<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom"><title>@webknjaz's ramblings</title><link href="https://webknjaz.me/prose/" rel="alternate"></link><link href="https://webknjaz.me/feed.xml" rel="self"></link><id>urn:uuid:d007b4c7-be0f-3e68-a60e-ad4b7bea6ad4</id><updated>2025-11-27T01:12:00Z</updated><author><name></name></author><entry><title>Get Off My LAN: Banishing a Google Home Smart Speaker with OpenWRT by Mistake</title><link href="https://webknjaz.me/prose/google-home-in-exile/" rel="alternate"></link><updated>2025-11-27T01:12:00Z</updated><author><name>Sviatoslav</name></author><id>urn:uuid:4afd17d7-ff58-3c6a-bb2c-51ff72d085a8</id><content type="html">&lt;h2&gt;A Story of Exile&lt;/h2&gt;
&lt;blockquote&gt;&lt;p&gt;&lt;em&gt;It's not &lt;a href="https://rfc-annotations.research.icann.org/"&gt;DNS&lt;/a&gt;&lt;/em&gt;&lt;br&gt;
&lt;em&gt;There's no way it's DNS&lt;/em&gt;&lt;br&gt;
&lt;em&gt;&lt;del&gt;It was DNS&lt;/del&gt; but not really, no&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Sometimes you don't mean to exile your smart devices to network purgatory,
but your favorite router operating system gives you the power to do it
anyway. Here's how I accidentally banished all my Google Home speakers
from the internet with a seemingly innocent &lt;a href="https://www.rfc-editor.org/rfc/rfc2131"&gt;DHCP&lt;/a&gt; configuration change.&lt;/p&gt;
&lt;h2&gt;The Unwitting Exile Begins&lt;/h2&gt;
&lt;p&gt;My old Google Home speakers stopped working one day. They'd claim no internet
connectivity on any voice command — as if they'd been cast out from the
network, unable to reach the digital world beyond. Factory resets didn't
help. They'd get stuck during initial setup, insisting they couldn't connect
to Wi-Fi.&lt;/p&gt;
&lt;p&gt;The maddening part? Setting up a mobile hotspot on my phone and pointing
the speakers to it instead of the home network worked just fine.
So the hardware wasn't the culprit. And this also meant that Google hadn't
deprecated them. They just couldn't live on my LAN anymore.&lt;/p&gt;
&lt;h2&gt;Signs of the Banished&lt;/h2&gt;
&lt;p&gt;Initially, I blamed a recent &lt;a href="https://openwrt.org"&gt;OpenWRT&lt;/a&gt; update I installed onto my access
points. The timing seemed suspicious. And the fact that I use my own
&lt;a href="https://openwrt.org/docs/guide-user/additional-software/imagebuilder"&gt;ImageBuilder&lt;/a&gt;-based immutable deployment process wasn't making the
guessing game any easier. But after multiple frustrating troubleshooting
attempts, I noticed something telling in the AP's web interface: the
Wireless page listed the device with its MAC address, but no
corresponding IP or hostname — like a ghost, present but not truly there.&lt;/p&gt;
&lt;p&gt;Yet I had IPs statically assigned in &lt;code&gt;dnsmasq&lt;/code&gt; on the main router. Looking
up the device's MAC address gave me the expected IP, and pinging that IP
worked fine.&lt;/p&gt;
&lt;p&gt;Why would my Google Home speaker think it's air-gapped when I could reach
it over LAN? It was connected yet convinced it wasn't — the perfect
network gaslight.&lt;/p&gt;
&lt;h2&gt;Tracking the Exile Order&lt;/h2&gt;
&lt;p&gt;The AP's logs showed no problems. The device connected to 5 GHz, didn't
like it, disconnected gracefully, and hopped to 2.4 GHz — all normal
behavior for a device trying to find its place.&lt;/p&gt;
&lt;p&gt;Time to intercept the DHCP conversation and see if the speaker was
getting the configuration responses correctly. Here's what I grabbed
with &lt;code&gt;tcpdump&lt;/code&gt;, having been SSHed into the nearest AP:&lt;/p&gt;
&lt;div class="hll"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="gp"&gt;$ &lt;/span&gt;tcpdump&lt;span class="w"&gt; &lt;/span&gt;-i&lt;span class="w"&gt; &lt;/span&gt;br-lan&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;(udp port 67 or port 68 or port 546 or port 547) and ether host f4:f5:de:ad:be:ef&amp;#39;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;-vvv
&lt;span class="go"&gt;01:13:36.373309 IP (tos 0x0, ttl 64, id 13202, offset 0, flags [none], proto UDP (17), length 352)&lt;/span&gt;
&lt;span class="go"&gt;    Google-Home.68 &amp;gt; 255.255.255.255.67: [udp sum ok] BOOTP/DHCP, Request from f4:f5:de:ad:be:ef (oui Unknown), length 324, xid 0xc06f922f, Flags [none] (0x0000)&lt;/span&gt;
&lt;span class="go"&gt;     Client-Ethernet-Address f4:f5:de:ad:be:ef (oui Unknown)&lt;/span&gt;
&lt;span class="go"&gt;     Vendor-rfc1048 Extensions&lt;/span&gt;
&lt;span class="go"&gt;       Magic Cookie 0x63825363&lt;/span&gt;
&lt;span class="go"&gt;       DHCP-Message (53), length 1: Request&lt;/span&gt;
&lt;span class="go"&gt;       Requested-IP (50), length 4: Google-Home&lt;/span&gt;
&lt;span class="go"&gt;       MSZ (57), length 2: 1500&lt;/span&gt;
&lt;span class="go"&gt;       Vendor-Class (60), length 41: &amp;quot;dhcpcd-6.8.2:Linux-3.8.13+:armv7l:Marvell&amp;quot;&lt;/span&gt;
&lt;span class="go"&gt;       Hostname (12), length 11: &amp;quot;Google-Home&amp;quot;&lt;/span&gt;
&lt;span class="go"&gt;       Unknown (145), length 1: 1&lt;/span&gt;
&lt;span class="go"&gt;       Parameter-Request (55), length 9:&lt;/span&gt;
&lt;span class="go"&gt;         Subnet-Mask (1), Static-Route (33), Default-Gateway (3), Domain-Name-Server (6)&lt;/span&gt;
&lt;span class="go"&gt;         Domain-Name (15), BR (28), Lease-Time (51), RN (58)&lt;/span&gt;
&lt;span class="go"&gt;         RB (59)&lt;/span&gt;
&lt;span class="go"&gt;       END (255), length 0&lt;/span&gt;
&lt;span class="go"&gt;01:13:36.377212 IP (tos 0xc0, ttl 64, id 50324, offset 0, flags [none], proto UDP (17), length 373)&lt;/span&gt;
&lt;span class="go"&gt;    turris-omnia-gw.67 &amp;gt; Google-Home.68: [udp sum ok] BOOTP/DHCP, Reply, length 345, xid 0xc06f922f, Flags [none] (0x0000)&lt;/span&gt;
&lt;span class="go"&gt;     Your-IP Google-Home&lt;/span&gt;
&lt;span class="go"&gt;     Server-IP turris-omnia-gw&lt;/span&gt;
&lt;span class="go"&gt;     Client-Ethernet-Address f4:f5:de:ad:be:ef (oui Unknown)&lt;/span&gt;
&lt;span class="go"&gt;     Vendor-rfc1048 Extensions&lt;/span&gt;
&lt;span class="go"&gt;       Magic Cookie 0x63825363&lt;/span&gt;
&lt;span class="go"&gt;       DHCP-Message (53), length 1: ACK&lt;/span&gt;
&lt;span class="go"&gt;       Server-ID (54), length 4: turris-omnia-gw&lt;/span&gt;
&lt;span class="go"&gt;       Lease-Time (51), length 4: 172800&lt;/span&gt;
&lt;span class="go"&gt;       RN (58), length 4: 86400&lt;/span&gt;
&lt;span class="go"&gt;       RB (59), length 4: 151200&lt;/span&gt;
&lt;span class="go"&gt;       Subnet-Mask (1), length 4: 255.255.255.0&lt;/span&gt;
&lt;span class="go"&gt;       BR (28), length 4: 192.168.1.255&lt;/span&gt;
&lt;span class="go"&gt;       Default-Gateway (3), length 4: turris-omnia-gw&lt;/span&gt;
&lt;span class="go"&gt;       Domain-Name-Server (6), length 4: turris-omnia-gw&lt;/span&gt;
&lt;span class="go"&gt;       Domain-Name (15), length 9: &amp;quot;home.lan&amp;quot;&lt;/span&gt;
&lt;span class="go"&gt;       Classless-Static-Route-Microsoft (249), length 9: (10.60.0.1/32:turris-omnia-gw)&lt;/span&gt;
&lt;span class="go"&gt;       Classless-Static-Route (121), length 9: (10.60.0.1/32:turris-omnia-gw)&lt;/span&gt;
&lt;span class="go"&gt;       END (255), length 0&lt;/span&gt;
&lt;span class="go"&gt;^C&lt;/span&gt;
&lt;span class="go"&gt;2 packets captured&lt;/span&gt;
&lt;span class="go"&gt;2 packets received by filter&lt;/span&gt;
&lt;span class="go"&gt;0 packets dropped by kernel&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;There's our smoking gun: extra DHCP options &lt;code&gt;121&lt;/code&gt; and &lt;code&gt;249&lt;/code&gt;! I started
remembering that I'd been experimenting with pushing explicit routes to
my ISP's internal network nodes via DHCP — unknowingly drafting the exile
papers for my smart speakers.&lt;/p&gt;
&lt;h2&gt;False Trails in the Wilderness&lt;/h2&gt;
&lt;p&gt;I opened &lt;a href="https://openwrt.org/docs/techref/luci"&gt;LuCI&lt;/a&gt;'s network interfaces page of the router that runs a fork
of OpenWRT and found additional option entries in the advanced DHCP
settings. Deleted them, restarted &lt;code&gt;dnsmasq&lt;/code&gt;, and... nothing happened.
&lt;em&gt;Nothing!&lt;/em&gt; The exile continued.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;121&lt;/code&gt; and &lt;code&gt;249&lt;/code&gt; options were still being sent. The banishment order was
coming from somewhere else.&lt;/p&gt;
&lt;h2&gt;The Hidden Decree&lt;/h2&gt;
&lt;p&gt;Checking the &lt;a href="https://openwrt.org/docs/techref/uci"&gt;UCI&lt;/a&gt; configuration revealed the true source of exile:&lt;/p&gt;
&lt;div class="hll"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="gp"&gt;root@omnia:~# &lt;/span&gt;uci&lt;span class="w"&gt; &lt;/span&gt;show&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;grep&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;249&lt;/span&gt;
&lt;span class="go"&gt;dhcp.lan.dhcp_option_force=&amp;#39;121,10.60.0.1/32,192.168.1.1&amp;#39; &amp;#39;249,10.60.0.1/32,192.168.1.1&amp;#39;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;I'd configured forced DHCP options, apparently — a setting not exposed in
LuCI, only accessible via &lt;code&gt;uci&lt;/code&gt; commands. These were the real culprit,
hidden from the web interface.&lt;/p&gt;
&lt;p&gt;Revoking the exile was simple:&lt;/p&gt;
&lt;div class="hll"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="gp"&gt;root@omnia:~# &lt;/span&gt;uci&lt;span class="w"&gt; &lt;/span&gt;del&lt;span class="w"&gt; &lt;/span&gt;dhcp.lan.dhcp_option_force
&lt;span class="gp"&gt;root@omnia:~# &lt;/span&gt;uci&lt;span class="w"&gt; &lt;/span&gt;commit&lt;span class="w"&gt; &lt;/span&gt;dhcp
&lt;span class="gp"&gt;root@omnia:~# &lt;/span&gt;/etc/init.d/dnsmasq&lt;span class="w"&gt; &lt;/span&gt;restart
&lt;/pre&gt;&lt;/div&gt;
&lt;h2&gt;Understanding the Banishment&lt;/h2&gt;
&lt;p&gt;Why did this shut out my Google Homes? &lt;a href="https://datatracker.ietf.org/doc/html/rfc3442#page-5"&gt;RFC 3442&lt;/a&gt; holds the answer under
the DHCP Client Behavior section:&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;&lt;em&gt;"If the DHCP server returns both a Classless Static Routes option and
a Router option, the DHCP client MUST ignore the Router option."&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;DHCP options &lt;code&gt;121&lt;/code&gt; and &lt;code&gt;249&lt;/code&gt; don't just append routes — they become the
&lt;em&gt;only&lt;/em&gt; routes the client would use if it supports them. By providing
just a path to &lt;code&gt;10.60.0.1/32&lt;/code&gt; without a default route (&lt;code&gt;0.0.0.0/0&lt;/code&gt;), I'd
essentially told my Google Homes: "You can only talk to this one host.
The rest of the internet doesn't exist for you."&lt;/p&gt;
&lt;p&gt;It was network isolation through misconfiguration — an accidental digital
exile where the devices were physically connected but logically banished
from the wider network.&lt;/p&gt;
&lt;h2&gt;Granting Amnesty&lt;/h2&gt;
&lt;p&gt;The proper fix maintains the specific route while granting passage back
to the internet:&lt;/p&gt;
&lt;div class="hll"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="gp"&gt;root@omnia:~# &lt;/span&gt;uci&lt;span class="w"&gt; &lt;/span&gt;add_list&lt;span class="w"&gt; &lt;/span&gt;dhcp.lan.dhcp_option_force&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;121&lt;/span&gt;,10.60.0.1/32,192.168.1.1
&lt;span class="gp"&gt;root@omnia:~# &lt;/span&gt;uci&lt;span class="w"&gt; &lt;/span&gt;add_list&lt;span class="w"&gt; &lt;/span&gt;dhcp.lan.dhcp_option_force&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;249&lt;/span&gt;,10.60.0.1/32,192.168.1.1
&lt;span class="gp"&gt;root@omnia:~# &lt;/span&gt;uci&lt;span class="w"&gt; &lt;/span&gt;add_list&lt;span class="w"&gt; &lt;/span&gt;dhcp.lan.dhcp_option_force&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;121&lt;/span&gt;,0.0.0.0/0,192.168.1.1
&lt;span class="gp"&gt;root@omnia:~# &lt;/span&gt;uci&lt;span class="w"&gt; &lt;/span&gt;add_list&lt;span class="w"&gt; &lt;/span&gt;dhcp.lan.dhcp_option_force&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;249&lt;/span&gt;,0.0.0.0/0,192.168.1.1
&lt;span class="gp"&gt;root@omnia:~# &lt;/span&gt;uci&lt;span class="w"&gt; &lt;/span&gt;commit&lt;span class="w"&gt; &lt;/span&gt;dhcp
&lt;span class="gp"&gt;root@omnia:~# &lt;/span&gt;/etc/init.d/dnsmasq&lt;span class="w"&gt; &lt;/span&gt;restart
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The exile was lifted. My Google Homes could rejoin the digital society
&lt;del&gt;and get back to spying on me 🤪&lt;/del&gt;.&lt;/p&gt;
&lt;h2&gt;Lessons from the Exile&lt;/h2&gt;
&lt;p&gt;OpenWRT gives you the power to banish devices from your network in subtle
ways. DHCP options &lt;code&gt;121&lt;/code&gt;/&lt;code&gt;249&lt;/code&gt; don't supplement routing tables — they replace
them entirely, making them perfect tools for accidental exile. LuCI won't
always show you the full picture; &lt;code&gt;uci show&lt;/code&gt; reveals the hidden decrees.
And while many devices ignored my misconfigured exile order, Google Home's
RFC-compliant DHCP client (&lt;code&gt;dhcpcd-6.8.2&lt;/code&gt;) dutifully accepted its banishment.&lt;/p&gt;
&lt;p&gt;When your smart devices claim they're offline while clearly connected, check
if you've accidentally exiled them. Sometimes the most effective network
isolation is the one you didn't mean to create.&lt;/p&gt;
&lt;p&gt;Well, it wasn't DNS after all — just a routing misconfiguration that prevented
reaching it. Though the haiku's spirit lives on: when troubleshooting network
issues, DNS is always a suspect, even when it's innocent.&lt;/p&gt;
</content></entry><entry><title>ansible-galaxy CLI ❤️ resolvelib</title><link href="https://webknjaz.me/prose/ansible-galaxy-reuses-pips-resolvelib/" rel="alternate"></link><updated>2021-02-17T04:28:00Z</updated><author><name>Sviatoslav Sydorenko</name></author><id>urn:uuid:3ba42a64-327c-3156-b8d9-ababe4ad2582</id><content type="html">&lt;p&gt;Ever since Ansible Collections got introduced, &lt;code&gt;ansible-galaxy
collection install&lt;/code&gt; had to somehow figure the whole dependency tree that
it's supposed to download and install. The code we had rather entangled.
But things are going to change starting &lt;a href="https://github.com/ansible/ansible"&gt;ansible-core&lt;/a&gt; 2.11.
And here's how.&lt;/p&gt;
&lt;p&gt;One of the &lt;a href="https://docs.ansible.com/ansible-core/devel/roadmap/ROADMAP_2_11.html"&gt;items planned for ansible-core v2.11&lt;/a&gt; was improving &lt;code&gt;ansible-galaxy collection&lt;/code&gt; CLI 💻. The first
thing needed was making possible to &lt;a href="https://github.com/ansible/ansible/issues/71784"&gt;upgrade collections when using the
&lt;code&gt;install&lt;/code&gt; subcommand without requiring &lt;code&gt;--force&lt;/code&gt; or
&lt;code&gt;--force-with-deps&lt;/code&gt;&lt;/a&gt;. This is something
&lt;a href="https://github.com/ansible/proposals/issues/181"&gt;people have been wanting&lt;/a&gt; for quite a while but
&lt;a href="https://github.com/ansible/proposals/issues/23"&gt;wasn't possible for roles&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Then, we also wanted to introduce an additional &lt;code&gt;ansible-galaxy
collection install [ -U | --upgrade ]&lt;/code&gt; option. And we also considered
working on the new &lt;code&gt;ansible-galaxy collection remove&lt;/code&gt; subcommand but
never had time to complete this stretch goal&lt;sup class="footnote-ref" id="fnref-ansible collection rm"&gt;&lt;a href="#fn-ansible collection rm"&gt;1&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;
&lt;p&gt;Another thing on our radar was caching HTTP responses to Galaxy API so
that the dependency resolution process could become dramatically faster.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://www.bloggingforlogging.com"&gt;Jordan&lt;/a&gt;, &lt;a href="https://github.com/s-hertel"&gt;Sloane&lt;/a&gt; and I formed a feature team to work on this. We
decided that we'll try to cut the subtasks one per person to spread the
load somehow. &lt;a href="https://www.bloggingforlogging.com"&gt;Jordan&lt;/a&gt; was to work on caching, &lt;a href="https://github.com/s-hertel"&gt;Sloane&lt;/a&gt; was assigned to
do the &lt;code&gt;--upgrade&lt;/code&gt; task and I was supposed to work on updating the bare
&lt;code&gt;install&lt;/code&gt; command to make it not require &lt;code&gt;--force&lt;/code&gt; (and
&lt;code&gt;--force-with-deps&lt;/code&gt; for that matter) when there's a need to update the
already installed collections.&lt;/p&gt;
&lt;p&gt;I was almost unfamiliar with this part of &lt;a href="https://github.com/ansible/ansible"&gt;ansible-core&lt;/a&gt; so I needed to
get myself familiar with it by starting with exloring the pointers of my
colleagues on what functions will likely to need updates. What could
possibly go wrong? Well, as I was going deeper and deeper down the
rabbit hole, I realized that there was a lot of complexity in the
existing code and we basically had a rather simplistic dependency
resolver that looked like a yarn of leaky abstractions 🤯. It was hard
to reason about what strategies it follows to get all the transitive
dependencies for collections requested to be installed or downloaded.
At the same time, I remembered that there is this other prominent
project in the Python ecosystem — &lt;a href="https://pip.pypa.io"&gt;pip&lt;/a&gt; — that recently got a fresh out
of the oven dependency resolver &lt;a href="https://github.com/sarugaku/resolvelib"&gt;resolvelib&lt;/a&gt; It's a third-party library
that &lt;a href="https://pip.pypa.io"&gt;pip&lt;/a&gt; bundles but it's also freely available for use via &lt;code&gt;pip
install&lt;/code&gt;. My buddy &lt;a href="https://pradyunsg.me"&gt;Pradyun&lt;/a&gt; has been involved with this effort (the
&lt;a href="https://pip.pypa.io"&gt;pip&lt;/a&gt; one) for about four years so I had somebody I could ask dumb
questions about the dependency resolution :)&lt;/p&gt;
&lt;p&gt;And so the idea to replace the dependency resolver was born. Instead of
patching a few places in the old code here and there, I thought why
don't I &lt;del&gt;overengineer this task and refactor the whole thing&lt;/del&gt; improve
the maintainability of the subpackage dedicated to managing collection
CLI subcommands!&lt;/p&gt;
&lt;p&gt;I must say that my enthusiasm to &lt;del&gt;break all the things&lt;/del&gt; refine a whole
bunch of already working code was met with a lot of suspicion within the
broader Ansible Core Engineering team, at first. This additionally meant
introducing a new runtime dependency — something that we almost never
do. We now have a good mechanism to help OS packagers seamlessly bundle
runtime dependencies, though&lt;sup class="footnote-ref" id="fnref-_vendor/ dir"&gt;&lt;a href="#fn-_vendor/ dir"&gt;2&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;
&lt;p&gt;I faced with a challenge — &lt;em&gt;I knew&lt;/em&gt; that the idea was good and now I had
to convince others that it's not as crazy as it may seem.&lt;/p&gt;
&lt;p&gt;I switched into the research mode, looked into what interfaces and hooks
&lt;a href="https://github.com/sarugaku/resolvelib"&gt;resolvelib&lt;/a&gt; requires and came up with a tiny 225 LoC long
&lt;a href="https://github.com/webknjaz/ansible-galaxy-collection-resolver/blob/master/__main__.py"&gt;proof-of-concept&lt;/a&gt;. I even &lt;a href="https://github.com/webknjaz/ansible-galaxy-collection-resolver/blob/master/.github/workflows/demo.yml"&gt;wired the
demo into GitHub Actions CI/CD&lt;/a&gt; so people
see the result instantly. After that, when folks saw how easy it is to
connect &lt;a href="https://github.com/sarugaku/resolvelib"&gt;resolvelib&lt;/a&gt; and delegate the resolution correctness
responsibility to it, the team agreed that this refactoring would be
useful and we should proceed.&lt;/p&gt;
&lt;p&gt;Meanwhile &lt;a href="https://www.bloggingforlogging.com"&gt;Jordan&lt;/a&gt; was working on his caching task. So while I was busy
figuring out where to stick &lt;a href="https://github.com/sarugaku/resolvelib"&gt;resolvelib&lt;/a&gt; into our spaghetti, &lt;a href="https://www.bloggingforlogging.com"&gt;Jordan&lt;/a&gt;
submitted &lt;a href="https://github.com/ansible/ansible/pull/71904"&gt;the API HTTP request caching PR&lt;/a&gt; and
it got merged without any problems.&lt;/p&gt;
&lt;p&gt;The resolver replacement work was so fundamental that it turned out to
block virtually everything else related to our ansible-galaxy CLI UX
improvements. This was no longer just my task. Yes, I was making most of
the design for the new architecture but I got just enormous amount of
help getting this to the finish line. And I enjoyed this collaboration
so much!&lt;/p&gt;
&lt;p&gt;It wasn't just throwing old code away and adding the new one in place.
One of our main objectives was to keep the behavior as close as possible
to what the old code did. We've identified a lot of reduntant tests that
could be removed, rewrote some of the unit tests into integration tests.
We've also identified a ton of gaps in the test coverage which we filled
in with many new tests (yaaay! 🙌). &lt;a href="https://github.com/s-hertel"&gt;Sloane&lt;/a&gt; also did a lot of manual
behavior verification and testing 👏.&lt;/p&gt;
&lt;h3&gt;resolvelib and fancy design patterns&lt;/h3&gt;
&lt;p&gt;I mentioned earlier that &lt;a href="https://github.com/sarugaku/resolvelib"&gt;resolvelib&lt;/a&gt; was easy to integrate and even
linked that extremely short PoC. This creates an illusion that it could
be a "5-minute patch" but it totally wasn't.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/sarugaku/resolvelib"&gt;resolvelib&lt;/a&gt; requires one to implement an interface they call "provider"
with the following hooks:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;identify(requirement_or_candidate)&lt;/code&gt; — returns a unique identifier for
the package (FQCN in our case)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;get_preference(resolution, candidates, information)&lt;/code&gt; — makes a sort
key determining the "importance" of a certain requirement&lt;/li&gt;
&lt;li&gt;&lt;code&gt;find_matches(requirements)&lt;/code&gt; — returns all candidates matching the
given requirements&lt;/li&gt;
&lt;li&gt;&lt;code&gt;is_satisfied_by(requirement, candidate)&lt;/code&gt; — double-checks the
correctness of the candidates resolver chooses&lt;/li&gt;
&lt;li&gt;&lt;code&gt;get_dependencies(candidate)&lt;/code&gt; — retrieves all the direct requirements
that given candidate has&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This doesn't look too complicated, does it? That's because resolvelib
really doesn't care what your requirements and candidates are for as
long as you keep interfacing with it via the same data structures.&lt;/p&gt;
&lt;p&gt;This also means that the resolver &lt;em&gt;doesn't know where to get the info
about the requirements and the candidates&lt;/em&gt; beyond the data you provide
to it by implementing these hooks.&lt;/p&gt;
&lt;p&gt;So we needed to implement talking to Galaxy API, taking into accont more
than one Galaxy-like server as a source for retrieving collections. We
needed to take into account non-Galaxy provided artifacts like direct
URLs to tarballs or Git repos, or local files and folders.&lt;/p&gt;
&lt;p&gt;This all could easily increase the complexity so I introduced the
concepts of a concrete artifacts manager, and a facade for talking to
multiple Galaxy APIs and other metadata sources (including the artifacts
manager). The artifacts manager is responsible for downloading and
caching the artifacts (if they are not local) as well as retrieving (and
caching) their metadata. It also has an alternative constructor that can
clean up the cache directory upon exit.
Both objects are initialized once (at the beginning) and are passed to
the consumers as a dependency injection.&lt;/p&gt;
&lt;p&gt;Most of the packaging ecosystems are rather simple. They have packages
with the content of one "atom" inside the artifact. Ansible Collections
are mostly like that but there are additional cases which make
everything substantionally more complex. One of the primary use-cases
that differ is SCM-based collections — they may have one collection in
the root of the repository but also in a certain (user-defined)
subdirectory. Moreover, SCM targets may have multiple collections inside
the same repository (in a namespace subdir that also can be nested as
defined by the repo creators).
To solve this, we mark Git targets as "virtual collections" during the
dependency resolution. The artifacts manager downloads them into a
temporary directory and marks that directory a single dependency of such
a "virtual Git collection"). If there's subdirs, we do the same "virtual
collection" trick with them (except unpacked dirs don't need to be
copied into cache, the manager just holds their real paths in memory).
These "virtual collections" are very helpful during the resolution and
are skipped on the install step (after the resolution is complete).&lt;/p&gt;
&lt;h3&gt;Fin.&lt;/h3&gt;
&lt;p&gt;Well, that's about it. 3–4 months into experimentation, development,
testing, polishing and reviews, days before the feature freeze, and the
feature is in devel!&lt;/p&gt;
&lt;p&gt;Based on the refactoring, &lt;a href="https://github.com/s-hertel"&gt;Sloane&lt;/a&gt; was able to complete her work on the
&lt;code&gt;--upgrade&lt;/code&gt; option and it got merged too.&lt;/p&gt;
&lt;h3&gt;Feedback, please 🙏&lt;/h3&gt;
&lt;p&gt;If you are an end-user who uses &lt;code&gt;ansible-galaxy collection
[download|install|list|verify]&lt;/code&gt; subcommands, please make sure to tell us
how well we managed to mix refactoring with the feature development this
time. Hopefully, we've squashed all the bugs already 🤞 but we missed
anything — &lt;a href="https://github.com/ansible/ansible/issues/new/choose"&gt;let us know&lt;/a&gt;! 🖖&lt;/p&gt;
&lt;div class="footnotes"&gt;
&lt;hr&gt;
&lt;ol&gt;&lt;li id="fn-ansible collection rm"&gt;&lt;p&gt;&lt;a href="https://github.com/ansible/ansible/pull/73464"&gt;The attempt to implement it&lt;/a&gt; revealed that we need more design discussion to define how exactly &lt;code&gt;ansible-galaxy collection uninstall&lt;/code&gt; is supposed to work.&lt;a href="#fnref-ansible collection rm" class="footnote"&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li id="fn-_vendor/ dir"&gt;&lt;p&gt;The downstream packagers can now just drop the external dependencies into &lt;code&gt;lib/ansible/_vendor/&lt;/code&gt; transparently instead of packaging them separately, &lt;a href="https://github.com/ansible/ansible/pull/69850"&gt;starting ansible-base v2.10&lt;/a&gt;.&lt;a href="#fnref-_vendor/ dir" class="footnote"&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</content></entry><entry><title>Et Tu Brutè? Use Travis CI for FOSS no more.</title><link href="https://webknjaz.me/prose/et-tu-brute-use-travis-ci-for-foss-no-more/" rel="alternate"></link><updated>2020-11-13T01:51:00Z</updated><author><name>Sviatoslav</name></author><id>urn:uuid:89011e35-9af1-3ddd-9fab-13206867c0cc</id><content type="html">&lt;p&gt;Something sadly&lt;sup class="footnote-ref" id="fnref-kill-foss-sentiment"&gt;&lt;a href="#fn-kill-foss-sentiment"&gt;1&lt;/a&gt;&lt;/sup&gt; expected has happened
at the beginning of the last week. Travis CI — a pioneer
in the field that brought automated testing seamlessly
integrated with GitHub to &lt;del&gt;masses&lt;/del&gt; many open-source
projects on scale — announced that they are migrating all
of the public projects that previously got the service for
free to a trial plan with a limited amount of toy credits to
use&lt;sup class="footnote-ref" id="fnref-anti-foss"&gt;&lt;a href="#fn-anti-foss"&gt;2&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;
&lt;p&gt;Ever since they were acquired&lt;sup class="footnote-ref" id="fnref-sinking-idera"&gt;&lt;a href="#fn-sinking-idera"&gt;3&lt;/a&gt;&lt;/sup&gt; by a company
with a rather shady past&lt;sup class="footnote-ref" id="fnref-shady-idera"&gt;&lt;a href="#fn-shady-idera"&gt;4&lt;/a&gt;&lt;/sup&gt;, Travis CI kept
going down this path. They've been shrinking the resources
again and again over the past few years and it seems this
sort of outcome was inevitable, expecially once they've
suddenly layed off a lot of senior engineers&lt;sup class="footnote-ref" id="fnref-layoff-idera"&gt;&lt;a href="#fn-layoff-idera"&gt;5&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;
&lt;p&gt;Of course, they've made a rather pathetic attempt to assure
everyone that they'll continue to support open source. But
who are they kidding? I bet most of the maintainers have
better things to do then go begging support for a bunch of
free credits every now and then just to keep things runing.
Folks keep underestimating the &lt;a href="https://twitter.com/di_codes/status/1326952200413786112"&gt;FOSS maintenance effort&lt;/a&gt; and
it even seems like projects using &lt;a href="https://tidelift.com/subscription/pkg/pypi-cheroot?utm_source=pypi-cheroot&amp;amp;utm_medium=referral&amp;amp;utm_campaign=blog"&gt;Tidelift&lt;/a&gt; may be
inelligible&lt;sup class="footnote-ref" id="fnref-travis-uncommercial-foss"&gt;&lt;a href="#fn-travis-uncommercial-foss"&gt;6&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;
&lt;h3&gt;Now What?&lt;/h3&gt;
&lt;p&gt;Even before the acquisition there's been signs that Travis
CI wasn't doing well. There are a lot of articles on setting
up other CIs like &lt;a href="https://hynek.me/articles/simple-python-azure-pipelines/"&gt;Azure Pipelines&lt;/a&gt; or &lt;a href="https://hynek.me/articles/python-github-actions/"&gt;GitHub Actions CI/CD
Workflows&lt;/a&gt;. Most of the alternative options provide a
comparable experience but may have slightly different ways
of being set up. There are also even more powerful CIs like
&lt;a href="https://zuul-ci.org"&gt;Zuul&lt;/a&gt; that are available to significant FOSS projects.&lt;/p&gt;
&lt;h3&gt;So can we dump Travis CI yet?&lt;/h3&gt;
&lt;p&gt;Yes, we totally can do that! Should we, though? I'm
personally planning to stop advocating for using Travis CI
if it's not necessary for a given project. Back in the day,
I even contributed a GitHub Pages deployment provider into
their dpl project so I feel a little nostalgic... I used to
give linting CI set up tasks to my mentees based on Travis.
But now, I don't want to advertise a FOSS-unfriendly lock-in
so I'll switch to &lt;a href="https://hynek.me/articles/python-github-actions/"&gt;GitHub Actions CI/CD Workflows&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;There's one case when I may need to consider using Travis CI
additionally to other systems — there are cases when I'd
want to run tests in environments (architectures) that none
of the other CIs provide. But this will be decided on
per-project basis.&lt;/p&gt;
&lt;p&gt;Huh. I guess that's all I've been wanting to write. 🖖&lt;/p&gt;
&lt;div class="footnotes"&gt;
&lt;hr&gt;
&lt;ol&gt;&lt;li id="fn-kill-foss-sentiment"&gt;&lt;p&gt;A lot of people got upset because of this &lt;a href="https://twitter.com/mitsuhiko/status/1323223738247192576"&gt;Travis CI fuckup&lt;/a&gt;.&lt;a href="#fnref-kill-foss-sentiment" class="footnote"&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li id="fn-anti-foss"&gt;&lt;p&gt;&lt;a href="https://blog.travis-ci.com/2020-11-02-travis-ci-new-billing"&gt;Travis gave everyone 10K credits&lt;/a&gt; and suggested that people would need to switch to a paid plan after that.&lt;a href="#fnref-anti-foss" class="footnote"&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li id="fn-sinking-idera"&gt;&lt;p&gt;Travis &lt;a href="https://blog.travis-ci.com/2019-01-23-travis-ci-joins-idera-inc"&gt;announced the acquisition by Idera&lt;/a&gt; on Jan 23, 2019.&lt;a href="#fnref-sinking-idera" class="footnote"&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li id="fn-shady-idera"&gt;&lt;p&gt;Some people on HN seem to have previous experience with &lt;a href="https://news.ycombinator.com/item?id=18978346"&gt;Idera ruining their acquired businesses&lt;/a&gt;. Also, they've announced all sorts of commitments like keeping the &lt;a href="https://foundation.travis-ci.org"&gt;Travis Foundation&lt;/a&gt; alive and now, almost two years in, that domain is dead and googling doesn't even find any mentions of it.&lt;a href="#fnref-shady-idera" class="footnote"&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li id="fn-layoff-idera"&gt;&lt;p&gt;Idera &lt;a href="https://news.ycombinator.com/item?id=19218036"&gt;removed many essensial employees&lt;/a&gt; without a warning and folk on Twitter call this the &lt;a href="https://twitter.com/kylemh_/status/1323494924306710528"&gt;last nail&lt;/a&gt; in the coffin.&lt;a href="#fnref-layoff-idera" class="footnote"&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li id="fn-travis-uncommercial-foss"&gt;&lt;p&gt;It's not yet clear but people on Twitter speculate that &lt;a href="https://twitter.com/hugovk/status/1326935425903185920"&gt;Tidelift-baked FOSS projects may not get free credits&lt;/a&gt;&lt;a href="#fnref-travis-uncommercial-foss" class="footnote"&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</content></entry><entry><title>Hello Website. Again.</title><link href="https://webknjaz.me/prose/brave-new-world/" rel="alternate"></link><updated>2020-11-01T23:06:00Z</updated><author><name>Sviatoslav</name></author><id>urn:uuid:d0e0788e-fe1d-3052-b17d-3bb572d381ef</id><content type="html">&lt;p&gt;Once upon a time, I used to have a blog. I've lost it at some point.
But here I am again. Starting over... As if this time I won't have
excuses to postpone writing new posts. The &lt;a href="https://throwgrammarfromthetrain.blogspot.com/2010/10/definition-of-insanity.html"&gt;definition of insanity&lt;/a&gt;,
huh?&lt;/p&gt;
&lt;p&gt;Anyway. It's now time to start over. The old blog used &lt;a href="https://wordpress.com"&gt;WordPress&lt;/a&gt; and
thus required a web-server with some software like nginx and php-fpm.
What sounded necessary back in the day seems ridiculous today.
Personal blogs don't need to generate pages on flight, or hit a
separate database. All that's needed is some static site generator.
And so I chose &lt;a href="https://getlektor.com"&gt;Lektor&lt;/a&gt; for this purpose. It's pythonic and very well
customizable — just what one needs to run a blog that can be published
to &lt;a href="https://pages.github.com"&gt;GitHub Pages&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Urgh... That's a pretty long backstory so I'll stop here.&lt;/p&gt;
&lt;p&gt;Welcome to my blog!&lt;/p&gt;
</content></entry></feed>