On 2023-01-09 14:08:06, Daniel Swarbrick wrote: > Hi Eric, > > Thanks for the detailed bug report. As this is something which can > theoretically affect _any_ apt-based distributed (i.e., derivatives of > Debian), I feel that it should ideally be reported upstream.
I'm curious here, actually: which upstream are you thinking of? Because I have the suspicion this is actually a python3-apt bug rather than specific to this exporter... > I personally run this textfile collector on a Debian bookworm system, as > well as apticron - so this is (I think) a similar scenario where two > independent processes are periodically updating the apt cache, and I > wondered whether that was wise or not. I have seen the textfile > collector block only once so far. We're seeing repeated problems with this here. We manage a fleet of about 90 Debian installations, out of which 42 have been upgraded to bookworm and are showing symptoms. Those machines have a hourly legacy cron job that updates the apt cache for another monitoring system, with `apt update -qq`. Since we upgraded to bookworm, we have had 95 warnings from cron, some of them repeating for hours on end. One box in particular hung on that lock for over *two days*. > The apt.sh script which apt_info.py replaces only executed "apt-get > --just-print" - so even if executed as root, it never tried to update > the apt cache. In fact, unless you had something else like apticron to > periodically update the apt cache, apt.sh would return stale information. That does seem suboptimal, that said. :) > I guess that a simple workaround would be to tweak the systemd service > so that apt_info.py is executed as an unprivileged user, which would be > unable to update the cache, and theoretically avoid any potential for a > deadlock. Perhaps a recommendation to the upstream developer could be > made, e.g. to add a command-line argument to the script so that it > wouldn't try to update the cache even when executed as root. Surely there should be a timeout to this script or something? Why does it hang in the first place? I'll investigate a little bit further.