Package: aptitude-robot Severity: grave Version: 1.3.4-1 Tags: patch Hi,
since at least 4th of December 2014 I noticed hanging aptitude-robot-session processes on our four machines already running Jessie. Those hangs seem to happen everytime there are package updates available. The issue only happens since recently as we run Jessie as well as Wheezy systems with aptitude-robot-session via cron for quite a while now. One installation dates back to 22nd of September and another one to at least July when the aptitude-robot version currently in Jessie was already installed. Today I was able to reduce the issue to a simple "yes '' | aptitude install -y" with pending and scheduled updates: # ps auxwwwf | egrep --color '[a]pt|[d]pkg|[d]ebconf|[y]es' root 9967 0.0 0.0 8228 656 pts/3 S+ 20:28 0:00 | \_ yes root 9968 14.3 2.3 212780 76112 pts/3 Sl+ 20:28 0:01 | \_ aptitude install -y root 10065 2.3 0.0 0 0 ? Zs 20:28 0:00 | \_ [dpkg] <defunct> My test setup for this issue was a Jessie installation and I used dh-exec as package to be upgraded. For that I downloaded the according dh-exec package for my architecture one version below the current version in Jessie from http://snapshot.debian.org/package/dh-exec/0.13/. (Jessie currently has dh-exec 0.14.) Then I called the following commands to simulate what makes aptitude-robot-session hang: # dpkg -i dh-exec_0.13_amd64.deb # aptitude install --schedule-only dh-exec # yes '' | aptitude install -y If a process hangs as mentioned above, the following suffices to unhang the process: # ls -l /proc/9968/fd | fgrep /dev/pts # cat /dev/pts/10 (i.e. check the /dev/pts that aptitude has open and cat that pts. Both, aptitude-robot-session as well as the cat will exit at about the same time.) So the main issue is that aptitude is connected to some terminal, wants to get rid of some output, but there is nothing which takes the output. I'm not yet 100% sure about the correct fix, mostly because I still have no idea which change made this issue appear. Besides the manual workaround mentioned above, I've found two other changes which cause the issue to vanish in the example above: 1) If I remove the "yes '' |" the issue is gone. But the "yes" is necessary for the cases which are not covered by --force-conf{def,old} e.g. first installs with config files already present. See commit 169ee18d77a6a80248bdbd1d95cf626638219cb5 and the changelog entry for 1.2.15-1. 2) If I add a "< /dev/null" behind the aptitude call: yes '' | aptitude install -y < /dev/null The "< /dev/null" was actually present in aptitude-robot-session until the "yes '' |" was added, but it seems that nowadays both are necessary to avoid the drawbacks of (1) mentioned above. So I assume that the following patch will fix the issue: diff --git a/aptitude-robot-session b/aptitude-robot-session index 213ce85..39dbfbb 100755 --- a/aptitude-robot-session +++ b/aptitude-robot-session @@ -67,7 +67,8 @@ export DEBIAN_PRIORITY nice yes '' | \ /usr/sbin/aptitude-robot -y -q "$@" \ -o DPkg::Options::=--force-confdef \ - -o DPkg::Options::=--force-confold + -o DPkg::Options::=--force-confold \ + < /dev/null if [ -n "$POST_SESSION_HOOK" ]; then $POST_SESSION_HOOK diff --git a/debian/changelog b/debian/changelog index ca9a7ee..5baf5df 100644 --- a/debian/changelog +++ b/debian/changelog @@ -1,3 +1,10 @@ +aptitude-robot (1.3.4.1-1) UNRELEASED; urgency=medium + + * Fix hanging aptitude-robot-session processes with zombie dpkg children + by reintroducing "< /dev/null". + + -- Axel Beckert <a...@debian.org> Sat, 13 Dec 2014 20:48:10 +0100 + aptitude-robot (1.3.4-1) unstable; urgency=low [ Axel Beckert ] diff --git a/configure.ac b/configure.ac index 9325322..c31aef5 100644 --- a/configure.ac +++ b/configure.ac @@ -1,7 +1,7 @@ AC_PREREQ([2.67])dnl require version in Debian squeeze (or higher) AC_INIT( [Aptitude Robot], - [1.3.4], + [1.3.4.1], [el...@heebs.ch], [aptitude-robot], [https://github.com/elmar/aptitude-robot.git] But since I've only tested it with manually calling aptitude as shown above yet, this needs some more testing over the next few days. Additional things I checked but which didn't seem to make a difference: * Checking all dpkg versions since 1.17.13 because dpkg in Testing/Jessie was upgraded from 1.17.13 to 1.17.21 on 3rd of December and first noticed occurrence of that issue was on 4th of December in the morning. * Changing needrestart's configuration to $nrconf{restart} = 'a' and $nrconf{ui} = 'NeedRestart::UI::stdio'; * Purging needrestart [Bug report written on a different system than the one where the issue occurred.] -- System Information: Debian Release: 8.0 APT prefers unstable APT policy: (990, 'unstable'), (600, 'testing'), (110, 'experimental'), (109, 'buildd-unstable'), (109, 'buildd-experimental') Architecture: amd64 (x86_64) Kernel: Linux 3.17.0-trunk-amd64 (SMP w/4 CPU cores) Locale: LANG=C.UTF-8, LC_CTYPE=C.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system) -- To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org