[for anyone reading this in the BTS, Ian is responding to http://lists.debian.org/debian-dpkg/2007/08/msg00024.html]
On Tue August 28 2007 05:27:52 am Ian Jackson wrote: > Bruce Sass writes ("start-stop-daemon can't stop process"): > > I have an exim4 daemon (bug #396944) which doesn't stop... > > The exim4 binary on disk has been changed and now the running program > doesn't have the same binary as is on disk. > > This was probably because you upgraded exim4 but the exim4 maintainer > scripts should have arranged to stop the daemon before the upgrade. The exim4 maintainer scripts do stop the daemon before unpacking, s-s-d failed at that point. Here is the debug output of the attempt... Preparing to replace exim4-config 4.67-7 (using .../exim4-config_4.67-8_all.deb) ... Unpacking replacement exim4-config ... now debugging /var/lib/dpkg/info/exim4-config.postrm upgrade 4.67-8 + [ upgrade = purge ] Preparing to replace exim4-base 4.67-7 (using .../exim4-base_4.67-8_i386.deb) ... Unpacking replacement exim4-base ... now debugging /var/lib/dpkg/info/exim4-base.postrm upgrade 4.67-8 + [ upgrade = purge ] Preparing to replace exim4-daemon-light 4.67-7 (using .../exim4-daemon-light_4.67-8_i386.deb) ... now debugging /var/lib/dpkg/info/exim4-daemon-light.prerm upgrade 4.67-8 + [ -x /etc/init.d/exim4 ] + [ -n 1 ] + netstat -tulpen Active Internet connections (only servers) Proto Recv-Q Send-Q Local Address Foreign Address State User Inode PID/Program name tcp 0 0 127.0.0.1:2208 0.0.0.0:* LISTEN 0 2909928 13195/hpiod tcp 0 0 0.0.0.0:2049 0.0.0.0:* LISTEN 0 2488620 - tcp 0 0 0.0.0.0:24837 0.0.0.0:* LISTEN 1000 5323638 697/knotes tcp 0 0 0.0.0.0:57638 0.0.0.0:* LISTEN 0 2488669 18734/rpc.mountd tcp 0 0 0.0.0.0:5800 0.0.0.0:* LISTEN 1000 5323696 640/kded [kdeinit] tcp 0 0 0.0.0.0:5900 0.0.0.0:* LISTEN 1000 5323702 640/kded [kdeinit] tcp 0 0 127.0.0.1:56204 0.0.0.0:* LISTEN 107 2909964 13198/python tcp 0 0 0.0.0.0:59853 0.0.0.0:* LISTEN 0 2333 - tcp 0 0 0.0.0.0:111 0.0.0.0:* LISTEN 0 1728804 31135/portmap tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN 0 3133 1407/boa tcp 0 0 0.0.0.0:6001 0.0.0.0:* LISTEN 0 5322993 589/X tcp 0 0 0.0.0.0:113 0.0.0.0:* LISTEN 0 3575 1740/inetd tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 0 704876 19851/sshd tcp 0 0 127.0.0.1:631 0.0.0.0:* LISTEN 0 3555360 13218/cupsd tcp 0 0 0.0.0.0:39095 0.0.0.0:* LISTEN 0 2487874 18045/rpc.statd tcp 0 0 0.0.0.0:25 0.0.0.0:* LISTEN 0 3441 1685/exim4 tcp 0 0 0.0.0.0:57146 0.0.0.0:* LISTEN 1000 5323496 685/artsd tcp 0 0 0.0.0.0:7741 0.0.0.0:* LISTEN 0 2790100 4393/lisa udp 0 0 0.0.0.0:2049 0.0.0.0:* 0 2488619 - udp 0 0 0.0.0.0:32770 0.0.0.0:* 0 3503 - udp 0 0 0.0.0.0:517 0.0.0.0:* 0 2790848 1740/inetd udp 0 0 0.0.0.0:518 0.0.0.0:* 0 2790867 1740/inetd udp 0 0 127.0.0.1:161 0.0.0.0:* 0 797911 31565/snmpd udp 0 0 0.0.0.0:177 0.0.0.0:* 0 4207 1940/kdm udp 0 0 0.0.0.0:7741 0.0.0.0:* 0 2790423 4393/lisa udp 0 0 0.0.0.0:60225 0.0.0.0:* 0 2487871 18045/rpc.statd udp 0 0 0.0.0.0:60226 0.0.0.0:* 0 2488664 18734/rpc.mountd udp 0 0 0.0.0.0:68 0.0.0.0:* 0 2210 924/dhclient udp 0 0 0.0.0.0:837 0.0.0.0:* 0 2487862 18045/rpc.statd udp 0 0 0.0.0.0:36704 0.0.0.0:* 114 2607656 9308/avahi-daemon: udp 0 0 0.0.0.0:5353 0.0.0.0:* 114 2607655 9308/avahi-daemon: udp 0 0 127.0.0.1:3049 0.0.0.0:* 0 4073 1881/cfsd udp 0 0 0.0.0.0:111 0.0.0.0:* 0 1728803 31135/portmap udp 0 0 0.0.0.0:631 0.0.0.0:* 0 3555363 13218/cupsd udp 0 0 192.168.0.255:123 0.0.0.0:* 0 3114817 13551/ntpd udp 0 0 127.0.0.1:123 0.0.0.0:* 0 3114777 13551/ntpd udp 0 0 192.168.0.102:123 0.0.0.0:* 0 3114776 13551/ntpd udp 0 0 0.0.0.0:123 0.0.0.0:* 0 3114774 13551/ntpd + ls -al /var/run/exim4/ total 5 drwxr-x--- 2 Debian-exim Debian-exim 72 Jul 26 01:55 . drwxrwxr-x 23 root staff 1192 Aug 24 02:54 .. -rw-r--r-- 1 root Debian-exim 5 Jul 26 01:55 exim.pid + cat /var/run/exim4/exim.pid 1685 + pidof exim4 1685 + command -v invoke-rc.d + invoke-rc.d exim4 stop now debugging /etc/init.d/exim4 stop + ENV=env -i LANG=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/bin/X11 + QUEUERUNNER=combined + QUEUEINTERVAL=30m + UPEX4OPTS= + PIDFILE=/var/run/exim4/exim.pid + QRPIDFILE=/var/run/exim4/eximqr.pid + [ -f /etc/default/exim4 ] + . /etc/default/exim4 + EX4DEF_VERSION= + QUEUERUNNER=combined + QUEUEINTERVAL=30m + COMMONOPTIONS= + QUEUERUNNEROPTIONS= + QFLAGS= + SMTPLISTENEROPTIONS= + [ -f /etc/inetd.conf ] + grep -E -q ^[[:space:]]*((\*|[[:alnum:].-]+):)?smtp /etc/inetd.conf + DAEMON=/usr/sbin/exim4 + NAME=exim4 + test -x /usr/lib/exim4/exim4 + log_daemon_msg Stopping MTA + [ -z Stopping MTA ] + [ -z ] + echo -n Stopping MTA: Stopping MTA:+ return + stop_exim + [ -f /var/run/exim4/eximqr.pid ] + [ -f /var/run/exim4/exim.pid ] + start-stop-daemon --stop --oknodo --pidfile /var/run/exim4/exim.pid --oknodo --retry 30 --exec /usr/sbin/exim4 No /usr/sbin/exim4 found running; none killed. + log_progress_msg exim4_listener + [ -z exim4_listener ] + echo -n exim4_listener exim4_listener+ rm -f /var/run/exim4/eximqr.pid /var/run/exim4/exim.pid + log_end_msg 0 + [ -z 0 ] + log_use_fancy_output + TPUT=/usr/bin/tput + EXPR=/usr/bin/expr + [ xxterm != xdumb ] + [ -x /usr/bin/tput ] + [ -x /usr/bin/expr ] + /usr/bin/tput hpa 60 + /usr/bin/tput setaf 1 + [ -z ] + FANCYTTY=1 + true + /usr/bin/tput setaf 1 + RED= + /usr/bin/tput op + NORMAL= + [ 0 -eq 0 ] + echo . . + return 0 + warn_paniclog + [ -s /var/log/exim4/paniclog ] + exit 0 + [ -n 1 ] + netstat -tulpen Active Internet connections (only servers) Proto Recv-Q Send-Q Local Address Foreign Address State User Inode PID/Program name tcp 0 0 127.0.0.1:2208 0.0.0.0:* LISTEN 0 2909928 13195/hpiod tcp 0 0 0.0.0.0:2049 0.0.0.0:* LISTEN 0 2488620 - tcp 0 0 0.0.0.0:24837 0.0.0.0:* LISTEN 1000 5323638 697/knotes tcp 0 0 0.0.0.0:57638 0.0.0.0:* LISTEN 0 2488669 18734/rpc.mountd tcp 0 0 0.0.0.0:5800 0.0.0.0:* LISTEN 1000 5323696 640/kded [kdeinit] tcp 0 0 0.0.0.0:5900 0.0.0.0:* LISTEN 1000 5323702 640/kded [kdeinit] tcp 0 0 127.0.0.1:56204 0.0.0.0:* LISTEN 107 2909964 13198/python tcp 0 0 0.0.0.0:59853 0.0.0.0:* LISTEN 0 2333 - tcp 0 0 0.0.0.0:111 0.0.0.0:* LISTEN 0 1728804 31135/portmap tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN 0 3133 1407/boa tcp 0 0 0.0.0.0:6001 0.0.0.0:* LISTEN 0 5322993 589/X tcp 0 0 0.0.0.0:113 0.0.0.0:* LISTEN 0 3575 1740/inetd tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 0 704876 19851/sshd tcp 0 0 127.0.0.1:631 0.0.0.0:* LISTEN 0 3555360 13218/cupsd tcp 0 0 0.0.0.0:39095 0.0.0.0:* LISTEN 0 2487874 18045/rpc.statd tcp 0 0 0.0.0.0:25 0.0.0.0:* LISTEN 0 3441 1685/exim4 tcp 0 0 0.0.0.0:57146 0.0.0.0:* LISTEN 1000 5323496 685/artsd tcp 0 0 0.0.0.0:7741 0.0.0.0:* LISTEN 0 2790100 4393/lisa udp 0 0 0.0.0.0:2049 0.0.0.0:* 0 2488619 - udp 0 0 0.0.0.0:32770 0.0.0.0:* 0 3503 - udp 0 0 0.0.0.0:517 0.0.0.0:* 0 2790848 1740/inetd udp 0 0 0.0.0.0:518 0.0.0.0:* 0 2790867 1740/inetd udp 0 0 127.0.0.1:161 0.0.0.0:* 0 797911 31565/snmpd udp 0 0 0.0.0.0:177 0.0.0.0:* 0 4207 1940/kdm udp 0 0 0.0.0.0:7741 0.0.0.0:* 0 2790423 4393/lisa udp 0 0 0.0.0.0:60225 0.0.0.0:* 0 2487871 18045/rpc.statd udp 0 0 0.0.0.0:60226 0.0.0.0:* 0 2488664 18734/rpc.mountd udp 0 0 0.0.0.0:68 0.0.0.0:* 0 2210 924/dhclient udp 0 0 0.0.0.0:837 0.0.0.0:* 0 2487862 18045/rpc.statd udp 0 0 0.0.0.0:36704 0.0.0.0:* 114 2607656 9308/avahi-daemon: udp 0 0 0.0.0.0:5353 0.0.0.0:* 114 2607655 9308/avahi-daemon: udp 0 0 127.0.0.1:3049 0.0.0.0:* 0 4073 1881/cfsd udp 0 0 0.0.0.0:111 0.0.0.0:* 0 1728803 31135/portmap udp 0 0 0.0.0.0:631 0.0.0.0:* 0 3555363 13218/cupsd udp 0 0 192.168.0.255:123 0.0.0.0:* 0 3114817 13551/ntpd udp 0 0 127.0.0.1:123 0.0.0.0:* 0 3114777 13551/ntpd udp 0 0 192.168.0.102:123 0.0.0.0:* 0 3114776 13551/ntpd udp 0 0 0.0.0.0:123 0.0.0.0:* 0 3114774 13551/ntpd + ls -al /var/run/exim4/ total 1 drwxr-x--- 2 Debian-exim Debian-exim 48 Aug 24 02:59 . drwxrwxr-x 23 root staff 1192 Aug 24 02:54 .. + cat /var/run/exim4/exim.pid cat: /var/run/exim4/exim.pid: No such file or directory dpkg: warning - old pre-removal script returned error exit status 1 dpkg - trying script from the new package instead ... now debugging /var/lib/dpkg/tmp.ci/prerm failed-upgrade 4.67-7 dpkg: ... it looks like that went OK. Unpacking replacement exim4-daemon-light ... Preparing to replace exim4 4.67-7 (using .../archives/exim4_4.67-8_all.deb) ... Unpacking replacement exim4 ... ...and... $ dpkg -S /usr/sbin/exim4 exim4-daemon-light: /usr/sbin/exim4 Hmm, here's a mystery to me... ps reports the exim4 daemon to be process 1685, and s-s-d finds 1685 to be /usr/sbin/.nfs00035c1d000000a3 [see the debian-dpkg post mentioned at the top of this message] $ ls -l /usr/sbin/.nfs00035c1d000000a3 -rwsr-xr-x 1 root root 672764 Jul 14 01:42 /usr/sbin/.nfs00035c1d000000a3 # zgrep "exim4-daemon-light" dpkg.log* dpkg.log:2007-08-24 02:59:01 upgrade exim4-daemon-light 4.67-7 4.67-8 ... dpkg.log.1:2007-07-17 00:40:24 upgrade exim4-daemon-light 4.67-5 4.67-7 ... dpkg.log.2.gz:2007-06-18 22:15:10 upgrade exim4-daemon-light 4.67-1 4.67-3 ... $ ls -l /var/log/dmesg -rw-r--r-- 1 root adm 14642 Jul 26 01:55 dmesg So, /usr/sbin/exim4 changed on disk prior to the previous (successful, 2007-07-17) upgrade and it survived a reboot. ...wtf? I've never been sure of what the .nfs* files are, but /etc/cron.daily/chkrootkit makes note of those it finds and I've noticed it never finds any immediately after a reboot---so it is a little surprising to discover one apparently older than the last reboot. > Looking at the bug report I think Marc Haber's explanation is > plausible. I don't think there's evidence of a bug in > start-stop-daemon. Ya, Marc asked me if there was any correlation between the lsb-base bug and exim4 upgrade failures. After looking at snapshot.debian.net/lsb-base I was able to say the problem did start after the killproc() code in lsb_3.1-8.tar.gz/init-functions started to change towards buggy. This most recent failure has ocurred with the apparently fixed -24 build of lsb-base installed; which would indicate that either lsb isn't really fixed, or the problem lies elsewhere. How should we go about debugging this... more detailed info about the exim4 process at upgrade time? strace start-stop-daemon in the exim4 maintainer script? something else? - Bruce -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]