Package: bind9
Version: 1:9.7.3.dfsg-1~squeeze2
Severity: normal
Tags: patch


Sometimes when i do a /etc/init.d/bind9 stop or restart, the init scripts loops 
in
a never ending "waiting for pid ???? to die".

Looking to the system and logs i can see that named is still running and the
logs show this:

Jun 16 10:55:04 ns1 named[1203]: error (unexpected RCODE REFUSED) resolving 
'smartadvertising.pt/MX/IN': 188.93.231.1#53
Jun 16 10:55:05 ns1 named[1203]: lame server resolving 'fonsecas.pt' (in 
'fonsecas.pt'?): 62.193.206.146#53
Jun 16 10:55:05 ns1 named[1203]: error (network unreachable) resolving 
'fonsecas.pt/MX/IN': 2a02:2b8:1:406::724:142#53
Jun 16 10:55:05 ns1 named[1203]: error (network unreachable) resolving 
'fonsecas.pt/MX/IN': 2a02:2b8:1:406::724:136#53
Jun 16 10:55:05 ns1 named[1203]: received control channel command 'stop -p'
Jun 16 10:55:05 ns1 named[1203]: shutting down: flushing changes
Jun 16 10:55:05 ns1 named[1203]: stopping command channel on 127.0.0.1#953
Jun 16 10:55:05 ns1 named[1203]: stopping command channel on ::1#953
Jun 16 10:55:05 ns1 named[1203]: no longer listening on ::#53
Jun 16 10:55:05 ns1 named[1203]: no longer listening on 127.0.0.1#53
Jun 16 10:55:05 ns1 named[1203]: no longer listening on 192.168.1.235#53
Jun 16 10:55:05 ns1 named[1203]: lame server resolving 'maxiprojecto.pt' (in 
'maxiprojecto.pt'?): 109.71.40.37#53

So the named is asked to stop, it exits from the ports, but never really exits.
the strace -p $PID shows:

Process 1203 attached - interrupt to quit
futex(0xb6c91bd8, FUTEX_WAIT, 1204, NULL^C <unfinished ...>
Process 1203 detached

My config is nothing unsusual and i have a cloned machine as a secondary/slave
DNS that dont show this problem (maybe only applies to master DNS?)

This bug might look as unimportant, but this breaks any script that tried to
reload the bind9 config and gives a total lost of service.

As a workaround, i would say that the init script should wait a MAXWAIT and
if the named dont exits until then, kill it with kill -9. This allows the
system to recover...

The patch is very simple, define a MAXWAIT and count how many times the
init script is waiting to kill the daemon:

--- bind9.new   2011-06-16 11:41:11.000000000 +0100
+++ bind9       2009-08-17 14:55:27.000000000 +0100
@@ -19,7 +19,6 @@
 # Don't modify this line, change or create /etc/default/bind9.
 OPTIONS=""
 RESOLVCONF=no
-MAXWAIT=60
 
 test -f /etc/default/bind9 && . /etc/default/bind9
 
@@ -90,14 +89,9 @@
                    --pidfile ${PIDFILE} -- $OPTIONS
        fi
        if [ -n $pid ]; then
-          i=0
          while kill -0 $pid 2>/dev/null; do
            log_progress_msg "waiting for pid $pid to die"
            sleep 1
-            i=$((i+1))
-            if [  $i -gt $MAXWAIT ] ; then
-                kill -9 $PID
-            fi
          done
        fi
        log_end_msg 0


-- System Information:
Debian Release: 6.0.1
  APT prefers stable
  APT policy: (500, 'stable')
Architecture: i386 (i686)

Kernel: Linux 2.6.32-5-686 (SMP w/1 CPU core)
Locale: LANG=pt_PT.UTF-8, LC_CTYPE=pt_PT.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages bind9 depends on:
ii  adduser          3.112+nmu2              add and remove users and groups
ii  bind9utils       1:9.7.3.dfsg-1~squeeze2 Utilities for BIND
ii  debconf [debconf 1.5.36.1                Debian configuration management sy
ii  libbind9-60      1:9.7.3.dfsg-1~squeeze2 BIND9 Shared Library used by BIND
ii  libc6            2.11.2-10               Embedded GNU C Library: Shared lib
ii  libcap2          1:2.19-3                support for getting/setting POSIX.
ii  libdb4.8         4.8.30-2                Berkeley v4.8 Database Libraries [
ii  libdns69         1:9.7.3.dfsg-1~squeeze2 DNS Shared Library used by BIND
ii  libgssapi-krb5-2 1.8.3+dfsg-4            MIT Kerberos runtime libraries - k
ii  libisc62         1:9.7.3.dfsg-1~squeeze2 ISC Shared Library used by BIND
ii  libisccc60       1:9.7.3.dfsg-1~squeeze2 Command Channel Library used by BI
ii  libisccfg62      1:9.7.3.dfsg-1~squeeze2 Config File Handling Library used 
ii  libldap-2.4-2    2.4.23-7                OpenLDAP libraries
ii  liblwres60       1:9.7.3.dfsg-1~squeeze2 Lightweight Resolver Library used 
ii  libssl0.9.8      0.9.8o-4squeeze1        SSL shared libraries
ii  libxml2          2.7.8.dfsg-2+squeeze1   GNOME XML library
ii  lsb-base         3.2-23.2squeeze1        Linux Standard Base 3.2 init scrip
ii  net-tools        1.60-23                 The NET-3 networking toolkit
ii  netbase          4.45                    Basic TCP/IP networking system

bind9 recommends no packages.

Versions of packages bind9 suggests:
ii  bind9-doc        1:9.7.3.dfsg-1~squeeze2 Documentation for BIND
ii  dnsutils         1:9.7.3.dfsg-1~squeeze2 Clients provided with BIND
pn  resolvconf       <none>                  (no description available)
pn  ufw              <none>                  (no description available)

-- debconf information excluded
--- bind9.new   2011-06-16 11:41:11.000000000 +0100
+++ bind9       2009-08-17 14:55:27.000000000 +0100
@@ -19,7 +19,6 @@
 # Don't modify this line, change or create /etc/default/bind9.
 OPTIONS=""
 RESOLVCONF=no
-MAXWAIT=60
 
 test -f /etc/default/bind9 && . /etc/default/bind9
 
@@ -90,14 +89,9 @@
                    --pidfile ${PIDFILE} -- $OPTIONS
        fi
        if [ -n $pid ]; then
-          i=0
          while kill -0 $pid 2>/dev/null; do
            log_progress_msg "waiting for pid $pid to die"
            sleep 1
-            i=$((i+1))
-            if [  $i -gt $MAXWAIT ] ; then
-                kill -9 $PID
-            fi
          done
        fi
        log_end_msg 0

Reply via email to