Very strange.. we just had one of our 3-node pdns clusters go down, each box 
with a different issue. One had segfaulted, one had hung on a FUTEX call (from 
strace, though there was a bit of a rush to get the cluster back so I was 
unable to look further at the time), and the process had simply vanished on the 
third.

version: PowerDNS 2.9.21 (C) 2001-2006 PowerDNS.COM BV (Apr  9 2008, 10:37:43, 
gcc 4.2.3 (Ubuntu 4.2.3-2ubuntu7))

relevant pdns.conf for the 'questions waiting':
max-queue-length=20000
queue-limit=1500
distributor-threads=10

1 - Has anyone seen a similar signal 11 issue, or suggestions for what to look 
into immediately if this happens? I apologize again, we were in a crunch to get 
things back up.

Jul 31 07:30:00 pdns12 pdns[8315]: 20018 questions waiting for database 
attention. Limit is 20000, respawning
Jul 31 07:30:00 pdns12 pdns[15887]: Our pdns instance exited with code 1
Jul 31 07:30:00 pdns12 pdns[15887]: Respawning
Jul 31 07:30:01 pdns12 pdns[15887]: Got a signal 11, attempting to print trace: 
Jul 31 07:30:01 pdns12 pdns[15887]: /usr/sbin/pdns_server [0x479610]
Jul 31 07:30:01 pdns12 pdns[15887]: /lib/libc.so.6 [0x7f36b737f100]
Jul 31 07:30:01 pdns12 pdns[15887]: /lib/libc.so.6(fgets+0x44) [0x7f36b73b0514]
Jul 31 07:30:01 pdns12 pdns[15887]: /usr/sbin/pdns_server [0x478a7b]
Jul 31 07:30:01 pdns12 pdns[15887]: 
/usr/sbin/pdns_server(_ZN11DynListener11theListenerEv+0x501) [0x48a621]
Jul 31 07:30:01 pdns12 pdns[15887]: 
/usr/sbin/pdns_server(_ZN11DynListener17theListenerHelperEPv+0x9) [0x48b669]
Jul 31 07:30:01 pdns12 pdns[15887]: /lib/libpthread.so.0 [0x7f36b76b53f7]
Jul 31 07:30:01 pdns12 pdns[15887]: /lib/libc.so.6(clone+0x6d) [0x7f36b7424b2d]

2 - Should we up the number of distributor threads?

* The load/memory usage remained fairly constant on the servers in question 
(per sar)

* mysql -e "show status like 'Max%'";
+----------------------+-------+
| Variable_name        | Value |
+----------------------+-------+
| Max_used_connections | 16    | 
+----------------------+-------+

which is quite low

* qsize-q still peaked above 20k

Everything I've read indicates that only a relatively low number of distributor 
threads are necessary, but in this case I'm venturing that it may help?

Thanks in advance for any insight you guys may have.

Daniel
_______________________________________________
Pdns-users mailing list
Pdns-users@mailman.powerdns.com
http://mailman.powerdns.com/mailman/listinfo/pdns-users

Reply via email to