On Fri, 7 Jun 2013 22:11:30 +0200, Kurt Roeckx <k...@roeckx.be> wrote: > But you started a new one, which wrote a PID file, and then it > died because it detected that an other ntpd was still running, > and you really [only] want 1 running. It probably shouldn't have > written the pid file in that case.
I now have an instance of the problem occurring naturally on a squeeze system (so the trigger mechanism isn't Ubuntu-only, one can't blame it on upstart in this case), and I can confirm that it is associated with attempts by the system to start two ntpd processes concurrently. Arranging for the instance that loses the race not to have its PID written to the file should be very helpful, I think. Here are some relevant logs about the incident, lightly sanitised: Jun 7 08:17:18 <HOST> dhclient: DHCPACK from <SERVERIP> Jun 7 08:17:18 <HOST> ntpd[1576]: ntpd exiting on signal 15 Jun 7 08:17:20 <HOST> ntpd[1904]: ntpd 4.2.6p2@1.2194-o Sun Oct 17 13:35:13 UTC 2010 (1) Jun 7 08:17:20 <HOST> ntpd[1905]: ntpd 4.2.6p2@1.2194-o Sun Oct 17 13:35:13 UTC 2010 (1) Jun 7 08:17:20 <HOST> ntpd[1906]: proto: precision = 1.623 usec Jun 7 08:17:20 <HOST> ntpd[1906]: Listen and drop on 0 v4wildcard 0.0.0.0 UDP 123 Jun 7 08:17:20 <HOST> ntpd[1907]: proto: precision = 1.603 usec Jun 7 08:17:20 <HOST> ntpd[1907]: unable to bind to wildcard address 0.0.0.0 - another process may be running - EXITING Jun 7 08:17:20 <HOST> ntpd[1906]: Listen and drop on 1 v6wildcard :: UDP 123 Jun 7 08:17:20 <HOST> ntpd[1906]: Listen normally on 2 lo 127.0.0.1 UDP 123 Jun 7 08:17:20 <HOST> ntpd[1906]: Listen normally on 3 eth0 <HOSTIP> UDP 123 Jun 7 08:17:20 <HOST> ntpd[1906]: Listen normally on 4 lo ::1 UDP 123 Jun 7 08:17:20 <HOST> dhclient: bound to <HOSTIP> -- renewal in 40158 seconds. Process accounting records show the following: ntpd |v3| 0.00| 0.00| 1216.00| 105| 107| 38344.00| 0.00| 1576 1|Fri Jun 7 08:17:06 2013 ntpd |v3| 0.00| 0.00| 0.00| 0| 0| 25720.00| 0.00| 1904 1894|Fri Jun 7 08:17:20 2013 ntp |v3| 0.00| 0.00| 2.00| 0| 0| 3956.00| 0.00| 1894 1867|Fri Jun 7 08:17:20 2013 ntp |v3| 0.00| 0.00| 205.00| 0| 0| 3956.00| 0.00| 1867 1846|Fri Jun 7 08:17:18 2013 ntpd |v3| 0.00| 1.00| 0.00| 0| 0| 25720.00| 0.00| 1905 1895|Fri Jun 7 08:17:20 2013 ntp |v3| 0.00| 0.00| 209.00| 0| 0| 3956.00| 0.00| 1846 1814|Fri Jun 7 08:17:18 2013 invoke-rc.d |v3| 0.00| 0.00| 213.00| 0| 0| 3956.00| 0.00| 1814 1783|Fri Jun 7 08:17:18 2013 ntp |v3| 0.00| 0.00| 2.00| 0| 0| 3956.00| 0.00| 1895 1871|Fri Jun 7 08:17:20 2013 ntp |v3| 0.00| 1.00| 204.00| 0| 0| 3956.00| 0.00| 1871 1847|Fri Jun 7 08:17:18 2013 ntp |v3| 0.00| 0.00| 208.00| 0| 0| 3956.00| 0.00| 1847 1815|Fri Jun 7 08:17:18 2013 invoke-rc.d |v3| 0.00| 0.00| 212.00| 0| 0| 3956.00| 0.00| 1815 1782|Fri Jun 7 08:17:18 2013 ntpd |v3| 0.00| 0.00| 0.00| 0| 0| 25720.00| 0.00| 1907 1|Fri Jun 7 08:17:20 2013 dhclient-script |v3| 1.00| 1.00| 223.00| 0| 0| 17616.00| 0.00| 1783 1777|Fri Jun 7 08:17:18 2013 dhclient-script |v3| 0.00| 3.00| 223.00| 0| 0| 17616.00| 0.00| 1782 1246|Fri Jun 7 08:17:18 2013 dhclient |v3| 0.00| 0.00| 878.00| 0| 0| 6756.00| 0.00| 1777 1776|Fri Jun 7 08:17:12 2013 dhclient |v3| 0.00| 0.00| 1666.00| 0| 0| 6756.00| 0.00| 1246 1245|Fri Jun 7 08:17:04 2013 sh |v3| 0.00| 0.00| 879.00| 0| 0| 3956.00| 0.00| 1776 1770|Fri Jun 7 08:17:12 2013 sh |v3| 0.00| 0.00| 1667.00| 0| 0| 3956.00| 0.00| 1245 658|Fri Jun 7 08:17:04 2013 ifup |v3| 0.00| 0.00| 2410.00| 0| 0| 3872.00| 0.00| 658 1|Fri Jun 7 08:16:56 2013 ifup |v3| 0.00| 0.00| 900.00| 0| 0| 3872.00| 0.00| 1770 1655|Fri Jun 7 08:17:11 2013 network-bridge |v3| 1.00| 1.00| 1379.00| 0| 0| 9232.00| 0.00| 1655 1445|Fri Jun 7 08:17:07 2013 (Yes, this happens to be a Xen dom0. The other host I saw this on was just a vanilla kernel with no hypervisor in sight. There may be more than one way of getting two ntpd instances started at the same time. Process 1906 is still running so its accounting record hasn't been cut yet.) -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org