Package: iputils-ping Version: 3:20221126-1 Severity: normal Hi,
After upgrading our monitoring host from bullseye to bookworm, our check_ping plugin suddenly reports that hosts that have been down for months are up again for a few minutes. I've looked into this and it seems we are running so many ping checks, that the randomly selected id from a ping to host A sometimes matches to the id of the ping to host B. Since all icmp responses are forwarded to both ping processes on the system. I've looked at the code between the ping command from bullseye and bookworm and it seems that the code from bookworm accounts for these wrong packets: it shows the response but marks it as "DIFFERENT ADDRESS!". This is correct, but these packets are then accounted for in the number of responses received and this makes it seem that a host is up when it isn't: root@iridium:~# ping 10.89.22.23 -v ping: sock4.fd: 3 (socktype: SOCK_RAW), sock6.fd: 4 (socktype: SOCK_RAW), hints.ai_family: AF_UNSPEC ai->ai_family: AF_INET, ai->ai_canonname: '10.89.22.23' PING 10.89.22.23 (10.89.22.23) 56(84) bytes of data. 64 bytes from 10.89.22.179: icmp_seq=1 ident=47195 ttl=252 time=1.04 ms (DIFFERENT ADDRESS!) 64 bytes from 10.89.22.179: icmp_seq=2 ident=47195 ttl=252 time=5.79 ms (DIFFERENT ADDRESS!) 64 bytes from 10.89.22.179: icmp_seq=3 ident=47195 ttl=252 time=69.8 ms (DIFFERENT ADDRESS!) 64 bytes from 10.89.22.179: icmp_seq=4 ident=47195 ttl=252 time=0.988 ms (DIFFERENT ADDRESS!) 64 bytes from 10.89.22.179: icmp_seq=5 ident=47195 ttl=252 time=0.975 ms (DIFFERENT ADDRESS!) ^C --- 10.89.22.23 ping statistics --- 1022 packets transmitted, 5 received, 99.5108% packet loss, time 1045246ms rtt min/avg/max/mdev = 0.975/15.720/69.808/27.107 ms, pipe 994 Since 10.89.22.23 (the one we are pinging) is down, there are no responses from this system. But because the ident (47195 in the example above) matches a different ping to 10.89.22.179, the responses are also parsed by this ping. It correctly shows that the response came from a different address, but still adds the 5 packets as 5 valid received packets, which makes the packet loss 99.5% instead of 100%. Those invalid packets should not count as valid responses as they are not from the correct host. I've compared the source code of ping between the bullseye and bookworm version, and the bullseye version discarded the packets if this occured. This change results in hosts that are actually down being reported as up in our monitoring system. Regards, Rik -- System Information: Debian Release: 12.0 APT prefers stable-security APT policy: (500, 'stable-security'), (500, 'stable') Architecture: amd64 (x86_64) Kernel: Linux 6.1.0-9-amd64 (SMP w/2 CPU threads; PREEMPT) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE not set Shell: /bin/sh linked to /usr/bin/dash Init: systemd (via /run/systemd/system) Versions of packages iputils-ping depends on: ii libc6 2.36-9 ii libcap2 1:2.66-4 ii libcap2-bin 1:2.66-4 ii libidn2-0 2.3.3-1+b1 iputils-ping recommends no packages. iputils-ping suggests no packages. -- no debconf information