Hello, I am running 24 smokeping prober instances in different parts of my datacenter each located behind different firewalls. All of these 24 probers running pings againts a specific target. The ping frequency is 0.2s so 5 pings per seconds per target and 5x24 = 120 pings per second.
The target is rebooting every night and the reboot takes around 30s. For some unknown reason all 24 smokeping_probers are not able to reach this target anymore after a reboot 14 days agon. They did not recover automatically. Only change to recover is to restart the smokeping_prober service. on another smokeping_prober instance I did a tcpdump on the nexthop device and I noticed that there is not any icmp-request sent out by smokeping_prober to this specific target. However the same smokeping_prober instance is sending pings to other targets successfully and this smokeping_prober instance (RedHat 8) ins answering icmp requests from other smokeping_prober instances. TL;DR: - 1 target is pingend by 24 smokeping_prober 0.8.1 imnstances on RHEL 8 - The target reboots once a day and is dow for around 30s - 14 days ago all 24 instances can not reach this traget anymore. tcpdump confirms they do NOT send any ICMP requests anymore to only this specific target - systemctl restart smokeping_prober.service recovers the instance and target can be reached. Any ideas why and how to investigate why from some reason 1 target can not be reached anymore at the same time for 24 smokeping_prober instances? PS: Of course - running "ping" on the RHEL8 system where smokeping_prober is installed can ping the target. The issue is, smokeping_prober stopped sending out pings to this target for some reason. -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/4c562f4d-aeb5-4345-a55f-2b3346485a24n%40googlegroups.com.

