On 2018-12-21 17:32:30 +0300, Sergey B Kirpichev wrote: > Does it send you alert if you > remove "if changed link capacity then alert" line too, correct?
Yes, I still get the same messages. > If so, I suspect that the currect monit's behaviour is correct and I suggest > you just adding "if failed link then unmonitor"-like line. No, unmonitoring the service is not OK, since I still want alerts when the Ethernet cable is plugged back in. In any case, this cannot be correct since while the Ethernet cable is *not* plugged, I get messages with: ------------------------------------------------------------ Subject: monit alert -- Link up eth0 Link up Service eth0 Date: Sat, 22 Dec 2018 02:40:38 Action: alert Host: zira Description: link data collection succeeded ------------------------------------------------------------ The "Link up eth0" cannot be correct since the cable is not plugged. And just after that (at the same second), monit sends a ------------------------------------------------------------ Subject: monit alert -- Link down eth0 Link down Service eth0 Date: Sat, 22 Dec 2018 02:40:38 Action: alert Host: zira Description: link down ------------------------------------------------------------ (I suppose that this is a consequence of the "Link up eth0", which yields a loop). So, it seems to be a bug in the monit code that makes it think that the link is up while it is not. In validate.c, I see: [...] for (LinkStatus_T link = s->linkstatuslist; link; link = link->next) { Event_post(s, Event_Link, State_Succeeded, link->action, "link data collection succeeded"); } // State if (! Link_getState(s->inf.net->stats)) { for (LinkStatus_T link = s->linkstatuslist; link; link = link->next) Event_post(s, Event_Link, State_Failed, link->action, "link down"); return State_Failed; // Terminate test if the link is down } else { for (LinkStatus_T link = s->linkstatuslist; link; link = link->next) Event_post(s, Event_Link, State_Succeeded, link->action, "link up"); } [...] This seems suspicious. The "link data collection succeeded" preceedes the check of the state, while the alert says "Link up eth0". So, if I understand correctly, this yields a State_Succeeded / State_Failed loop. That's one point. I don't underdtand the code after "// State" either (which may be another bug): because the "for" loops are under the "if" and the "else", it seems to consider that *all* links are down or *all* links are up. (In my case, I monitor only one link, so that's not an issue, but it may be one in more complex settings.) -- Vincent Lefèvre <vinc...@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)