I've recently started monitoring a large fleet of hardware devices using a combination of blackbox, snmp, node, and json exporters. I started out using the *up* metric, but I noticed when using blackbox ping, *up* is *always* 1 even when the device is offline. So I plan to switch to *probe_success* instead. But I'm thinking about the implications of this when mixed with other exporters. For example json-exporter does not offer a *probe_success* metric; instead it returns *up*=0 when the target times out.
My goal is to build a Grafana dashboard and alerts that monitors a combination of blackbox and other exporters. For context, when certain devices crash, they remain pingable, but they return their failed state via REST API. So I'm setting the json-exporter to an HTTP target endpoint. I'm struggling to come up with a unified way of monitoring all these different devices types. -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/1746ad20-654f-499c-ae1d-28b84d3cb962n%40googlegroups.com.

