Hello all,
I have a rule which is trying to count time series that match a certain
regexp and spot when this changes, to raise an alert more or less
immediately (i.e. no for clause). This is counting a custom socket count
metric that we need to catch any changes in.
- alert: outboundSocketCountChange
expr: (count({__name__=~"tcpsocket(.+)Inbound"} offset 30s) -
count({__name__=~"tcpsocket(.+)Inbound"})) != bool 0
labels:
severity: critical
annotations:
summary: OB socket count has changed
It triggers fine when the value changes but it appears to then be stuck in
firing, rather than resolving when the next evaluation window completes.
Graphing the promQL shows exactly what I would expect - a single spike to 1
when the value changes and then back to zero. I would expect the alert to
clear when it hits that zero.
Scrape and evaluation intervals are both set to 15s. Prom v2.45.
Am I missing something here?
--
You received this message because you are subscribed to the Google Groups
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion visit
https://groups.google.com/d/msgid/prometheus-users/0902f0b4-aac2-40d9-bd0a-1b1667501012n%40googlegroups.com.