The query syntax would not be drastically impacted (though I understand 
there is still a change). The significance would be a 75% reduction in the 
number of series generated by these metrics. Less to store and compute.

*> "what fraction of nodes is down"*

Current:
    count (consul_health_node_status{status!="passing"} == 1)
    /
    count (consul_health_node_status)

Proposed:
    count (consul_health_node_status == 0)
    /
    count (consul_health_node_status)

*> "which nodes have multiple services down?"*

Current:
    count by (node) (consul_health_service_status{status!="passing"} == 1) 
> 1

Proposed:
    count by (node) (consul_health_service_status == 0) > 1

*> What service checks are critical?*

Current:
    consul_health_service_status{status="critical"} == 1

Proposed:
    consul_health_service_status == 0


On Tuesday, August 17, 2021 at 5:01:27 AM UTC-4 [email protected] wrote:

> What would some common queries be that this affects, and how would they 
> look in the future? For example, "what fraction of nodes is down" "which 
> nodes have multiple services down?"
>
> /MR
>
> On Mon, Aug 16, 2021, 22:31 Matt Russi <[email protected]> wrote:
>
>> Currently, the consul_exporter exposes 4 series per health_node and 
>> health_service status check. Each with a label indicating the status 
>> (maintenance, warning, critical, or passing). In larger environments, this 
>> creates quite a few extra series. 
>>
>> As somewhat of a precedent, the status is already being mapped to a value 
>> for the consul_serf_lan_member_status metric (as Consul's API provides this 
>> mapping).
>> # HELP consul_serf_lan_member_status Status of member in the cluster. 
>> 1=Alive, 2=Leaving, 3=Left, 4=Failed.
>>
>> I wanted to get some thoughts around this before pursuing a PR.
>>
>> In my example, I used -2=maintenance, -1=warning, 0=critical, and 
>> 1=passing to fall in line with the Prometheus paradigm of up=0 (down) and 
>> up=1 (up). Since we have two additional values, the negative numbers play 
>> more nicely when trying to do a value mapping in Grafana. Not married to 
>> the values themselves though. :) 
>>
>> Present Example:
>> consul_health_node_status{check="serfHealth",node="example_node",status="critical"}
>>  
>> 0
>> consul_health_node_status{check="serfHealth",node="example_node",status="maintenance"}
>>  
>> 0
>> consul_health_node_status{check="serfHealth",node="example_node",status="passing"}
>>  
>> 1
>> consul_health_node_status{check="serfHealth",node="example_node",status="warning"}
>>  
>> 0
>>
>> consul_health_service_status{check="service:10.0.0.1_443",node="example_node",service_id="10.0.0.1_443",service_name="auth_service",status="critical"}
>>  
>> 0
>> consul_health_service_status{check="service:10.0.0.1_443",node="example_node",service_id="10.0.0.1_443",service_name="auth_service",status="maintenance"}
>>  
>> 0
>> consul_health_service_status{check="service:10.0.0.1_443",node="example_node",service_id="10.0.0.1_443",service_name="auth_service",status="passing"}
>>  
>> 1
>> consul_health_service_status{check="service:10.0.0.1_443",node="example_node",service_id="10.0.0.1_443",service_name="auth_service",status="warning"}
>>  
>> 0
>>
>> Proposed Example:
>> # HELP consul_health_node_status Status of health checks associated with 
>> a node. -2=maintenance, -1=warning, 0=critical, 1=passing
>> consul_health_node_status{check="serfHealth",node="example_node"} 1
>>
>> # HELP consul_health_service_status Status of health checks associated 
>> with a service. -2=maintenance, -1=warning, 0=critical, 1=passing 
>> consul_health_service_status{check="service:10.0.0.1_443",node="example_node",service_id="10.0.0.1_443",service_name="auth_service"}
>>  
>> 1
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Prometheus Developers" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected].
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/prometheus-developers/9bb6b446-728d-47d9-8a08-355dec88d572n%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/prometheus-developers/9bb6b446-728d-47d9-8a08-355dec88d572n%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/420999dd-e14a-43f2-b296-b5a3da23dd0dn%40googlegroups.com.

Reply via email to