Re: [prometheus-users] Counter or Gauge metric?

Christoph Anton Mitterer Sat, 20 Jul 2024 16:51:45 -0700

Hey.

On Sat, 2024-07-20 at 10:26 -0700, 'Brian Candler' via Prometheus Users
wrote:
> 
> If the label stays constant, then the amount of extra space required
> is tiny.  There is an internal mapping between a bag of labels and a
> timeseries ID.


Is it the same if one uses a metric (like for the RPMs from below) and
that never changes? I mean is that also efficient?


> But if any label changes, that generates a completely new timeseries.
> This is not something you want to happen too often (a.k.a "timeseries
> churn"), but moderate amounts are OK.

Why exactly wouldn't one want this? I mean especially with respect to
such _info metrics.

Graphing _info time series doesn't make sense anyway... so it's not as
if one would get some usable time series/graph (like a temperature or
so) interrupted, if e.g. the state changes for a while from OK to
degraded.
It's of course clear why one doesn't want that for "normal" time
series.


> For example, if a drive changes from "OK" to "degraded" then that
> would be reasonable, but putting the drive temperature in a label
> would not.

The latter is anyway clear. :-) 


> Some people prefer to enumerate a separate timeseries per state,
> which can make certain types of query easier since you don't have to
> worry about staleness or timeseries appearing and disappearing. e.g.
> 
> foo{state="OK"} 1
> foo{state="degraded"} 0
> foo{state="absent"} 0

I guess with appearing/disappearing you mean, that one has to take into
account, that e.g. pd_info{state=="OK",pd_name="foo"} won't exist while
"foo" is failed... and thus e.g. when graphing the OK-times of a
device, it would per default show nothing during that time and not a
value of zero?

But other than that... are there any bigger consequences?
I mean I wonder with which kinds of queries this would cause troubles,
cause for the simply counting time series with some special value (like
state=="OK") there should still always be exactly one series for each
(e.g.) PD (respectively for each combination of the "primary key"
labels).

What are the differences with respect to staleness?


> It's much easier to alert on foo{state="OK"}  == 0, than on the
> absence of timeseries foo{state="OK"}.

Why? Can't I just do something like
  count( foo{state!="OK"} ) > 0
?

> However, as you observe, you need to know in advance what all the
> possible states are.
> 
> The other option, if the state values are integer enumerations at
> source (e.g. as from SNMP), is to store the raw numeric value:
> 
> foo 3
> 
> That means the querier has to know how the meaning of these values.
> (Grafana can map specific values to textual labels and/or colours
> though).

But that also requires me to use a label like in enum_metric{value=3},
or I have to construct metric names dynamically (which I could also
have done for the symbolic name), which seems however discouraged (and
I'd say for good reasons)?

Anyway, in my case there's no numeric value anyway ;-)


> > 2) Metrics like:
> > - smartraid_physical_drive_size_bytes
> > - smartraid_physical_drive_rotational_speed_rpm
> > can in principle not change (again: unless the PD is replaced with
> > another one of the same name).
> >  
> > So should they rather be labels, despite being numbers?
> >  
> > OTOH, labels like:
> > - smartraid_logical_drive_size_bytes
> > - smartraid_logical_drive_chunk_size_bytes
> > - smartraid_logical_drive_data_stripe_size_bytes
> > *can* in principle change (namely if the RAID is converted).
> 
> IMO those are all fine as labels. Creation or modification of a
> logical volume is a rare thing to do, and arguably changing such
> fundamental parameters is making a new logical volume.

I'm still unsure what to do ;-)

I mean if both, label and metric, are equally efficient (in therms of
storage)... then using a metric would have still the advantage of being
able to do things like:
   smartraid_logical_drive_chunk_size_bytes > (256*1024)
i.e. select those LDs, that use a chunk size > 256 KiB ... which I
cannot (as easily) do if it's in a label.


> If you ever wanted to do *arithmetic* on those values - like divide
> the physical drive size by the sum of logical drive sizes - then
> you'd want them as metrics.

Ah... here you go ^^

> Also, filtering on labels can be awkward (e.g. "show me all drives
> with speed greater than 7200rpm" requires a bit of regexp magic,
> although "show me all drives with speed not 7200rpm" is easy).
> 
> But I don't think those are common use cases.  Rather it's just about
> collecting secondary identifying information and characteristics.

Well....  yes and now.

What I would actually like to have is that I can use my _info data for
selecting groups (not really sure if it's possible as I want it), but
taking for example some metric from node_exporter about device IO
rates... I'd like to be able to select all those devices that have e.g.
chunksize = 1 MiB in one graph.
And those with say 256 KiB in another.

I had hoped one could do that with some "… and on (…) …" magic.



> 
> > I went now for the approach to have a dedicated metric for those
> > where there's a dedicated property in the RAID tool output, like:
> > - smartraid_controller_temperature_celsius
> 
> Yes: something that's continuously variable (and likely to vary
> frequently), and/or that you might want to draw a graph of or alert
> on, is definitely its own metric value, not a label.

My point here was rather:

Should I have made e.g. only one
  smartraid_temperature{type="bla"}
value (or perhaps a bit more than just "type",
with "bla" being e.g. controller, capacitor, cache_module or "sensor".

I.e. putting all temperatures in one metric, rather than the 4
different ones I have now (and where I have no "type" label).


Thanks,
Chris.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/ec767e65f7c71de2c11cabcf5da686ee9bd428e9.camel%40gmail.com.

Re: [prometheus-users] Counter or Gauge metric?

Reply via email to