> If one adds a label to a metric, which then stays mostly constant, does
> this add any considerably amount of space needed for storing it?
If the label stays constant, then the amount of extra space required is
tiny. There is an internal mapping between a bag of labels and a
timeseries ID.
But if any label changes, that generates a completely new timeseries. This
is not something you want to happen too often (a.k.a "timeseries churn"),
but moderate amounts are OK. For example, if a drive changes from "OK" to
"degraded" then that would be reasonable, but putting the drive temperature
in a label would not.
Some people prefer to enumerate a separate timeseries per state, which can
make certain types of query easier since you don't have to worry about
staleness or timeseries appearing and disappearing. e.g.
foo{state="OK"} 1
foo{state="degraded"} 0
foo{state="absent"} 0
It's much easier to alert on foo{state="OK"} == 0, than on the absence of
timeseries foo{state="OK"}. However, as you observe, you need to know in
advance what all the possible states are.
The other option, if the state values are integer enumerations at source
(e.g. as from SNMP), is to store the raw numeric value:
foo 3
That means the querier has to know how the meaning of these values.
(Grafana can map specific values to textual labels and/or colours though).
> 2) Metrics like:
> - smartraid_physical_drive_size_bytes
> - smartraid_physical_drive_rotational_speed_rpm
> can in principle not change (again: unless the PD is replaced with
> another one of the same name).
>
> So should they rather be labels, despite being numbers?
>
> OTOH, labels like:
> - smartraid_logical_drive_size_bytes
> - smartraid_logical_drive_chunk_size_bytes
> - smartraid_logical_drive_data_stripe_size_bytes
> *can* in principle change (namely if the RAID is converted).
IMO those are all fine as labels. Creation or modification of a logical
volume is a rare thing to do, and arguably changing such fundamental
parameters is making a new logical volume.
If you ever wanted to do *arithmetic* on those values - like divide the
physical drive size by the sum of logical drive sizes - then you'd want
them as metrics. Also, filtering on labels can be awkward (e.g. "show me
all drives with speed greater than 7200rpm" requires a bit of regexp magic,
although "show me all drives with speed not 7200rpm" is easy).
But I don't think those are common use cases. Rather it's just about
collecting secondary identifying information and characteristics.
> I went now for the approach to have a dedicated metric for those
> where there's a dedicated property in the RAID tool output, like:
> - smartraid_controller_temperature_celsius
Yes: something that's continuously variable (and likely to vary
frequently), and/or that you might want to draw a graph of or alert on, is
definitely its own metric value, not a label.
On Saturday 20 July 2024 at 16:00:51 UTC+1 Christoph Anton Mitterer wrote:
> Hey Ben and Chris.
>
> Thanks for your replies!
>
> On Fri, 2024-07-19 at 09:17 +0200, Ben Kochie wrote:
> > This is one of those tricky situations where there's not a strict
> > correct answer.
>
> Indeed.
>
>
> > For power-on-hours I would probably go with a gauge.
> > * You don't really have a "perfect" monotonic counter here.
>
> Why not?
> I mean there's what Chris said about some drives that may overflow the
> number - which in principle sounds unlikely though I must admit that
> especially with NVMe SMART data I have seen unreasonably low numbers,
> too (but in those cases, I've never seen high numbers for these drives,
> so it may also just be some other issue).
>
>
> > * I would also include the serial number label as well, just for
> > uniqueness identification sake.
>
> If one adds a label to a metric, which then stays mostly constant, does
> this add any considerably amount of space needed for storing it?
>
> But more on that below.
>
>
> > * Power-on-hours doesn't really have a lot of use as a counter. Do
> > actually want to display a counter like `rate(power_on_hours[1h])`?
>
> No, not particularly. It's just a number that should at least in theory
> only increase, and I wanted to do it right.
>
>
> Perhaps I should describe things a bit more, because actually I would
> have some more cases where it's not clear to me how to map perfectly
> into metrics.
>
> I had already asked at
> https://discuss.prometheus.io/t/how-to-design-metrics-labels/2337
> and one further thread there (which was however swallowed by the anti-
> spam, and would need some admin to approve it).
>
>
> My exporter parses the RAID CLI tools output, which results in a structure
> like this (as JSON):
> {
> "controllers": {
> "0": {
> "properties": {
> "slot": "0",
> "serial_number": "000A",
> "controller_status": "OK",
> "hardware_revision": "A",
> "firmware_version": "6.52",
> "rebuild_priority": "High",
> "cache_status": "OK",
> "battery_capacitor_status": "OK",
> "controller_temperature_celsius": 55.0,
> "model": "HPE Smart Array P816i-a SR Gen10"
> },
> "sensors": {
> "0": {
> "properties": {
> "location": "Inlet Ambient",
> "temperature_celsius": 43.0
> }
> },
> "1": {
> "properties": {
> "location": "ASIC",
> "temperature_celsius": 55.0
> }
> },
> "2": {
> "properties": {
> "location": "Top",
> "temperature_celsius": 41.0
> }
> }
> },
> "arrays": {
> "A": {
> "properties": {
> "unused_space_bytes": 0.0,
> "used_space_bytes": 960139939020.8,
> "status": "OK",
> "multidomain_status": "OK"
> },
> "logical_drives": {
> "1": {
> "properties": {
> "size_bytes": 480069969510.4,
> "raid_level": "1",
> "chunk_size_bytes": 262144,
> "data_stripe_size_bytes": 262144,
> "status": "OK",
> "unrecoverable_media_errors": "None",
> "multidomain_status": "OK",
> "caching": false,
> "device": "/dev/sda",
> "logical_drive_label": "system"
> }
> }
> },
> "physical_drives": {
> "3I:1:15": {
> "properties": {
> "port": "3I",
> "box": "1",
> "bay": "15",
> "status": "OK",
> "drive_role": "Data",
> "interface_type": "Solid State SATA",
> "size_bytes": 480000000000,
> "firmware_version": "HPG1",
> "serial_number": "000B",
> "wwn": "00001",
> "model": "ATA VK000480GXAWK",
> "temperature_celsius": 35.0,
> "usage_remaining_percent": 99.6,
> "power_on_hours": 25391.0,
> "life_remaining_based_on_workload_to_date_days": 263431.0,
> "shingled_magnetic_recording_support": "None"
> }
> },
> "3I:1:16": {
> "properties": {
> "port": "3I",
> "box": "1",
> "bay": "16",
> "status": "OK",
> "drive_role": "Data",
> "interface_type": "Solid State SATA",
> "size_bytes": 480000000000,
> "firmware_version": "HPG1",
> "serial_number": "000C",
> "wwn": "00002",
> "model": "ATA VK000480GXAWK",
> "temperature_celsius": 32.0,
> "usage_remaining_percent": 99.6,
> "power_on_hours": 25391.0,
> "life_remaining_based_on_workload_to_date_days": 263431.0,
> "shingled_magnetic_recording_support": "None"
> }
> }
> }
> },
> "B": {
> "properties": {
> "unused_space_bytes": 0.0,
> "used_space_bytes": 224014499043082.25,
> "status": "OK",
> "multidomain_status": "OK"
> },
> "logical_drives": {
> "2": {
> "properties": {
> "size_bytes": 17592186044416.0,
> "raid_level": "6",
> "chunk_size_bytes": 524288,
> "data_stripe_size_bytes": 6291456,
> "status": "OK",
> "unrecoverable_media_errors": "None",
> "multidomain_status": "OK",
> "caching": true,
> "parity_initialization_status": "Initialization Completed",
> "device": "/dev/sde",
> "logical_drive_label": "data-a-0"
> }
> },
> "3": {
> "properties": {
> "size_bytes": 17592186044416.0,
> "raid_level": "6",
> "chunk_size_bytes": 524288,
> "data_stripe_size_bytes": 6291456,
> "status": "OK",
> "unrecoverable_media_errors": "None",
> "multidomain_status": "OK",
> "caching": true,
> "parity_initialization_status": "Initialization Completed",
> "device": "/dev/sdb",
> "logical_drive_label": "data-a-1"
> }
> },
> "physical_drives": {
> "1I:1:1": {
> "properties": {
> "port": "1I",
> "box": "1",
> "bay": "1",
> "status": "OK",
> "drive_role": "Data",
> "interface_type": "SAS",
> "size_bytes": 16000000000000,
> "rotational_speed_rpm": 7200,
> "firmware_version": "HPD4",
> "serial_number": "000D",
> "wwn": "00003",
> "model": "HPE MB016000JWZHE",
> "temperature_celsius": 30.0,
> "shingled_magnetic_recording_support": "None"
> }
> },
> "1I:1:2": {
> "properties": {
> "port": "1I",
> "box": "1",
> "bay": "2",
> "status": "OK",
> "drive_role": "Data",
> "interface_type": "SAS",
> "size_bytes": 16000000000000,
> "rotational_speed_rpm": 7200,
> "firmware_version": "HPD4",
> "serial_number": "000E",
> "wwn": "00004",
> "model": "HPE MB016000JWZHE",
> "temperature_celsius": 33.0,
> "shingled_magnetic_recording_support": "None"
> }
> },
> "3I:1:14": {
> "properties": {
> "port": "3I",
> "box": "1",
> "bay": "14",
> "status": "OK",
> "drive_role": "Data",
> "interface_type": "SAS",
> "size_bytes": 16000000000000,
> "rotational_speed_rpm": 7200,
> "firmware_version": "HPD4",
> "serial_number": "000F",
> "wwn": "00005",
> "model": "HPE MB016000JWZHE",
> "temperature_celsius": 37.0,
> "shingled_magnetic_recording_support": "None"
> }
> }
> }
> }
> },
> "unassigned_physical_drives": {}
> }
> }
> }
>
> (The above output is a bit shortened, simply for readability)
>
> In short:
> - there can be multiple controllers
> - controllers have sensors and arrays
> - arrays have logical drives
> - physical drives are assigned to arrays, too,
> but may also be shared spares (which cannot be properly deduced from
> the RAID tool) or unassigned drives (and thus not belong to an array)
>
>
> Right now, I'd map that as follows:
> # HELP smartraid_controller_cache_module_temperature_celsius Temperature
> of the cache module of a SmartRAID controller in celsius.
> # TYPE smartraid_controller_cache_module_temperature_celsius gauge
> # HELP smartraid_controller_capacitor_temperature_celsius Temperature of
> the capacitor of a SmartRAID controller in celsius.
> # TYPE smartraid_controller_capacitor_temperature_celsius gauge
> # HELP smartraid_controller_info Information about a SmartRAID controller.
> # TYPE smartraid_controller_info gauge
> smartraid_controller_info{battery_capacitor_status="OK",cache_status="OK",controller_name="0",controller_status="OK",firmware_version="6.52",hardware_revision="A",model="HPE
>
> Smart Array P816i-a SR Gen10",rebuild_priority="High",serial_number="000A"}
> 1.0
> # HELP smartraid_controller_temperature_celsius Temperature of a SmartRAID
> controller in celsius.
> # TYPE smartraid_controller_temperature_celsius gauge
> smartraid_controller_temperature_celsius{controller_name="0"} 55.0
> # HELP smartraid_controller_sensor_info Information about a SmartRAID
> controller sensor.
> # TYPE smartraid_controller_sensor_info gauge
> smartraid_controller_sensor_info{controller_name="0",location="Inlet
> Ambient",sensor_name="0"} 1.0
> smartraid_controller_sensor_info{controller_name="0",location="ASIC",sensor_name="1"}
>
> 1.0
> smartraid_controller_sensor_info{controller_name="0",location="Top",sensor_name="2"}
>
> 1.0
> # HELP smartraid_controller_sensor_temperature_celsius Temperature of a
> SmartRAID controller sensor in celsius.
> # TYPE smartraid_controller_sensor_temperature_celsius gauge
> smartraid_controller_sensor_temperature_celsius{controller_name="0",sensor_name="0"}
>
> 43.0
> smartraid_controller_sensor_temperature_celsius{controller_name="0",sensor_name="1"}
>
> 55.0
> smartraid_controller_sensor_temperature_celsius{controller_name="0",sensor_name="2"}
>
> 41.0
> # HELP smartraid_array_info Information about a SmartRAID array.
> # TYPE smartraid_array_info gauge
> smartraid_array_info{array_name="A",controller_name="0",multidomain_status="OK",status="OK"}
>
> 1.0
> smartraid_array_info{array_name="B",controller_name="0",multidomain_status="OK",status="OK"}
>
> 1.0
> # HELP smartraid_array_unused_space_bytes Unused space of a SmartRAID
> array in bytes.
> # TYPE smartraid_array_unused_space_bytes gauge
> smartraid_array_unused_space_bytes{array_name="A",controller_name="0"} 0.0
> smartraid_array_unused_space_bytes{array_name="B",controller_name="0"} 0.0
> # HELP smartraid_array_used_space_bytes Used space of a SmartRAID array in
> bytes.
> # TYPE smartraid_array_used_space_bytes gauge
> smartraid_array_used_space_bytes{array_name="A",controller_name="0"}
> 9.601399390208e+011
> smartraid_array_used_space_bytes{array_name="B",controller_name="0"}
> 2.2401449904308225e+014
> # HELP smartraid_logical_drive_chunk_size_bytes Chunk size of a SmartRAID
> logical drive in bytes. A chunk is a set of consecutive bytes per physical
> drive.
> # TYPE smartraid_logical_drive_chunk_size_bytes gauge
> smartraid_logical_drive_chunk_size_bytes{array_name="A",controller_name="0",logical_drive_name="1"}
>
> 262144.0
> smartraid_logical_drive_chunk_size_bytes{array_name="B",controller_name="0",logical_drive_name="2"}
>
> 524288.0
> smartraid_logical_drive_chunk_size_bytes{array_name="B",controller_name="0",logical_drive_name="3"}
>
> 524288.0
> # HELP smartraid_logical_drive_data_stripe_size_bytes Data stripe size of
> a SmartRAID logical drive in bytes. A data stripe is a row of data (but not
> parity) chunks over the physical drives.
> # TYPE smartraid_logical_drive_data_stripe_size_bytes gauge
> smartraid_logical_drive_data_stripe_size_bytes{array_name="A",controller_name="0",logical_drive_name="1"}
>
> 262144.0
> smartraid_logical_drive_data_stripe_size_bytes{array_name="B",controller_name="0",logical_drive_name="2"}
>
> 6.291456e+06
> smartraid_logical_drive_data_stripe_size_bytes{array_name="B",controller_name="0",logical_drive_name="3"}
>
> 6.291456e+06
> # HELP smartraid_logical_drive_info Information about a SmartRAID logical
> drive.
> # TYPE smartraid_logical_drive_info gauge
> smartraid_logical_drive_info{array_name="A",caching="0",controller_name="0",device="/dev/sda",logical_drive_label="system",logical_drive_name="1",multidomain_status="OK",parity_initialization_status="",raid_level="1",status="OK",unrecoverable_media_errors="None"}
>
> 1.0
> smartraid_logical_drive_info{array_name="B",caching="1",controller_name="0",device="/dev/sde",logical_drive_label="data-a-0",logical_drive_name="2",multidomain_status="OK",parity_initialization_status="Initialization
>
> Completed",raid_level="6",status="OK",unrecoverable_media_errors="None"} 1.0
> smartraid_logical_drive_info{array_name="B",caching="1",controller_name="0",device="/dev/sdb",logical_drive_label="data-a-1",logical_drive_name="3",multidomain_status="OK",parity_initialization_status="Initialization
>
> Completed",raid_level="6",status="OK",unrecoverable_media_errors="None"} 1.0
> # HELP smartraid_logical_drive_size_bytes Size of a SmartRAID logical
> drive in bytes.
> # TYPE smartraid_logical_drive_size_bytes gauge
> smartraid_logical_drive_size_bytes{array_name="A",controller_name="0",logical_drive_name="1"}
>
> 4.800699695104e+011
> smartraid_logical_drive_size_bytes{array_name="B",controller_name="0",logical_drive_name="2"}
>
> 1.7592186044416e+013
> smartraid_logical_drive_size_bytes{array_name="B",controller_name="0",logical_drive_name="3"}
>
> 1.7592186044416e+013
> # HELP smartraid_physical_drive_info Information about a SmartRAID
> physical drive.
> # TYPE smartraid_physical_drive_info gauge
> smartraid_physical_drive_info{array_name="A",bay="15",box="1",controller_name="0",drive_role="Data",firmware_version="HPG1",interface_type="Solid
>
> State SATA",model="ATA
> VK000480GXAWK",multi_actuator_drive="",physical_drive_name="3I:1:15",port="3I",serial_number="000B",shingled_magnetic_recording_support="None",status="OK",wwn="00001"}
>
> 1.0
> smartraid_physical_drive_info{array_name="A",bay="16",box="1",controller_name="0",drive_role="Data",firmware_version="HPG1",interface_type="Solid
>
> State SATA",model="ATA
> VK000480GXAWK",multi_actuator_drive="",physical_drive_name="3I:1:16",port="3I",serial_number="000C",shingled_magnetic_recording_support="None",status="OK",wwn="00002"}
>
> 1.0
> smartraid_physical_drive_info{array_name="B",bay="1",box="1",controller_name="0",drive_role="Data",firmware_version="HPD4",interface_type="SAS",model="HPE
>
> MB016000JWZHE",multi_actuator_drive="",physical_drive_name="1I:1:1",port="1I",serial_number="000D",shingled_magnetic_recording_support="None",status="OK",wwn="00003"}
>
> 1.0
> smartraid_physical_drive_info{array_name="B",bay="2",box="1",controller_name="0",drive_role="Data",firmware_version="HPD4",interface_type="SAS",model="HPE
>
> MB016000JWZHE",multi_actuator_drive="",physical_drive_name="1I:1:2",port="1I",serial_number="000E",shingled_magnetic_recording_support="None",status="OK",wwn="00004"}
>
> 1.0
> smartraid_physical_drive_info{array_name="B",bay="14",box="1",controller_name="0",drive_role="Data",firmware_version="HPD4",interface_type="SAS",model="HPE
>
> MB016000JWZHE",multi_actuator_drive="",physical_drive_name="3I:1:14",port="3I",serial_number="000F",shingled_magnetic_recording_support="None",status="OK",wwn="00005"}
>
> 1.0
> # HELP
> smartraid_physical_drive_life_remaining_based_on_workload_to_date_days
> Remaining lifetime (estimated based on the workload to date) of a SmartRAID
> physical drive in days.
> # TYPE
> smartraid_physical_drive_life_remaining_based_on_workload_to_date_days gauge
> smartraid_physical_drive_life_remaining_based_on_workload_to_date_days{controller_name="0",physical_drive_name="3I:1:15"}
>
> 263431.0
> smartraid_physical_drive_life_remaining_based_on_workload_to_date_days{controller_name="0",physical_drive_name="3I:1:16"}
>
> 263431.0
> # HELP smartraid_physical_drive_power_on_hours_total Power-on time of a
> SmartRAID physical drive in hours.
> # TYPE smartraid_physical_drive_power_on_hours_total counter
> smartraid_physical_drive_power_on_hours_total{controller_name="0",physical_drive_name="3I:1:15"}
>
> 25391.0
> smartraid_physical_drive_power_on_hours_total{controller_name="0",physical_drive_name="3I:1:16"}
>
> 25391.0
> # HELP smartraid_physical_drive_rotational_speed_rpm Rotational speed of a
> SmartRAID physical drive in revolutions per minute.
> # TYPE smartraid_physical_drive_rotational_speed_rpm gauge
> smartraid_physical_drive_rotational_speed_rpm{controller_name="0",physical_drive_name="1I:1:1"}
>
> 7200.0
> smartraid_physical_drive_rotational_speed_rpm{controller_name="0",physical_drive_name="1I:1:2"}
>
> 7200.0
> smartraid_physical_drive_rotational_speed_rpm{controller_name="0",physical_drive_name="3I:1:14"}
>
> 7200.0
> # HELP smartraid_physical_drive_size_bytes Size of a SmartRAID physical
> drive in bytes.
> # TYPE smartraid_physical_drive_size_bytes gauge
> smartraid_physical_drive_size_bytes{controller_name="0",physical_drive_name="3I:1:15"}
>
> 4.8e+011
> smartraid_physical_drive_size_bytes{controller_name="0",physical_drive_name="3I:1:16"}
>
> 4.8e+011
> smartraid_physical_drive_size_bytes{controller_name="0",physical_drive_name="1I:1:1"}
>
> 1.6e+013
> smartraid_physical_drive_size_bytes{controller_name="0",physical_drive_name="1I:1:2"}
>
> 1.6e+013
> smartraid_physical_drive_size_bytes{controller_name="0",physical_drive_name="3I:1:14"}
>
> 1.6e+013
> # HELP smartraid_physical_drive_temperature_celsius Temperature of a
> SmartRAID physical drive in celsius.
> # TYPE smartraid_physical_drive_temperature_celsius gauge
> smartraid_physical_drive_temperature_celsius{controller_name="0",physical_drive_name="3I:1:15"}
>
> 35.0
> smartraid_physical_drive_temperature_celsius{controller_name="0",physical_drive_name="3I:1:16"}
>
> 32.0
> smartraid_physical_drive_temperature_celsius{controller_name="0",physical_drive_name="1I:1:1"}
>
> 30.0
> smartraid_physical_drive_temperature_celsius{controller_name="0",physical_drive_name="1I:1:2"}
>
> 33.0
> smartraid_physical_drive_temperature_celsius{controller_name="0",physical_drive_name="3I:1:14"}
>
> 37.0
> # HELP smartraid_physical_drive_usage_remaining_ratio Remaining usage (in
> terms of durability) of a SmartRAID physical drive as ratio.
> # TYPE smartraid_physical_drive_usage_remaining_ratio gauge
> smartraid_physical_drive_usage_remaining_ratio{controller_name="0",physical_drive_name="3I:1:15"}
>
> 0.996
> smartraid_physical_drive_usage_remaining_ratio{controller_name="0",physical_drive_name="3I:1:16"}
>
> 0.996
>
>
> What I did is the following:
> - Any property from the JSON which is a true number (like temperatures,
> byte sizes, etc.), did get their own metric.
>
> - Booleans became a label, too. Storing them as a metric value would of
> course be an alternative (see the question below).
>
> - Anything that's a string, became a label of an _info metric.
> I do know about enums[0], but the main problem with enum candidates
> like `status` is that I don't know all possible values (and that the
> closed upstream tool may at any time add/change them)
>
> - There are labels that act like primary keys:
> for controllers: `controller_name`
> for sensors: `controller_name`, `sensor_name`
> for arrays: `controller_name`, `array_name`
> for LDs: `controller_name`, `array_name`, `logical_drive_name`
> for PDs:`controller_name`, `physical_drive_name`
>
> For PDs, the _info metric does also contain `array_name` but
> there it's not like a primary key as it may e.g. be empty for
> unassigned drives.
>
> The idea is that with operators like `… and on(…) …` one should e.g.
> be able to use the _info metric to select all drives that have say
> status==OK ... and intersect that with e.g. the temperatures.
>
> And it should be possible to make proper alerts like
> count(smartraid_physical_drive_info{label!="OK"}) != 0
> to get any drives which are not OK.
>
> - If some property doesn't appear (like HDDs have no
> smartraid_physical_drive_usage_remaining_ratio), the metric doesn't
> appear for the respective labels. Similar, if a property that is
> mapped to a label doesn't appear, it's left empty.
>
>
> Now there are quite some things that could have done differently from
> how I did them in the first draft:
>
> 1) The "primary key" labels, don't truly uniquely identify the
> respective object over all times, but only for a given time.
>
> This is basically the point we've discussed before, with the PDs and
> the serial number.
>
> There may be two different controllers with name `0`, not at the
> same time, but when one breaks and is replaced by another.
> Array A may be deleted and replaced with another one of the same
> name, same for LDs.
> And most obvious of course for PDs.
>
> I could of course use further labels like WWN or serial numbers, but
> there wouldn't be such mean for arrays and LDs.
>
> Also, while in the case of PDs it may make sense to include the,
> serial so that one can more easily tell that e.g. some power-on-
> hours are that from another drive... it makes IMO less sense for
> e.g. the controller, when one relates that to a PD.
> If the controller changes (but still has name `0`) and the PDs all
> stay... one would not want the PDs, or LDs or arrays to be
> considered different ones from those with the old controller.
>
> So the question is:
> *If* one actually includes things like serial number: where?
> *Or* should one leave that exercise to the query (cause the info,
> which serial number a drive name had at a given time is in principle
> there, namely in the _info metric) - but that could make queries
> quite complex.
>
> I should further add, that there is no heavenly law that e.g serial
> numbers are actually different. In fact I have seen cases where
> devices effectively get the same serial number - the kernel does not
> enforce them to be different in any way.
>
>
> 2) Metrics like:
> - smartraid_physical_drive_size_bytes
> - smartraid_physical_drive_rotational_speed_rpm
> can in principle not change (again: unless the PD is replaced with
> another one of the same name).
>
> So should they rather be labels, despite being numbers?
>
> OTOH, labels like:
> - smartraid_logical_drive_size_bytes
> - smartraid_logical_drive_chunk_size_bytes
> - smartraid_logical_drive_data_stripe_size_bytes
> *can* in principle change (namely if the RAID is converted).
>
>
> 3) Is there a better way to map he various `status` properties?
> I made them now labels, as described above.
> And remember that I don't know all possible value such enums may
> take.
>
>
> 4) The way I mapped the temperatures...
> I described the alternative I see in
> https://discuss.prometheus.io/t/how-to-design-metrics-labels/2337
> I went now for the approach to have a dedicated metric for those
> where there's a dedicated property in the RAID tool output, like:
> - smartraid_controller_temperature_celsius
> - smartraid_controller_cache_module_temperature_celsius
> - smartraid_controller_capacitor_temperature_celsius
> - smartraid_physical_drive_temperature_celsius
> and a single metric with "type" label (named `location`) for those
> where the output is variable:
> - smartraid_controller_sensor_info{controller_name="0",location="Inlet
> Ambient",sensor_name="0"} 1.0
> -
> smartraid_controller_sensor_info{controller_name="0",location="ASIC",sensor_name="1"}
>
> 1.0
> -
> smartraid_controller_sensor_info{controller_name="0",location="Top",sensor_name="2"}
>
> 1.0
> (with the corresponding
> smartraid_controller_sensor_temperature_celsius)
>
>
> 5) Booleans
> There are properties like:
> - multi_actuator_drive (PDs)
> which will never change (for a given PD)
> like:
> - shingled_magnetic_recording_support (PDs)
> which might theoretically change (if it actually tells whether it
> does SMR, not whether it can do, and if it can be disabled, which
> I think is possible with zoned drives)
> like:
> - caching (LDs)
> which may actually change at any time (e.g. when the battery
> fails)
>
> I mapped them now to labels, rather than metrics.
> This question is similar to (2) above... should it be labels or
> metrics.
> For multi_actuator_drive I'd say clearly label.
> For `caching` it might even be better to have a metric.
>
> But not sure.
>
>
> 6) *If* one maps booleans to labels (like I did), is there any
> recommended values?
> I simply took 0/1, because they're the shortest. But of course there
> are true/false, yes/no, etc. pp..
>
>
> 7) I've also played with the though to split up my current:
> *_info metrics:
> - one which contains those labels that are truly static per object),
> like serial number and hardware revision
> - one like *_status_info that contains those, which may change, like
> `status`, `firmware_version`, etc.
> But I didn't do so, as most can actually change (at least in
> principle).
>
>
> Thanks,
> Chris.
>
> [0] https://prometheus.github.io/client_python/instrumenting/enum/
>
>
--
You received this message because you are subscribed to the Google Groups
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-users/aca5d25f-b84a-4cb9-b911-67d3d35e0fe3n%40googlegroups.com.