https://github.com/prometheus/snmp_exporter/tree/main/generator
Alexander Wilke schrieb am Samstag, 16. März 2024 um 09:08:44 UTC+1: > Check File Format example. > > Time Out, retries, max-repetition. > > I use Repetition 50 or 100 with Cisco, retries 0 and Time Out 1s or 500ms > below Prometheus timeout > > Ben Kochie schrieb am Samstag, 16. März 2024 um 06:31:17 UTC+1: > >> This is very likely a problem with counter resets or some other kind of >> duplicate data. >> >> The best way to figure this out is to perform the query, but without the >> `rate()` function. >> >> This can be done via the Prometheus UI (harder to do in Grafana) in the >> "Table" view. >> >> Here is an example demo query >> <https://prometheus.demo.do.prometheus.io/graph?g0.expr=process_cpu_seconds_total%7Bjob%3D%22prometheus%22%7D%5B2m%5D&g0.tab=1&g0.display_mode=lines&g0.show_exemplars=0&g0.range_input=1h> >> >> The results is a list of the raw samples that are needed to debug. >> >> On Fri, Mar 15, 2024 at 11:41 PM Nick Carlton <[email protected]> >> wrote: >> >>> Hello Everyone, >>> >>> I have just seen something weird in my environment where I saw interface >>> bandwidth on a gigabit switch reach about 1tbps on some of the >>> interfaces..... >>> >>> Here is the query im using: >>> >>> rate(ifHCInOctets{ifHCInOctetsIntfName=~".*.\\/.*.",instance="<device-name>"}[2m]) >>> >>> * 8 >>> >>> Which ive never had a problem with. Here is an image of the graph >>> showing the massive increase in bandwidth and then decrease back to normal: >>> >>> [image: Screenshot 2024-03-15 222353.png] >>> >>> When Ive done some more investigation into what could have happened, I >>> can see that the 'snmp_scrape_duration_seconds' metric increases to around >>> 20s at the time. So the cisco switch is talking 20 seconds to respond to >>> the SNMP request. >>> >>> [image: Screenshot 2024-03-15 222244.png] >>> >>> Im a bit confused as to how this could cause the rate query to give >>> completely false data? Could the delay in data have caused prometheus to >>> think there was more bandwidth on the interface? The switch certainly >>> cannot do the speeds the graph is claiming! >>> >>> Im on v0.25.0 on the SNMP exporter and its normally sat around 2s for >>> the scrapes. Im not blaming the exporter for the high response times, thats >>> probably the switch. Just wondering if in some way the high response time >>> could cause the rate query to give incorrect data. The fact the graph went >>> back to normal post the high reponse times makes me think it wasn't the >>> switch giving duff data. >>> >>> Anyone seen this before and is there any way to mitigate? Happy to >>> provide more info if required :) >>> >>> Thanks >>> Nick >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Prometheus Users" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/prometheus-users/6fd3dca6-2013-47ad-af8f-3344e79954a7n%40googlegroups.com >>> >>> <https://groups.google.com/d/msgid/prometheus-users/6fd3dca6-2013-47ad-af8f-3344e79954a7n%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >> -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/4cd1e6b8-fa73-4ee0-92c0-c504c161870bn%40googlegroups.com.

