Re: [prometheus-users] Smokeping_prober CPU usage optimization possible?

Ben Kochie Tue, 27 Feb 2024 13:24:53 -0800

The variable change is for the smokeping prober, not prometheus.

I don't know how you run your services.


On Tue, Feb 27, 2024, 20:12 Alexander Wilke <[email protected]> wrote:

> Hello Ben,
>
> I googled a little bit and found this:
> https://github.com/prometheus/prometheus/issues/2665#issuecomment-342149607
>
> As far as I understand this variable is not working anymore or not used
> anymore !?
> I tried in a test environment:
>
> export GOGC=200
>
> And the restarted (not reload) prometheus and in the UI --> "Runime &
> Build Information" the GOGC is still empty.
>
> 1.) Is this environment variable set correctly?
> 2.) Is the variable still working?
> 3.) If it is still working can I apply it only to smokeping_prober but not
> other services like prometheus? Sounds like higher GOGC has tradeoffs for
> queries in the prometheus tsdb ?
>
> Ben Kochie schrieb am Dienstag, 27. Februar 2024 um 16:59:31 UTC+1:
>
>> Interesting, thanks for the data. It does seem like the process is
>> spending a lot of time doing GC like I thought.
>>
>> One trick you could try is to increase the memory allocated to
>> the prober, which would reduce the time spent on GC.
>>
>> The default setting is is GOGC=100.
>>
>> You could try increasing this by setting the environment variable, GOGC.
>>
>> Try something like GOGC=200 or GOGC=300.
>>
>> This will make the process use more memory, but it should reduce the CPU
>> time spent.
>>
>> On Sun, Feb 25, 2024 at 11:18 PM Alexander Wilke <[email protected]>
>> wrote:
>>
>>> Hello,
>>>
>>> I attached a few screenshots showing the results and graphs for 1h and
>>> 6h.
>>> In addition I added a screenshot from node_exporter metrics to give you
>>> an overview of the system itself.
>>> On the same system there is prometheus, grafana, snmp_exporter (200-800%
>>> CPU), smokeping prober, node_exporter, blackbox_exporter.
>>> The main CPU consumers are snmp_exporter and smokeping.
>>>
>>> Ben Kochie schrieb am Sonntag, 25. Februar 2024 um 19:22:35 UTC+1:
>>>
>>>> Looking at the CPU profile, I'm seeing almost all the time spent in the
>>>> Go runtime. Mostly the ICMP packet receiving code and garbage collection.
>>>> I'm not sure there's a lot we can optimize here as it's core Go code for
>>>> ICMP packet handling.
>>>>
>>>> Can you also post me a graph of a few metrics queries?
>>>>
>>>> rate(process_cpu_seconds_total{job="smokeping_prober"}[30s])
>>>> rate(go_gc_duration_seconds_count{job="smokeping_prober"}[5m])
>>>> rate(go_gc_duration_seconds_sum{job="smokeping_prober"}[5m])
>>>>
>>>>
>>>> On Sun, Feb 25, 2024 at 7:08 PM Alexander Wilke <[email protected]>
>>>> wrote:
>>>>
>>>>> Hello,
>>>>> any Chance to investigate the Reports and any suggestions?
>>>>>
>>>>> Alexander Wilke schrieb am Donnerstag, 22. Februar 2024 um 12:40:09
>>>>> UTC+1:
>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> sorry for the delay. here are the results. to be honest - I do not
>>>>>> understand anything of it.
>>>>>>
>>>>>> Smokeping_Prober Heap:
>>>>>>
>>>>>>
>>>>>> https://pprof.me/a1e7400d32859dbc217e2182398485df/?profileType=profile%3Aalloc_objects%3Acount%3Aspace%3Abytes&dashboard_items=icicle
>>>>>>
>>>>>>
>>>>>>
>>>>>> smokeping_prober profile30s
>>>>>>
>>>>>>
>>>>>> https://pprof.me/340674b335e114e4b0df6b4582f0644e/?profileType=profile%3Asamples%3Acount%3Acpu%3Ananoseconds%3Adelta
>>>>>>
>>>>>> Ben Kochie schrieb am Dienstag, 20. Februar 2024 um 10:27:10 UTC+1:
>>>>>>
>>>>>>> Best thing you can do is capture some pprof data. That will show you
>>>>>>> what it's spending the time on.
>>>>>>>
>>>>>>> :9374/debug/pprof/heap
>>>>>>> :9374/debug/pprof/profile?seconds=30
>>>>>>>
>>>>>>> You can post the results to https://pprof.me/ for sharing.
>>>>>>>
>>>>>>> On Tue, Feb 20, 2024 at 6:22 AM Alexander Wilke <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hello,
>>>>>>>> I am running smokeping_prober from one VM to Monitor around 500
>>>>>>>> destinations.
>>>>>>>> Around 30 devices are monitored with 0.2s Intervall and Others with
>>>>>>>> 1.65s Intervall.
>>>>>>>>
>>>>>>>> Prometheus scrapes every 5s.
>>>>>>>>
>>>>>>>> So there are roughly 600 icmp ipv4 24byte pings per Seconds.
>>>>>>>> CPU usage jumps between 700-1200% using "top"
>>>>>>>>
>>>>>>>> What Else except reducing Interval or Host Count could Help to
>>>>>>>> reduce CPU usage?
>>>>>>>> Is the UDP Socket "better" or any other optimization which could be
>>>>>>>> relevant for that Type of Traffic? Running on RHEL8
>>>>>>>>
>>>>>>>> Someone with similar CPU usage and this amount of pings per
>>>>>>>> Seconds? Maybe Others Ping 6.000 Destination every 10s?
>>>>>>>>
>>>>>>>> --
>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>> Groups "Prometheus Users" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>> send an email to [email protected].
>>>>>>>> To view this discussion on the web visit
>>>>>>>> https://groups.google.com/d/msgid/prometheus-users/d803c1a2-64ee-48d1-8513-b864856f53c8n%40googlegroups.com
>>>>>>>> <https://groups.google.com/d/msgid/prometheus-users/d803c1a2-64ee-48d1-8513-b864856f53c8n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>> .
>>>>>>>>
>>>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "Prometheus Users" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to [email protected].
>>>>>
>>>> To view this discussion on the web visit
>>>>> https://groups.google.com/d/msgid/prometheus-users/62aeaacb-9fd6-4d64-8bec-d7171f592766n%40googlegroups.com
>>>>> <https://groups.google.com/d/msgid/prometheus-users/62aeaacb-9fd6-4d64-8bec-d7171f592766n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Prometheus Users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>>
>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/prometheus-users/045a6fb0-b82c-4425-8bae-e2c99403b2adn%40googlegroups.com
>>> <https://groups.google.com/d/msgid/prometheus-users/045a6fb0-b82c-4425-8bae-e2c99403b2adn%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/bc0953fd-9d80-48ce-8263-d9f5882db6d5n%40googlegroups.com
> <https://groups.google.com/d/msgid/prometheus-users/bc0953fd-9d80-48ce-8263-d9f5882db6d5n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CABbyFmos33_EGg7RtyKEOQuRPZW1GNXm3x_7QpGkTsfcpGjFZg%40mail.gmail.com.

Re: [prometheus-users] Smokeping_prober CPU usage optimization possible?

Reply via email to