Re: [prometheus-users] OOM error for Prometheus

Adso Castro Thu, 16 Jul 2020 11:56:15 -0700

@Martin
 Just a ping about this issue, how did you identified what services were 
causing you trouble with too much metrics? I'm asking because I'm facing a 
similar problem at the moment.


Thank you.
Em segunda-feira, 13 de abril de 2020 às 06:53:13 UTC-3, Martin Man 
escreveu:

> Hi Nishant,
>
> I’m also new to prometheus and faced similar scenario recently.
>
> What helped me was to add a job to monitor prometheus instance itself, 
> then import a prometheus 2.0 grafana dashboard and watch prometheus memory 
> consumption and samples appended per second while defining new 
> servicemonitors. This in the end helped me stabilise the memory usage as 
> well as identify services that generated way too many metrics responsible 
> for huge memory consumption.
>
> HTH,
> Martin
>
>
> > On 13 Apr 2020, at 11:03, Nishant Ketu <[email protected]> wrote:
> > 
> > We have deployed Prometheus through helm and using after around 2 months 
> we get OOM error and the pods failed to restart. We have manually clean up 
> the /data to get the pod running again. I have used the retention flag but 
> it don't seem to work on wall folder of /data. Any help for this would be 
> nice. Thanks
> > 
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/a8fad06c-8df5-4581-9549-7264c7a8b3e5n%40googlegroups.com.

Re: [prometheus-users] OOM error for Prometheus

Reply via email to