To start, NFS is not supported. Only local disk storage. Q1: Prometheus today can scale to about 100M series, but operates a bit better below 50M series
Q2: Infinite, the 2.0 TSDB has no practical storage limit. Q3: I've heard about instances with upwards of 50k targets. Q4: Query performance can start to degrade when you try and query over about 10M series or a large time range. A good proxy for this is how many samples are loaded to solve a single query. There is a default flag of --query.max-samples of 50M that cancels queries that are too large. This is enough to query 1400 metrics over a year time range with 15s scrape intervals. Of course, this can easily be increased depending on your infra. We run benchmarks with every Prometheus release with https://github.com/prometheus/test-infra On Tue, Jun 11, 2024 at 7:24 AM mohan garden <[email protected]> wrote: > Hi All, > I am reaching out to gather some quantitative insights and experiences > regarding the scalability of single prometheus instance. I understand that > performance and scalability can vary significantly based on different > aspects of infrastructure like whether the backend storage is local disk of > NFS, network bandwidth , number of targets and metrics per targets, scrape > interval etc. > For the following queries you can assume ideal conditions I.e. Optimal > hardware , Maximum network bandwidth , Local disk ( say SSD) Or > if you can share the information on the hardware and the performance > metrics , that would help . > > Here are the questions: > Q1. What is the practical limit on the number of active series which > prometheus can handle or, > What is the maximum number of active series to which prometheus can > scale? > > > https://www.robustperception.io/scaling-and-federating-prometheus > this article was written in 2015 and mentions that single instance > can scale at minimum 1M series ( 1000 servers x 1000 metrics) > > https://prometheus.io/docs/introduction/faq/#i-was-told-prometheus-doesnt-scale > here i assume under ideal conditions, it can scale between 20 to > 90M. > > Q2: What are the practical limits for storage and data retention in a > single instance. > > Q3: What is the highest number of targets and total metrics per target > that can be efficiently scraped by single instance? > > Q4: How does query performance ( latency) scale with increasing number of > metrics and targets? > > Any shared experiences , benchmarks or references to relevant > documentation would help . > > Thank you > Regards, > > > -- > You received this message because you are subscribed to the Google Groups > "Prometheus Users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/prometheus-users/800c9642-1d65-4806-b634-412429766831n%40googlegroups.com > <https://groups.google.com/d/msgid/prometheus-users/800c9642-1d65-4806-b634-412429766831n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/CABbyFmpaNUSWYm7HFkRVPAcMrwruq4nAziffs-G2EW8MBCMKyQ%40mail.gmail.com.

