Thanks, everyone for the replies! The official msg seems to be to use a Prometheus instance per tenant/priority if you want to have multiple tenants in your environment.
Kind regards, Lili On Thursday, 30 July 2020 10:44:59 UTC+2, Ben Kochie wrote: > > I'm with Brian and Julian on this. > > Multi-tenancy is not really something we want to solve in Prometheus. This > is a concern for higher level systems like Kubernetes. Prometheus is > designed to be distributed. If you have targets with different needs, they > need to have separate Prometheus instances. > > This is also why we have things like Thanos and Cortex as aggregation > layers. > > Similar to why we have said we don't plan to implement IO limits, this is > a scheduling concern, out of scope for Prometheus. > > On Thu, Jul 30, 2020, 10:31 Frederic Branczyk <[email protected] > <javascript:>> wrote: > >> That's only effective in limiting the number of targets, the point here >> is that selectively scraping those with a higher priority based on >> backpressure of the system as a whole. >> >> On Wed, 22 Jul 2020 at 17:00, Julien Pivotto <[email protected] >> <javascript:>> wrote: >> >>> On 22 Jul 16:47, Frederic Branczyk wrote: >>> > In practice even that can still be problematic. You only know that >>> > Prometheus has a problem when everything fails, the point is to keep >>> things >>> > alive well enough for more critical components. >>> > >>> > On Wed, 22 Jul 2020 at 16:38, Julien Pivotto <[email protected] >>> <javascript:>> >>> > wrote: >>> > >>> > > On 22 Jul 16:36, Frederic Branczyk wrote: >>> > > > It's unclear how that helps, can you help me understand? >>> > > >>> > > - job: highprio >>> > > relabel_configs: >>> > > - target_label: job >>> > > replacement: pods >>> > > - source_labels: [__meta_pod_priority] >>> > > regex: high >>> > > action: keep >>> >>> highprio job will always be scraped. >>> >>> > > - job: lowprio >>> > > relabel_configs: >>> > > - target_label: job >>> > > replacement: pods >>> > > - source_labels: [__meta_pod_priority] >>> > > regex: high >>> > > action: drop >>> > > target_limit: 1000 >>> > > >>> > > > >>> > > > On Wed, 22 Jul 2020 at 16:34, Julien Pivotto < >>> [email protected] <javascript:> >>> > > > >>> > > > wrote: >>> > > > >>> > > > > On 22 Jul 16:32, Frederic Branczyk wrote: >>> > > > > > Can you explain what you mean by two jobs? Do you mean two >>> scrape >>> > > > > configs? >>> > > > > >>> > > > > Yes. >>> > > > > >>> > > > > > >>> > > > > > On Wed, 22 Jul 2020 at 11:40, Julien Pivotto < >>> > > [email protected] <javascript:> >>> > > > > > >>> > > > > > wrote: >>> > > > > > >>> > > > > > > On 22 Jul 02:35, Lili Cosic wrote: >>> > > > > > > > >>> > > > > > > > >>> > > > > > > > On Wednesday, 22 July 2020 11:23:00 UTC+2, Brian Brazil >>> wrote: >>> > > > > > > > > >>> > > > > > > > > On Wed, 22 Jul 2020 at 10:18, Julien Pivotto < >>> > > > > [email protected] >>> > > > > > > > > <javascript:>> wrote: >>> > > > > > > > > >>> > > > > > > > >> On 22 Jul 02:14, Lili Cosic wrote: >>> > > > > > > > >> > Only now seen in the docs that I am supposed to start >>> any >>> > > > > > > discussions >>> > > > > > > > >> here >>> > > > > > > > >> > first before opening an issue, sorry about that! :) >>> > > > > > > > >> > >>> > > > > > > > >> > Currently there is no way of a target to have higher >>> scrape >>> > > > > > > priority >>> > > > > > > > >> over >>> > > > > > > > >> > another, but if you have a setup and even if you set >>> target >>> > > > > limits >>> > > > > > > and >>> > > > > > > > >> > sample limits you can still overestimate your setup, >>> you >>> > > still >>> > > > > want >>> > > > > > > to >>> > > > > > > > >> have >>> > > > > > > > >> > a higher priority targets that are preferred over the >>> entire >>> > > > > > > Prometheus >>> > > > > > > > >> to >>> > > > > > > > >> > fail. It would need to be based on the inability to >>> ingest >>> > > into >>> > > > > > > tsdb on >>> > > > > > > > >> the >>> > > > > > > > >> > current rate we are scrapping, if that is hit the >>> priority >>> > > class >>> > > > > > > would >>> > > > > > > > >> take >>> > > > > > > > >> > affect and only the highest priority targets would be >>> > > scrapped >>> > > > > in >>> > > > > > > > >> favour of >>> > > > > > > > >> > lower priority. Another option which might be simpler >>> would >>> > > be >>> > > > > to >>> > > > > > > have >>> > > > > > > > >> a >>> > > > > > > > >> > global limit on how much prometheus can handle based >>> on perf >>> > > > > > > testing. >>> > > > > > > > >> > >>> > > > > > > > >> > This would be treated as a last resort, and there >>> would >>> > > > > definitely >>> > > > > > > be a >>> > > > > > > > >> > need for a high severity alert to inform the admin >>> that >>> > > > > something >>> > > > > > > went >>> > > > > > > > >> > terribly wrong, but because we would still be able to >>> ingest >>> > > > > > > Prometheus >>> > > > > > > > >> > metrics for example if they are higher priority class >>> > > alerting >>> > > > > > > would be >>> > > > > > > > >> > possible. >>> > > > > > > > >> >>> > > > > > > > >> Hi, >>> > > > > > > > >> >>> > > > > > > > >> I think that limiting the number of targets you scrape >>> is >>> > > already >>> > > > > a >>> > > > > > > last >>> > > > > > > > >> resort. I don't think we would need a second line of >>> defense. >>> > > > > > > > >> >>> > > > > > > > > >>> > > > > > > > > I agree with Julien here. If you've gotten to this point >>> you're >>> > > > > > > already >>> > > > > > > > > seriously overloaded, and prioritising individual >>> targets is >>> > > just >>> > > > > > > > > rearranging the deckchairs at that point. >>> > > > > > > > > >>> > > > > > > > > >>> > > > > > > > >> >>> > > > > > > > >> You can achieve this priority by setting 2 jobs, one >>> which is >>> > > > > limited >>> > > > > > > > >> and one which is not, and use relabeling to decinde >>> which >>> > > target >>> > > > > is >>> > > > > > > > >> going in which job. >>> > > > > > > > >> >>> > > > > > > > > >>> > > > > > > > > Or more generally, one Prometheus for the important >>> targets and >>> > > > > > > another >>> > > > > > > > > for the less important and riskier targets. >>> > > > > > > > > >>> > > > > > > > >>> > > > > > > > I get your point completely Brian, and agree to some >>> degree but >>> > > > > people >>> > > > > > > are >>> > > > > > > > still going to be setting up a multi tenant prometheus >>> which then >>> > > > > causes >>> > > > > > > > the above problems I mentioned. Even within the riskier >>> targets >>> > > there >>> > > > > > > will >>> > > > > > > > be some more important than others for users. I think we >>> should >>> > > still >>> > > > > > > > strive to making a single shared Prometheus as safe as >>> possible, >>> > > if >>> > > > > this >>> > > > > > > is >>> > > > > > > > not the priority class I suggested, open to other ideas! >>> > > > > > > >>> > > > > > > Then 2 jobs are the answer, one unlimited and one limited. >>> > > > > > > >>> > > > > > > The target_limit is already pretty advanced use case. >>> > > > > > > >>> > > > > > > > >>> > > > > > > > >>> > > > > > > > > >>> > > > > > > > > Brian >>> > > > > > > > > >>> > > > > > > > > >>> > > > > > > > >> >>> > > > > > > > >> > >>> > > > > > > > >> > We could model this on something like PriorityClass >>> > > > > > > > >> > < >>> > > > > > > > >> >>> > > > > > > >>> > > > > >>> > > >>> https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/#priorityclass >>> > > > > > >>> > > > > > > >>> > > > > > > > >> from >>> > > > > > > > >> > Kubernetes, but I am open to other suggestions. >>> > > > > > > > >> >>> > > > > > > > >> That could be used in relabeling as I said. >>> > > > > > > > >> >>> > > > > > > > >> > >>> > > > > > > > >> > I am open to other suggestions, or maybe there is >>> something >>> > > like >>> > > > > > > this >>> > > > > > > > >> but I >>> > > > > > > > >> > missed it. The main purpose is to ensure there are >>> > > protection >>> > > > > > > > >> mechanisms in >>> > > > > > > > >> > place, so any ideas and suggestions welcome! >>> > > > > > > > >> > >>> > > > > > > > >> >>> > > > > > > > >> regards, >>> > > > > > > > >> >>> > > > > > > > >> > Thanks and kind regards, >>> > > > > > > > >> > Lili >>> > > > > > > > >> > >>> > > > > > > > >> > -- >>> > > > > > > > >> > You received this message because you are subscribed >>> to the >>> > > > > Google >>> > > > > > > > >> Groups "Prometheus Developers" group. >>> > > > > > > > >> > To unsubscribe from this group and stop receiving >>> emails >>> > > from >>> > > > > it, >>> > > > > > > send >>> > > > > > > > >> an email to >>> > > [email protected] <javascript:> >>> > > > > > > > >> <javascript:>. >>> > > > > > > > >> > To view this discussion on the web visit >>> > > > > > > > >> >>> > > > > > > >>> > > > > >>> > > >>> https://groups.google.com/d/msgid/prometheus-developers/30df615e-5420-4bdf-9cb7-2790ef19d520o%40googlegroups.com >>> > > > > > > > >> . >>> > > > > > > > >> >>> > > > > > > > >> >>> > > > > > > > >> -- >>> > > > > > > > >> Julien Pivotto >>> > > > > > > > >> @roidelapluie >>> > > > > > > > >> >>> > > > > > > > >> -- >>> > > > > > > > >> You received this message because you are subscribed to >>> the >>> > > Google >>> > > > > > > Groups >>> > > > > > > > >> "Prometheus Developers" group. >>> > > > > > > > >> To unsubscribe from this group and stop receiving >>> emails from >>> > > it, >>> > > > > > > send an >>> > > > > > > > >> email to >>> [email protected] <javascript:> >>> > > > > > > <javascript:> >>> > > > > > > > >> . >>> > > > > > > > >> To view this discussion on the web visit >>> > > > > > > > >> >>> > > > > > > >>> > > > > >>> > > >>> https://groups.google.com/d/msgid/prometheus-developers/20200722091759.GA140540%40oxygen >>> > > > > > > > >> . >>> > > > > > > > >> >>> > > > > > > > > >>> > > > > > > > > >>> > > > > > > > > -- >>> > > > > > > > > Brian Brazil >>> > > > > > > > > www.robustperception.io >>> > > > > > > > > >>> > > > > > > > >>> > > > > > > > -- >>> > > > > > > > You received this message because you are subscribed to the >>> > > Google >>> > > > > > > Groups "Prometheus Developers" group. >>> > > > > > > > To unsubscribe from this group and stop receiving emails >>> from it, >>> > > > > send >>> > > > > > > an email to >>> [email protected] <javascript:>. >>> > > > > > > > To view this discussion on the web visit >>> > > > > > > >>> > > > > >>> > > >>> https://groups.google.com/d/msgid/prometheus-developers/b0b9e5f7-239a-4cc7-9108-9e6e015a30d6o%40googlegroups.com >>> > > > > > > . >>> > > > > > > >>> > > > > > > >>> > > > > > > -- >>> > > > > > > Julien Pivotto >>> > > > > > > @roidelapluie >>> > > > > > > >>> > > > > > > -- >>> > > > > > > You received this message because you are subscribed to the >>> Google >>> > > > > Groups >>> > > > > > > "Prometheus Developers" group. >>> > > > > > > To unsubscribe from this group and stop receiving emails >>> from it, >>> > > send >>> > > > > an >>> > > > > > > email to [email protected] >>> <javascript:>. >>> > > > > > > To view this discussion on the web visit >>> > > > > > > >>> > > > > >>> > > >>> https://groups.google.com/d/msgid/prometheus-developers/20200722094024.GA175281%40oxygen >>> > > > > > > . >>> > > > > > > >>> > > > > > >>> > > > > > -- >>> > > > > > You received this message because you are subscribed to the >>> Google >>> > > > > Groups "Prometheus Developers" group. >>> > > > > > To unsubscribe from this group and stop receiving emails from >>> it, >>> > > send >>> > > > > an email to [email protected] >>> <javascript:>. >>> > > > > > To view this discussion on the web visit >>> > > > > >>> > > >>> https://groups.google.com/d/msgid/prometheus-developers/CAOs1Umx-uFZFPoeOMA-ev4oN5QoRUyODiCWnSZML3hessHkmBQ%40mail.gmail.com >>> > > > > . >>> > > > > >>> > > > > -- >>> > > > > Julien Pivotto >>> > > > > @roidelapluie >>> > > > > >>> > > > >>> > > > -- >>> > > > You received this message because you are subscribed to the Google >>> > > Groups "Prometheus Developers" group. >>> > > > To unsubscribe from this group and stop receiving emails from it, >>> send >>> > > an email to [email protected] >>> <javascript:>. >>> > > > To view this discussion on the web visit >>> > > >>> https://groups.google.com/d/msgid/prometheus-developers/CAOs1UmzgPKCrpmsDb4v3CrN9Oe%2Bmaka8bosCDuodmjmd-RAyLw%40mail.gmail.com >>> > > . >>> > > >>> > > -- >>> > > Julien Pivotto >>> > > @roidelapluie >>> > > >>> > >>> > -- >>> > You received this message because you are subscribed to the Google >>> Groups "Prometheus Developers" group. >>> > To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected] >>> <javascript:>. >>> > To view this discussion on the web visit >>> https://groups.google.com/d/msgid/prometheus-developers/CAOs1UmyxR%3DQ%2B6_emwh12CVwkwemU%2B-tzenvgP1WQ%2BCHnw67UUQ%40mail.gmail.com >>> . >>> >>> -- >>> Julien Pivotto >>> @roidelapluie >>> >> -- >> You received this message because you are subscribed to the Google Groups >> "Prometheus Developers" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:> >> . >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/prometheus-developers/CAOs1UmwjYgxU9ABkATe04febF_010n3%3DKVoEm8J_5XGnf0je%2Bg%40mail.gmail.com >> >> <https://groups.google.com/d/msgid/prometheus-developers/CAOs1UmwjYgxU9ABkATe04febF_010n3%3DKVoEm8J_5XGnf0je%2Bg%40mail.gmail.com?utm_medium=email&utm_source=footer> >> . >> > -- You received this message because you are subscribed to the Google Groups "Prometheus Developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-developers/4e4786ba-2ecd-497d-b900-18c8a30e9c75o%40googlegroups.com.

