Re: [prometheus-developers] Introduce the concept of scrape Priority for Targets

Lili Cosic Thu, 30 Jul 2020 02:02:18 -0700

Thanks, everyone for the replies! The official msg seems to be to use a 
Prometheus instance per tenant/priority if you want to have multiple 
tenants in your environment.


Kind regards,
Lili

On Thursday, 30 July 2020 10:44:59 UTC+2, Ben Kochie wrote:
>
> I'm with Brian and Julian on this.
>
> Multi-tenancy is not really something we want to solve in Prometheus. This 
> is a concern for higher level systems like Kubernetes. Prometheus is 
> designed to be distributed. If you have targets with different needs, they 
> need to have separate Prometheus instances.
>
> This is also why we have things like Thanos and Cortex as aggregation 
> layers.
>
> Similar to why we have said we don't plan to implement IO limits, this is 
> a scheduling concern, out of scope for Prometheus.
>
> On Thu, Jul 30, 2020, 10:31 Frederic Branczyk <[email protected] 
> <javascript:>> wrote:
>
>> That's only effective in limiting the number of targets, the point here 
>> is that selectively scraping those with a higher priority based on 
>> backpressure of the system as a whole.
>>
>> On Wed, 22 Jul 2020 at 17:00, Julien Pivotto <[email protected] 
>> <javascript:>> wrote:
>>
>>> On 22 Jul 16:47, Frederic Branczyk wrote:
>>> > In practice even that can still be problematic. You only know that
>>> > Prometheus has a problem when everything fails, the point is to keep 
>>> things
>>> > alive well enough for more critical components.
>>> > 
>>> > On Wed, 22 Jul 2020 at 16:38, Julien Pivotto <[email protected] 
>>> <javascript:>>
>>> > wrote:
>>> > 
>>> > > On 22 Jul 16:36, Frederic Branczyk wrote:
>>> > > > It's unclear how that helps, can you help me understand?
>>> > >
>>> > > - job: highprio
>>> > >   relabel_configs:
>>> > >   - target_label: job
>>> > >     replacement: pods
>>> > >   - source_labels: [__meta_pod_priority]
>>> > >     regex: high
>>> > >     action: keep
>>>
>>> highprio job will always be scraped.
>>>
>>> > > - job: lowprio
>>> > >   relabel_configs:
>>> > >   - target_label: job
>>> > >     replacement: pods
>>> > >   - source_labels: [__meta_pod_priority]
>>> > >     regex: high
>>> > >     action: drop
>>> > >   target_limit: 1000
>>> > >
>>> > > >
>>> > > > On Wed, 22 Jul 2020 at 16:34, Julien Pivotto <
>>> [email protected] <javascript:>
>>> > > >
>>> > > > wrote:
>>> > > >
>>> > > > > On 22 Jul 16:32, Frederic Branczyk wrote:
>>> > > > > > Can you explain what you mean by two jobs? Do you mean two 
>>> scrape
>>> > > > > configs?
>>> > > > >
>>> > > > > Yes.
>>> > > > >
>>> > > > > >
>>> > > > > > On Wed, 22 Jul 2020 at 11:40, Julien Pivotto <
>>> > > [email protected] <javascript:>
>>> > > > > >
>>> > > > > > wrote:
>>> > > > > >
>>> > > > > > > On 22 Jul 02:35, Lili Cosic wrote:
>>> > > > > > > >
>>> > > > > > > >
>>> > > > > > > > On Wednesday, 22 July 2020 11:23:00 UTC+2, Brian Brazil 
>>> wrote:
>>> > > > > > > > >
>>> > > > > > > > > On Wed, 22 Jul 2020 at 10:18, Julien Pivotto <
>>> > > > > [email protected]
>>> > > > > > > > > <javascript:>> wrote:
>>> > > > > > > > >
>>> > > > > > > > >> On 22 Jul 02:14, Lili Cosic wrote:
>>> > > > > > > > >> > Only now seen in the docs that I am supposed to start 
>>> any
>>> > > > > > > discussions
>>> > > > > > > > >> here
>>> > > > > > > > >> > first before opening an issue, sorry about that! :)
>>> > > > > > > > >> >
>>> > > > > > > > >> > Currently there is no way of a target to have higher 
>>> scrape
>>> > > > > > > priority
>>> > > > > > > > >> over
>>> > > > > > > > >> > another, but if you have a setup and even if you set 
>>> target
>>> > > > > limits
>>> > > > > > > and
>>> > > > > > > > >> > sample limits you can still overestimate your setup, 
>>> you
>>> > > still
>>> > > > > want
>>> > > > > > > to
>>> > > > > > > > >> have
>>> > > > > > > > >> > a higher priority targets that are preferred over the 
>>> entire
>>> > > > > > > Prometheus
>>> > > > > > > > >> to
>>> > > > > > > > >> > fail. It would need to be based on the inability to 
>>> ingest
>>> > > into
>>> > > > > > > tsdb on
>>> > > > > > > > >> the
>>> > > > > > > > >> > current rate we are scrapping, if that is hit the 
>>> priority
>>> > > class
>>> > > > > > > would
>>> > > > > > > > >> take
>>> > > > > > > > >> > affect and only the highest priority targets would be
>>> > > scrapped
>>> > > > > in
>>> > > > > > > > >> favour of
>>> > > > > > > > >> > lower priority. Another option which might be simpler 
>>> would
>>> > > be
>>> > > > > to
>>> > > > > > > have
>>> > > > > > > > >> a
>>> > > > > > > > >> > global limit on how much prometheus can handle based 
>>> on perf
>>> > > > > > > testing.
>>> > > > > > > > >> >
>>> > > > > > > > >> > This would be treated as a last resort, and there 
>>> would
>>> > > > > definitely
>>> > > > > > > be a
>>> > > > > > > > >> > need for a high severity alert to inform the admin 
>>> that
>>> > > > > something
>>> > > > > > > went
>>> > > > > > > > >> > terribly wrong, but because we would still be able to 
>>> ingest
>>> > > > > > > Prometheus
>>> > > > > > > > >> > metrics for example if they are higher priority class
>>> > > alerting
>>> > > > > > > would be
>>> > > > > > > > >> > possible.
>>> > > > > > > > >>
>>> > > > > > > > >> Hi,
>>> > > > > > > > >>
>>> > > > > > > > >> I think that limiting the number of targets you scrape 
>>> is
>>> > > already
>>> > > > > a
>>> > > > > > > last
>>> > > > > > > > >> resort. I don't think we would need a second line of 
>>> defense.
>>> > > > > > > > >>
>>> > > > > > > > >
>>> > > > > > > > > I agree with Julien here. If you've gotten to this point 
>>> you're
>>> > > > > > > already
>>> > > > > > > > > seriously overloaded, and prioritising individual 
>>> targets is
>>> > > just
>>> > > > > > > > > rearranging the deckchairs at that point.
>>> > > > > > > > >
>>> > > > > > > > >
>>> > > > > > > > >>
>>> > > > > > > > >> You can achieve this priority by setting 2 jobs, one 
>>> which is
>>> > > > > limited
>>> > > > > > > > >> and one which is not, and use relabeling to decinde 
>>> which
>>> > > target
>>> > > > > is
>>> > > > > > > > >> going in which job.
>>> > > > > > > > >>
>>> > > > > > > > >
>>> > > > > > > > > Or more generally, one Prometheus for the important 
>>> targets and
>>> > > > > > > another
>>> > > > > > > > > for the less important and riskier targets.
>>> > > > > > > > >
>>> > > > > > > >
>>> > > > > > > > I get your point completely Brian, and agree to some 
>>> degree but
>>> > > > > people
>>> > > > > > > are
>>> > > > > > > > still going to be setting up a multi tenant prometheus 
>>> which then
>>> > > > > causes
>>> > > > > > > > the above problems I mentioned. Even within the riskier 
>>> targets
>>> > > there
>>> > > > > > > will
>>> > > > > > > > be some more important than others for users. I think we 
>>> should
>>> > > still
>>> > > > > > > > strive to making a single shared Prometheus as safe as 
>>> possible,
>>> > > if
>>> > > > > this
>>> > > > > > > is
>>> > > > > > > > not the priority class I suggested, open to other ideas!
>>> > > > > > >
>>> > > > > > > Then 2 jobs are the answer, one unlimited and one limited.
>>> > > > > > >
>>> > > > > > > The target_limit is already pretty advanced use case.
>>> > > > > > >
>>> > > > > > > >
>>> > > > > > > >
>>> > > > > > > > >
>>> > > > > > > > > Brian
>>> > > > > > > > >
>>> > > > > > > > >
>>> > > > > > > > >>
>>> > > > > > > > >> >
>>> > > > > > > > >> > We could model this on something like PriorityClass
>>> > > > > > > > >> > <
>>> > > > > > > > >>
>>> > > > > > >
>>> > > > >
>>> > > 
>>> https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/#priorityclass
>>> > > > > >
>>> > > > > > >
>>> > > > > > > > >> from
>>> > > > > > > > >> > Kubernetes, but I am open to other suggestions.
>>> > > > > > > > >>
>>> > > > > > > > >> That could be used in relabeling as I said.
>>> > > > > > > > >>
>>> > > > > > > > >> >
>>> > > > > > > > >> > I am open to other suggestions, or maybe there is 
>>> something
>>> > > like
>>> > > > > > > this
>>> > > > > > > > >> but I
>>> > > > > > > > >> > missed it. The main purpose is to ensure there are
>>> > > protection
>>> > > > > > > > >> mechanisms in
>>> > > > > > > > >> > place, so any ideas and suggestions welcome!
>>> > > > > > > > >> >
>>> > > > > > > > >>
>>> > > > > > > > >> regards,
>>> > > > > > > > >>
>>> > > > > > > > >> > Thanks and kind regards,
>>> > > > > > > > >> > Lili
>>> > > > > > > > >> >
>>> > > > > > > > >> > --
>>> > > > > > > > >> > You received this message because you are subscribed 
>>> to the
>>> > > > > Google
>>> > > > > > > > >> Groups "Prometheus Developers" group.
>>> > > > > > > > >> > To unsubscribe from this group and stop receiving 
>>> emails
>>> > > from
>>> > > > > it,
>>> > > > > > > send
>>> > > > > > > > >> an email to
>>> > > [email protected] <javascript:>
>>> > > > > > > > >> <javascript:>.
>>> > > > > > > > >> > To view this discussion on the web visit
>>> > > > > > > > >>
>>> > > > > > >
>>> > > > >
>>> > > 
>>> https://groups.google.com/d/msgid/prometheus-developers/30df615e-5420-4bdf-9cb7-2790ef19d520o%40googlegroups.com
>>> > > > > > > > >> .
>>> > > > > > > > >>
>>> > > > > > > > >>
>>> > > > > > > > >> --
>>> > > > > > > > >> Julien Pivotto
>>> > > > > > > > >> @roidelapluie
>>> > > > > > > > >>
>>> > > > > > > > >> --
>>> > > > > > > > >> You received this message because you are subscribed to 
>>> the
>>> > > Google
>>> > > > > > > Groups
>>> > > > > > > > >> "Prometheus Developers" group.
>>> > > > > > > > >> To unsubscribe from this group and stop receiving 
>>> emails from
>>> > > it,
>>> > > > > > > send an
>>> > > > > > > > >> email to 
>>> [email protected] <javascript:>
>>> > > > > > > <javascript:>
>>> > > > > > > > >> .
>>> > > > > > > > >> To view this discussion on the web visit
>>> > > > > > > > >>
>>> > > > > > >
>>> > > > >
>>> > > 
>>> https://groups.google.com/d/msgid/prometheus-developers/20200722091759.GA140540%40oxygen
>>> > > > > > > > >> .
>>> > > > > > > > >>
>>> > > > > > > > >
>>> > > > > > > > >
>>> > > > > > > > > --
>>> > > > > > > > > Brian Brazil
>>> > > > > > > > > www.robustperception.io
>>> > > > > > > > >
>>> > > > > > > >
>>> > > > > > > > --
>>> > > > > > > > You received this message because you are subscribed to the
>>> > > Google
>>> > > > > > > Groups "Prometheus Developers" group.
>>> > > > > > > > To unsubscribe from this group and stop receiving emails 
>>> from it,
>>> > > > > send
>>> > > > > > > an email to 
>>> [email protected] <javascript:>.
>>> > > > > > > > To view this discussion on the web visit
>>> > > > > > >
>>> > > > >
>>> > > 
>>> https://groups.google.com/d/msgid/prometheus-developers/b0b9e5f7-239a-4cc7-9108-9e6e015a30d6o%40googlegroups.com
>>> > > > > > > .
>>> > > > > > >
>>> > > > > > >
>>> > > > > > > --
>>> > > > > > > Julien Pivotto
>>> > > > > > > @roidelapluie
>>> > > > > > >
>>> > > > > > > --
>>> > > > > > > You received this message because you are subscribed to the 
>>> Google
>>> > > > > Groups
>>> > > > > > > "Prometheus Developers" group.
>>> > > > > > > To unsubscribe from this group and stop receiving emails 
>>> from it,
>>> > > send
>>> > > > > an
>>> > > > > > > email to [email protected] 
>>> <javascript:>.
>>> > > > > > > To view this discussion on the web visit
>>> > > > > > >
>>> > > > >
>>> > > 
>>> https://groups.google.com/d/msgid/prometheus-developers/20200722094024.GA175281%40oxygen
>>> > > > > > > .
>>> > > > > > >
>>> > > > > >
>>> > > > > > --
>>> > > > > > You received this message because you are subscribed to the 
>>> Google
>>> > > > > Groups "Prometheus Developers" group.
>>> > > > > > To unsubscribe from this group and stop receiving emails from 
>>> it,
>>> > > send
>>> > > > > an email to [email protected] 
>>> <javascript:>.
>>> > > > > > To view this discussion on the web visit
>>> > > > >
>>> > > 
>>> https://groups.google.com/d/msgid/prometheus-developers/CAOs1Umx-uFZFPoeOMA-ev4oN5QoRUyODiCWnSZML3hessHkmBQ%40mail.gmail.com
>>> > > > > .
>>> > > > >
>>> > > > > --
>>> > > > > Julien Pivotto
>>> > > > > @roidelapluie
>>> > > > >
>>> > > >
>>> > > > --
>>> > > > You received this message because you are subscribed to the Google
>>> > > Groups "Prometheus Developers" group.
>>> > > > To unsubscribe from this group and stop receiving emails from it, 
>>> send
>>> > > an email to [email protected] 
>>> <javascript:>.
>>> > > > To view this discussion on the web visit
>>> > > 
>>> https://groups.google.com/d/msgid/prometheus-developers/CAOs1UmzgPKCrpmsDb4v3CrN9Oe%2Bmaka8bosCDuodmjmd-RAyLw%40mail.gmail.com
>>> > > .
>>> > >
>>> > > --
>>> > > Julien Pivotto
>>> > > @roidelapluie
>>> > >
>>> > 
>>> > -- 
>>> > You received this message because you are subscribed to the Google 
>>> Groups "Prometheus Developers" group.
>>> > To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to [email protected] 
>>> <javascript:>.
>>> > To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/prometheus-developers/CAOs1UmyxR%3DQ%2B6_emwh12CVwkwemU%2B-tzenvgP1WQ%2BCHnw67UUQ%40mail.gmail.com
>>> .
>>>
>>> -- 
>>> Julien Pivotto
>>> @roidelapluie
>>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Prometheus Developers" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>
>> .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/prometheus-developers/CAOs1UmwjYgxU9ABkATe04febF_010n3%3DKVoEm8J_5XGnf0je%2Bg%40mail.gmail.com
>>  
>> <https://groups.google.com/d/msgid/prometheus-developers/CAOs1UmwjYgxU9ABkATe04febF_010n3%3DKVoEm8J_5XGnf0je%2Bg%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/4e4786ba-2ecd-497d-b900-18c8a30e9c75o%40googlegroups.com.

Re: [prometheus-developers] Introduce the concept of scrape Priority for Targets

Reply via email to