Re: [prometheus-developers] Requirements / Best Practices to use Prometheus Metrics for Serverless environments

Matthias Rampke Sat, 27 Nov 2021 15:22:57 -0800

What properties would an ideal OpenMetrics push receiver have? In
particular, I am wondering:


- What tradeoff would it make when metric ingestion is slower than metric
production? Backpressure or drop data?
- What are the semantics of pushing a counter?
- Where would the data move from there, and how?
- How many of these receivers would you typically run? How much
coordination is necessary between them?

>From observing the use of the statsd exporter, I see a few cases where it
covers ground that is not very compatible with the in-process aggregation
implied by the pull model. It has the downside of mapping through a
different metrics model, and its tradeoffs are informed by the ones statsd
made 10+ years ago. I wonder what it would look like, remade in 2022
starting from OpenMetrics.


/MR

On Sat, 27 Nov 2021, 12:50 Rob Skillington, <[email protected]>
wrote:

> Here’s the documentation for using M3 coordinator (with it without M3
> aggregator) with a backend that has a Prometheus Remote Write receiver:
> https://m3db.io/docs/how_to/any_remote_storage/
>
> Would be more than happy to do a call some time on this topic, the more
> we’ve looked at this it’s a client library issue primarily way before you
> consider the backend/receiver aspect (which there are options out there and
> are fairly mechanical to overcome, vs the client library concerns which
> have a lot of ergonomic and practical issues especially in a serverless
> environment where you may need to wait for publishing before finishing your
> request - perhaps an async process like publishing a message to local
> serverless message queue like SQS is an option and having a reader read
> that and use another client library to push that data out is ideal - it
> would be more type safe and probably less lossy than logs and reading the
> logs then publishing but would need good client library support for both
> the serverless producers and the readers/pushers).
>
> Rob
>
> On Sat, Nov 27, 2021 at 1:41 AM Rob Skillington <[email protected]>
> wrote:
>
>> FWIW we have been experimenting with users pushing OpenMetrics protobuf
>> payloads quite successfully, but only sophisticated exporters that can
>> guarantee no collisions of time series and generate their own monotonic
>> counters, etc are using this at this time.
>>
>> If you're looking for a solution that also involves aggregation support,
>> M3 Coordinator (either standalone or combined with M3 Aggregator) supports
>> Remote Write as a backend (and is thus compatible with Thanos, Cortex and
>> of course Prometheus itself too due to the PRW receiver).
>>
>> M3 Coordinator however does not have any nice support to publish to it
>> from a serverless environment (since the primary protocol it supports is
>> Prometheus Remote Write which has no metrics clients, etc I would assume).
>>
>> Rob
>>
>>
>> On Mon, Nov 15, 2021 at 9:54 PM Bartłomiej Płotka <[email protected]>
>> wrote:
>>
>>> Hi All,
>>>
>>> I would love to resurrect this thread. I think we are missing a good
>>> push-gateway like a product that would ideally live in Prometheus
>>> (repo/binary or can be recommended by us) and convert events to metrics in
>>> a cheap way. Because this is what it is when we talk about short-living
>>> containers and serverless functions. What's the latest Rob? I would be
>>> interested in some call for this if that is still on the table. (:
>>>
>>> I think we have some new options on the table like supporting Otel
>>> metrics as such potential high-cardinal event push, given there are more
>>> and more clients for that API. Potentially Otel collector can work as such
>>> "push gateway" proxy, but at this point, it's extremely generic, so we
>>> might want to consider something more focused/efficient/easier to maintain.
>>> Let's see (: The other problem is that Otel metrics is yet another
>>> protocol. Users might want to use push gateway API, remote write or
>>> logs/traces as per @Tobias Schmidt <[email protected]> idea
>>>
>>> Another service "loggateway" (or otherwise named) would then stream the
>>>> logs, aggregate them and either expose them on the common /metrics endpoint
>>>> or push them with remote write right away to a Prometheus instance hosted
>>>> somewhere (like Grafana Cloud)."
>>>
>>>
>>> Kind Regards,
>>> Bartek Płotka (@bwplotka)
>>>
>>>
>>> On Fri, Jun 25, 2021 at 6:11 AM Rob Skillington <[email protected]>
>>> wrote:
>>>
>>>> With respect to OpenMetrics push, we had something very similar at
>>>> $prevco that pushed something that looked very similar to the protobuf
>>>> payload of OpenMetrics (but was Thrift snapshot of an aggregated set of
>>>> metrics from in process) that was used by short running tasks (for Jenkins,
>>>> Flink jobs, etc).
>>>>
>>>> I definitely agree it’s not ideal and ideally the platform provider can
>>>> supply a collection point (there is something for Jenkins, a plug-in that
>>>> can do this, but custom metrics is very hard / nigh impossible to make work
>>>> with it, and this is a non-cloud provider environment that’s actually
>>>> possible to make work, just no one has made it seamless).
>>>>
>>>> I agree with Richi that something that could push to a Prometheus Agent
>>>> like target that supports OpenMetrics push could be a good middle ground
>>>> with the right support / guidelines:
>>>> - A way to specify multiple Prometheus Agent targets and quickly
>>>> failover from one to another if within $X ms one is not responding (you
>>>> could imagine a 5ms budget for each and max 3 are tried, introducing at
>>>> worst 15ms overhead when all are down in 3 local availability zones, but in
>>>> general this is a disaster case)
>>>> - Deduplication ability so that a retried push is not double counted,
>>>> this might mean timestamping the metrics… (so if written twice only first
>>>> record kept, etc)
>>>>
>>>> I think it should similar to the Push Gateway be generally a last
>>>> resort kind of option and have clear limitations so that pull still remains
>>>> the clear choice for anything but these environments.
>>>>
>>>> Is there any interest discussing this on a call some time?
>>>>
>>>> Rob
>>>>
>>>> On Thu, Jun 24, 2021 at 5:09 PM Bjoern Rabenstein <[email protected]>
>>>> wrote:
>>>>
>>>>> On 22.06.21 11:26, Tobias Schmidt wrote:
>>>>> >
>>>>> > Last night I was wondering if there are any other common interfaces
>>>>> > available in serverless environments and noticed that all products
>>>>> by AWS
>>>>> > (Lambda) and GCP (Functions, Run) at least provide the option to
>>>>> handle log
>>>>> > streams, sometimes even log files on disk. I'm currently thinking
>>>>> about
>>>>> > experimenting with an approach where containers log metrics to
>>>>> stdout /
>>>>> > some file, get picked up by the serverless runtime and written to
>>>>> some log
>>>>> > stream. Another service "loggateway" (or otherwise named) would then
>>>>> stream
>>>>> > the logs, aggregate them and either expose them on the common
>>>>> /metrics
>>>>> > endpoint or push them with remote write right away to a Prometheus
>>>>> instance
>>>>> > hosted somewhere (like Grafana Cloud).
>>>>>
>>>>> Perhaps I'm missing something, but isn't that
>>>>> https://github.com/google/mtail ?
>>>>>
>>>>> --
>>>>> Björn Rabenstein
>>>>> [PGP-ID] 0x851C3DA17D748D03
>>>>> [email] [email protected]
>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "Prometheus Developers" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to [email protected].
>>>>> To view this discussion on the web visit
>>>>> https://groups.google.com/d/msgid/prometheus-developers/20210624210908.GB11559%40jahnn
>>>>> .
>>>>>
>>>> --
>> You received this message because you are subscribed to the Google Groups
>> "Prometheus Developers" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/prometheus-developers/CABakzZaGy-Rm1qv5%3D6-2ghjmDyW3k1YkO12YfWurHZmzfsv4-g%40mail.gmail.com
>> <https://groups.google.com/d/msgid/prometheus-developers/CABakzZaGy-Rm1qv5%3D6-2ghjmDyW3k1YkO12YfWurHZmzfsv4-g%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-developers/CAFtK1UOa5ORJyui5-ORACtCMgS-82ZGz4G1T90EV6WY_RPDpqQ%40mail.gmail.com
> <https://groups.google.com/d/msgid/prometheus-developers/CAFtK1UOa5ORJyui5-ORACtCMgS-82ZGz4G1T90EV6WY_RPDpqQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/CAMV%3D_gb0ZYLNs%2B%2BYx9LSc885%3DivHMno7DPA3eEvjifgnD5Lx%3DQ%40mail.gmail.com.

Re: [prometheus-developers] Requirements / Best Practices to use Prometheus Metrics for Serverless environments

Reply via email to