Re: [prometheus-developers] Requirements / Best Practices to use Prometheus Metrics for Serverless environments

Tobias Schmidt Tue, 22 Jun 2021 07:32:32 -0700

On Tue, Jun 22, 2021 at 12:09 PM Richard Hartmann <
[email protected]> wrote:


> This has come up in the context of OM, OTel, and TAG Observability. My
> own thinking largely mirrors beorn's & grobie's: In a perfect world
> the orchestration layer has all the information and interfaces
> required and billing knows about the required datapaths, NB:
> Monitoring usually has higher speed and lower reliability requirements
> than billing. Still, for doability, lock-in, convenience, and velocity
> reasons, it's enticing to bypass the ideal solution and do something
> that works-ish now. If someone incurs ~100% overhead for monitoring
> lightweight functions but gets their job done, they are are still
> getting their job done and can optimize later if they so choose.
>
> Pushing might appear hamfisted here, and arguably is, but it's largely
> under the control of the dev; as such, they can do it with less
> coordination. This might get us near to using the Prometheus Agent as
> a Collector to reduce latency and blast radius. Far from ideal, but...
>
> An in-between would be what grobie said: To speak in Prometheus terms,
> the orchestrator is node_exporter, the serverless functions write out
> something which the textfile collector can ingest.
>

There is not much overlap between the node_exporter and the functionality
needed here. It would need something which can read common log streams from
major cloud providers / serverless runtimes, aggregate the logs, and then
expose them. Only the last part is somewhat available in the node_exporter
and the rest doesn't really make sense there. Google's mtail would be a bit
closer conceptually, but as we have full control over the clients and wire
format there is no need for a full-fledged log parsing engine, and the
cloud provider log reading part is still missing.

OpenMetrics deliberately supports push, but this approach creates
> issues with `up` and staleness handling. OTel is currently facing
> similar issues, maybe there's room for cooperation. Also see
>
> https://github.com/OpenObservability/OpenMetrics/blob/main/specification/OpenMetrics.md#supporting-target-metadata-in-both-push-based-and-pull-based-systems
> and
> https://docs.google.com/document/d/1hn-u6WKLHxIsqYT1_u6eh94lyQeXrFaAouMshJcQFXs/edit#heading=h.e4p9f543e7i2
>
>
> I strongly believe that we should be particular about the wire format;
> in a future in which orchestrators have a collector component, it
> would be nice to be able to simply expose the metrics for pulling or
> use PRW code and wire format.
>
>
> Best,
> Richard
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-developers/CAD77%2BgSiKWVrnoGydB2hBVkeX87NejCht93JPVvaY%2BQ-Y%3DGvoQ%40mail.gmail.com
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/CAChBsdAC6LSeR%2BQDpmpae-Ty27KeQq-nSo6mdGdhxV3bQiJ-yg%40mail.gmail.com.

Re: [prometheus-developers] Requirements / Best Practices to use Prometheus Metrics for Serverless environments

Reply via email to