Oof, just noticed that the images do not load in some email clients 😬

The proposal can also be seen at this Pull Request
<https://github.com/prometheus-operator/prometheus-operator/pull/5497>,
with the images :)

Em sex., 14 de abr. de 2023 às 10:00, Arthur Silva Sens <
[email protected]> escreveu:

> Hi everybody, I'm Arthur from the Prometheus-Operator team.
>
> We've recently added support for running Prometheus in Agent mode with
> Prometheus-Operator and we've started to brainstorm new Deployment Patterns
> that could be explored with the Agent, e.g. as Daemonsets or Sidecars.
>
> At this point in time, I'm drafting how things could look like if
> Prometheus Agent is run as Pod sidecars, and would love to know the opinion
> of the community about it. I'm particularly interested to know if there is
> an appetite from the community for such a deployment pattern and if you
> find new failure modes with that approach.
>
> Here is the proposal:
>
> Agent Deployment Pattern: Sidecar Injection
>
>
> <https://github.com/prometheus-operator/prometheus-operator/blob/803a331736a6b05274bf07862c6550d053735a19/Documentation/designs/agent-deployment-pattern-sidecar.md#summary>
> Summary
>
> With Prometheus-Operator finally supporting running Prometheus in Agent
> mode, we can start thinking about different deployment patterns that can be
> explored with this minimal container. This document aims to continue the
> work started by this document
> <https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/designs/prometheus-agent.md>,
> focusing on exploring how Prometheus-Operator can leverage deploying
> PrometheusAgents as sidecars running alongside pods that a user wants to
> monitor.
>
> <https://github.com/prometheus-operator/prometheus-operator/blob/803a331736a6b05274bf07862c6550d053735a19/Documentation/designs/agent-deployment-pattern-sidecar.md#background>
> Background
>
> By the time this document was written, Prometheus-Operator can deploy
> Prometheus in Agent mode, but only using a pattern similar to the original
> implementation of Prometheus Server: using StatefulSets. The original
> design document
> <https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/designs/prometheus-agent.md>
>  for
> Prometheus Agent already mentions that different deployment patterns are
> desired, however, for the sake of speeding up the initial implementation it
> was decided to re-use the logic and start with the Agent running as
> StatefulSets.
>
> Also for the sake of speeding up implementation, this document won't focus
> on several new Deployment patterns, but only one: Sidecar Injection.
>
> Looking at the traditional deployment model, we have a single Prometheus
> (or an HA setup) per cluster or namespace, responsible for scraping all
> containers under their scope. Prometheus operator relies on ServiceMonitor
> , PodMonitor, and Probe CRs to configure Prometheus, which will
> eventually use Kubernetes service-discovery to find endpoints that need to
> be scraped.
>
> Depending on the Cluster's scale and how often Prometheus hits Kubernetes
> API, Prometheus service discovery can increase the load on the API
> significantly and affect the overall functionality of said cluster.
>
> Another problem is that one or more containers can be updated to a
> problematic version that causes a Cardinality Spike
> <https://grafana.com/blog/2022/02/15/what-are-cardinality-spikes-and-why-do-they-matter/>.
> Depending on the proportion of the spike, it is possible that a container
> could single-handedly crash the monitoring system of the whole cluster.
>
> [image: Traditional Deployment Pattern]
> <https://github.com/prometheus-operator/prometheus-operator/blob/803a331736a6b05274bf07862c6550d053735a19/Documentation/designs/assets/agent-deployment-pattern-sidecar/traditional-deployment-pattern.png>
> .
>
> <https://github.com/prometheus-operator/prometheus-operator/blob/803a331736a6b05274bf07862c6550d053735a19/Documentation/designs/agent-deployment-pattern-sidecar.md#proposal>
> Proposal
>
> This document proposes a new deployment model where Prometheus-Operator
> injects Prometheus agents as a sidecar container (and Prometheus config
> reloader) to pods that needs to be scrapped. With a sidecar, we tackle both
> problems mentioned above:
>
>
>    - Load on Kubernetes API won't exist since it's not needed anymore.
>    Prometheus will scrape containers from the same pod through their shared
>    network interface and scrape configuration can be declared via pod
>    annotations.
>    - A sudden cardinality spike will not affect the whole monitoring
>    system. In a worst-case scenario, it will fail a single pod.
>
> A common pattern used with Prometheus's Kubernetes service discovery is
> the usage of annotation to declaratively tell Prometheus which endpoints
> need to be scraped
> <https://www.acagroup.be/en/blog/auto-discovery-of-kubernetes-endpoint-services-prometheus/>.
> From a code search at Github
> <https://github.com/search?q=prometheus.io%2Fscrape%3A+%22true%22&type=code>
>  for prometheus.io/scrape: "true", we can tell that this approach has
> good adoption already. To not conflict with the already commonly used
> annotation, we can start with our own, but with a very similar approach.
> apiVersion: v1
>   kind: Pod
>   metadata:
>     name: example
>     annotations:
>       prometheus.operator.io/scrape: "true"
>       prometheus.operator.io/path: "/metrics"
>       prometheus.operator.io/port: "8080"
>       prometheus.operator.io/scrape-interval: "60s"
> spec:
> ...
>
> The existing PrometheusAgent CRD would be extended with a new field called
> mode, which can be one of two values(for now): [statefulset, sidecar],
> with statefulset as default. If mode is set to sidecar,
> Prometheus-Operator won't deploy any Prometheus agents initially. Instead,
> it will watch for Pod updates and inject the Prometheus Agent as a sidecar
> with the pre-determined annotations present.
>
> In addition to telling the deployment model, the Agent CR will be the
> source of truth for remote-write configuration, such as URL and
> authentication. A change to the remote-write configuration would still
> require a hot reload of potentially millions of agent sidecar containers,
> but by avoiding having the remote-write configuration in pod annotation we
> at least avoid requiring that the Pod manifest also needs to be upgraded.
>
> If different sets of pods require different remote-write configurations,
> then multiple PrometheusAgent CRs are needed. This means that the pod also
> needs to specify which Agent CR will inject the sidecar:
> apiVersion: v1
>   kind: Pod
>   metadata:
>     name: example
>     annotations:
>       prometheus.operator.io/scrape: "true"
>       prometheus.operator.io/path: "/metrics"
>       prometheus.operator.io/port: "8080"
>       prometheus.operator.io/scrape-interval: "60s"
>       prometheus.operator.io/agent-selector: "monitoring/agent-example"
> spec:
> ...
> ---
>   apiVersion: monitoring.coreos.com/v1alpha1
>   kind: PrometheusAgent
>   metadata:
>     name: agent-example
>     namespace: monitoring
> spec:
>   mode: sidecar
>   remoteWrite:
>    - url: https://example.com
>
> With a visualization:
>
> [image: Sidecar Deployment Pattern]
> <https://github.com/prometheus-operator/prometheus-operator/blob/803a331736a6b05274bf07862c6550d053735a19/Documentation/designs/assets/agent-deployment-pattern-sidecar/sidecar-deployment-pattern.png>
>
> <https://github.com/prometheus-operator/prometheus-operator/blob/803a331736a6b05274bf07862c6550d053735a19/Documentation/designs/agent-deployment-pattern-sidecar.md#what-to-do-with-servicemonitor-podmonitor-and-probe-selectors>What
> to do with ServiceMonitor, PodMonitor, and Probe selectors?
>
> With the sidecar approach, our goal is to scale Prometheus horizontally
> while avoiding impact in the Kubernetes API. It wouldn't make sense for a
> sidecar to also scrape metrics from other pods.
>
> If mode is set to sidecar, a validating webhook would forbid
> PrometheusAgent CRs to be created/updated with the following fields:
>
>    - serviceMonitorSelector
>    - serviceMonitorNamespaceSelector
>    - podMonitorSelector
>    - podMonitorNamespaceSelector
>    - probeSelector
>    - probeNamespaceSelector
>
>
> <https://github.com/prometheus-operator/prometheus-operator/blob/803a331736a6b05274bf07862c6550d053735a19/Documentation/designs/agent-deployment-pattern-sidecar.md#caveats>
> Caveats
> <https://github.com/prometheus-operator/prometheus-operator/blob/803a331736a6b05274bf07862c6550d053735a19/Documentation/designs/agent-deployment-pattern-sidecar.md#config-hot-reload>Config
> Hot Reload
>
> There will be two ways to change Prometheus configuration now, 1) by
> changing annotation on the pod and 2) by changing the remote-write field in
> PrometheusAgent CRD. The first one will only trigger a hot reload for the
> involved pod, but the latter has the potential to trigger millions of hot
> reloads, depending on the scale of the cluster.
>
> While there is no research regarding the config-reloader efficiency, this
> particular container might become problematic for huge-scale environments.
>
> <https://github.com/prometheus-operator/prometheus-operator/blob/803a331736a6b05274bf07862c6550d053735a19/Documentation/designs/agent-deployment-pattern-sidecar.md#wal-not-optimized-for-small-environments>WAL
> not optimized for small environments
>
> Prometheus Write-Ahead-log(WAL) is stored as a sequence of numbered files
> with 128MiB each by default. This means that, by default, at least 128MiB
> is needed for running Prometheus Agent if we ignore every other part of
> Prometheus. Using a sidecar, we're optimizing for horizontal scale and
> 128MiB might be much more than necessary to store metrics from a single Pod.
>
> <https://github.com/prometheus-operator/prometheus-operator/blob/803a331736a6b05274bf07862c6550d053735a19/Documentation/designs/agent-deployment-pattern-sidecar.md#lack-of-high-availability-setup>Lack
> of High-Availability setup
>
> With the problem that Prometheus is not optimized for very small
> environments, injecting 2 sidecars per Pod sounds like a big waste of
> resources. However, with only 1 sidecar HA Prometheus won't be an option.
>
> With that said, having an HA Prometheus in the traditional deployment
> pattern seems to be more critical than the sidecar approach. That's because
> with Prometheus fails in the first approach we lose the monitoring stack
> for the whole cluster, while with the latter we just lose metrics from a
> pod.
>
> <https://github.com/prometheus-operator/prometheus-operator/blob/803a331736a6b05274bf07862c6550d053735a19/Documentation/designs/agent-deployment-pattern-sidecar.md#references>
> References
>
>    - [1]
>    
> https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/designs/prometheus-agent.md
>    - [2] https://opentelemetry.io/docs/collector/scaling/
>    - [3]
>    
> https://www.acagroup.be/en/blog/auto-discovery-of-kubernetes-endpoint-services-prometheus/
>    - [4]
>    https://ganeshvernekar.com/blog/prometheus-tsdb-wal-and-checkpoint/
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "Prometheus Developers" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/prometheus-developers/JHmnU8IVGMc/unsubscribe
> .
> To unsubscribe from this group and all its topics, send an email to
> [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-developers/d3e4d7c7-d79e-494a-bdcc-32ce2d04a88dn%40googlegroups.com
> <https://groups.google.com/d/msgid/prometheus-developers/d3e4d7c7-d79e-494a-bdcc-32ce2d04a88dn%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/CAJqZosyGCKWHMPEHViCoN%2BeD%3DicARc-Y%2BFLBPF%3DQOsdXO8HZQg%40mail.gmail.com.

Reply via email to