Oof, just noticed that the images do not load in some email clients 😬 The proposal can also be seen at this Pull Request <https://github.com/prometheus-operator/prometheus-operator/pull/5497>, with the images :)
Em sex., 14 de abr. de 2023 Ã s 10:00, Arthur Silva Sens < [email protected]> escreveu: > Hi everybody, I'm Arthur from the Prometheus-Operator team. > > We've recently added support for running Prometheus in Agent mode with > Prometheus-Operator and we've started to brainstorm new Deployment Patterns > that could be explored with the Agent, e.g. as Daemonsets or Sidecars. > > At this point in time, I'm drafting how things could look like if > Prometheus Agent is run as Pod sidecars, and would love to know the opinion > of the community about it. I'm particularly interested to know if there is > an appetite from the community for such a deployment pattern and if you > find new failure modes with that approach. > > Here is the proposal: > > Agent Deployment Pattern: Sidecar Injection > > > <https://github.com/prometheus-operator/prometheus-operator/blob/803a331736a6b05274bf07862c6550d053735a19/Documentation/designs/agent-deployment-pattern-sidecar.md#summary> > Summary > > With Prometheus-Operator finally supporting running Prometheus in Agent > mode, we can start thinking about different deployment patterns that can be > explored with this minimal container. This document aims to continue the > work started by this document > <https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/designs/prometheus-agent.md>, > focusing on exploring how Prometheus-Operator can leverage deploying > PrometheusAgents as sidecars running alongside pods that a user wants to > monitor. > > <https://github.com/prometheus-operator/prometheus-operator/blob/803a331736a6b05274bf07862c6550d053735a19/Documentation/designs/agent-deployment-pattern-sidecar.md#background> > Background > > By the time this document was written, Prometheus-Operator can deploy > Prometheus in Agent mode, but only using a pattern similar to the original > implementation of Prometheus Server: using StatefulSets. The original > design document > <https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/designs/prometheus-agent.md> > for > Prometheus Agent already mentions that different deployment patterns are > desired, however, for the sake of speeding up the initial implementation it > was decided to re-use the logic and start with the Agent running as > StatefulSets. > > Also for the sake of speeding up implementation, this document won't focus > on several new Deployment patterns, but only one: Sidecar Injection. > > Looking at the traditional deployment model, we have a single Prometheus > (or an HA setup) per cluster or namespace, responsible for scraping all > containers under their scope. Prometheus operator relies on ServiceMonitor > , PodMonitor, and Probe CRs to configure Prometheus, which will > eventually use Kubernetes service-discovery to find endpoints that need to > be scraped. > > Depending on the Cluster's scale and how often Prometheus hits Kubernetes > API, Prometheus service discovery can increase the load on the API > significantly and affect the overall functionality of said cluster. > > Another problem is that one or more containers can be updated to a > problematic version that causes a Cardinality Spike > <https://grafana.com/blog/2022/02/15/what-are-cardinality-spikes-and-why-do-they-matter/>. > Depending on the proportion of the spike, it is possible that a container > could single-handedly crash the monitoring system of the whole cluster. > > [image: Traditional Deployment Pattern] > <https://github.com/prometheus-operator/prometheus-operator/blob/803a331736a6b05274bf07862c6550d053735a19/Documentation/designs/assets/agent-deployment-pattern-sidecar/traditional-deployment-pattern.png> > . > > <https://github.com/prometheus-operator/prometheus-operator/blob/803a331736a6b05274bf07862c6550d053735a19/Documentation/designs/agent-deployment-pattern-sidecar.md#proposal> > Proposal > > This document proposes a new deployment model where Prometheus-Operator > injects Prometheus agents as a sidecar container (and Prometheus config > reloader) to pods that needs to be scrapped. With a sidecar, we tackle both > problems mentioned above: > > > - Load on Kubernetes API won't exist since it's not needed anymore. > Prometheus will scrape containers from the same pod through their shared > network interface and scrape configuration can be declared via pod > annotations. > - A sudden cardinality spike will not affect the whole monitoring > system. In a worst-case scenario, it will fail a single pod. > > A common pattern used with Prometheus's Kubernetes service discovery is > the usage of annotation to declaratively tell Prometheus which endpoints > need to be scraped > <https://www.acagroup.be/en/blog/auto-discovery-of-kubernetes-endpoint-services-prometheus/>. > From a code search at Github > <https://github.com/search?q=prometheus.io%2Fscrape%3A+%22true%22&type=code> > for prometheus.io/scrape: "true", we can tell that this approach has > good adoption already. To not conflict with the already commonly used > annotation, we can start with our own, but with a very similar approach. > apiVersion: v1 > kind: Pod > metadata: > name: example > annotations: > prometheus.operator.io/scrape: "true" > prometheus.operator.io/path: "/metrics" > prometheus.operator.io/port: "8080" > prometheus.operator.io/scrape-interval: "60s" > spec: > ... > > The existing PrometheusAgent CRD would be extended with a new field called > mode, which can be one of two values(for now): [statefulset, sidecar], > with statefulset as default. If mode is set to sidecar, > Prometheus-Operator won't deploy any Prometheus agents initially. Instead, > it will watch for Pod updates and inject the Prometheus Agent as a sidecar > with the pre-determined annotations present. > > In addition to telling the deployment model, the Agent CR will be the > source of truth for remote-write configuration, such as URL and > authentication. A change to the remote-write configuration would still > require a hot reload of potentially millions of agent sidecar containers, > but by avoiding having the remote-write configuration in pod annotation we > at least avoid requiring that the Pod manifest also needs to be upgraded. > > If different sets of pods require different remote-write configurations, > then multiple PrometheusAgent CRs are needed. This means that the pod also > needs to specify which Agent CR will inject the sidecar: > apiVersion: v1 > kind: Pod > metadata: > name: example > annotations: > prometheus.operator.io/scrape: "true" > prometheus.operator.io/path: "/metrics" > prometheus.operator.io/port: "8080" > prometheus.operator.io/scrape-interval: "60s" > prometheus.operator.io/agent-selector: "monitoring/agent-example" > spec: > ... > --- > apiVersion: monitoring.coreos.com/v1alpha1 > kind: PrometheusAgent > metadata: > name: agent-example > namespace: monitoring > spec: > mode: sidecar > remoteWrite: > - url: https://example.com > > With a visualization: > > [image: Sidecar Deployment Pattern] > <https://github.com/prometheus-operator/prometheus-operator/blob/803a331736a6b05274bf07862c6550d053735a19/Documentation/designs/assets/agent-deployment-pattern-sidecar/sidecar-deployment-pattern.png> > > <https://github.com/prometheus-operator/prometheus-operator/blob/803a331736a6b05274bf07862c6550d053735a19/Documentation/designs/agent-deployment-pattern-sidecar.md#what-to-do-with-servicemonitor-podmonitor-and-probe-selectors>What > to do with ServiceMonitor, PodMonitor, and Probe selectors? > > With the sidecar approach, our goal is to scale Prometheus horizontally > while avoiding impact in the Kubernetes API. It wouldn't make sense for a > sidecar to also scrape metrics from other pods. > > If mode is set to sidecar, a validating webhook would forbid > PrometheusAgent CRs to be created/updated with the following fields: > > - serviceMonitorSelector > - serviceMonitorNamespaceSelector > - podMonitorSelector > - podMonitorNamespaceSelector > - probeSelector > - probeNamespaceSelector > > > <https://github.com/prometheus-operator/prometheus-operator/blob/803a331736a6b05274bf07862c6550d053735a19/Documentation/designs/agent-deployment-pattern-sidecar.md#caveats> > Caveats > <https://github.com/prometheus-operator/prometheus-operator/blob/803a331736a6b05274bf07862c6550d053735a19/Documentation/designs/agent-deployment-pattern-sidecar.md#config-hot-reload>Config > Hot Reload > > There will be two ways to change Prometheus configuration now, 1) by > changing annotation on the pod and 2) by changing the remote-write field in > PrometheusAgent CRD. The first one will only trigger a hot reload for the > involved pod, but the latter has the potential to trigger millions of hot > reloads, depending on the scale of the cluster. > > While there is no research regarding the config-reloader efficiency, this > particular container might become problematic for huge-scale environments. > > <https://github.com/prometheus-operator/prometheus-operator/blob/803a331736a6b05274bf07862c6550d053735a19/Documentation/designs/agent-deployment-pattern-sidecar.md#wal-not-optimized-for-small-environments>WAL > not optimized for small environments > > Prometheus Write-Ahead-log(WAL) is stored as a sequence of numbered files > with 128MiB each by default. This means that, by default, at least 128MiB > is needed for running Prometheus Agent if we ignore every other part of > Prometheus. Using a sidecar, we're optimizing for horizontal scale and > 128MiB might be much more than necessary to store metrics from a single Pod. > > <https://github.com/prometheus-operator/prometheus-operator/blob/803a331736a6b05274bf07862c6550d053735a19/Documentation/designs/agent-deployment-pattern-sidecar.md#lack-of-high-availability-setup>Lack > of High-Availability setup > > With the problem that Prometheus is not optimized for very small > environments, injecting 2 sidecars per Pod sounds like a big waste of > resources. However, with only 1 sidecar HA Prometheus won't be an option. > > With that said, having an HA Prometheus in the traditional deployment > pattern seems to be more critical than the sidecar approach. That's because > with Prometheus fails in the first approach we lose the monitoring stack > for the whole cluster, while with the latter we just lose metrics from a > pod. > > <https://github.com/prometheus-operator/prometheus-operator/blob/803a331736a6b05274bf07862c6550d053735a19/Documentation/designs/agent-deployment-pattern-sidecar.md#references> > References > > - [1] > > https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/designs/prometheus-agent.md > - [2] https://opentelemetry.io/docs/collector/scaling/ > - [3] > > https://www.acagroup.be/en/blog/auto-discovery-of-kubernetes-endpoint-services-prometheus/ > - [4] > https://ganeshvernekar.com/blog/prometheus-tsdb-wal-and-checkpoint/ > > -- > You received this message because you are subscribed to a topic in the > Google Groups "Prometheus Developers" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/prometheus-developers/JHmnU8IVGMc/unsubscribe > . > To unsubscribe from this group and all its topics, send an email to > [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/prometheus-developers/d3e4d7c7-d79e-494a-bdcc-32ce2d04a88dn%40googlegroups.com > <https://groups.google.com/d/msgid/prometheus-developers/d3e4d7c7-d79e-494a-bdcc-32ce2d04a88dn%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "Prometheus Developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-developers/CAJqZosyGCKWHMPEHViCoN%2BeD%3DicARc-Y%2BFLBPF%3DQOsdXO8HZQg%40mail.gmail.com.

