xBis7 opened a new pull request, #64902:
URL: https://github.com/apache/airflow/pull/64902

   This patch is adding an `otel-collector`, `jaeger`, `grafana` and 
`prometheus` to the Helm chart. The services are disabled by default in 
contrast to `statsd` which always starts. The user will have to manually enable 
them.
   
   I've added 2 separate flags for enabling traces and metrics. When the user 
enables the otel metrics, `statsd` is disabled in the airflow config so that 
otel will be used instead.
   
   ## Testing
   
   I tested the changes manually like so
   
   ```
   > breeze k8s setup-env
   
   > breeze k8s create-cluster
   
   > breeze k8s configure-cluster
   
   > breeze k8s build-k8s-image
   
   > breeze k8s upload-k8s-image
   
   > breeze k8s deploy-airflow
   ```
   
   then I connected to the breeze k8s shell and checked that the new services 
haven't been started
   ```
   > breeze k8s shell
   
   (kind-airflow-python-3.10-v1.30.13:KubernetesExecutor)> kubectl get pods -n 
airflow
   NAME                                     READY   STATUS    RESTARTS   AGE
   airflow-api-server-6447c4c878-cp2xm      1/1     Running   0          12m
   airflow-dag-processor-857fb7df66-8v8lr   2/2     Running   0          12m
   airflow-postgresql-0                     1/1     Running   0          12m
   airflow-scheduler-b4b75cddd-6xp68        2/2     Running   0          12m
   airflow-statsd-694c7b7497-smmtw          1/1     Running   0          12m
   airflow-triggerer-0                      2/2     Running   0          12m
   (kind-airflow-python-3.10-v1.30.13:KubernetesExecutor)> exit
   ```
   
   redeployed airflow with otel traces and metrics enabled
   ```
   > breeze k8s deploy-airflow --upgrade --set otelCollector.enabled=true --set 
otelCollector.tracesEnabled=true --set otelCollector.metricsEnabled=true --set 
prometheus.enabled=true --set grafana.enabled=true --set jaeger.enabled=true
   ```
   
   the new services were now running
   ```
   Entering interactive k8s shell.
   
   (kind-airflow-python-3.10-v1.30.13:KubernetesExecutor)> kubectl get pods -n 
airflow
   NAME                                      READY   STATUS    RESTARTS   AGE
   airflow-api-server-65c764b69d-c6hck       1/1     Running   0          88s
   airflow-dag-processor-764d594dc-5rn7h     2/2     Running   0          88s
   airflow-grafana-6995b86fbf-4ggl8          1/1     Running   0          88s
   airflow-jaeger-c6fdd9f5b-sm5wr            1/1     Running   0          88s
   airflow-otel-collector-566b6f84bf-9xkzq   1/1     Running   0          88s
   airflow-postgresql-0                      1/1     Running   0          14m
   airflow-prometheus-7bd946496d-wvtnj       1/1     Running   0          88s
   airflow-scheduler-6b4bf7967b-hksvw        2/2     Running   0          87s
   airflow-statsd-694c7b7497-smmtw           1/1     Running   0          14m
   airflow-triggerer-0                       2/2     Running   0          80s
   (kind-airflow-python-3.10-v1.30.13:KubernetesExecutor)> exit
   ```
   
   I triggered the `example_simplest_dag`
   
   <img width="2690" height="1370" alt="image" 
src="https://github.com/user-attachments/assets/e7c0ec56-a3c5-4f43-9353-b6b7030b3552";
 />
   
   Jaeger traces
   
   <img width="2686" height="898" alt="image" 
src="https://github.com/user-attachments/assets/bcc816d6-1ea6-4931-9fd7-6899c50fbb61";
 />
   
   Prometheus, the instance in the attributes is `airflow-otel-collector`
   
   <img width="2692" height="912" alt="image" 
src="https://github.com/user-attachments/assets/b1d54cbc-a67a-42e5-93ed-e4de1f8675c5";
 />
   
   Grafana dashboard
   
   <img width="2692" height="904" alt="image" 
src="https://github.com/user-attachments/assets/ccaf0ee3-8397-4795-ab61-5572807ffffd";
 />
   
   For grafana, I used the same dashboard that exists under breeze docker. It's 
the exact same json file.
   
   ---
   
   ##### Was generative AI tooling used to co-author this PR?
   
   <!--
   If generative AI tooling has been used in the process of authoring this PR, 
please
   change below checkbox to `[X]` followed by the name of the tool, uncomment 
the "Generated-by".
   -->
   
   - [X] Yes (please specify the tool below)
   Claude Sonnet 4.6 Extended
   
   <!--
   Generated-by: [Tool Name] following [the 
guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#gen-ai-assisted-contributions)
   -->
   
   ---
   
   * Read the **[Pull Request 
Guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#pull-request-guidelines)**
 for more information. Note: commit author/co-author name and email in commits 
become permanently public when merged.
   * For fundamental code changes, an Airflow Improvement Proposal 
([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals))
 is needed.
   * When adding dependency, check compliance with the [ASF 3rd Party License 
Policy](https://www.apache.org/legal/resolved.html#category-x).
   * For significant user-facing changes create newsfragment: 
`{pr_number}.significant.rst`, in 
[airflow-core/newsfragments](https://github.com/apache/airflow/tree/main/airflow-core/newsfragments).
 You can add this file in a follow-up commit after the PR is created so you 
know the PR number.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to