[DISCUSS] AIP-101 - Airflow AI Assistant - Phase 1 (Read-only assistance)

Shahar Epstein Fri, 03 Apr 2026 06:03:10 -0700

Hello everyone,

Since I first opened the thread discussing the MCP server, I've been thinking 
for a long while about how we can practically bring AI-assisted debugging and 
operational insights directly into Airflow. As orchestration environments grow 
more complex, the cost of troubleshooting, such as navigating across Dag code, 
task instances, scheduler logs, and configurations, translates directly to lost 
on-call time and delayed pipelines.

Today, organizations that want AI-assisted debugging are forced to build
custom, ad-hoc integrations or rely on external paid solutions. This leads to
fragmented user experiences, duplicated effort, and most critically,
inconsistent security controls that risk exposing sensitive metadata or
bypassing Airflow's native Role-Based Access Control (RBAC).

I think we can do better, by proposing AIP-101: Airflow AI Assistant - Phase 1
(Read-only assistance). This AIP introduces an official, opt-in plugin that
provides a conversational UI directly within Airflow to answer user questions
about their instances, explain errors, and help troubleshoot failures.

To ensure this is done safely and securely, Phase 1 is strictly read-only. The
assistant does not modify Airflow state, nor does it operate autonomously.
Instead, it relies on the newly proposed Airflow MCP Server (AIP-91) as its
data-retrieval engine. By leveraging the MCP standard, the assistant guarantees
that its answers are grounded in live system state while strictly enforcing the
authenticated user's RBAC permissions so the AI never accesses data the user
cannot see.

tl;dr of the proposed implementation:

Packaging: Delivered as an opt-in, standalone plugin package within the
apache/airflow monorepo (with an independent release cycle).
Frontend: A conversational UI embedded directly in the Airflow web interface.
Backend: A FastAPI-based plugin backend utilizing pydantic-ai to safely
orchestrate external LLM calls.
Data Access: Relies entirely on the Airflow MCP Server (AIP-91) to fetch
read-only state.

---

Because the assistant is heavily coupled with the secure tool-calling execution
provided by the MCP server, which is covered in a separate AIP (AIP-91) -
please note the ongoing discussion here, as well as AIP-91 itself:
https://lists.apache.org/thread/xgd66v6s7zf0xkvy3c7ysqvn4csgmw06
https://cwiki.apache.org/confluence/x/G4q3FQ

---

AIP-101 is available here:
https://cwiki.apache.org/confluence/x/8Ic8G

A quick warning before you read: the AIP is quite long! (sorry Jarek)
Because integrating AI into an orchestrator opens up a lot of potential
pitfalls, [ChatGPT and] I tried to be extremely thorough in covering all the
possible stuff that could go wrong :)
If you find a specific section to be overly detailed or repetitive, please
comment in the AIP and I'll try to handle it.

I've managed to build a very inital POC, screenshots are available in this
section:
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=406620144#AIP101AirflowAIAssistantPhase1(Readonlyassistance)-BehavioralModel

I would love to hear your thoughts. Please comment on the AIP and/or reply to
this thread.

Thank you,

Shahar

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[DISCUSS] AIP-101 - Airflow AI Assistant - Phase 1 (Read-only assistance)

Reply via email to