potiuk commented on code in PR #64760:
URL: https://github.com/apache/airflow/pull/64760#discussion_r3045027425
##########
airflow-core/docs/security/security_model.rst:
##########
@@ -282,6 +312,309 @@ Access to all Dags
All Dag authors have access to all Dags in the Airflow deployment. This means
that they can view, modify,
and update any Dag without restrictions at any time.
+.. _jwt-authentication-and-workload-isolation:
+
+JWT authentication and workload isolation
+-----------------------------------------
+
+Airflow uses JWT (JSON Web Token) authentication for both its public REST API
and its internal
+Execution API. For a detailed description of the JWT authentication flows,
token structure, and
+configuration, see :doc:`/security/jwt_token_authentication`.
+
+Current isolation limitations
+.............................
+
+While Airflow 3 significantly improved the security model by preventing worker
task code from
+directly accessing the metadata database (workers now communicate exclusively
through the
+Execution API), **perfect isolation between Dag authors is not yet achieved**.
Dag author code
+potentially still executes with direct database access in the Dag File
Processor and Triggerer.
+
+**Software guards vs. intentional access**
+ Airflow implements software-level guards that prevent **accidental and
unintentional** direct database
+ access from Dag author code. The Dag File Processor removes the database
session and connection
+ information before forking child processes that parse Dag files, and worker
tasks use the Execution
+ API exclusively.
+
+ However, these software guards **do not protect against intentional,
malicious access**. The child
+ processes that parse Dag files and execute trigger code run as the **same
Unix user** as their parent
+ processes (the Dag File Processor manager and the Triggerer respectively).
Because of how POSIX
+ process isolation works, a child process running as the same user can
retrieve the parent's
+ credentials through several mechanisms:
+
+ * **Environment variables**: On Linux, any process can read
``/proc/<PID>/environ`` of another
+ process running as the same user — so database credentials passed via
environment variables
+ (e.g., ``AIRFLOW__DATABASE__SQL_ALCHEMY_CONN``) can be read from the
parent process.
+ * **Configuration files**: If configuration is stored in files, those files
must be readable by the
+ parent process and are therefore also readable by the child process
running as the same user.
+ * **Command-based secrets** (``_CMD`` suffix options): The child process
can execute the same
+ commands to retrieve secrets.
+ * **Secrets manager access**: If the parent uses a secrets backend, the
child can access the same
+ secrets manager using credentials available in the process environment or
filesystem.
+
+ This means that a deliberately malicious Dag author can retrieve database
credentials and gain
+ **full read/write access to the metadata database** — including the ability
to modify any Dag,
+ task instance, connection, or variable. The software guards address
accidental access (e.g., a Dag
+ author importing ``airflow.settings.Session`` out of habit from Airflow 2)
but do not prevent a
+ determined actor from circumventing them.
+
+ On workers, the isolation is stronger: worker processes do not receive
database credentials at all
+ (neither via environment variables nor configuration). Workers communicate
exclusively through the
+ Execution API using short-lived JWT tokens. A task running on a worker
genuinely cannot access the
+ metadata database directly — there are no credentials to retrieve.
+
+**Dag File Processor and Triggerer potentially bypass JWT authentication**
Review Comment:
Updated it this way:
> **Dag File Processor and Triggerer run user code only have soft protection
to bypass JWT authentication**
The Dag File Processor and Triggerer processes that run user code,
use an in-process transport to access the Execution API, which bypasses
JWT authentication.
Since these components execute user-submitted code (Dag files and trigger
code respectively),
a Dag author whose code runs in these components
has unrestricted access to all Execution API operations if they bypass
the soft protections
— including the ability to read any connection, variable, or XCom —
without needing a valid JWT token.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]