This is an automated email from the ASF dual-hosted git repository. potiuk pushed a commit to branch 3.2.0-docs in repository https://gitbox.apache.org/repos/asf/airflow.git
commit a644aad1a7ca08d1f2b3db2e023191531b9e81e2 Author: Jarek Potiuk <[email protected]> AuthorDate: Tue Apr 7 15:47:19 2026 +0200 Docs: Improve security docs wording, extract workload isolation, recommend DagBundle - Reword DFP/Triggerer descriptions to clarify software guards vs intentional bypass - Extract workload isolation section from jwt_token_authentication into workload.rst - Recommend Dag Bundle mechanism (GitDagBundle) for DAG synchronization - Fix typo in public-airflow-interface.rst and broken backtick in jwt_token_authentication.rst - Update cross-references between security docs --- AGENTS.md | 44 +------ .../production-deployment.rst | 6 +- airflow-core/docs/public-airflow-interface.rst | 4 +- .../docs/security/jwt_token_authentication.rst | 133 ++++++--------------- airflow-core/docs/security/security_model.rst | 67 ++++------- airflow-core/docs/security/workload.rst | 83 +++++++++++++ 6 files changed, 156 insertions(+), 181 deletions(-) diff --git a/AGENTS.md b/AGENTS.md index ac347fd2e91..ce8e8384ec9 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -66,11 +66,11 @@ UV workspace monorepo. Key paths: ## Architecture Boundaries 1. Users author Dags with the Task SDK (`airflow.sdk`). -2. Dag File Processor parses Dag files in separate processes and stores serialized Dags in the metadata DB. It potentially has **direct database access** and uses an in-process Execution API transport that **potentially bypasses JWT authentication**. +2. Dag File Processor parses Dag files in separate processes and stores serialized Dags in the metadata DB. Software guards prevent individual parsing processes from accessing the database directly and enforce use of the Execution API, but these guards do not protect against intentional bypassing by malicious or misconfigured code. 3. Scheduler reads serialized Dags — **never runs user code** — and creates Dag runs / task instances. 4. Workers execute tasks via Task SDK and communicate with the API server through the Execution API — **never access the metadata DB directly**. Each task receives a short-lived JWT token scoped to its task instance ID. 5. API Server serves the React UI and handles all client-database interactions. -6. Triggerer evaluates deferred tasks/sensors in separate processes. Like the Dag File Processor, it potentially has **direct database access** and uses an in-process Execution API transport that **potentially bypasses JWT authentication**. +6. Triggerer evaluates deferred tasks/sensors in separate processes. Like the Dag File Processor, software guards steer it through the Execution API rather than direct database access, but these guards do not protect against intentional bypassing by malicious or misconfigured code. 7. Shared libraries that are symbolically linked to different Python distributions are in `shared` folder. 8. Airflow uses `uv workspace` feature to keep all the distributions sharing dependencies and venv 9. Each of the distributions should declare other needed distributions: `uv --project <FOLDER> sync` command acts on the selected project in the monorepo with only dependencies that it has @@ -82,44 +82,8 @@ mind the following aspects of Airflow's security model. The authoritative refere [`airflow-core/docs/security/security_model.rst`](airflow-core/docs/security/security_model.rst) and [`airflow-core/docs/security/jwt_token_authentication.rst`](airflow-core/docs/security/jwt_token_authentication.rst). -**The following are intentional design choices, not security vulnerabilities:** - -- **Dag File Processor and Triggerer potentially bypass JWT authentication.** They use - `InProcessExecutionAPI` which overrides the JWT bearer dependency to always allow access. This - is by design — these components run within trusted infrastructure and potentially need direct - database access for their core operations (storing serialized Dags, managing trigger state). -- **Dag File Processor and Triggerer potentially have direct metadata database access.** - User-submitted code (Dag files, trigger code) executes in these components and can potentially - access the database. This is a known limitation documented in the security model, not an - undiscovered vulnerability. -- **Worker Execution API tokens grant access to shared resources.** While `ti:self` scope prevents - cross-task state manipulation, connections, variables, and XComs are accessible to all tasks. - This is the current design — finer-grained scoping is planned for future versions. -- **The experimental multi-team feature (`[core] multi_team`) does not guarantee task-level - isolation.** It provides UI-level and REST API-level RBAC isolation only. At the Execution API - and database level, there is no enforcement of team boundaries. This is documented and expected. -- **Execution API tokens are not subject to revocation.** They are short-lived (default 10 min) - with automatic refresh, so revocation is intentionally not part of the Execution API security model. -- **A single Dag File Processor and Triggerer instance serves all teams by default.** Per-team - instances require deployment-level configuration by the Deployment Manager. - -**The following are NOT security vulnerabilities (per Airflow's security policy and trust model):** - -- Dag authors executing arbitrary code, accessing credentials, or reading environment variables — - Dag authors are trusted users with broad capabilities by design. -- Dag author code passing unsanitized input to operators/hooks — responsibility lies with the Dag - author, not Airflow. SQL injection or command injection is only a vulnerability if exploitable by - a non-Dag-author role without the Dag author deliberately writing unsafe code. -- Connection configuration users being able to trigger RCE/DoS/arbitrary reads via connection - parameters — these users are highly privileged by design. Test connection is disabled by default. -- DoS by authenticated users — Airflow is an internal application with known, authenticated users. - DoS by authenticated users is an operational concern, not a CVE-worthy vulnerability. -- Self-XSS by authenticated users — only considered a vulnerability if it crosses privilege - boundaries (lower-privileged user's payload executes in higher-privileged session). -- Simple Auth Manager security issues — it is for development/testing only, with a prominent warning. -- Third-party dependency CVEs in Docker images — expected over time; users should build their own - images. Only report if you have a proof-of-concept exploiting the vulnerability in Airflow's context. -- Automated scanner results without human verification against the security model. +Particularly, the intentional design choices, that are not security vulnerabilities and should not be +reported as such are described in "What is NOT considered a security vulnerability" chapter of the security model. **When flagging security concerns, distinguish between:** diff --git a/airflow-core/docs/administration-and-deployment/production-deployment.rst b/airflow-core/docs/administration-and-deployment/production-deployment.rst index 2f9537aad3c..e88b94d94ba 100644 --- a/airflow-core/docs/administration-and-deployment/production-deployment.rst +++ b/airflow-core/docs/administration-and-deployment/production-deployment.rst @@ -63,9 +63,9 @@ the :doc:`Celery executor <apache-airflow-providers-celery:celery_executor>`. Once you have configured the executor, it is necessary to make sure that every node in the cluster contains the Dags and configuration appropriate for its role. Airflow sends simple instructions such as -"execute task X of Dag Y", but does not send any Dag files or configuration. You can use a simple cronjob -or any other mechanism to sync Dags across your nodes, e.g., checkout Dags from git repo every 5 minutes -on all nodes. For security-sensitive deployments, restrict sensitive configuration (JWT signing keys, +"execute task X of Dag Y", but does not send any Dag files or configuration. For synchronization of Dags +we recommend the Dag Bundle mechanism (including ``GitDagBundle``), which allows you to make use of +DAG versioning. For security-sensitive deployments, restrict sensitive configuration (JWT signing keys, database credentials, Fernet keys) to only the components that need them rather than sharing all configuration across all nodes — see :doc:`/security/security_model` for guidance. diff --git a/airflow-core/docs/public-airflow-interface.rst b/airflow-core/docs/public-airflow-interface.rst index 2c271a580b3..4f4c09d66d1 100644 --- a/airflow-core/docs/public-airflow-interface.rst +++ b/airflow-core/docs/public-airflow-interface.rst @@ -548,8 +548,8 @@ but in Airflow they are not parts of the Public Interface and might change any t internal implementation detail and you should not assume they will be maintained in a backwards-compatible way. -**Direct metadata database access from worker task code is no longer allowed**. -Worker task code cannot directly access the metadata database to query Dag state, task history, +**Direct metadata database access from code authored by Dag Authors is no longer allowed**. +The code authored by Dag Authors cannot directly access the metadata database to query Dag state, task history, or Dag runs — workers communicate exclusively through the Execution API. Instead, use one of the following alternatives: diff --git a/airflow-core/docs/security/jwt_token_authentication.rst b/airflow-core/docs/security/jwt_token_authentication.rst index 38528811f13..7aa85bba9a3 100644 --- a/airflow-core/docs/security/jwt_token_authentication.rst +++ b/airflow-core/docs/security/jwt_token_authentication.rst @@ -55,8 +55,8 @@ Airflow supports two mutually exclusive signing modes: **Asymmetric (public/private key pair)** Uses a PEM-encoded private key (``[api_auth] jwt_private_key_path``) for signing and - the corresponding public key for validation. Supported algorithms: **RS256** (RSA) and - **EdDSA** (Ed25519). The algorithm is auto-detected from the key type when + the corresponding public key for validation. Supported algorithms: **RS256** (``RSA``) and + **EdDSA** (``Ed25519``). The algorithm is auto-detected from the key type when ``[api_auth] jwt_algorithm`` is set to ``GUESS`` (the default). Validation can use either: @@ -66,11 +66,6 @@ Airflow supports two mutually exclusive signing modes: - The public key derived from the configured private key (automatic fallback when ``trusted_jwks_url`` is not set). -The asymmetric mode is recommended for production deployments where you want workers -and the API server to operate with different credentials (workers only need the private key for -token generation; the API server only needs the JWKS for validation). - - REST API Authentication Flow ----------------------------- @@ -101,9 +96,9 @@ Token structure (REST API) * - ``jti`` - Unique token identifier (UUID4 hex). Used for token revocation. * - ``iss`` - - Issuer (from ``[api_auth] jwt_issuer``). Optional but recommended. + - Issuer (from ``[api_auth] jwt_issuer``). * - ``aud`` - - Audience (from ``[api_auth] jwt_audience``). Optional but recommended. + - Audience (from ``[api_auth] jwt_audience``). * - ``sub`` - User identifier (serialized by the auth manager). * - ``iat`` @@ -136,11 +131,6 @@ Revoked tokens are tracked in the ``revoked_token`` database table by their ``jt On logout or explicit revocation, the token's ``jti`` and ``exp`` are inserted into this table. Expired entries are automatically cleaned up at a cadence of ``2× jwt_expiration_time``. -Execution API tokens are not subject to revocation. They are short-lived (default 10 minutes) -and automatically refreshed by the ``JWTReissueMiddleware``, so revocation is not part of the -Execution API security model. Once an Execution API token is issued to a worker, it remains -valid until it expires. - Token refresh (REST API) ^^^^^^^^^^^^^^^^^^^^^^^^ @@ -168,15 +158,16 @@ Default timings (REST API) Execution API Authentication Flow ---------------------------------- -The Execution API is an internal API used by workers to report task state transitions, -heartbeats, and to retrieve connections, variables, and XComs at task runtime. +The Execution API is an API used for use by Airflow itself (not third party callers) +to report and set task state transitions, send heartbeats, and to retrieve connections, +variables, and XComs at task runtime, trigger execution and Dag parsing. Token generation (Execution API) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -1. The **Scheduler** (via the executor) generates a JWT for each task instance before - dispatching it to a worker. The executor's ``jwt_generator`` property creates a - ``JWTGenerator`` configured with the ``[execution_api]`` settings. +1. The **Scheduler** generates a JWT for each task instance before + dispatching it (via the executor) to a worker. The executor's + ``jwt_generator`` property creates a ``JWTGenerator`` configured with the ``[execution_api]`` settings. 2. The token's ``sub`` (subject) claim is set to the **task instance UUID**. 3. The token is embedded in the workload JSON payload (``BaseWorkloadSchema.token`` field) that is sent to the worker process. @@ -193,13 +184,13 @@ Token structure (Execution API) * - ``jti`` - Unique token identifier (UUID4 hex). * - ``iss`` - - Issuer (from ``[api_auth] jwt_issuer``). Optional. + - Issuer (from ``[api_auth] jwt_issuer``). * - ``aud`` - Audience (from ``[execution_api] jwt_audience``, default: ``urn:airflow.apache.org:task``). * - ``sub`` - Task instance UUID — the identity of the workload. * - ``scope`` - - Token scope: ``"execution"`` (default) or ``"workload"`` (restricted). + - Token scope: ``"execution"`` or ``"workload"``. * - ``iat`` - Issued-at timestamp. * - ``nbf`` @@ -212,15 +203,15 @@ Token scopes (Execution API) The Execution API defines two token scopes: -**execution** (default) - Accepted by all Execution API endpoints. This is the standard scope for worker - communication. - **workload** A restricted scope accepted only on endpoints that explicitly opt in via ``Security(require_auth, scopes=["token:workload"])``. Used for endpoints that manage task state transitions. +**execution** + Accepted by all Execution API endpoints. This is the standard scope for worker + communication and allows access + Tokens without a ``scope`` claim default to ``"execution"`` for backwards compatibility. Token delivery to workers @@ -228,7 +219,8 @@ Token delivery to workers The token flows through the execution stack as follows: -1. **Executor** generates the token and embeds it in the workload JSON payload. +1. **Scheduler** generates the token and embeds it in the workload JSON payload that it passes to + **Executor**. 2. The workload JSON is passed to the worker process (via the executor-specific mechanism: Celery message, Kubernetes Pod spec, local subprocess arguments, etc.). 3. The worker's ``execute_workload()`` function reads the workload JSON and extracts the token. @@ -259,7 +251,7 @@ Route-level enforcement is handled by ``require_auth``: Token refresh (Execution API) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -The ``JWTReissueMiddleware`` automatically refreshes tokens that are approaching expiry: +The ``JWTReissueMiddleware`` automatically refreshes valid tokens that are approaching expiry: 1. After each response, the middleware checks the token's remaining validity. 2. If less than **20%** of the total validity remains (minimum 30 seconds), the server @@ -271,6 +263,16 @@ The ``JWTReissueMiddleware`` automatically refreshes tokens that are approaching This mechanism ensures long-running tasks do not lose API access due to token expiry, without requiring the worker to re-authenticate. +No token revocation (Execution API) +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Execution API tokens are not subject to revocation. They are short-lived (default 10 minutes) +and automatically refreshed by the ``JWTReissueMiddleware``, so revocation is not part of the +Execution API security model. Once an Execution API token is issued to a worker, it remains +valid until it expires. + + + Default timings (Execution API) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -312,8 +314,10 @@ retrieve the parent process's database credentials (via ``/proc/<PID>/environ``, files, or secrets manager access) and gain full read/write access to the metadata database and all Execution API operations — without needing a valid JWT token. -This is in contrast to workers, where the isolation is genuine: worker processes do not receive -database credentials at all and communicate exclusively through the Execution API. +This is in contrast to workers/task execution, where the isolation is implemented ad deployment +level - where sensitive configuration of database credentials is not available to Airflow +processes because they are not set in their deployment configuration at all, and communicate +exclusively through the Execution API. In the default deployment, a **single Dag File Processor instance** parses Dag files for all teams and a **single Triggerer instance** handles all triggers across all teams. This means @@ -334,67 +338,8 @@ guidance, and the planned strategic and tactical improvements. Workload Isolation and Current Limitations ------------------------------------------ -The current JWT authentication model operates under the following assumptions and limitations: - -**Worker process memory protection (Linux)** - On Linux, the supervisor process calls ``prctl(PR_SET_DUMPABLE, 0)`` at the start of - ``supervise()`` before forking the task process. This flag is inherited by the forked - child. Marking processes as non-dumpable prevents same-UID sibling processes from reading - ``/proc/<pid>/mem``, ``/proc/<pid>/environ``, or ``/proc/<pid>/maps``, and blocks - ``ptrace(PTRACE_ATTACH)``. This is critical because each supervisor holds a distinct JWT - token in memory — without this protection, a malicious task process running as the same - Unix user could steal tokens from sibling supervisor processes. - - This protection is one of the reasons that passing sensitive configuration via environment - variables is safer than via configuration files: environment variables are only readable - by the process itself (and root), whereas configuration files on disk are readable by any - process with filesystem access running as the same user. - - .. note:: - - This protection is Linux-specific. On non-Linux platforms, the - ``_make_process_nondumpable()`` call is a no-op. Deployment Managers running Airflow - on non-Linux platforms should implement alternative isolation measures. - -**No cross-workload isolation** - All worker workloads authenticate to the same Execution API with tokens that share the - same signing key, audience, and issuer. While the ``ti:self`` scope enforcement prevents - a worker from accessing *another task instance's* specific endpoints (e.g., heartbeat, - state transitions), the token grants access to shared resources such as connections, - variables, and XComs that are not scoped to individual tasks. - -**No team-level isolation in Execution API (experimental multi-team feature)** - The experimental multi-team feature (``[core] multi_team``) provides UI-level and REST - API-level RBAC isolation between teams, but **does not yet guarantee task-level isolation**. - At the Execution API level, there is no enforcement of team-based access boundaries. - A task from one team can access the same connections, variables, and XComs as a task from - another team. All workloads share the same JWT signing keys and audience regardless of team - assignment. - - In deployments where additional hardening measures are not implemented at the deployment - level, a task from one team can potentially access resources belonging to another team - (see :doc:`/security/security_model`). A deep understanding of configuration and deployment - security is required by Deployment Managers to configure it in a way that can guarantee - separation between teams. Task-level team isolation will be improved in future versions - of Airflow. - -**Dag File Processor and Triggerer potentially bypass JWT and access the database** - As described above, the default deployment runs a single Dag File Processor and a single - Triggerer for all teams. Both potentially bypass JWT authentication via in-process transport. - For multi-team isolation, Deployment Managers must run separate instances per team, but - even then, each instance potentially retains direct database access. A Dag author whose code - runs in these components can potentially access the database directly — including data - belonging to other teams or the JWT signing key configuration — unless the Deployment Manager - restricts the database credentials and configuration available to each instance. - -**Planned improvements** - Future versions of Airflow will address these limitations with: - - - Finer-grained token scopes tied to specific resources (connections, variables) and teams. - - Enforcement of team-based isolation in the Execution API. - - Built-in support for per-team Dag File Processor and Triggerer instances. - - Improved sandboxing of user-submitted code in the Dag File Processor and Triggerer. - - Full task-level isolation for the multi-team feature. +For a detailed discussion of workload isolation protections, current limitations, and planned +improvements, see :ref:`workload-isolation`. Configuration Reference @@ -410,16 +355,16 @@ All JWT-related configuration parameters: - Default - Description * - ``[api_auth] jwt_secret`` - - Auto-generated + - Auto-generated if missing - Symmetric secret key for signing tokens. Must be the same across all components. Mutually exclusive with ``jwt_private_key_path``. * - ``[api_auth] jwt_private_key_path`` - None - - Path to PEM-encoded private key (RSA or Ed25519). Mutually exclusive with ``jwt_secret``. + - Path to PEM-encoded private key (``RSA`` or ``Ed25519``). Mutually exclusive with ``jwt_secret``. * - ``[api_auth] jwt_algorithm`` - ``GUESS`` - - Signing algorithm. Auto-detected from key type: HS512 for symmetric, RS256 for RSA, EdDSA for Ed25519. + - Signing algorithm. Auto-detected from key type: ``HS512`` for symmetric, ``RS256`` for ``RSA``, ``EdDSA`` for ``Ed25519``. * - ``[api_auth] jwt_kid`` - - Auto (RFC 7638 thumbprint) + - Auto (``RFC 7638`` thumbprint) - Key ID placed in token header. Ignored for symmetric keys. * - ``[api_auth] jwt_issuer`` - None diff --git a/airflow-core/docs/security/security_model.rst b/airflow-core/docs/security/security_model.rst index de960685058..96f6f66783b 100644 --- a/airflow-core/docs/security/security_model.rst +++ b/airflow-core/docs/security/security_model.rst @@ -319,7 +319,8 @@ JWT authentication and workload isolation Airflow uses JWT (JSON Web Token) authentication for both its public REST API and its internal Execution API. For a detailed description of the JWT authentication flows, token structure, and -configuration, see :doc:`/security/jwt_token_authentication`. +configuration, see :doc:`/security/jwt_token_authentication`. For the current state of workload +isolation protections and their limitations, see :ref:`workload-isolation`. Current isolation limitations ............................. @@ -341,9 +342,10 @@ potentially still executes with direct database access in the Dag File Processor process isolation works, a child process running as the same user can retrieve the parent's credentials through several mechanisms: - * **Environment variables**: On Linux, any process can read ``/proc/<PID>/environ`` of another + * **Environment variables**: By default, on Linux, any process can read ``/proc/<PID>/environ`` of another process running as the same user — so database credentials passed via environment variables - (e.g., ``AIRFLOW__DATABASE__SQL_ALCHEMY_CONN``) can be read from the parent process. + (e.g., ``AIRFLOW__DATABASE__SQL_ALCHEMY_CONN``) can be read from the parent process. This can be + prevented by setting dumpable property of the process which is implemented in supervisor of tasks. * **Configuration files**: If configuration is stored in files, those files must be readable by the parent process and are therefore also readable by the child process running as the same user. * **Command-based secrets** (``_CMD`` suffix options): The child process can execute the same @@ -357,17 +359,19 @@ potentially still executes with direct database access in the Dag File Processor author importing ``airflow.settings.Session`` out of habit from Airflow 2) but do not prevent a determined actor from circumventing them. - On workers, the isolation is stronger: worker processes do not receive database credentials at all - (neither via environment variables nor configuration). Workers communicate exclusively through the - Execution API using short-lived JWT tokens. A task running on a worker genuinely cannot access the - metadata database directly — there are no credentials to retrieve. + On workers, the isolation can be stronger when Deployment Manager configures worker processes to + not receive database credentials at all (neither via environment variables nor configuration). + Workers should communicate exclusively through the Execution API using short-lived JWT tokens. + A task running on a worker genuinely should not access the metadata database directly — + when it is configured to not have any credentials accessible to it. -**Dag File Processor and Triggerer potentially bypass JWT authentication** - The Dag File Processor and Triggerer use an in-process transport to access the Execution API, - which bypasses JWT authentication. Since these components execute user-submitted code - (Dag files and trigger code respectively), a Dag author whose code runs in these components - potentially has unrestricted access to all Execution API operations — including the ability to - read any connection, variable, or XCom — without needing a valid JWT token. +**Dag File Processor and Triggerer run user code only have soft protection to bypass JWT authentication** + The Dag File Processor and Triggerer processes that run user code, + use an in-process transport to access the Execution API, which bypasses JWT authentication. + Since these components execute user-submitted code (Dag files and trigger code respectively), + a Dag author whose code runs in these components + has unrestricted access to all Execution API operations if they bypass the soft protections + — including the ability to read any connection, variable, or XCom — without needing a valid JWT token. Furthermore, the Dag File Processor has direct access to the metadata database (it needs this to store serialized Dags). As described above, Dag author code executing in the Dag File Processor @@ -398,10 +402,12 @@ potentially still executes with direct database access in the Dag File Processor variables, and XComs are accessible to all tasks. There is no isolation between tasks belonging to different teams or Dag authors at the Execution API level. -**Token signing key is a shared secret** +**Token signing key might be a shared secret** In symmetric key mode (``[api_auth] jwt_secret``), the same secret key is used to both generate and validate tokens. Any component that has access to this secret can forge tokens with arbitrary claims, - including tokens for other task instances or with elevated scopes. + including tokens for other task instances or with elevated scopes. This does not impact the security + of the system though if the secret is only available to api-server and scheduler via deployment + configuration. **Sensitive configuration values can be leaked through logs** Dag authors can write code that prints environment variables or configuration values to task logs @@ -435,11 +441,12 @@ model — Airflow does not enforce these natively. be available to components that need to generate tokens (Scheduler/Executor, API Server) and components that need to validate tokens (API Server). Workers should not have access to the signing key — they only need the tokens provided to them. - * Connection credentials for external systems should only be available to the API Server + * Connection credentials for external systems (via Secrets Managers) should only be available to the API Server (which serves them to workers via the Execution API), not to the Scheduler, Dag File Processor, - or Triggerer processes directly. + or Triggerer processes directly. This however limits some of the features of Airflow - such as Deadline + Alerts or triggers that need to authenticate with the external systems. * Database connection strings should only be available to components that need direct database access - (API Server, Scheduler, Dag File Processor), not to workers. + (API Server, Scheduler, Dag File Processor, Triggerer), not to workers. **Pass configuration via environment variables** For higher security, pass sensitive configuration values via environment variables rather than @@ -565,25 +572,6 @@ model — Airflow does not enforce these natively. * Workers cannot forge tokens even if they could access the JWKS endpoint, since they would not have the private key. -**Unix user-level isolation for Dag File Processor and Triggerer** - Since the child processes of the Dag File Processor and Triggerer run as the same Unix user as - their parent processes, a Dag author's code can read the parent's credentials. To prevent this, - Deployment Managers can configure the child processes to run as a **different Unix user** that has - no access to Airflow's configuration files or the parent process's ``/proc/<PID>/environ``. - - This requires: - - * Creating a dedicated low-privilege Unix user for Dag parsing / trigger execution. - * Configuring ``sudo`` access so the Airflow user can impersonate this low-privilege user. - * Ensuring that Airflow configuration files and directories are not readable by the low-privilege - user (e.g., using Unix group permissions). - * Ensuring that the low-privilege user has no network access to the metadata database. - - This approach is analogous to the existing ``run_as_user`` impersonation support for tasks (see - :doc:`/security/workload`). It is a deployment-level measure — Airflow does not currently - automate this separation for the Dag File Processor or Triggerer, but future versions plan to - support it natively. - **Network-level isolation** Use network policies, VPCs, or similar mechanisms to restrict which components can communicate with each other. For example, workers should only be able to reach the Execution API endpoint, @@ -591,11 +579,6 @@ model — Airflow does not enforce these natively. child processes should ideally not have network access to the metadata database either, if Unix user-level isolation is implemented. -**Restrict access to task logs** - Task logs may contain sensitive information if Dag authors (accidentally or intentionally) print - environment variables or configuration values. Deployment Managers should restrict who can view - task logs via RBAC and ensure that log storage backends are properly secured. - **Other measures and future improvements** Deployment Managers may need to implement additional measures depending on their security requirements. These may include monitoring and auditing of Execution API access patterns, diff --git a/airflow-core/docs/security/workload.rst b/airflow-core/docs/security/workload.rst index 31714aa21fb..0496cddc7f5 100644 --- a/airflow-core/docs/security/workload.rst +++ b/airflow-core/docs/security/workload.rst @@ -50,3 +50,86 @@ not set. [core] default_impersonation = airflow + +.. _workload-isolation: + +Workload Isolation and Current Limitations +------------------------------------------ + +This section describes the current state of workload isolation in Apache Airflow, +including the protections that are in place, the known limitations, and planned improvements. + +For the full security model and deployment hardening guidance, see :doc:`/security/security_model`. +For details on the JWT authentication flows used by workers and internal components, see +:doc:`/security/jwt_token_authentication`. + +Worker process memory protection (Linux) +'''''''''''''''''''''''''''''''''''''''' + +On Linux, the supervisor process calls ``prctl(PR_SET_DUMPABLE, 0)`` at the start of +``supervise()`` before forking the task process. This flag is inherited by the forked +child. Marking processes as non-dumpable prevents same-UID sibling processes from reading +``/proc/<pid>/mem``, ``/proc/<pid>/environ``, or ``/proc/<pid>/maps``, and blocks +``ptrace(PTRACE_ATTACH)``. This is critical because each supervisor holds a distinct JWT +token in memory — without this protection, a malicious task process running as the same +Unix user could steal tokens from sibling supervisor processes. + +This protection is one of the reasons that passing sensitive configuration via environment +variables is safer than via configuration files: environment variables are only readable +by the process itself (and root), whereas configuration files on disk are readable by any +process with filesystem access running as the same user. + +.. note:: + + This protection is Linux-specific. On non-Linux platforms, the + ``_make_process_nondumpable()`` call is a no-op. Deployment Managers running Airflow + on non-Linux platforms should implement alternative isolation measures. + +No cross-workload isolation +''''''''''''''''''''''''''' + +All worker workloads authenticate to the same Execution API with tokens that share the +same signing key, audience, and issuer. While the ``ti:self`` scope enforcement prevents +a worker from accessing *another task instance's* specific endpoints (e.g., heartbeat, +state transitions), the token grants access to shared resources such as connections, +variables, and XComs that are not scoped to individual tasks. + +No team-level isolation in Execution API (experimental multi-team feature) +'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' + +The experimental multi-team feature (``[core] multi_team``) provides UI-level and REST +API-level RBAC isolation between teams, but **does not yet guarantee task-level isolation**. +At the Execution API level, there is no enforcement of team-based access boundaries. +A task from one team can access the same connections, variables, and XComs as a task from +another team. All workloads share the same JWT signing keys and audience regardless of team +assignment. + +In deployments where additional hardening measures are not implemented at the deployment +level, a task from one team can potentially access resources belonging to another team +(see :doc:`/security/security_model`). A deep understanding of configuration and deployment +security is required by Deployment Managers to configure it in a way that can guarantee +separation between teams. Task-level team isolation will be improved in future versions +of Airflow. + +Dag File Processor and Triggerer potentially bypass JWT and access the database +''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' + +As described in :doc:`/security/jwt_token_authentication`, the default deployment runs a +single Dag File Processor and a single Triggerer for all teams. Both potentially bypass +JWT authentication via in-process transport. For multi-team isolation, Deployment Managers +must run separate instances per team, but even then, each instance potentially retains +direct database access. A Dag author whose code runs in these components can potentially +access the database directly — including data belonging to other teams or the JWT signing +key configuration — unless the Deployment Manager restricts the database credentials and +configuration available to each instance. + +Planned improvements +'''''''''''''''''''' + +Future versions of Airflow will address these limitations with: + +- Finer-grained token scopes tied to specific resources (connections, variables) and teams. +- Enforcement of team-based isolation in the Execution API. +- Built-in support for per-team Dag File Processor and Triggerer instances. +- Improved sandboxing of user-submitted code in the Dag File Processor and Triggerer. +- Full task-level isolation for the multi-team feature.
