anishgirianish commented on code in PR #64760:
URL: https://github.com/apache/airflow/pull/64760#discussion_r3049813142


##########
airflow-core/docs/security/jwt_token_authentication.rst:
##########
@@ -0,0 +1,453 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+JWT Token Authentication
+========================
+
+This document describes how JWT (JSON Web Token) authentication works in 
Apache Airflow
+for both the public REST API (Core API) and the internal Execution API used by 
workers.
+
+.. contents::
+   :local:
+   :depth: 2
+
+Overview
+--------
+
+Airflow uses JWT tokens as the primary authentication mechanism for its APIs. 
There are two
+distinct JWT authentication flows:
+
+1. **REST API (Core API)** — used by UI users, CLI tools, and external clients 
to interact
+   with the Airflow public API.
+2. **Execution API** — used internally by workers, the Dag File Processor, and 
the Triggerer
+   to communicate task state and retrieve runtime data (connections, 
variables, XComs).
+
+Both flows share the same underlying JWT infrastructure (``JWTGenerator`` and 
``JWTValidator``
+classes in ``airflow.api_fastapi.auth.tokens``) but differ in audience, token 
lifetime, subject
+claims, and scope semantics.
+
+
+Signing and Cryptography
+------------------------
+
+Airflow supports two mutually exclusive signing modes:
+
+**Symmetric (shared secret)**
+   Uses a pre-shared secret key (``[api_auth] jwt_secret``) with the **HS512** 
algorithm.
+   All components that generate or validate tokens must share the same secret. 
If no secret
+   is configured, Airflow auto-generates a random 16-byte key at startup — but 
this key is
+   ephemeral and different across processes, which will cause authentication 
failures in
+   multi-component deployments. Deployment Managers must explicitly configure 
this value.
+
+**Asymmetric (public/private key pair)**
+   Uses a PEM-encoded private key (``[api_auth] jwt_private_key_path``) for 
signing and
+   the corresponding public key for validation. Supported algorithms: 
**RS256** (RSA) and
+   **EdDSA** (Ed25519). The algorithm is auto-detected from the key type when
+   ``[api_auth] jwt_algorithm`` is set to ``GUESS`` (the default).
+
+   Validation can use either:
+
+   - A JWKS (JSON Web Key Set) endpoint configured via ``[api_auth] 
trusted_jwks_url``
+     (local file or remote HTTP/HTTPS URL, polled periodically for updates).
+   - The public key derived from the configured private key (automatic 
fallback when
+     ``trusted_jwks_url`` is not set).
+
+The asymmetric mode is recommended for production deployments where you want 
workers
+and the API server to operate with different credentials (workers only need 
the private key for
+token generation; the API server only needs the JWKS for validation).
+
+
+REST API Authentication Flow
+-----------------------------
+
+Token acquisition
+^^^^^^^^^^^^^^^^^
+
+1. A client sends a ``POST`` request to ``/auth/token`` with credentials 
(e.g., username
+   and password in JSON body).
+2. The auth manager validates the credentials and creates a user object.
+3. The auth manager serializes the user into JWT claims and calls 
``JWTGenerator.generate()``.
+4. The generated token is returned in the response as ``access_token``.
+
+For UI-based authentication, the token is stored in a secure, HTTP-only cookie 
(``_token``)
+with ``SameSite=Lax``.
+
+The CLI uses a separate endpoint (``/auth/token/cli``) with a different 
(shorter) expiration
+time.
+
+Token structure (REST API)
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+.. list-table::
+   :header-rows: 1
+   :widths: 15 85
+
+   * - Claim
+     - Description
+   * - ``jti``
+     - Unique token identifier (UUID4 hex). Used for token revocation.
+   * - ``iss``
+     - Issuer (from ``[api_auth] jwt_issuer``). Optional but recommended.
+   * - ``aud``
+     - Audience (from ``[api_auth] jwt_audience``). Optional but recommended.
+   * - ``sub``
+     - User identifier (serialized by the auth manager).
+   * - ``iat``
+     - Issued-at timestamp (Unix epoch seconds).
+   * - ``nbf``
+     - Not-before timestamp (same as ``iat``).
+   * - ``exp``
+     - Expiration timestamp (``iat + jwt_expiration_time``).
+
+Token validation (REST API)
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+On each API request, the token is extracted in this order of precedence:
+
+1. ``Authorization: Bearer <token>`` header.
+2. OAuth2 query parameter.
+3. ``_token`` cookie.
+
+The ``JWTValidator`` verifies the signature, expiry (``exp``), not-before 
(``nbf``),
+issued-at (``iat``), audience, and issuer claims. A configurable leeway
+(``[api_auth] jwt_leeway``, default 10 seconds) accounts for clock skew.
+
+Token revocation (REST API only)
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Token revocation applies only to REST API and UI tokens — it is **not** used 
for Execution API
+tokens issued to workers.
+
+Revoked tokens are tracked in the ``revoked_token`` database table by their 
``jti`` claim.
+On logout or explicit revocation, the token's ``jti`` and ``exp`` are inserted 
into this
+table. Expired entries are automatically cleaned up at a cadence of ``2× 
jwt_expiration_time``.
+
+Execution API tokens are not subject to revocation. They are short-lived 
(default 10 minutes)
+and automatically refreshed by the ``JWTReissueMiddleware``, so revocation is 
not part of the
+Execution API security model. Once an Execution API token is issued to a 
worker, it remains
+valid until it expires.
+
+Token refresh (REST API)
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+The ``JWTRefreshMiddleware`` runs on UI requests. When the middleware detects 
that the
+current token's ``_token`` cookie is approaching expiry, it calls
+``auth_manager.refresh_user()`` to generate a new token and sets it as the 
updated cookie.
+
+Default timings (REST API)
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+.. list-table::
+   :header-rows: 1
+   :widths: 50 50
+
+   * - Setting
+     - Default
+   * - ``[api_auth] jwt_expiration_time``
+     - 86400 seconds (24 hours)
+   * - ``[api_auth] jwt_cli_expiration_time``
+     - 3600 seconds (1 hour)
+   * - ``[api_auth] jwt_leeway``
+     - 10 seconds
+
+
+Execution API Authentication Flow
+----------------------------------
+
+The Execution API is an internal API used by workers to report task state 
transitions,
+heartbeats, and to retrieve connections, variables, and XComs at task runtime.
+
+Token generation (Execution API)
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+1. The **Scheduler** (via the executor) generates a JWT for each task instance 
before
+   dispatching it to a worker. The executor's ``jwt_generator`` property 
creates a
+   ``JWTGenerator`` configured with the ``[execution_api]`` settings.
+2. The token's ``sub`` (subject) claim is set to the **task instance UUID**.
+3. The token is embedded in the workload JSON payload 
(``BaseWorkloadSchema.token`` field)
+   that is sent to the worker process.
+
+Token structure (Execution API)
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+.. list-table::
+   :header-rows: 1
+   :widths: 15 85
+
+   * - Claim
+     - Description
+   * - ``jti``
+     - Unique token identifier (UUID4 hex).
+   * - ``iss``
+     - Issuer (from ``[api_auth] jwt_issuer``). Optional.
+   * - ``aud``
+     - Audience (from ``[execution_api] jwt_audience``, default: 
``urn:airflow.apache.org:task``).
+   * - ``sub``
+     - Task instance UUID — the identity of the workload.
+   * - ``scope``
+     - Token scope: ``"execution"`` (default) or ``"workload"`` (restricted).
+   * - ``iat``
+     - Issued-at timestamp.
+   * - ``nbf``
+     - Not-before timestamp.
+   * - ``exp``
+     - Expiration timestamp (``iat + [execution_api] jwt_expiration_time``).
+
+Token scopes (Execution API)
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The Execution API defines two token scopes:
+
+**execution** (default)
+   Accepted by all Execution API endpoints. This is the standard scope for 
worker
+   communication.
+
+**workload**
+   A restricted scope accepted only on endpoints that explicitly opt in via
+   ``Security(require_auth, scopes=["token:workload"])``. Used for endpoints 
that
+   manage task state transitions.

Review Comment:
   Thanks @ashb and @potiuk! Since this PR is already merged, I've updated the 
JWT docs in #60108 to cover the workload-scoped token lifetime, the /run 
handoff to execution tokens, and the middleware skip behavior. Should be in 
sync now. Would appreciate a review on #60108 when you get a chance! 
   
   thank you



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to