Copilot commented on code in PR #64133:
URL: https://github.com/apache/airflow/pull/64133#discussion_r3066474692
##########
airflow-core/docs/extra-packages-ref.rst:
##########
@@ -92,6 +92,8 @@ python dependencies for the provided package. The same extras
are available as `
+---------------------+-----------------------------------------------------+----------------------------------------------------------------------------+
| kerberos | ``pip install 'apache-airflow[kerberos]'`` |
Kerberos integration for Kerberized services (Hadoop, Presto, Trino) |
+---------------------+-----------------------------------------------------+----------------------------------------------------------------------------+
+| memray | ``pip install 'apache-airflow[memray]'`` |
Required for memory profiling with memray |
Review Comment:
The PR title/description claims a CLI behavior change (“map_index bounds
validation”), but the provided diffs show only documentation/CI/repo-meta
changes and no CLI/task SDK code changes. Please either (a) include the actual
CLI fix + tests in this PR, or (b) update the PR title/description to reflect
the documentation/CI scope to avoid misleading reviewers and changelog tooling.
##########
airflow-core/docs/conf.py:
##########
@@ -244,13 +245,20 @@ def
add_airflow_core_exclude_patterns_to_sphinx(exclude_patterns: list[str]):
config_descriptions =
retrieve_configuration_description(include_providers=False)
configs, deprecated_options = get_configs_and_deprecations(airflow_version,
config_descriptions)
+# TODO: remove it when we start releasing task-sdk separately from airflow-core
+airflow_version_split = PACKAGE_VERSION.split(".")
+TASK_SDK_VERSION = f"1.{airflow_version_split[1]}.{airflow_version_split[2]}"
Review Comment:
Deriving `TASK_SDK_VERSION` via `PACKAGE_VERSION.split('.')` will produce
invalid versions for pre-releases/local versions (e.g. `3.2.0.dev0`,
`3.2.0+local`). Since `packaging.version` is already imported, prefer parsing
`PACKAGE_VERSION` with `Version(parse_version(...))` and using
`.major/.minor/.micro` to build a stable `TASK_SDK_VERSION`.
```suggestion
package_version: Version = parse_version(PACKAGE_VERSION)
TASK_SDK_VERSION = f"1.{package_version.minor}.{package_version.micro}"
```
##########
INSTALL:
##########
@@ -231,15 +265,15 @@ that are used in main CI tests and by other contributors.
There are different constraint files for different Python versions. For
example, this command will install
all basic devel requirements and requirements of Google provider as last
successfully tested for Python 3.10:
- uv pip install -e ".[devel,google]"" \
+ pip install -e ".[devel,google]"" \
Review Comment:
The first command has an extra double-quote (`\".[devel,google]\"\"`) which
makes it invalid to copy/paste. Please remove the stray quote so the
installation instructions work as written.
```suggestion
pip install -e ".[devel,google]" \
```
##########
airflow-core/docs/core-concepts/operators.rst:
##########
@@ -330,4 +343,38 @@ If you need to include a Jinja template expression (e.g.,
``{{ ds }}``) literall
dag=dag,
)
-This ensures the f-string processing results in a string containing the
literal double braces required by Jinja, which Airflow can then template
correctly before execution. Failure to do this is a common issue for beginners
and can lead to errors during Dag parsing or unexpected behavior at runtime
when the templating does not occur as expected.
+This ensures the f-string processing results in a string containing the
literal double braces required by Jinja, which Airflow can then template
correctly before execution. Failure to do this is a common issue for beginners
and can lead to errors during DAG parsing or unexpected behavior at runtime
when the templating does not occur as expected.
+
+Pre- and post-execute methods
+-----------------------------
+
+The ``pre_execute`` and ``post_execute`` methods are called before and after
the operator is executed, respectively.
+
+For example, you can use the ``pre_execute`` method to elegantly determine if
a task should be executed or not:
+
+.. code-block:: python
+
+ def _check_skipped(context: Any) -> None:
+ """
+ Check if a given task instance should be skipped if the
`tasks_to_skip` Airflow Variable is a list that contains the task id.
+ """
+ tasks_to_skip = Variable.get("tasks_to_skip", deserialize_json=True)
+ if context["task"].task_id in on_ice:
Review Comment:
The example uses `on_ice` in the membership check, but that variable is
never defined in the snippet. This will raise a NameError if copied. Replace
`on_ice` with `tasks_to_skip` (or define `on_ice` explicitly) so the example is
runnable.
```suggestion
if context["task"].task_id in tasks_to_skip:
```
##########
airflow-core/docs/core-concepts/auth-manager/index.rst:
##########
@@ -161,35 +160,59 @@ cookie named ``_token`` before redirecting to the Airflow
UI. The Airflow UI wil
.. code-block:: python
+ from airflow.api_fastapi.app import get_cookie_path
from airflow.api_fastapi.auth.managers.base_auth_manager import
COOKIE_NAME_JWT_TOKEN
response = RedirectResponse(url="/")
secure = request.base_url.scheme == "https" or bool(conf.get("api",
"ssl_cert", fallback=""))
- response.set_cookie(COOKIE_NAME_JWT_TOKEN, token, secure=secure)
+ response.set_cookie(COOKIE_NAME_JWT_TOKEN, token, path=get_cookie_path(),
secure=secure, httponly=True)
return response
.. note::
- Do not set the cookie parameter ``httponly`` to ``True``. Airflow UI needs
to access the JWT token from the cookie.
+ Ensure that the cookie parameter ``httponly`` is set to ``True``. The UI
does not manage the token.
+Refreshing JWT Token
+''''''''''''''''''''
+Refreshing token is optional feature and its availability depends on the
specific implementation of the auth manager.
+The auth manager is responsible for refreshing the JWT token when it expires.
+The Airflow API uses middleware that intercepts every request and checks the
validity of the JWT token.
+Token communication is handled through ``httponly`` cookies to improve
security.
+When the token expires, the `JWTRefreshMiddleware
<https://github.com/apache/airflow/blob/3.1.5/airflow-core/src/airflow/api_fastapi/auth/middlewares/refresh_token.py>`_
middleware calls the auth manager's ``refresh_user`` method to obtain a new
token.
+
+
+To support token refresh operations, the auth manager must implement the
``refresh_user`` method.
+This method receives an expired token and must return a new valid token.
+User information is extracted from the expired token and used to generate a
fresh token.
+
+An example implementation of ``refresh_user`` could be:
+`KeycloakAuthManager::refresh_user
<https://github.com/apache/airflow/blob/3.1.5/providers/keycloak/src/airflow/providers/keycloak/auth_manager/keycloak_auth_manager.py#L113-L121>`_
+User information is derived from the ``BaseUser`` instance.
+It is important that the user object contains all the fields required to
refresh the token. An example user class could be:
+`KeycloakAuthManagerUser(BaseUser)
<https://github.com/apache/airflow/blob/3.1.5/providers/keycloak/src/airflow/providers/keycloak/auth_manager/user.pys>`_.
Review Comment:
The linked filename ends with `user.pys`, which appears to be a typo and
results in a broken link. It should likely be `user.py`.
```suggestion
`KeycloakAuthManagerUser(BaseUser)
<https://github.com/apache/airflow/blob/3.1.5/providers/keycloak/src/airflow/providers/keycloak/auth_manager/user.py>`_.
```
##########
.github/actions/post_tests_failure/action.yml:
##########
@@ -22,21 +22,21 @@ runs:
using: "composite"
steps:
- name: "Upload airflow logs"
- uses: actions/upload-artifact@v4
+ uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02
# v4.6.2
with:
name: airflow-logs-${{env.JOB_ID}}
path: './files/airflow_logs*'
retention-days: 7
if-no-files-found: ignore
- name: "Upload container logs"
- uses: actions/upload-artifact@v4
+ uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02
# v4.6.2
with:
name: container-logs-${{env.JOB_ID}}
path: "./files/container_logs*"
retention-days: 7
if-no-files-found: ignore
- name: "Upload other logs"
- uses: actions/upload-artifact@v4
+ uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02
# v4.6.2
with:
name: container-logs-${{env.JOB_ID}}
Review Comment:
The 'Upload other logs' step sets the artifact name to
`container-logs-${{env.JOB_ID}}`, which duplicates the name used by the
container logs artifact. This can overwrite or collide depending on artifact
settings/runner behavior. Use a distinct name (e.g. `other-logs-${{ env.JOB_ID
}}`) to keep artifacts separate.
```suggestion
name: other-logs-${{env.JOB_ID}}
```
##########
airflow-core/docs/core-concepts/operators.rst:
##########
@@ -330,4 +343,38 @@ If you need to include a Jinja template expression (e.g.,
``{{ ds }}``) literall
dag=dag,
)
-This ensures the f-string processing results in a string containing the
literal double braces required by Jinja, which Airflow can then template
correctly before execution. Failure to do this is a common issue for beginners
and can lead to errors during Dag parsing or unexpected behavior at runtime
when the templating does not occur as expected.
+This ensures the f-string processing results in a string containing the
literal double braces required by Jinja, which Airflow can then template
correctly before execution. Failure to do this is a common issue for beginners
and can lead to errors during DAG parsing or unexpected behavior at runtime
when the templating does not occur as expected.
+
+Pre- and post-execute methods
+-----------------------------
+
+The ``pre_execute`` and ``post_execute`` methods are called before and after
the operator is executed, respectively.
+
+For example, you can use the ``pre_execute`` method to elegantly determine if
a task should be executed or not:
+
+.. code-block:: python
+
+ def _check_skipped(context: Any) -> None:
Review Comment:
The example uses `on_ice` in the membership check, but that variable is
never defined in the snippet. This will raise a NameError if copied. Replace
`on_ice` with `tasks_to_skip` (or define `on_ice` explicitly) so the example is
runnable.
##########
INSTALL:
##########
@@ -231,15 +265,15 @@ that are used in main CI tests and by other contributors.
There are different constraint files for different Python versions. For
example, this command will install
all basic devel requirements and requirements of Google provider as last
successfully tested for Python 3.10:
- uv pip install -e ".[devel,google]"" \
+ pip install -e ".[devel,google]"" \
--constraint
"https://raw.githubusercontent.com/apache/airflow/constraints-main/constraints-3.10.txt"
Using the 'constraints-no-providers' constraint files, you can upgrade Airflow
without paying attention to the provider's dependencies. This allows you to
keep installed provider dependencies and install the latest supported ones
using pure Airflow core.
- uv pip install -e ".[devel]" \
+ pip install -e ".[devel]" \
Review Comment:
The first command has an extra double-quote (`\".[devel,google]\"\"`) which
makes it invalid to copy/paste. Please remove the stray quote so the
installation instructions work as written.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]