This is an automated email from the ASF dual-hosted git repository.
jscheffl pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git
The following commit(s) were added to refs/heads/main by this push:
new 02107a49b0b Enhance Edge3 Provider docs (#49859)
02107a49b0b is described below
commit 02107a49b0b08494aa1d3c646a9225d3a319b2cd
Author: Jens Scheffler <[email protected]>
AuthorDate: Thu May 8 14:50:47 2025 +0200
Enhance Edge3 Provider docs (#49859)
* Enhance Edge3 Provider docs
* Complete Edge docs revision
* Adjust and extend documentation after PR 49915
* Review comments
* Extend docs after PR 50278
---
.pre-commit-config.yaml | 4 +-
providers/edge3/docs/architecture.rst | 189 ++++++++++++++++
providers/edge3/docs/deployment.rst | 173 ++++++++++++++
providers/edge3/docs/edge_executor.rst | 252 ++-------------------
.../edge3/docs/img/distributed_architecture.svg | 4 +
providers/edge3/docs/img/edge_package.svg | 4 +
providers/edge3/docs/index.rst | 13 +-
providers/edge3/docs/install_on_windows.rst | 24 +-
providers/edge3/docs/ui_plugin.rst | 66 ++++++
providers/edge3/docs/why_edge.rst | 53 +++++
providers/edge3/provider.yaml | 14 +-
.../airflow/providers/edge3/get_provider_info.py | 2 +-
12 files changed, 540 insertions(+), 258 deletions(-)
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
index 8e5c8cfdab0..3f3b10f1dcb 100644
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -314,7 +314,7 @@ repos:
exclude:
material-icons\.css$|^images/.*$|^RELEASE_NOTES\.txt$|^.*package-lock\.json$|^.*/kinglear\.txt$|^.*pnpm-lock\.yaml$|.*/dist/.*
args:
- --ignore-words=docs/spelling_wordlist.txt
- -
--skip=providers/.*/src/airflow/providers/*/*.rst,providers/*/docs/changelog.rst,docs/*/commits.rst,providers/*/docs/commits.rst,providers/*/*/docs/commits.rst,docs/apache-airflow/tutorial/pipeline_example.csv,*.min.js,*.lock,INTHEWILD.md
+ -
--skip=providers/.*/src/airflow/providers/*/*.rst,providers/*/docs/changelog.rst,docs/*/commits.rst,providers/*/docs/commits.rst,providers/*/*/docs/commits.rst,docs/apache-airflow/tutorial/pipeline_example.csv,*.min.js,*.lock,INTHEWILD.md,*.svg
- --exclude-file=.codespellignorelines
- repo: https://github.com/woodruffw/zizmor-pre-commit
rev: v1.5.2
@@ -648,7 +648,7 @@ repos:
^.*commits\.(rst|txt)$|
^.*RELEASE_NOTES\.rst$|
^contributing-docs/03_contributors_quick_start\.rst$|
- ^.*\.(png|gif|jp[e]?g|tgz|lock)$|
+ ^.*\.(png|gif|jp[e]?g|svg|tgz|lock)$|
git|
^airflow-core/newsfragments/43349\.significant\.rst$|
^airflow-core/newsfragments/41368\.significant\.rst$|
diff --git a/providers/edge3/docs/architecture.rst
b/providers/edge3/docs/architecture.rst
new file mode 100644
index 00000000000..6eb12742469
--- /dev/null
+++ b/providers/edge3/docs/architecture.rst
@@ -0,0 +1,189 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements. See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership. The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License. You may obtain a copy of the License at
+
+ .. http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied. See the License for the
+ specific language governing permissions and limitations
+ under the License.
+
+Edge Provider Architecture
+==========================
+
+Airflow consist of several components which are connected like in the
following diagram. The Edge Worker which is
+deployed outside of the central Airflow cluster is connected via HTTP(s) to
the API server of the Airflow cluster:
+
+.. graphviz::
+
+ digraph A{
+ rankdir="TB"
+ node[shape="rectangle", style="rounded"]
+
+
+ subgraph cluster {
+ label="Cluster";
+ {rank = same; dag; database}
+ {rank = same; workers; scheduler; api}
+
+ workers[label="(Central) Workers"]
+ scheduler[label="Scheduler"]
+ api[label="API server"]
+ database[label="Database"]
+ dag[label="DAG files"]
+
+ api->workers
+ api->database
+
+ workers->dag
+ workers->database
+
+ scheduler->database
+ }
+
+ subgraph edge_worker_subgraph {
+ label="Edge site";
+ {rank = same; edge_worker; edge_dag}
+ edge_worker[label="Edge Worker"]
+ edge_dag[label="DAG files (Remote copy)"]
+
+ edge_worker->edge_dag
+ }
+
+ edge_worker->api[label="HTTP(s)"]
+ }
+
+* **Workers** - Execute the assigned tasks - most standard setup has local or
centralized workers, e.g. via Celery
+* **Edge Workers** - Special workers which pull tasks via HTTP(s) as provided
as feature via this provider package
+* **Scheduler** - Responsible for adding the necessary tasks to the queue. The
EdgeExecutor is running as a module inside the scheduler.
+* **API server** - HTTP REST API Server provides access to DAG/task status
information. The required end-points are
+ provided by the Edge provider plugin. The Edge Worker uses this API to pull
tasks and send back the results.
+* **Database** - Contains information about the status of tasks, Dags,
Variables, connections, etc.
+
+In detail the parts of the Edge provider are deployed as follows:
+
+.. image:: img/edge_package.svg
+ :alt: Overview and communication of Edge Provider modules
+
+* **EdgeExecutor** - The EdgeExecutor is running inside the core Airflow
scheduler. It is responsible for
+ scheduling tasks and sending them to the Edge job queue in the database. The
EdgeExecutor is a subclass of the
+ ``airflow.executors.base_executor.BaseExecutor`` class. To activate the
EdgeExecutor, you need to set the
+ ``executor`` configuration option in the ``airflow.cfg`` file to
+ ``airflow.providers.edge3.executors.EdgeExecutor``. For more details see
:doc:`edge_executor`. Note that also
+ multiple executors can be used in parallel together with the EdgeExecutor.
+* **API server** - The API server is providing REST endpoints to the web UI as
well as serves static files. The
+ Edge provider adds a plugin that provides additional REST API for the Edge
Worker as well as UI elements to
+ manage workers (currently Airflow 2.10 only).
+ The API server is responsible for handling requests from the Edge Worker and
sending back the results. To
+ activate the API server, you need to set the ``api_enabled`` configuration
option in ``edge`` section in the
+ ``airflow.cfg`` file to ``True``. The API endpoints for edge is not started
by default.
+ Fr more details see :doc:`ui_plugin`.
+* **Database** - The Airflow meta database is used to store the status of
tasks, Dags, Variables, connections
+ etc. The Edge provider uses the database to store the status of the Edge
Worker instances and the tasks that
+ are assigned to it. The database is also used to store the results of the
tasks that are executed by the
+ Edge Worker. Setup of needed tables and migration is done automatically when
the provider package is deployed.
+* **Edge Worker** - The Edge Worker is a lightweight process that runs on the
edge device. It is responsible for
+ pulling tasks from the API server and executing them. The Edge Worker is a
standalone process that can be
+ deployed on any machine that has access to the API server. It is designed to
be lightweight and easy to
+ deploy. The Edge Worker is implemented as a command line tool that can be
started with the ``airflow edge worker``
+ command. For more details see :doc:`deployment`.
+
+Edge Worker State Model
+-----------------------
+
+Each Edge Worker is tracked from the API server such that it is known which
worker is currently active. This is
+for monitoring as well as administrators as else it is assumed a distributed
monitoring and tracking is hard to
+achieve. This also allows central management for administrative maintenance.
+
+Workers send regular heartbeats to the API server to indicate that they are
still alive. The heartbeats are used to
+determine the state of the worker.
+
+The following states are used to track the worker:
+
+.. graphviz::
+
+ digraph edge_worker_state {
+ node [shape=circle];
+
+ STARTING[label="starting"];
+ IDLE[label="idle"];
+ RUNNING[label="running"];
+ TERMINATING[label="terminating"];
+ OFFLINE[label="offline"];
+ UNKNOWN[label="unknown"];
+ MAINTENANCE_REQUEST[label="maintenance request"];
+ MAINTENANCE_PENDING[label="maintenance pending"];
+ MAINTENANCE_MODE[label="maintenance mode"];
+ MAINTENANCE_EXIT[label="maintenance exit"];
+ OFFLINE_MAINTENANCE[label="offline maintenance"];
+
+ STARTING->IDLE[label="initialization"];
+ IDLE->RUNNING[label="new task"];
+ RUNNING->IDLE[label="all tasks completed"];
+ IDLE->MAINTENANCE_REQUEST[label="triggered by admin"];
+ RUNNING->MAINTENANCE_REQUEST[label="triggered by admin"];
+ MAINTENANCE_REQUEST->MAINTENANCE_PENDING[label="if running tasks > 0"];
+ MAINTENANCE_REQUEST->MAINTENANCE_MODE[label="if running tasks = 0"];
+ MAINTENANCE_PENDING->MAINTENANCE_MODE[label="running tasks = 0"];
+ MAINTENANCE_PENDING->MAINTENANCE_EXIT[label="triggered by admin"];
+ MAINTENANCE_MODE->MAINTENANCE_EXIT[label="triggered by admin"];
+ MAINTENANCE_EXIT->RUNNING[label="if running tasks > 0"];
+ MAINTENANCE_EXIT->IDLE[label="if running tasks = 0"];
+ IDLE->OFFLINE[label="on clean shutdown"];
+ RUNNING->TERMINATING[label="on clean shutdown if running tasks > 0"];
+ TERMINATING->OFFLINE[label="on clean shutdown if running tasks = 0"];
+ }
+
+See also
https://github.com/apache/airflow/blob/main/providers/edge3/src/airflow/providers/edge3/models/edge_worker.py#L45
+for a documentation of details of all states of the Edge Worker.
+
+Feature Backlog Edge Provider
+-----------------------------
+
+The current version of the EdgeExecutor is released with known limitations. It
will mature over time.
+
+The following features are known missing and will be implemented in increments:
+
+- API token per worker: Today there is a global API token available only
+- Edge Worker Plugin
+
+ - Make plugin working on Airflow 3.0, depending on AIP-68
+ - Overview about queues / jobs per queue
+ - Allow starting Edge Worker REST API separate to api-server
+ - Add some hints how to setup an additional worker
+
+- Edge Worker CLI
+
+ - Use WebSockets instead of HTTP calls for communication
+ - Send logs also to TaskFileHandler if external logging services are used
+ - Integration into telemetry to send metrics from remote site
+ - Publish system metrics with heartbeats (CPU, Disk space, RAM, Load)
+ - Be more liberal e.g. on patch version. Currently requires exact version
match
+ (In current state if versions do not match, the worker will gracefully shut
+ down when jobs are completed, no new jobs will be started)
+
+- Tests
+
+ - System tests in Github, test the deployment of the worker with a Dag
execution
+ - Test/Support on Windows for Edge Worker
+
+- Scaling test - Check and define boundaries of workers/jobs. Today it is
known to
+ scale into a range of 50 workers. This is not a hard limit but just an
experience reported.
+- Load tests - impact of scaled execution and code optimization
+- Incremental logs during task execution can be served w/o shared log disk on
api-server
+- Reduce dependencies during execution: Today the worker depends on the
airflow core with a lot
+ of transitive dependencies. Target is to reduce the dependencies to a
minimum like TaskSDK
+ and providers only.
+
+- Documentation
+
+ - Provide scripts and guides to install edge components as service (systemd)
+ - Extend Helm-Chart for needed support
+ - Provide an example docker compose for worker setup
diff --git a/providers/edge3/docs/deployment.rst
b/providers/edge3/docs/deployment.rst
new file mode 100644
index 00000000000..f608cb7f101
--- /dev/null
+++ b/providers/edge3/docs/deployment.rst
@@ -0,0 +1,173 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements. See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership. The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License. You may obtain a copy of the License at
+
+ .. http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied. See the License for the
+ specific language governing permissions and limitations
+ under the License.
+
+Edge Worker Deployment
+======================
+
+Edge Workers can be deployed outside of the central Airflow infrastructure.
They
+are connected to the Airflow API server via HTTP(s). The Edge Worker is a
+lightweight component that can be deployed on any machine that has outbound
+HTTP(s) access to the Airflow API server. The Edge Worker is designed to be
+lightweight and easy to deploy. It allows you to run Airflow tasks on machines
+that are not part of your main data center, e.g. edge servers. This also
allows to
+deploy only reduced dependencies on the edge worker.
+
+Here are a few imperative requirements for your workers:
+
+- ``airflow`` needs to be installed, and the Airflow CLI needs to be in the
path. This includes
+ the Task SDK as well as the edge3 provider package.
+- Airflow configuration settings should be homogeneous across the cluster and
on the edge site
+- Operators that are executed on the Edge Worker need to have their
dependencies
+ met in that context. Please take a look to the respective provider package
+ documentations
+- The worker needs to have access to the ``DAGS_FOLDER``, and you need to
+ synchronize the filesystems by your own means. A common setup would be to
+ store your ``DAGS_FOLDER`` in a Git repository and sync it across machines
using
+ Chef, Puppet, Ansible, or whatever you use to configure machines in your
+ environment. If all your boxes have a common mount point, having your
+ pipelines files shared there should work as well
+
+
+Minimum Airflow configuration settings for the Edge Worker to make it running
is:
+
+- Section ``[core]``
+
+ - ``executor``: Executor must be set or added to be
``airflow.providers.edge3.executors.EdgeExecutor``
+ - ``internal_api_secret_key``: An encryption key must be set on api-server
and Edge Worker component as
+ shared secret to authenticate traffic. It should be a random string like
the fernet key
+ (but preferably not the same).
+
+- Section ``[edge]``
+
+ - ``api_enabled``: Must be set to true. It is disabled intentionally to not
expose
+ API endpoint by default. This is the endpoint the worker connects to.
+ In a future release a dedicated API server can be started.
+ - ``api_url``: Must be set to the URL which exposes the api endpoint as it
is reachable from the
+ worker. Typically this looks like
``https://your-hostname-and-port/edge_worker/v1/rpcapi``.
+
+To kick off a worker, you need to setup Airflow and kick off the worker
+subcommand
+
+.. code-block:: bash
+
+ airflow edge worker
+
+Your worker should start picking up tasks as soon as they get fired in
+its direction. To stop a worker running on a machine you can use:
+
+.. code-block:: bash
+
+ airflow edge stop
+
+It will try to stop the worker gracefully by sending ``SIGINT`` signal to main
+process as and wait until all running tasks are completed. Also in a console
you can use
+``Ctrl-C`` to stop the worker.
+
+If you want to monitor the remote activity and worker, use the UI plugin which
+is included in the provider package and install it on the webserver and use the
+"Admin" - "Edge Worker Hosts" and "Edge Worker Jobs" pages.
+(Note: The plugin is not ported to Airflow 3.0 web UI at time of writing)
+
+If you want to check status of the worker via CLI you can use the command
+
+.. code-block:: bash
+
+ airflow edge status
+
+Some caveats:
+
+- Tasks can consume resources. Make sure your worker has enough resources to
run ``worker_concurrency`` tasks
+- Make sure that the ``pool_slots`` of a Tasks matches with the
``worker_concurrency`` of the worker.
+ See also :ref:`edge_executor:concurrency_slots`.
+- Queue names are limited to 256 characters
+
+See :doc:`apache-airflow:administration-and-deployment/modules_management` for
details on how Python and Airflow manage modules.
+
+.. _deployment:maintenance:
+
+Worker Maintenance Mode
+-----------------------
+
+Sometimes infrastructure needs to be maintained. The Edge Worker provides a
+maintenance mode to
+- Stop accepting new tasks
+- Drain all ongoing work gracefully
+
+Also please note if the worker detects that the Airflow or Edge provider
package version
+is not the same as the one running on the API server, it will stop accepting
new tasks and shut down gracefully.
+This is to prevent running tasks with different versions of the code.
+
+Worker status can be checked via the web UI in the "Admin" - "Edge Worker
Hosts" page.
+
+.. image:: img/worker_hosts.png
+
+.. note::
+
+ As of time of writing the web UI to see edge jobs and manage workers is
not ported to Airflow 3.0.
+ Until this is available you can use the CLI commands as described in
:ref:`deployment:maintenance-mgmt-cli`.
+
+
+Worker maintenance can also be triggered via the CLI command on the machine
that runs the worker.
+
+.. code-block:: bash
+
+ airflow edge maintenance --comments "Some comments for the maintenance" on
+
+This will stop the local worker instance from accepting new tasks and will
complete running tasks.
+If you add the command argument ``--wait`` the CLI will wait until all
+running tasks are completed before return.
+
+If you want to know the status of you local worker while waiting on
maintenance you can
+use the command
+
+.. code-block:: bash
+
+ airflow edge status
+
+This will show the status of the local worker instance as JSON and the tasks
running on it.
+
+The status and maintenance comments will also be shown in the web UI
+in the "Admin" - "Edge Worker Hosts" page.
+
+.. image:: img/worker_maintenance.png
+
+The local worker instance can be started to fetch new tasks via the command
+
+.. code-block:: bash
+
+ airflow edge maintenance off
+
+This will start the worker again and it will start accepting tasks again.
+
+.. _deployment:maintenance-mgmt-cli:
+
+Worker Maintenance Management CLI
+---------------------------------
+
+Besides the CLI command to trigger maintenance on the local worker instance,
there are also additional commands to
+manage the maintenance of all workers in the cluster. These commands can be
used to trigger maintenance
+on all workers in the cluster or to check the status of all workers in the
cluster.
+
+These set of commands need database access, and can only be called on the
central Airflow
+instance. The commands are:
+
+- ``airflow edge list-workers``: List all workers in the cluster
+- ``airflow edge remote-edge-worker-request-maintenance``: Request a remote
edge worker to enter maintenance mode
+- ``airflow edge remote-edge-worker-update-maintenance-comment``: Updates the
maintenance comment for a remote edge worker
+- ``airflow edge remote-edge-worker-exit-maintenance``: Request a remote edge
worker to exit maintenance mode
+- ``airflow edge shutdown-remote-edge-worker``: Shuts down a remote edge
worker gracefully
+- ``airflow edge remove-remote-edge-worker``: Remove a worker instance from
the cluster
diff --git a/providers/edge3/docs/edge_executor.rst
b/providers/edge3/docs/edge_executor.rst
index 5b1df582ab0..e3af6aae107 100644
--- a/providers/edge3/docs/edge_executor.rst
+++ b/providers/edge3/docs/edge_executor.rst
@@ -20,159 +20,12 @@ Edge Executor
``EdgeExecutor`` is an option if you want to distribute tasks to workers
distributed in different locations.
You can use it also in parallel with other executors if needed. Change your
``airflow.cfg`` to point
-the executor parameter to ``EdgeExecutor`` and provide the related settings.
+the executor parameter to ``EdgeExecutor`` and provide the related settings.
The ``EdgeExecutor`` is the component
+to schedule tasks to the edge workers. The edge workers need to be set-up
separately as described in :doc:`deployment`.
The configuration parameters of the Edge Executor can be found in the Edge
provider's :doc:`configurations-ref`.
-Here are a few imperative requirements for your workers:
-
-- ``airflow`` needs to be installed, and the Airflow CLI needs to be in the
path
-- Airflow configuration settings should be homogeneous across the cluster and
on the edge site
-- Operators that are executed on the Edge Worker need to have their
dependencies
- met in that context. Please take a look to the respective provider package
- documentations
-- The worker needs to have access to its ``DAGS_FOLDER``, and you need to
- synchronize the filesystems by your own means. A common setup would be to
- store your ``DAGS_FOLDER`` in a Git repository and sync it across machines
using
- Chef, Puppet, Ansible, or whatever you use to configure machines in your
- environment. If all your boxes have a common mount point, having your
- pipelines files shared there should work as well
-
-
-Minimum configuration for the Edge Worker to make it running is:
-
-- Section ``[core]``
-
- - ``executor``: Executor must be set or added to be
``airflow.providers.edge3.executors.EdgeExecutor``
- - ``internal_api_secret_key``: An encryption key must be set on webserver
and Edge Worker component as
- shared secret to authenticate traffic. It should be a random string like
the fernet key
- (but preferably not the same).
-
-- Section ``[edge]``
-
- - ``api_enabled``: Must be set to true. It is disabled intentionally to not
expose
- the endpoint by default. This is the endpoint the worker connects to.
- In a future release a dedicated API server can be started.
- - ``api_url``: Must be set to the URL which exposes the web endpoint
-
-To kick off a worker, you need to setup Airflow and kick off the worker
-subcommand
-
-.. code-block:: bash
-
- airflow edge worker
-
-Your worker should start picking up tasks as soon as they get fired in
-its direction. To stop a worker running on a machine you can use:
-
-.. code-block:: bash
-
- airflow edge stop
-
-It will try to stop the worker gracefully by sending ``SIGINT`` signal to main
-process as and wait until all running tasks are completed.
-
-If you want to monitor the remote activity and worker, use the UI plugin which
-is included in the provider package and install it on the webserver and use the
-"Admin" - "Edge Worker Hosts" and "Edge Worker Jobs" pages.
-(Note: The plugin is not ported to Airflow 3.0 web UI at time of writing)
-
-If you want to check status of the worker via CLI you can use the command
-
-.. code-block:: bash
-
- airflow edge status
-
-Some caveats:
-
-- Tasks can consume resources. Make sure your worker has enough resources to
run ``worker_concurrency`` tasks
-- Make sure that the ``pool_slots`` of a Tasks matches with the
``worker_concurrency`` of the worker
-- Queue names are limited to 256 characters
-
-See :doc:`apache-airflow:administration-and-deployment/modules_management` for
details on how Python and Airflow manage modules.
-
-Current Limitations Edge Executor
----------------------------------
-
-If you plan to use the Edge Executor / Worker in the current stage you need to
ensure you test properly
-before use. The following features have been initially tested and are working:
-
-- Some core operators
-
- - ``BashOperator``
- - ``PythonOperator``
- - ``@task`` decorator
- - ``@task.branch`` decorator
- - ``@task.virtualenv`` decorator
- - ``@task.bash`` decorator
- - Dynamic Mapped Tasks
- - XCom read/write
- - Variable and Connection access
- - Setup and Teardown tasks
-
-- Some known limitations
-
- - Tasks that require DB access will fail - no DB connection from remote site
is possible
- (which is the default in Airflow 3.0)
- - This also means that some direct Airflow API via Python is not possible
(e.g. airflow.models.*)
- - Log upload will only work if you use a single web server instance or they
need to share one log file volume.
- Logs are uploaded in chunks and are transferred via API. If you use
multiple webservers w/o a shared log volume
- the logs will be scattered across the webserver instances.
- - Performance: No extensive performance assessment and scaling tests have
been made. The edge executor package is
- optimized for stability. This will be incrementally improved in future
releases. Setups have reported stable
- operation with ~50 workers until now. Note that executed tasks require
more webserver API capacity.
-
-
-Architecture
-------------
-
-.. graphviz::
-
- digraph A{
- rankdir="TB"
- node[shape="rectangle", style="rounded"]
-
-
- subgraph cluster {
- label="Cluster";
- {rank = same; dag; database}
- {rank = same; workers; scheduler; web}
-
- workers[label="(Central) Workers"]
- scheduler[label="Scheduler"]
- web[label="Web server"]
- database[label="Database"]
- dag[label="DAG files"]
-
- web->workers
- web->database
-
- workers->dag
- workers->database
-
- scheduler->dag
- scheduler->database
- }
-
- subgraph edge_worker_subgraph {
- label="Edge site";
- edge_worker[label="Edge Worker"]
- edge_dag[label="DAG files (Remote)"]
-
- edge_worker->edge_dag
- }
-
- edge_worker->web[label="HTTP(s)"]
- }
-
-Airflow consist of several components:
-
-* **Workers** - Execute the assigned tasks - most standard setup has local or
centralized workers, e.g. via Celery
-* **Edge Workers** - Special workers which pull tasks via HTTP as provided as
feature via this provider package
-* **Scheduler** - Responsible for adding the necessary tasks to the queue
-* **Web server** - HTTP Server provides access to DAG/task status information
-* **Database** - Contains information about the status of tasks, DAGs,
Variables, connections, etc.
-
+To understand the setup of the Edge Executor, please also take a look to
:doc:`architecture`.
.. _edge_executor:queue:
@@ -197,6 +50,8 @@ could take thousands of tasks without a problem), or from an
environment
perspective (you want a worker running from a specific location where required
infrastructure is available).
+.. _edge_executor:concurrency_slots:
+
Concurrency slot handling
-------------------------
@@ -239,93 +94,14 @@ Here is an example setting pool_slots for a task:
task_with_template()
-Worker maintenance
-------------------
-
-Sometimes infrastructure needs to be maintained. The Edge Worker provides a
-maintenance mode to
-- Stop accepting new tasks
-- Drain all ongoing work gracefully
-
-Worker status can be checked via the web UI in the "Admin" - "Edge Worker
Hosts" page.
-
-.. image:: img/worker_hosts.png
-
-.. note::
-
- As of time of writing the web UI to see edge jobs and manage workers is
not ported to Airflow 3.0
-
-
-Worker maintenance can also be triggered via the CLI command
-
-.. code-block:: bash
-
- airflow edge maintenance --comments "Some comments for the maintenance" on
-
-This will stop the worker from accepting new tasks and will complete running
tasks.
-If you add the command argument ``--wait`` the CLI will wait until all
-running tasks are completed before return.
-
-If you want to know the status of a worker while waiting on maintenance you can
-use the command
-
-.. code-block:: bash
-
- airflow edge status
-
-This will show the status of the worker as JSON and the tasks running on it.
-
-The status and maintenance comments will also be shown in the web UI
-in the "Admin" - "Edge Worker Hosts" page.
-
-.. image:: img/worker_maintenance.png
-
-The worker can be started to fetch new tasks via the command
-
-.. code-block:: bash
-
- airflow edge maintenance off
-
-This will start the worker again and it will start accepting tasks again.
-
-
-Feature Backlog of MVP to Release Readiness
--------------------------------------------
-
-The current version of the EdgeExecutor is a MVP (Minimum Viable Product). It
will mature over time.
-
-The following features are known missing and will be implemented in increments:
-
-- API token per worker: Today there is a global API token available only
-- Edge Worker Plugin
-
- - Overview about queues / jobs per queue
- - Allow starting Edge Worker REST API separate to webserver
- - Add some hints how to setup an additional worker
-
-- Edge Worker CLI
-
- - Use WebSockets instead of HTTP calls for communication
- - Send logs also to TaskFileHandler if external logging services are used
- - Integration into telemetry to send metrics from remote site
- - Publish system metrics with heartbeats (CPU, Disk space, RAM, Load)
- - Be more liberal e.g. on patch version. Currently requires exact version
match
- (In current state if versions do not match, the worker will gracefully shut
- down when jobs are completed, no new jobs will be started)
-
-- Tests
-
- - Integration tests in Github
- - Test/Support on Windows for Edge Worker
-
-- Scaling test - Check and define boundaries of workers/jobs. Today it is
known to
- scale into a range of 50 workers. This is not a hard limit but just an
experience reported.
-- Load tests - impact of scaled execution and code optimization
-- Incremental logs during task execution can be served w/o shared log disk on
webserver
+Current Limitations Edge Executor
+---------------------------------
-- Documentation
+- Some known limitations
- - Describe more details on deployment options and tuning
- - Provide scripts and guides to install edge components as service (systemd)
- - Extend Helm-Chart for needed support
- - Provide an example docker compose for worker setup
+ - Log upload will only work if you use a single web server instance or they
need to share one log file volume.
+ Logs are uploaded in chunks and are transferred via API. If you use
multiple webservers w/o a shared log volume
+ the logs will be scattered across the webserver instances.
+ - Performance: No extensive performance assessment and scaling tests have
been made. The edge executor package is
+ optimized for stability. This will be incrementally improved in future
releases. Setups have reported stable
+ operation with ~50 workers until now. Note that executed tasks require
more webserver API capacity.
diff --git a/providers/edge3/docs/img/distributed_architecture.svg
b/providers/edge3/docs/img/distributed_architecture.svg
new file mode 100644
index 00000000000..1bbb56c6fce
--- /dev/null
+++ b/providers/edge3/docs/img/distributed_architecture.svg
@@ -0,0 +1,4 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!-- Do not edit this file with editors other than draw.io -->
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<svg xmlns="http://www.w3.org/2000/svg" style="background-color: rgb(255, 255,
255);" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1" width="1313px"
height="650px" viewBox="-0.5 -0.5 1313 650" content="<mxfile
host="cwiki.apache.org" modified="2025-04-27T20:09:54.050Z"
agent="Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:136.0) Gecko/20100101
Firefox/136.0" etag="FpFJgG-N56l8xpwPFAtq"
version="24.4.0" type="atlas" scale [...]
diff --git a/providers/edge3/docs/img/edge_package.svg
b/providers/edge3/docs/img/edge_package.svg
new file mode 100644
index 00000000000..24408170c9c
--- /dev/null
+++ b/providers/edge3/docs/img/edge_package.svg
@@ -0,0 +1,4 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!-- Do not edit this file with editors other than draw.io -->
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<svg xmlns="http://www.w3.org/2000/svg" style="background-color: rgb(255, 255,
255);" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1" width="1091px"
height="561px" viewBox="-0.5 -0.5 1091 561" content="<mxfile
host="cwiki.apache.org" modified="2025-04-27T20:17:45.388Z"
agent="Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:136.0) Gecko/20100101
Firefox/136.0" etag="gPkWIMjSOMdvw7VuzY7t"
version="24.4.0" type="atlas" scale [...]
diff --git a/providers/edge3/docs/index.rst b/providers/edge3/docs/index.rst
index 93904372e55..35328e9c98d 100644
--- a/providers/edge3/docs/index.rst
+++ b/providers/edge3/docs/index.rst
@@ -28,7 +28,11 @@
Home <self>
Changelog <changelog>
Security <security>
- Installation on Windows <install_on_windows>
+ Why using Edge <why_edge>
+ Architecture <architecture>
+ Edge Worker Deployment <deployment>
+ Edge UI Plugin <ui_plugin>
+ Worker on Windows <install_on_windows>
.. toctree::
@@ -47,6 +51,13 @@
Configuration <configurations-ref>
CLI <cli-ref>
Python API <_api/airflow/providers/edge3/index>
+
+.. toctree::
+ :hidden:
+ :maxdepth: 1
+ :caption: Resources
+
+ Example DAGs
<https://github.com/apache/airflow/tree/providers-edge3/|version|/providers/edge3/src/airflow/providers/edge3/example_dags>
PyPI Repository <https://pypi.org/project/apache-airflow-providers-edge3/>
Installing from sources <installing-providers-from-sources>
diff --git a/providers/edge3/docs/install_on_windows.rst
b/providers/edge3/docs/install_on_windows.rst
index 04841c5486b..5ac0cf61784 100644
--- a/providers/edge3/docs/install_on_windows.rst
+++ b/providers/edge3/docs/install_on_windows.rst
@@ -26,31 +26,25 @@ Install Edge Worker on Windows
due to Python OS restrictions and if currently of Proof-of-Concept quality.
-The setup was tested on Windows 10 with Python 3.12.8, 64-bit.
+The setup was tested on Windows 10 with Python 3.12.8, 64-bit. Backend for
tests was Airflow 2.10.5.
To setup a instance of Edge Worker on Windows, you need to follow the steps
below:
1. Install Python 3.9 or higher.
-2. Create an empty folder as base to start with. In our example it is
``C:\\Airflow``.
-3. Start Shell/Command Line in ``C:\\Airflow`` and create a new virtual
environment via: ``python -m venv venv``
-4. Activate the virtual environment via: ``venv\\Scripts\\activate.bat``
+2. Create an empty folder as base to start with. In our example it is
``C:\Airflow``.
+3. Start Shell/Command Line in ``C:\Airflow`` and create a new virtual
environment via: ``python -m venv venv``
+4. Activate the virtual environment via: ``venv\Scripts\activate.bat``
5. Install Edge provider using the Airflow constraints as of your Airflow
version via
``pip install apache-airflow-providers-edge3 --constraint
https://raw.githubusercontent.com/apache/airflow/constraints-2.10.5/constraints-3.12.txt``.
- (or alternative build and copy the wheel of the edge provider to the folder
``C:\\Airflow``.
- This document used
``apache_airflow_providers_edge-0.9.7rc0-py3-none-any.whl``, install the wheel
file with the
- Airflow constraints matching your Airflow and Python version:
- ``pip install apache_airflow_providers_edge-0.9.7rc0-py3-none-any.whl
apache-airflow==2.10.5 virtualenv --constraint
https://raw.githubusercontent.com/apache/airflow/constraints-2.10.5/constraints-3.12.txt``)
-6. Create a new folder ``dags`` in ``C:\\Airflow`` and copy the relevant DAG
files in it.
- (At least the DAG files which should be executed on the edge alongside the
dependencies. For testing purposes
- the DAGs from the ``apache-airflow`` repository can be used located in
-
<https://github.com/apache/airflow/tree/main/providers/edge3/src/airflow/providers/edge3/example_dags>.)
+6. Create a new folder ``dags`` in ``C:\Airflow`` and copy the relevant DAG
files in it.
+ (At least the DAG files which should be executed on the edge alongside the
dependencies.)
7. Collect needed parameters from your running Airflow backend, at least the
following:
- ``edge`` / ``api_url``: The HTTP(s) endpoint where the Edge Worker
connects to
- - ``core`` / ``internal_api_secret_key``: The shared secret key between the
webserver and the Edge Worker
+ - ``core`` / ``internal_api_secret_key``: The shared secret key between the
api-server and the Edge Worker
- Any proxy details if applicable for your environment.
8. Create a worker start script to prevent repeated typing. Create a new file
``start_worker.bat`` in
- ``C:\\Airflow`` with the following content - replace with your settings:
+ ``C:\Airflow`` with the following content - replace with your settings:
.. code-block:: bash
@@ -59,7 +53,7 @@ To setup a instance of Edge Worker on Windows, you need to
follow the steps belo
set AIRFLOW__LOGGING__BASE_LOG_FOLDER=edge_logs
set
AIRFLOW__EDGE__API_URL=https://your-hostname-and-port/edge_worker/v1/rpcapi
set
AIRFLOW__CORE__EXECUTOR=airflow.providers.edge3.executors.edge_executor.EdgeExecutor
- set AIRFLOW__CORE__INTERNAL_API_SECRET_KEY=<steal this from your
deployment...>
+ set AIRFLOW__CORE__INTERNAL_API_SECRET_KEY=<use this as configured
centrally in api-server...>
set AIRFLOW__CORE__LOAD_EXAMPLES=False
set AIRFLOW_ENABLE_AIP_44=true
@REM Add if needed: set http_proxy=http://my-company-proxy.com:3128
diff --git a/providers/edge3/docs/ui_plugin.rst
b/providers/edge3/docs/ui_plugin.rst
new file mode 100644
index 00000000000..78a3ee8e3a3
--- /dev/null
+++ b/providers/edge3/docs/ui_plugin.rst
@@ -0,0 +1,66 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements. See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership. The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License. You may obtain a copy of the License at
+
+ .. http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied. See the License for the
+ specific language governing permissions and limitations
+ under the License.
+
+Edge UI Plugin and REST API
+===========================
+
+The Edge provider uses a Plugin to
+
+- Extend the REST API endpoints for connecting workers to the Airflow cluster
+- Provide a web UI for managing the workers and monitoring their status and
tasks
+ (Note: The UI is currently only available in Airflow 2.10+, implementation
for
+ Airflow 3.0 depends on completion of AIP-68)
+
+REST API endpoints
+------------------
+
+The Edge provider adds the following REST API endpoints to the Airflow API:
+
+- ``/api/v1/edge/health``: Check that the API endpoint is deployed and active
+- ``/api/v1/edge/jobs``: Endpoints to fetch jobs for workers and report state
+- ``/api/v1/edge/logs``: Endpoint to push log chunks from workers to the
Airflow cluster
+- ``/api/v1/edge/workers``: Endpoints to register and manage workers, report
heartbeat
+
+To see full documentation of the API endpoints open the Airflow web UI and
navigate to
+the sub-path ``/edge_worker/v1/docs`` (Airflow 3.0) or ``/edge_worker/v1/ui``
(Airflow 2.10).
+
+Web UI Plugin (Airflow 2.10 only)
+---------------------------------
+
+.. note::
+
+ As of time of writing the web UI to see edge jobs and manage workers is
not ported to Airflow 3.0.
+ Until this is available you can use the CLI commands as described in
:ref:`deployment:maintenance-mgmt-cli`.
+
+The Edge provider adds a web UI plugin to the Airflow web UI. The plugin is
+made to be able to see job queue and Edge Worker status.
+
+Pending and processes tasks can be checked in "Admin" - "Edge Worker Jobs"
page.
+
+Worker status can be checked via the web UI in the "Admin" - "Edge Worker
Hosts" page.
+
+.. image:: img/worker_hosts.png
+
+Via the UI you can also set the status of the worker to "Maintenance" or
"Active".
+
+The status and maintenance comments will also be shown in the web UI
+in the "Admin" - "Edge Worker Hosts" page.
+
+.. image:: img/worker_maintenance.png
+
+Note that maintenance mode can also be adjusted via CLI.
+See :ref:`deployment:maintenance` for more details.
diff --git a/providers/edge3/docs/why_edge.rst
b/providers/edge3/docs/why_edge.rst
new file mode 100644
index 00000000000..c9b67e7c36c
--- /dev/null
+++ b/providers/edge3/docs/why_edge.rst
@@ -0,0 +1,53 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements. See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership. The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License. You may obtain a copy of the License at
+
+ .. http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied. See the License for the
+ specific language governing permissions and limitations
+ under the License.
+
+Why using Edge Worker?
+======================
+
+Apache Airflow implements a distributed execution architecture. The Airflow
scheduler
+is responsible for scheduling tasks and sending them to the workers. The
workers are
+responsible for executing the tasks. The Airflow scheduler and workers are
typically
+deployed in the same data center.
+
+Most popular execution options for distributed setups are based on the
CeleryExecutor or
+KubernetesExecutor. The CeleryExecutor is a distributed task queue that allows
you to run
+tasks in parallel across multiple workers. These workers are connected via a
task queue,
+typically using Redis or RabbitMQ.
+The KubernetesExecutor is a cloud-native execution option that allows you to
run tasks in
+Kubernetes Pods. The KubernetesExecutor is a great option for organizations
that are already
+using Kubernetes for their infrastructure. It allows you to take advantage of
the scalability
+and flexibility of Kubernetes to run your Airflow tasks. However, it requires
a Kubernetes
+cluster and a Kubernetes service account with the necessary permissions to
create and manage
+Pods.
+
+The Edge Worker is a execution option that allows you to run Airflow tasks on
edge devices.
+The Edge Worker is designed to be lightweight and easy to deploy. It allows
you to run Airflow
+tasks on machines that are not part of your main data center, e.g. edge
servers. This is
+especially useful when deployments need to cross multiple data centers or
security perimeters
+like firewalls. For Celery for example a stable TCP connection is required
between the task
+queue (e.g. Redis) and the workers which can be hard to operate on wide-area
networks.
+To run Kubernetes Pods the scheduler needs access to API endpoints of the
Kubernetes cluster
+which is not always possible in edge deployments. Alternatively sometimes it
is possible to
+execute work on the edge devices via SSHOperator but this requires also a
direct and stable
+TCP connection to the edge devices.
+
+Target of the Edge Worker is to have a lean setup that allows task execution
on edge devices
+with only (outbound) HTTPS access. Edge Workers will be able connect, pull and
execute tasks
+with a simple deployment.
+
+.. image:: img/distributed_architecture.svg
+ :alt: Distributed architecture
diff --git a/providers/edge3/provider.yaml b/providers/edge3/provider.yaml
index c75a9fa076d..867346ebda5 100644
--- a/providers/edge3/provider.yaml
+++ b/providers/edge3/provider.yaml
@@ -18,7 +18,19 @@
package-name: apache-airflow-providers-edge3
name: Edge Executor
description: |
- Handle edge workers on remote sites via HTTP(s) connection and orchestrates
work over distributed sites
+ Handle edge workers on remote sites via HTTP(s) connection and orchestrates
work over distributed sites.
+
+ When tasks need to be executed on remote sites where the connection need to
pass through
+ firewalls or other network restrictions, the Edge Worker can be deployed.
The Edge Worker
+ is a lightweight process with reduced dependencies. The worker only needs to
be able to
+ communicate with the central Airflow site via HTTPS.
+
+ In the central Airflow site the EdgeExecutor is used to orchestrate the
work. The EdgeExecutor
+ is a custom executor which is used to schedule tasks on the edge workers.
The EdgeExecutor can co-exist
+ with other executors (for example CeleryExecutor or KubernetesExecutor) in
the same Airflow site.
+
+ Additional REST API endpoints are provided to distribute tasks and manage
the edge workers. The endpoints
+ are provided by the API server.
state: ready
source-date-epoch: 1741121867
diff --git a/providers/edge3/src/airflow/providers/edge3/get_provider_info.py
b/providers/edge3/src/airflow/providers/edge3/get_provider_info.py
index af5c8f9696e..63479e7e482 100644
--- a/providers/edge3/src/airflow/providers/edge3/get_provider_info.py
+++ b/providers/edge3/src/airflow/providers/edge3/get_provider_info.py
@@ -25,7 +25,7 @@ def get_provider_info():
return {
"package-name": "apache-airflow-providers-edge3",
"name": "Edge Executor",
- "description": "Handle edge workers on remote sites via HTTP(s)
connection and orchestrates work over distributed sites\n",
+ "description": "Handle edge workers on remote sites via HTTP(s)
connection and orchestrates work over distributed sites.\n\nWhen tasks need to
be executed on remote sites where the connection need to pass
through\nfirewalls or other network restrictions, the Edge Worker can be
deployed. The Edge Worker\nis a lightweight process with reduced dependencies.
The worker only needs to be able to\ncommunicate with the central Airflow site
via HTTPS.\n\nIn the central Airflow site the Ed [...]
"plugins": [
{
"name": "edge_executor",