This is an automated email from the ASF dual-hosted git repository. jedcunningham pushed a commit to branch 270 in repository https://gitbox.apache.org/repos/asf/airflow-site.git
commit a33ce20e7ec6c352d326cd25c710393f842b195a Author: Jed Cunningham <[email protected]> AuthorDate: Thu Aug 17 22:42:19 2023 -0600 Airflow 2.7 blog posts and announcement --- .../site/content/en/announcements/_index.md | 9 +++ .../en/blog/airflow-2.7.0/cluster_activity.png | Bin 0 -> 89752 bytes .../en/blog/airflow-2.7.0/graph_in_grid.png | Bin 0 -> 42349 bytes .../site/content/en/blog/airflow-2.7.0/index.md | 84 +++++++++++++++++++++ .../en/blog/introducing_setup_teardown/index.md | 72 ++++++++++++++++++ .../en/blog/introducing_setup_teardown/simple.png | Bin 0 -> 16721 bytes .../task-group-arrow.png | Bin 0 -> 59902 bytes 7 files changed, 165 insertions(+) diff --git a/landing-pages/site/content/en/announcements/_index.md b/landing-pages/site/content/en/announcements/_index.md index 0eed878745..fe0bbfe33b 100644 --- a/landing-pages/site/content/en/announcements/_index.md +++ b/landing-pages/site/content/en/announcements/_index.md @@ -13,6 +13,15 @@ menu: **Note:** Follow [@ApacheAirflow](https://twitter.com/ApacheAirflow) on Twitter for the latest news and announcements! +# August 18, 2023 + +We’ve just released Apache **Airflow 2.7.0**. You can read all about it in our [Apache Airflow 2.7.0 is here]({{< ref "blog/airflow-2.7.0/index.md" >}}) blog post. + +📦 PyPI: https://pypi.org/project/apache-airflow/2.7.0/ \ +📚 Docs: https://airflow.apache.org/docs/apache-airflow/2.7.0 \ +🛠️ Release Notes: https://airflow.apache.org/docs/apache-airflow/2.7.0/release_notes.html \ +🪶 Sources: https://airflow.apache.org/docs/apache-airflow/2.7.0/installation/installing-from-sources.html + # July 29, 2023 Airflow PMC welcomes new Airflow PMC Member: diff --git a/landing-pages/site/content/en/blog/airflow-2.7.0/cluster_activity.png b/landing-pages/site/content/en/blog/airflow-2.7.0/cluster_activity.png new file mode 100644 index 0000000000..6947fee8ed Binary files /dev/null and b/landing-pages/site/content/en/blog/airflow-2.7.0/cluster_activity.png differ diff --git a/landing-pages/site/content/en/blog/airflow-2.7.0/graph_in_grid.png b/landing-pages/site/content/en/blog/airflow-2.7.0/graph_in_grid.png new file mode 100644 index 0000000000..eb259d0b69 Binary files /dev/null and b/landing-pages/site/content/en/blog/airflow-2.7.0/graph_in_grid.png differ diff --git a/landing-pages/site/content/en/blog/airflow-2.7.0/index.md b/landing-pages/site/content/en/blog/airflow-2.7.0/index.md new file mode 100644 index 0000000000..eb03469fb1 --- /dev/null +++ b/landing-pages/site/content/en/blog/airflow-2.7.0/index.md @@ -0,0 +1,84 @@ +--- +title: "Apache Airflow 2.7.0 is here" +linkTitle: "Apache Airflow 2.7.0 is here" +author: "Jed Cunningham" +github: "jedcunningham" +linkedin: "jedidiah-cunningham" +description: "Apache Airflow 2.7.0 has been released!" +tags: [Release] +date: "2023-08-18" +--- + +I’m happy to announce that Apache Airflow 2.7.0 has been released! Some notable features have been added that we are excited for the community to use. + +Apache Airflow 2.7.0 contains over 500 commits, which include 40 new features, 49 improvements, 53 bug fixes, and 15 documentation changes. + +**Details**: + +📦 PyPI: https://pypi.org/project/apache-airflow/2.7.0/ \ +📚 Docs: https://airflow.apache.org/docs/apache-airflow/2.7.0/ \ +🛠 Release Notes: https://airflow.apache.org/docs/apache-airflow/2.7.0/release_notes.html \ +🐳 Docker Image: "docker pull apache/airflow:2.7.0" \ +🚏 Constraints: https://github.com/apache/airflow/tree/constraints-2.7.0 + +## Setup and Teardown (AIP-52) + +Airflow now has first class support for the concept of setup and teardown tasks. These tasks have special behavior in that: + +* Teardown tasks will still run, no matter what state the upstream tasks end up in +* Teardown tasks failing won’t, by default, cause the DAG run to fail +* Automatically clear setup/teardown tasks when clearing a dependent task + +You can read more about setup and teardown in the [Introducing Setup and Teardown tasks blog post]({{< ref "blog/introducing_setup_teardown/index.md" >}}), or in the [setup and teardown docs](https://airflow.apache.org/docs/apache-airflow/2.7.0/howto/setup-and-teardown.html). + +## Cluster Activity UI + +There is a new top level page in Airflow, the Cluster Activity page. This gives an overview of the cluster, including component health, dag and task state counts, and more! + + + +## Graph and gantt views moved into the Grid view UI + +The graph and gantt views have been rewritten and moved into the now familiar grid view. This makes it easier to jump between task details, logs, graph, and gantt views without losing your place in a complicated DAG. + + + +## Enable deferrable mode for all deferable tasks with 1 config setting + +Airflow 2.7.0 comes with a new config option, `default_deferrable`, which allows admins to enable deferrable mode for all deferrable tasks without requiring any DAG modifications. Simply set it in your config and enjoy async tasks! + +## OpenLineage provider + +[OpenLineage](https://openlineage.io/) provides a spec standardizing operational lineage collection and distribution across the data ecosystem that projects – open source or proprietary – implement. + +With 2.7.0, OpenLineage changes from a plugin implementation maintained in the OpenLineage project to a built-in feature of Airflow. As a plugin, OpenLineage depended on Airflow and operators’ internals, making it brittle. Built-in OpenLineage support in Airflow makes publishing operational lineage through the OpenLineage ecosystem easier and more reliable. It has been implemented by moving the [openlineage-airflow](https://github.com/OpenLineage/OpenLineage/tree/main/integration/airflow [...] + +## Some executors moved into providers + +Some of the executors that were shipped in core Airflow have moved into their respective providers for Airflow 2.7.0. + +The following providers have been moved and require certain minimum providers versions: + +* In order to use Celery executors, install the celery provider version 3.3.0+ +* In order to use the Kubernetes executor, install the kubernetes provider version 7.4.0+ +* In order to use the Dask executor, install any version of the daskexecutor provider + +If you use the official docker images, all of these providers come preinstalled. + +## Additional new features + +Here are just a few interesting new features, since there are too many to list in full: + +* Pools can now consider tasks in the deferred state as running (#32709) +* chain_linear, like chain but allowing sequential tasks (#31927) +* Grid view now supports keyboard shortcuts! (#30950) +* Mark task groups as success or failed (#30478) +* fail_stop, allowing all remaining and running tasks to be failed on the first failure in a DAG (#29406) + +## Contributors + +Thanks to everyone who contributed to this release, including Akash Sharma, Amogh Desai, Brent Bovenzi, D. Ferruzzi, Daniel Standish, Ephraim Anierobi, Hussein Awala, Jarek Potiuk, Jed Cunningham, Karthikeyan Singaravelan, Maciej Obuchowski, Niko Oliveira, Pankaj Koti, Pankaj Singh, Pierre Jeambrun, Tzu-ping Chung, Utkarsh Sharma, Vincent Beck, and over 74 others! + +I’d especially like to thank our release manager, Ephraim, for getting this release out the door. + +I hope you enjoy using Apache Airflow 2.7.0! diff --git a/landing-pages/site/content/en/blog/introducing_setup_teardown/index.md b/landing-pages/site/content/en/blog/introducing_setup_teardown/index.md new file mode 100644 index 0000000000..0ac842ca90 --- /dev/null +++ b/landing-pages/site/content/en/blog/introducing_setup_teardown/index.md @@ -0,0 +1,72 @@ +--- +title: "Introducing Setup and Teardown tasks" +linkTitle: "Introducing Setup and Teardown tasks" +author: "Daniel Standish" +github: "dstandish" +linkedin: "daniel-standish-12197714" +description: "An introduction to Setup and Teardown tasks" +tags: [Release] +date: "2023-08-18" +--- + +In data pipelines, commonly we need to create infrastructure resources, like a cluster or GPU nodes in an existing cluster, before doing the actual “work” and delete them after the work is done. Airflow 2.7 adds “setup” and “teardown” tasks to better support this type of pipeline. This blog post aims to highlight the key features so you know what’s possible. For full documentation on how to use setup and teardown tasks, see the [setup and teardown docs](https://airflow.apache.org/docs/ap [...] + +## Why setup and teardown? + +Before we dig into examples, let me state at high level what setup and teardown bring to the table. + +### More expressive dependencies + +Before setup and teardown, upstream and downstream relationships could only mean one thing: “this comes before that”. With setup and teardown, in effect we can say “this requires that”. And what it means in practice is, if you clear your task, and it requires a setup, that setup will be cleared too. And if that setup has a teardown, that will run again as well. + +### Separating the work from the infra + +Sometimes the part of the dag you care about is not, say, the cleanup task. For example, suppose you have a dag that loads some data and then deletes temp files. As long as the data loads, you want your dag to be marked successful. By default, this is how teardown tasks work; that is, they are ignored when determining dag run state. + +## Simple case + +A simple example is one setup / teardown pair, and one normal or “work” task. + + + +Setups and teardowns are indicated by the up and down arrows, respectively. From that we can see that .`create_cluster` is a setup task and `delete_cluster` is a teardown. The link between a setup and a teardown is always dotted to highlight the special relationship. + +Some things to observe: + +* If `create_cluster` fails, neither `run_query` nor `delete_cluster` will run. +* If `create_cluster` succeeds and `run_query` fails, then `delete_cluster` will still run. +* If `create_cluster` is skipped, `run_query` and `delete_cluster` will be skipped +* By default, if `run_query` succeeds, and `delete_cluster` fails, then the dag run will still be marked successful. (This behavior can be overridden). + +## Authoring with task groups + +When we set something downstream of a task group, any teardowns in the task group are ignored. This reflects the assumption that in general, we probably don’t want to stop dag execution just because a teardown fails. So, let’s wrap the above dag in a task group and see what happens: + + + +And here’s how we linked those groups in the code: + +```python +with TaskGroup("do_emr") as do_emr: + create_cluster_task = create_cluster() + run_query(create_cluster_task) >> delete_cluster(create_cluster_task) + +with TaskGroup("load") as load: + create_config_task = create_configuration() + load_data(create_config_task) >> delete_configuration(create_config_task) + +do_emr >> load +``` + +In this code, each group has a teardown, and we just arrow the first group to the second. As advertised, `delete_cluster`, a teardown task, is ignored. This has two important consequences: one, even if it fails, the `load` group will still run; and two, `delete_cluster` and `create_configuration` can run in parallel (generally speaking, we’d imagine you don’t want to wait for teardown operations to complete before continuing onto other tasks in the dag). Of course you can override this b [...] + +## Conclusion + +There’s a lot of detail we’re omitting here about exactly how to write dags with setup and teardown tasks, and for that please head over to the [setup and teardown docs](https://airflow.apache.org/docs/apache-airflow/2.7.0/howto/setup-and-teardown.html). But hopefully this post gives you enough of an idea of what is possible with setup and teardown tasks that you can begin to see where they can improve your data pipelines in Airflow. + +Curious to know what else is new in Airflow 2.7? Head over to the main [Airflow 2.7 blog post]({{< ref "blog/airflow-2.7.0/index.md" >}}) to find out! + +## Acknowledgements + +Setup and Teardown was the product of AIP-52. Thanks to everyone who contributed to it, including those that read and voted on the AIP. Special thanks to Ash Berlin-Taylor, Brent Bovenzi, Daniel Standish, Ephraim Anierobi, Jed Cunningham, Rahul Vats, and Vikram Koka. + diff --git a/landing-pages/site/content/en/blog/introducing_setup_teardown/simple.png b/landing-pages/site/content/en/blog/introducing_setup_teardown/simple.png new file mode 100644 index 0000000000..4f2f10ef0d Binary files /dev/null and b/landing-pages/site/content/en/blog/introducing_setup_teardown/simple.png differ diff --git a/landing-pages/site/content/en/blog/introducing_setup_teardown/task-group-arrow.png b/landing-pages/site/content/en/blog/introducing_setup_teardown/task-group-arrow.png new file mode 100644 index 0000000000..4f47d8d61d Binary files /dev/null and b/landing-pages/site/content/en/blog/introducing_setup_teardown/task-group-arrow.png differ
