aritra24 commented on code in PR #37039: URL: https://github.com/apache/airflow/pull/37039#discussion_r1468361184
########## dev/breeze/doc/ci/01_ci_environment.md: ########## @@ -0,0 +1,132 @@ +<!-- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. + --> + +<!-- START doctoc generated TOC please keep comment here to allow auto update --> +<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE --> +**Table of Contents** *generated with [DocToc](https://github.com/thlorenz/doctoc)* + +- [CI Environment](#ci-environment) + - [GitHub Actions runs](#github-actions-runs) + - [Container Registry used as cache](#container-registry-used-as-cache) + - [Authentication in GitHub Registry](#authentication-in-github-registry) + +<!-- END doctoc generated TOC please keep comment here to allow auto update --> + +# CI Environment + +Continuous Integration is important component of making Apache Airflow +robust and stable. We are running a lot of tests for every pull request, Review Comment: We run* a lot of tests ########## dev/breeze/doc/ci/01_ci_environment.md: ########## @@ -0,0 +1,132 @@ +<!-- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. + --> + +<!-- START doctoc generated TOC please keep comment here to allow auto update --> +<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE --> +**Table of Contents** *generated with [DocToc](https://github.com/thlorenz/doctoc)* + +- [CI Environment](#ci-environment) + - [GitHub Actions runs](#github-actions-runs) + - [Container Registry used as cache](#container-registry-used-as-cache) + - [Authentication in GitHub Registry](#authentication-in-github-registry) + +<!-- END doctoc generated TOC please keep comment here to allow auto update --> + +# CI Environment + +Continuous Integration is important component of making Apache Airflow +robust and stable. We are running a lot of tests for every pull request, +for main and v2-\*-test branches and regularly as scheduled jobs. + +Our execution environment for CI is [GitHub +Actions](https://github.com/features/actions). GitHub Actions (GA) are +very well integrated with GitHub code and Workflow and it has evolved +fast in 2019/202 to become a fully-fledged CI environment, easy to use +and develop for, so we decided to switch to it. Our previous CI system +was Travis CI. + +However part of the philosophy we have is that we are not tightly +coupled with any of the CI environments we use. Most of our CI jobs are +written as bash scripts which are executed as steps in the CI jobs. And +we have a number of variables determine build behaviour. + + +## GitHub Actions runs + +Our CI builds are highly optimized, leveraging the latest features +provided by the GitHub Actions environment to reuse parts of the build +process across different jobs. + +A significant portion of our CI runs utilize container images. Given +that Airflow has numerous dependencies, we use Docker containers to +ensure tests run in a well-configured and consistent environment. This +approach is used for most tests, documentation building, and some +advanced static checks. The environment comprises two types of images: +CI images and PROD images. CI images are used for most tests and checks, +while PROD images are used for Kubernetes tests. + +To run the tests, we need to ensure that the images are built using the +latest sources and that the build process is efficient. A full rebuild +of such an image from scratch might take approximately 15 minutes. +Therefore, we've implemented optimization techniques that efficiently +use the cache from the GitHub Docker registry. In most cases, this +reduces the time needed to rebuild the image to about 4 minutes. +However, when dependencies change, it can take around 6-7 minutes, and +if the base image of Python releases a new patch-level, it can take +approximately 12 minutes. + +## Container Registry used as cache + +We are using GitHub Container Registry to store the results of the +`Build Images` workflow which is used in the `Tests` workflow. + +Currently in main version of Airflow we run tests in all versions of +Python supported, which means that we have to build multiple images (one +CI and one PROD for each Python version). Yet we run many jobs (\>15) - +for each of the CI images. That is a lot of time to just build the +environment to run. Therefore we are utilising `pull_request_target` Review Comment: utilising the* `pull_request_target` ########## dev/breeze/doc/ci/01_ci_environment.md: ########## @@ -0,0 +1,132 @@ +<!-- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. + --> + +<!-- START doctoc generated TOC please keep comment here to allow auto update --> +<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE --> +**Table of Contents** *generated with [DocToc](https://github.com/thlorenz/doctoc)* + +- [CI Environment](#ci-environment) + - [GitHub Actions runs](#github-actions-runs) + - [Container Registry used as cache](#container-registry-used-as-cache) + - [Authentication in GitHub Registry](#authentication-in-github-registry) + +<!-- END doctoc generated TOC please keep comment here to allow auto update --> + +# CI Environment + +Continuous Integration is important component of making Apache Airflow +robust and stable. We are running a lot of tests for every pull request, +for main and v2-\*-test branches and regularly as scheduled jobs. + +Our execution environment for CI is [GitHub +Actions](https://github.com/features/actions). GitHub Actions (GA) are +very well integrated with GitHub code and Workflow and it has evolved +fast in 2019/202 to become a fully-fledged CI environment, easy to use Review Comment: 2019/2020* to become ########## dev/breeze/doc/ci/01_ci_environment.md: ########## @@ -0,0 +1,132 @@ +<!-- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. + --> + +<!-- START doctoc generated TOC please keep comment here to allow auto update --> +<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE --> +**Table of Contents** *generated with [DocToc](https://github.com/thlorenz/doctoc)* + +- [CI Environment](#ci-environment) + - [GitHub Actions runs](#github-actions-runs) + - [Container Registry used as cache](#container-registry-used-as-cache) + - [Authentication in GitHub Registry](#authentication-in-github-registry) + +<!-- END doctoc generated TOC please keep comment here to allow auto update --> + +# CI Environment + +Continuous Integration is important component of making Apache Airflow Review Comment: Not: is an* important component ########## dev/breeze/doc/ci/01_ci_environment.md: ########## @@ -0,0 +1,132 @@ +<!-- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. + --> + +<!-- START doctoc generated TOC please keep comment here to allow auto update --> +<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE --> +**Table of Contents** *generated with [DocToc](https://github.com/thlorenz/doctoc)* + +- [CI Environment](#ci-environment) + - [GitHub Actions runs](#github-actions-runs) + - [Container Registry used as cache](#container-registry-used-as-cache) + - [Authentication in GitHub Registry](#authentication-in-github-registry) + +<!-- END doctoc generated TOC please keep comment here to allow auto update --> + +# CI Environment + +Continuous Integration is important component of making Apache Airflow +robust and stable. We are running a lot of tests for every pull request, +for main and v2-\*-test branches and regularly as scheduled jobs. + +Our execution environment for CI is [GitHub +Actions](https://github.com/features/actions). GitHub Actions (GA) are +very well integrated with GitHub code and Workflow and it has evolved +fast in 2019/202 to become a fully-fledged CI environment, easy to use +and develop for, so we decided to switch to it. Our previous CI system +was Travis CI. + +However part of the philosophy we have is that we are not tightly +coupled with any of the CI environments we use. Most of our CI jobs are +written as bash scripts which are executed as steps in the CI jobs. And +we have a number of variables determine build behaviour. + + +## GitHub Actions runs + +Our CI builds are highly optimized, leveraging the latest features +provided by the GitHub Actions environment to reuse parts of the build +process across different jobs. + +A significant portion of our CI runs utilize container images. Given +that Airflow has numerous dependencies, we use Docker containers to +ensure tests run in a well-configured and consistent environment. This +approach is used for most tests, documentation building, and some +advanced static checks. The environment comprises two types of images: +CI images and PROD images. CI images are used for most tests and checks, +while PROD images are used for Kubernetes tests. + +To run the tests, we need to ensure that the images are built using the +latest sources and that the build process is efficient. A full rebuild +of such an image from scratch might take approximately 15 minutes. +Therefore, we've implemented optimization techniques that efficiently +use the cache from the GitHub Docker registry. In most cases, this +reduces the time needed to rebuild the image to about 4 minutes. +However, when dependencies change, it can take around 6-7 minutes, and +if the base image of Python releases a new patch-level, it can take +approximately 12 minutes. + +## Container Registry used as cache + +We are using GitHub Container Registry to store the results of the +`Build Images` workflow which is used in the `Tests` workflow. + +Currently in main version of Airflow we run tests in all versions of +Python supported, which means that we have to build multiple images (one +CI and one PROD for each Python version). Yet we run many jobs (\>15) - +for each of the CI images. That is a lot of time to just build the +environment to run. Therefore we are utilising `pull_request_target` +feature of GitHub Actions. + +This feature allows to run a separate, independent workflow, when the Review Comment: allows us* to run -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
