kaxil opened a new pull request, #64558:
URL: https://github.com/apache/airflow/pull/64558

   Addresses review feedback on #64181, fixes the teardown scope evaluation to 
avoid unnecessary work in the scheduler hot path.
   
   - **Lazy iteration instead of `list()` materialization**: 
`_evaluate_direct_relatives()` has side effects (`ti.set_state()` on [line 
460](https://github.com/apache/airflow/blob/456e4478bf69cc68b3855ef710f33092bef07220/airflow-core/src/airflow/ti_deps/deps/trigger_rule_dep.py#L460)).
 The original `list()` call forced all those DB writes before yielding 
anything. With lazy iteration, we yield as we go and only evaluate teardown 
scope if no direct-relative status was produced.
   
   - **Early return for empty `in_scope_ids`**: Skips the `ensure_finished_tis` 
DB query when there are no in-scope tasks to check.
   
   - **Parallel branch regression test**: The existing tests used linear 
chains, but the [reported bug](https://github.com/apache/airflow/issues/29332) 
involves parallel branches (`setup >> [t_fail, t_slow] >> downstream >> 
teardown`). Added a test matching that DAG shape.
   
   - **FAILED/UPSTREAM_FAILED state test**: Teardowns must run to clean up 
resources regardless of upstream failure state. Added a test verifying this 
works with FAILED and UPSTREAM_FAILED in-scope tasks.
   
   - **Removed fragile string assertion**: `assert "2 in-scope" in reason` 
couples tests to exact message format.
   
   ## Benchmark: scheduler impact of `list()` vs lazy
   
   The scheduler evaluates trigger rules for every TI in every loop. For 
teardowns, `list()` forces all `set_state()` DB writes upfront even when the 
first upstream already blocks:
   
   | Deployment | DAGs | Tasks/DAG | `list()` writes | lazy writes | wasted 
writes |
   |---|---|---|---|---|---|
   | Small | 50 | 10 | 500 | 50 | 450 |
   | Medium | 200 | 20 | 8,000 | 400 | 7,600 |
   | Large | 1,000 | 50 | 250,000 | 5,000 | 245,000 |
   | Enterprise | 5,000 | 100 | 5,000,000 | 50,000 | 4,950,000 |
   
   | Deployment | Teardowns | `list()` time | lazy time | speedup |
   |---|---|---|---|---|
   | Small | 50 | 0.05 ms | 0.02 ms | 2.4x |
   | Medium | 400 | 0.66 ms | 0.16 ms | 4.2x |
   | Large | 5,000 | 17.54 ms | 2.00 ms | 8.8x |
   | Enterprise | 50,000 | 334.51 ms | 19.82 ms | 16.9x |


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to