Copilot commented on code in PR #64162:
URL: https://github.com/apache/airflow/pull/64162#discussion_r3025333325
##########
airflow-core/docs/authoring-and-scheduling/timetable.rst:
##########
@@ -151,6 +151,63 @@ must be a :class:`datetime.timedelta` or
``dateutil.relativedelta.relativedelta`
def example_dag():
pass
+.. versionadded:: 3.0.0
+ The ``run_immediately`` argument was introduced in Airflow 3.
+
+The optional ``run_immediately`` argument controls which cron point is
scheduled when a Dag is first
+enabled or re-enabled after a pause. It has no effect when ``catchup=True`` or
when prior Dag runs
+already exist (in those cases the scheduler always continues from where it
left off).
+
Review Comment:
The docs say `run_immediately` affects a DAG when it is “first enabled or
re-enabled after a pause”, but the implementation (and the tests added in this
PR) state `run_immediately` only affects the very first run; after pause/resume
with prior runs, the most recent past boundary is always scheduled. Please
reword this paragraph to avoid implying `run_immediately` changes pause/resume
behavior.
##########
airflow-core/src/airflow/timetables/trigger.py:
##########
@@ -399,7 +395,7 @@ def __init__(
*,
timezone: str | Timezone | FixedTimezone,
run_offset: int | datetime.timedelta | relativedelta | None = None,
- run_immediately: bool | datetime.timedelta = False,
+ run_immediately: bool | datetime.timedelta = True,
# todo: AIP-76 we can't infer partition date from this, so we need to
store it separately.
key_format: str = r"%Y-%m-%dT%H:%M:%S",
Review Comment:
`CronPartitionTimetable`’s docstring description of `run_immediately` still
documents the old semantics (including `None` and the auto-buffer window).
Since `CronPartitionTimetable` inherits the updated `_calc_first_run()`
behavior and now defaults `run_immediately=True`, the docstring should be
updated to match the new three-case semantics and default.
##########
airflow-core/docs/authoring-and-scheduling/timetable.rst:
##########
@@ -151,6 +151,63 @@ must be a :class:`datetime.timedelta` or
``dateutil.relativedelta.relativedelta`
def example_dag():
pass
+.. versionadded:: 3.0.0
+ The ``run_immediately`` argument was introduced in Airflow 3.
+
+The optional ``run_immediately`` argument controls which cron point is
scheduled when a Dag is first
+enabled or re-enabled after a pause. It has no effect when ``catchup=True`` or
when prior Dag runs
+already exist (in those cases the scheduler always continues from where it
left off).
+
+* ``run_immediately=True`` *(default)* — schedule the **most recent past**
cron point immediately.
+* ``run_immediately=False`` — skip the past cron point and wait for the **next
future** cron point.
+* ``run_immediately=timedelta(...)`` — schedule the most recent past cron
point only if it fired
+ within the given window; otherwise wait for the **next future** cron point.
+
+.. code-block:: python
+
+ from datetime import timedelta
+
+ from airflow.timetables.trigger import CronTriggerTimetable
+
+
+ @dag(
+ # Runs every 10 minutes.
+ # run_immediately=False: on first enable, skip any past slot that is
more than
+ # 5 minutes old (the minimum buffer) and wait for the next 10-minute
boundary.
+ schedule=CronTriggerTimetable(
+ "*/10 * * * *",
+ timezone="UTC",
+ run_immediately=False,
+ ),
+ start_date=datetime(2024, 1, 1),
+ catchup=False,
Review Comment:
This example comment still refers to the removed “minimum buffer” behavior
(“skip any past slot that is more than 5 minutes old”). With the simplified
semantics, `run_immediately=False` always skips the past cron point regardless
of how recently it fired. Also, the snippet uses `datetime(...)` in
`start_date` but doesn’t import `datetime` (only `timedelta`).
##########
airflow-core/src/airflow/timetables/trigger.py:
##########
@@ -186,7 +187,7 @@ def __init__(
*,
timezone: str | Timezone | FixedTimezone,
interval: datetime.timedelta | relativedelta = datetime.timedelta(),
- run_immediately: bool | datetime.timedelta = False,
+ run_immediately: bool | datetime.timedelta = True,
) -> None:
super().__init__(cron, timezone)
self._interval = interval
Review Comment:
Constructor default for `run_immediately` is now `True`, but
`CronTriggerTimetable.deserialize()` still uses `data.get("run_immediately",
False)` as the fallback when the key is missing. If older serialized payloads
omit the field, deserialization will set `run_immediately=False` and (with the
bug fixed) change scheduling behavior unexpectedly. Consider switching the
fallback to `True` (or handling legacy payload versions explicitly) and adding
a regression test for deserializing without the field.
##########
airflow-core/src/airflow/timetables/trigger.py:
##########
@@ -399,7 +395,7 @@ def __init__(
*,
timezone: str | Timezone | FixedTimezone,
run_offset: int | datetime.timedelta | relativedelta | None = None,
- run_immediately: bool | datetime.timedelta = False,
+ run_immediately: bool | datetime.timedelta = True,
# todo: AIP-76 we can't infer partition date from this, so we need to
store it separately.
key_format: str = r"%Y-%m-%dT%H:%M:%S",
) -> None:
Review Comment:
`CronPartitionTimetable` now defaults `run_immediately=True`, but its
`deserialize()` implementation still falls back to `False` when
`run_immediately` is missing (`data.get("run_immediately", False)`). For legacy
serialized DAGs without the field, this can change behavior after upgrade.
Align the deserialize fallback with the new default (or version-gate it).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]