Copilot commented on code in PR #64766:
URL: https://github.com/apache/airflow/pull/64766#discussion_r3066486590
##########
providers/databricks/src/airflow/providers/databricks/operators/databricks.py:
##########
@@ -410,6 +410,51 @@ def execute(self, context: Context) -> int:
return job_id
+class DatabricksDeleteJobsOperator(BaseOperator):
+ """
+ Deletes a Databricks job.
+
+ :param job_id: The ID of the Databricks job to delete. (templated)
+ :param databricks_conn_id: Reference to the Databricks connection.
+ """
+
+ template_fields: Sequence[str] = ("job_id",)
+
+ def __init__(
+ self,
+ *,
+ job_id: int,
+ databricks_conn_id: str = "databricks_default",
Review Comment:
`job_id` is marked as a templated field, but the constructor type is `int`.
In Airflow, Jinja rendering typically produces `str` values, so this signature
is inconsistent with templating and with other Databricks operators (e.g.
`DatabricksRunNowOperator` uses `job_id: str | None`). Consider widening the
type (e.g. `int | str`) and/or coercing to `int` in `execute` before calling
the hook.
##########
providers/databricks/src/airflow/providers/databricks/operators/databricks.py:
##########
@@ -410,6 +410,51 @@ def execute(self, context: Context) -> int:
return job_id
+class DatabricksDeleteJobsOperator(BaseOperator):
+ """
+ Deletes a Databricks job.
+
+ :param job_id: The ID of the Databricks job to delete. (templated)
+ :param databricks_conn_id: Reference to the Databricks connection.
Review Comment:
The class docstring documents only `job_id` and `databricks_conn_id`, but
`__init__` also exposes retry-related parameters (and
`polling_period_seconds`). Please document these parameters here for
consistency with other operators in this module (e.g.
`DatabricksCreateJobsOperator`).
##########
providers/databricks/src/airflow/providers/databricks/hooks/databricks.py:
##########
@@ -302,6 +303,14 @@ def create_job(self, json: dict) -> int:
response = self._do_api_call(CREATE_ENDPOINT, json)
return response["job_id"]
+ def delete_job(self, job_id: int) -> None:
+ """
+ Call the ``api/2.2/jobs/delete`` endpoint.
+
+ :param job_id: The unique identifier of the job to be deleted.
+ """
+ self._do_api_call(DELETE_ENDPOINT, {"job_id": job_id})
Review Comment:
`delete_job` takes `job_id: int`, but operator/template usage commonly
yields a `str` at runtime. Consider accepting `str | int` (consistent with
`reset_job`/`update_job` signatures) and/or coercing/validating to an `int`
before making the API call to ensure the payload type matches the Databricks
API expectation.
##########
providers/databricks/tests/unit/databricks/operators/test_databricks.py:
##########
@@ -603,6 +604,44 @@ def test_exec_update_job_permission_with_empty_acl(self,
db_mock_class):
db_mock.update_job_permission.assert_not_called()
+class TestDatabricksDeleteJobsOperator:
+
@mock.patch("airflow.providers.databricks.operators.databricks.DatabricksHook")
+ def test_execute(self, mock_hook_class):
+ # Set up the mock hook instance
+ mock_hook = mock_hook_class.return_value
+
+ # Instantiate the DatabricksDeleteJobOperator
+ op = DatabricksDeleteJobsOperator(
+ task_id="delete_job_task",
+ job_id=12345,
+ databricks_conn_id="databricks_default",
+ )
+
+ # Execute the operator
+ op.execute({})
+
+ # Verify that the DatabricksHook was initialized with the correct
connection ID and default retry arguments
+ mock_hook_class.assert_called_once_with(
+ "databricks_default",
+ retry_limit=3,
+ retry_delay=1,
+ retry_args=None,
Review Comment:
This assertion hard-codes default retry values (`retry_limit=3`,
`retry_delay=1`). For consistency with the other operator tests in this file
and to avoid brittleness if defaults change (or if the operator is instantiated
with non-default values), assert against `op.databricks_retry_limit` /
`op.databricks_retry_delay` instead of literals.
##########
providers/databricks/src/airflow/providers/databricks/operators/databricks.py:
##########
@@ -410,6 +410,51 @@ def execute(self, context: Context) -> int:
return job_id
+class DatabricksDeleteJobsOperator(BaseOperator):
+ """
+ Deletes a Databricks job.
+
+ :param job_id: The ID of the Databricks job to delete. (templated)
+ :param databricks_conn_id: Reference to the Databricks connection.
+ """
+
+ template_fields: Sequence[str] = ("job_id",)
+
+ def __init__(
+ self,
+ *,
+ job_id: int,
+ databricks_conn_id: str = "databricks_default",
+ polling_period_seconds: int = 30,
+ databricks_retry_limit: int = 3,
+ databricks_retry_delay: int = 1,
+ databricks_retry_args: dict[Any, Any] | None = None,
+ **kwargs,
+ ) -> None:
+ super().__init__(**kwargs)
+ self.job_id = job_id
+ self.databricks_conn_id = databricks_conn_id
+ self.polling_period_seconds = polling_period_seconds
+ self.databricks_retry_limit = databricks_retry_limit
+ self.databricks_retry_delay = databricks_retry_delay
+ self.databricks_retry_args = databricks_retry_args
Review Comment:
`polling_period_seconds` is accepted and stored on the operator, but it is
never used by `DatabricksDeleteJobsOperator`. This makes the public API
misleading; consider removing the parameter (and attribute) or using it if
there is intended polling behavior.
##########
providers/databricks/src/airflow/providers/databricks/hooks/databricks.py:
##########
@@ -54,6 +54,7 @@
REPAIR_RUN_ENDPOINT = ("POST", "2.2/jobs/runs/repair")
OUTPUT_RUNS_JOB_ENDPOINT = ("GET", "2.2/jobs/runs/get-output")
CANCEL_ALL_RUNS_ENDPOINT = ("POST", "2.2/jobs/runs/cancel-all")
+DELETE_ENDPOINT = ("POST", "2.2/jobs/delete")
Review Comment:
Endpoint constant name `DELETE_ENDPOINT` is overly generic in this module,
which already defines other delete-related endpoints like `DELETE_RUN_ENDPOINT`
and `DELETE_REPO_ENDPOINT`. Renaming this to something specific like
`DELETE_JOB_ENDPOINT` would better match existing naming patterns and avoid
confusion.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]