CalvinKirs opened a new pull request, #45446: URL: https://github.com/apache/doris/pull/45446
### What problem does this PR solve? When the Master node restarts or switches to a new primary, the new Master must take over task scheduling. In this scenario, tasks running on the previous Master may remain in an uncertain state (e.g., suspended or incomplete). To ensure system consistency and accurate task states, the new Master should enforce failure handling for these tasks after its initialization to clean up any residual task states. ### Release note None ### Check List (For Author) - Test <!-- At least one of them must be included. --> - [ ] Regression test - [ ] Unit Test - [ ] Manual test (add detailed scripts or steps below) ``` mysql> create job restart_test_one_time ON SCHEDULE at current_timestamp do INSERT INTO orders.image values ('2023-03-18','1','12213'); mysql> select * from tasks("type"="insert") where jobName='restart_test_one_time'; +---------------+---------------+-----------------------+-----------------------------+---------+----------+---------------------+---------------------+------------+-------------+---------------+------+ | TaskId | JobId | JobName | Label | Status | ErrorMsg | CreateTime | StartTime | FinishTime | TrackingUrl | LoadStatistic | User | +---------------+---------------+-----------------------+-----------------------------+---------+----------+---------------------+---------------------+------------+-------------+---------------+------+ | 3519142268960 | 3518933899123 | restart_test_one_time | 3518933899123_3519142268960 | RUNNING | | 2024-12-16 11:15:22 | 2024-12-16 11:15:22 | | | | root | +---------------+---------------+-----------------------+-----------------------------+---------+----------+---------------------+---------------------+------------+-------------+---------------+------+ 1 row in set (1.14 sec) mysql> select * from tasks("type"="insert") where jobName='restart_test_one_time'; No connection. Trying to reconnect... Connection id: 0 Current database: orders +---------------+---------------+-----------------------+-----------------------------+--------+--------------------------------+---------------------+-----------+---------------------+-------------+---------------+------+ | TaskId | JobId | JobName | Label | Status | ErrorMsg | CreateTime | StartTime | FinishTime | TrackingUrl | LoadStatistic | User | +---------------+---------------+-----------------------+-----------------------------+--------+--------------------------------+---------------------+-----------+---------------------+-------------+---------------+------+ | 3519142268960 | 3518933899123 | restart_test_one_time | 3518933899123_3519142268960 | FAILED | task failed because of restart | 2024-12-16 11:15:22 | | 2024-12-16 11:15:48 | | | root | +---------------+---------------+-----------------------+-----------------------------+--------+--------------------------------+---------------------+-----------+---------------------+-------------+---------------+------+ 1 row in set (1.11 sec) mysql> select * from jobs("type"="insert") where Name='restart_test_one_time'; +---------------+-----------------------+---------+-------------+------------------------+----------+------------------------------------------------------------+---------------------+------------------+-----------------+-------------------+---------+ | Id | Name | Definer | ExecuteType | RecurringStrategy | Status | ExecuteSql | CreateTime | SucceedTaskCount | FailedTaskCount | CanceledTaskCount | Comment | +---------------+-----------------------+---------+-------------+------------------------+----------+------------------------------------------------------------+---------------------+------------------+-----------------+-------------------+---------+ | 3518933899123 | restart_test_one_time | root | ONE_TIME | AT 2024-12-16 11:15:22 | FINISHED | INSERT INTO orders.image values ('2023-03-18','1','12213') | 2024-12-16 11:15:22 | 0 | 1 | 0 | | +---------------+-----------------------+---------+-------------+------------------------+----------+------------------------------------------------------------+---------------------+------------------+-----------------+-------------------+---------+ 1 row in set (0.07 sec) ``` - Behavior changed: - [ ] No. - [ ] Yes. <!-- Explain the behavior change --> - Does this need documentation? - [ ] No. - [ ] Yes. <!-- Add document PR link here. eg: https://github.com/apache/doris-website/pull/1214 --> ### Check List (For Reviewer who merge this PR) - [ ] Confirm the release note - [ ] Confirm test cases - [ ] Confirm document - [ ] Add branch pick label <!-- Add branch pick label that this PR should merge into --> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org