github-actions[bot] commented on issue #16454: URL: https://github.com/apache/dolphinscheduler/issues/16454#issuecomment-2287564482
### Search before asking - [X] I had searched in the [issues](https://github.com/apache/dolphinscheduler/issues?q=is%3Aissue) and found no similar issues. ### What happened The current deployment mode is 1 master and 3 slaves. 1. Configure the st task through ds 2. Stop all three work-servers 3. Start 3 worker-servers in sequence The following problem occurs 1. The three worker-servers were down and did not kill the st task. The main reason for this is that ds is only responsible for submitting tasks to st. The actual task execution is run by the st server. However, when the work-server is started again, the previous task will be found. If the task stops unexpectedly, a new task will be restarted. At this time, the original st task will be doubled, and then the CPU and memory will be full. There will be two identical tasks in ds, one is running and the other is in status. It requires fault tolerance ### What you expected to happen ds's task monitoring for st is not complete yet. When launching a new st task, it did not go to stserver to check the actual running status of the task. ### How to reproduce 1. Configure the st task through ds 2. Stop all three work-servers 3. Start 3 worker-servers in sequence ### Anything else _No response_ ### Version 3.2.x ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
