chris-fast commented on issue #17786: URL: https://github.com/apache/dolphinscheduler/issues/17786#issuecomment-3810912178
I've been digging into this issue and found the root cause. Here's what's going on: The `SeatunnelTask` isn't properly tracking streaming jobs because a few key methods are missing or incomplete: - `getApplicationIds()` just returns an empty list, so DS never gets the actual Yarn/K8s application ID - `trackApplicationStatus()` does nothing - there's no status polling happening - When you hit "Stop", it only kills the local shell process, but the SeaTunnel job keeps running on the cluster For the fix, I see two options: **Quick fix** - implement the missing methods in `SeatunnelTask`: - Parse the application ID from log output (like FlinkTask/SparkTask already do) - Use that ID to properly cancel the job when stopping - Add status polling so the UI shows the correct state **Bigger refactor** - make `SeatunnelTask` extend `AbstractYarnTask` instead of `AbstractRemoteTask`. This would give us the tracking behavior automatically but it's a more significant change. Before I start coding, would be good to know: - Quick fix vs refactor - any preference? - Anything special about the Flink/Spark/Zeta modes I should watch out for? - Is Yarn/K8s polling sufficient or should we also consider SeaTunnel's REST API? I'm happy to put together a PR once we settle on an approach. Thoughts? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
