chris-fast commented on issue #17786:
URL: 
https://github.com/apache/dolphinscheduler/issues/17786#issuecomment-3810912178

   I've been digging into this issue and found the root cause. Here's what's 
going on:
   
   The `SeatunnelTask` isn't properly tracking streaming jobs because a few key 
methods are missing or incomplete:
   
   - `getApplicationIds()` just returns an empty list, so DS never gets the 
actual Yarn/K8s application ID
   - `trackApplicationStatus()` does nothing - there's no status polling 
happening
   - When you hit "Stop", it only kills the local shell process, but the 
SeaTunnel job keeps running on the cluster
   
   For the fix, I see two options:
   
   **Quick fix** - implement the missing methods in `SeatunnelTask`:
   - Parse the application ID from log output (like FlinkTask/SparkTask already 
do)
   - Use that ID to properly cancel the job when stopping
   - Add status polling so the UI shows the correct state
   
   **Bigger refactor** - make `SeatunnelTask` extend `AbstractYarnTask` instead 
of `AbstractRemoteTask`. This would give us the tracking behavior automatically 
but it's a more significant change.
   
   Before I start coding, would be good to know:
   - Quick fix vs refactor - any preference?
   - Anything special about the Flink/Spark/Zeta modes I should watch out for?
   - Is Yarn/K8s polling sufficient or should we also consider SeaTunnel's REST 
API?
   
   I'm happy to put together a PR once we settle on an approach. Thoughts?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to