reele opened a new issue, #17109: URL: https://github.com/apache/dolphinscheduler/issues/17109
### Search before asking - [x] I had searched in the [issues](https://github.com/apache/dolphinscheduler/issues?q=is%3Aissue) and found no similar issues. ### What happened when recover/failover a workflow from running/failed/stopped/paused state in multi master cluster, the host didn't set to new master's address, the operation may failed. if old master is not exist, server will report `Connection refused`, if old master exist, server will report `Cannot find the WorkflowExecuteRunnable`   ``` 2025-03-22 18:51:19.676 ERROR [qtp742969054-35] o.a.d.a.e.w.StopWorkflowInstanceExecutorDelegate:[98] - WorkflowInstance: sleep-20250321085059987 stop failed org.apache.dolphinscheduler.extract.base.exception.RemoteException: Call method to Host(ip=10.0.6.23, port=15678) failed at org.apache.dolphinscheduler.extract.base.client.NettyRemotingClient.sendSync(NettyRemotingClient.java:147) at org.apache.dolphinscheduler.extract.base.client.SyncClientMethodInvoker.invoke(SyncClientMethodInvoker.java:51) at org.apache.dolphinscheduler.extract.base.client.ClientInvocationHandler.invoke(ClientInvocationHandler.java:56) at com.sun.proxy.$Proxy830.stopWorkflowInstance(Unknown Source) at org.apache.dolphinscheduler.api.executor.workflow.StopWorkflowInstanceExecutorDelegate.stopInMaster(StopWorkflowInstanceExecutorDelegate.java:87) at org.apache.dolphinscheduler.api.executor.workflow.StopWorkflowInstanceExecutorDelegate.execute(StopWorkflowInstanceExecutorDelegate.java:52) at org.apache.dolphinscheduler.api.executor.workflow.StopWorkflowInstanceExecutorDelegate$StopWorkflowInstanceOperation.execute(StopWorkflowInstanceExecutorDelegate.java:127) ... Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: /10.0.6.23:15678 Caused by: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716) at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330) at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:702) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at java.lang.Thread.run(Thread.java:750) ``` ### What you expected to happen . ### How to reproduce in multi master cluster, run a workflow, stop (and start) the master which running the workflow, stop workflow in web ### Anything else _No response_ ### Version dev ### Are you willing to submit PR? - [x] Yes I am willing to submit a PR! ### Code of Conduct - [x] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
