zhangconan opened a new issue, #16581: URL: https://github.com/apache/dolphinscheduler/issues/16581
### Search before asking - [X] I had searched in the [issues](https://github.com/apache/dolphinscheduler/issues?q=is%3Aissue) and found no similar issues. ### What happened I wrote the following command in the remoteshell script: ``` touch zkn.txt; The error message is as follows: ``` [LOG-PATH]: /home/dolphinscheduler/tmp/dolphinscheduler/worker-server/logs/20240904/117989902358848/4/59/73.log, [HOST]: 172.18.0.1:1234 [INFO] 2024-09-04 14:21:11.228 +0800 - *********************************************************************************************** [INFO] 2024-09-04 14:21:11.232 +0800 - ********************************* Initialize task context *********************************** [INFO] 2024-09-04 14:21:11.233 +0800 - *********************************************************************************************** [INFO] 2024-09-04 14:21:11.233 +0800 - Begin to initialize task [INFO] 2024-09-04 14:21:11.233 +0800 - Set task startTime: 1725430871233 [INFO] 2024-09-04 14:21:11.233 +0800 - Set task appId: 59_73 [INFO] 2024-09-04 14:21:11.234 +0800 - End initialize task { "taskInstanceId" : 73, "taskName" : "远程shell", "firstSubmitTime" : 1725430871208, "startTime" : 1725430871233, "taskType" : "REMOTESHELL", "workflowInstanceHost" : "172.18.0.1:5678", "host" : "172.18.0.1:1234", "logPath" : "/home/dolphinscheduler/tmp/dolphinscheduler/worker-server/logs/20240904/117989902358848/4/59/73.log", "processId" : 0, "processDefineCode" : 117989902358848, "processDefineVersion" : 4, "processInstanceId" : 59, "scheduleTime" : 0, "executorId" : 1, "cmdTypeIfComplement" : 0, "tenantCode" : "default", "processDefineId" : 0, "projectId" : 0, "projectCode" : 117107289483392, "taskParams" : "{\"localParams\":[],\"rawScript\":\"touch zkn.txt\",\"resourceList\":[],\"type\":\"SSH\",\"datasource\":4}", "prepareParamsMap" : { "system.task.definition.name" : { "prop" : "system.task.definition.name", "direct" : "IN", "type" : "VARCHAR", "value" : "远程shell" }, "system.project.name" : { "prop" : "system.project.name", "direct" : "IN", "type" : "VARCHAR", "value" : null }, "system.project.code" : { "prop" : "system.project.code", "direct" : "IN", "type" : "VARCHAR", "value" : "117107289483392" }, "system.workflow.instance.id" : { "prop" : "system.workflow.instance.id", "direct" : "IN", "type" : "VARCHAR", "value" : "59" }, "system.biz.curdate" : { "prop" : "system.biz.curdate", "direct" : "IN", "type" : "VARCHAR", "value" : "20240904" }, "system.biz.date" : { "prop" : "system.biz.date", "direct" : "IN", "type" : "VARCHAR", "value" : "20240903" }, "system.task.instance.id" : { "prop" : "system.task.instance.id", "direct" : "IN", "type" : "VARCHAR", "value" : "73" }, "system.workflow.definition.name" : { "prop" : "system.workflow.definition.name", "direct" : "IN", "type" : "VARCHAR", "value" : "张可南测试" }, "system.task.definition.code" : { "prop" : "system.task.definition.code", "direct" : "IN", "type" : "VARCHAR", "value" : "118774317966656" }, "system.workflow.definition.code" : { "prop" : "system.workflow.definition.code", "direct" : "IN", "type" : "VARCHAR", "value" : "117989902358848" }, "system.datetime" : { "prop" : "system.datetime", "direct" : "IN", "type" : "VARCHAR", "value" : "20240904142111" } }, "taskAppId" : "59_73", "taskTimeout" : 2147483647, "workerGroup" : "default", "delayTime" : 0, "currentExecutionStatus" : "SUBMITTED_SUCCESS", "resourceParametersHelper" : { "resourceMap" : { "DATASOURCE" : { "4" : { "resourceType" : "DATASOURCE", "type" : "SSH", "connectionParams" : "{\"user\":\"root\",\"password\":\"***********\",\"host\":\"192.168.200.127\",\"port\":22}", "DATASOURCE" : null } } } }, "endTime" : 0, "dryRun" : 0, "paramsMap" : { }, "cpuQuota" : -1, "memoryMax" : -1, "testFlag" : 0, "logBufferEnable" : false, "dispatchFailTimes" : 0 } [INFO] 2024-09-04 14:21:11.236 +0800 - *********************************************************************************************** [INFO] 2024-09-04 14:21:11.237 +0800 - ********************************* Load task instance plugin ********************************* [INFO] 2024-09-04 14:21:11.237 +0800 - *********************************************************************************************** [INFO] 2024-09-04 14:21:11.240 +0800 - Send task status RUNNING_EXECUTION master: 172.18.0.1:1234 [INFO] 2024-09-04 14:21:11.241 +0800 - TenantCode: default check successfully [INFO] 2024-09-04 14:21:11.244 +0800 - WorkflowInstanceExecDir: /tmp/dolphinscheduler/exec/process/default/117107289483392/117989902358848_4/59/73 check successfully [INFO] 2024-09-04 14:21:11.244 +0800 - Create TaskChannel: org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteShellTaskChannel successfully [INFO] 2024-09-04 14:21:11.244 +0800 - Download resources successfully: ResourceContext(resourceItemMap={}) [INFO] 2024-09-04 14:21:11.245 +0800 - Download upstream files: [] successfully [INFO] 2024-09-04 14:21:11.245 +0800 - Task plugin instance: REMOTESHELL create successfully [INFO] 2024-09-04 14:21:11.245 +0800 - shell task params {"localParams":[],"rawScript":"touch zkn.txt","resourceList":[],"type":"SSH","datasource":4} [INFO] 2024-09-04 14:21:11.251 +0800 - Success initialized task plugin instance successfully [INFO] 2024-09-04 14:21:11.252 +0800 - Set taskVarPool: null successfully [INFO] 2024-09-04 14:21:11.253 +0800 - *********************************************************************************************** [INFO] 2024-09-04 14:21:11.253 +0800 - ********************************* Execute task instance ************************************* [INFO] 2024-09-04 14:21:11.253 +0800 - *********************************************************************************************** [INFO] 2024-09-04 14:21:11.255 +0800 - raw script : #!/bin/bash touch zkn.txt echo DOLPHINSCHEDULER-REMOTE-SHELL-TASK-STATUS-$? [INFO] 2024-09-04 14:21:11.716 +0800 - upload script from local:/tmp/dolphinscheduler/exec/process/default/117107289483392/117989902358848_4/59/73/59_73_node.sh to remote: /tmp/dolphinscheduler-remote-shell-root/dolphinscheduler-remoteshell-73.sh [INFO] 2024-09-04 14:21:12.122 +0800 - The final script is: #!/bin/bash touch zkn.txt echo DOLPHINSCHEDULER-REMOTE-SHELL-TASK-STATUS-$? [INFO] 2024-09-04 14:21:12.197 +0800 - Remote shell task log: [INFO] 2024-09-04 14:21:12.391 +0800 - DOLPHINSCHEDULER-REMOTE-SHELL-TASK-STATUS-0 [INFO] 2024-09-04 14:21:12.465 +0800 - Remote shell task run status: DOLPHINSCHEDULER-REMOTE-SHELL-TASK-STATUS-0 [ERROR] 2024-09-04 14:21:12.465 +0800 - Remote shell task failed [ERROR] 2024-09-04 14:21:12.468 +0800 - shell task error org.apache.dolphinscheduler.plugin.task.api.TaskException: Remote shell task error at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteExecutor.run(RemoteExecutor.java:100) at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteShellTask.handle(RemoteShellTask.java:104) at org.apache.dolphinscheduler.server.worker.runner.DefaultWorkerTaskExecutor.executeTask(DefaultWorkerTaskExecutor.java:51) at org.apache.dolphinscheduler.server.worker.runner.WorkerTaskExecutor.run(WorkerTaskExecutor.java:172) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Caused by: java.lang.NumberFormatException: For input string: "0 " at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Integer.parseInt(Integer.java:580) at java.lang.Integer.parseInt(Integer.java:615) at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteExecutor.getTaskExitCode(RemoteExecutor.java:140) at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteExecutor.run(RemoteExecutor.java:98) ... 6 common frames omitted [ERROR] 2024-09-04 14:21:12.469 +0800 - Task execute failed, due to meet an exception org.apache.dolphinscheduler.plugin.task.api.TaskException: Execute shell task error at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteShellTask.handle(RemoteShellTask.java:110) at org.apache.dolphinscheduler.server.worker.runner.DefaultWorkerTaskExecutor.executeTask(DefaultWorkerTaskExecutor.java:51) at org.apache.dolphinscheduler.server.worker.runner.WorkerTaskExecutor.run(WorkerTaskExecutor.java:172) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Caused by: org.apache.dolphinscheduler.plugin.task.api.TaskException: Remote shell task error at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteExecutor.run(RemoteExecutor.java:100) at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteShellTask.handle(RemoteShellTask.java:104) ... 5 common frames omitted Caused by: java.lang.NumberFormatException: For input string: "0 " at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Integer.parseInt(Integer.java:580) at java.lang.Integer.parseInt(Integer.java:615) at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteExecutor.getTaskExitCode(RemoteExecutor.java:140) at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteExecutor.run(RemoteExecutor.java:98) ... 6 common frames omitted [INFO] 2024-09-04 14:21:12.469 +0800 - kill remote task dolphinscheduler-remoteshell-73 [ERROR] 2024-09-04 14:21:12.470 +0800 - Cancel task failed, this will not affect the taskInstance status, but you need to check manual org.apache.dolphinscheduler.plugin.task.api.TaskException: cancel application error at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteShellTask.cancel(RemoteShellTask.java:121) at org.apache.dolphinscheduler.server.worker.runner.WorkerTaskExecutor.cancelTask(WorkerTaskExecutor.java:133) at org.apache.dolphinscheduler.server.worker.runner.WorkerTaskExecutor.afterThrowing(WorkerTaskExecutor.java:114) at org.apache.dolphinscheduler.server.worker.runner.DefaultWorkerTaskExecutor.afterThrowing(DefaultWorkerTaskExecutor.java:61) at org.apache.dolphinscheduler.server.worker.runner.WorkerTaskExecutor.run(WorkerTaskExecutor.java:179) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Caused by: org.apache.dolphinscheduler.plugin.task.api.TaskException: SSH connection failed at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteExecutor.getSession(RemoteExecutor.java:82) at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteExecutor.runRemote(RemoteExecutor.java:224) at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteExecutor.getTaskPid(RemoteExecutor.java:200) at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteExecutor.kill(RemoteExecutor.java:158) at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteShellTask.cancel(RemoteShellTask.java:119) ... 7 common frames omitted Caused by: java.lang.IllegalStateException: SshClient not started. Please call start() method before connecting to a server at org.apache.sshd.client.SshClient.doConnect(SshClient.java:627) at org.apache.sshd.client.SshClient.doConnect(SshClient.java:616) at org.apache.sshd.client.SshClient.connect(SshClient.java:547) at org.apache.sshd.client.SshClient.connect(SshClient.java:539) at org.apache.sshd.client.session.ClientSessionCreator.connect(ClientSessionCreator.java:74) at org.apache.sshd.client.session.ClientSessionCreator.connect(ClientSessionCreator.java:57) at org.apache.dolphinscheduler.plugin.datasource.ssh.SSHUtils.getSession(SSHUtils.java:41) at org.apache.dolphinscheduler.plugin.task.remoteshell.RemoteExecutor.getSession(RemoteExecutor.java:77) ... 11 common frames omitted [INFO] 2024-09-04 14:21:12.472 +0800 - Get a exception when execute the task, will send the task status: FAILURE to master: 172.18.0.1:1234 [INFO] 2024-09-04 14:21:12.472 +0800 - FINALIZE_SESSION ### What you expected to happen I hope the command can be executed normally and the task status returns success。 ### How to reproduce You only need to find a Linux server and configure it in the remoteshell task node to reproduce it. ### Anything else nothing ### Version 3.2.x ### Are you willing to submit PR? - [X] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
