Hi, Since users are not experts with cluster environment and slurm, their wrong works take up my time.
It seems that when their fluent job crashes for some reasons, or they decide to close the fluent window without terminating the job or closing the terminal suddenly or ... the fluent processes remain in the node while the job is not listed in the output of squeue command. And since they can not login to the nodes, I have to manually kill the processes. Is there any better way to manage that? Regards, Mahmood