On Thursday, 20 September 2018 5:57:56 PM AEST Mahmood Naderan wrote: > It seems that when their fluent job crashes for some reasons, or they > decide to close the fluent window without terminating the job or > closing the terminal suddenly or ... the fluent processes remain in > the node while the job is not listed in the output of squeue command.
If you use cgroups to contain jobs along with pam_slurm_adopt to put any SSH sessions into the jobs "extern" cgroup then Slurm should be able to track and clean up pretty much anything your users can throw at it. https://slurm.schedmd.com/cgroups.html https://slurm.schedmd.com/pam_slurm_adopt.html Best of luck! Chris -- Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC