Hi Jason,
IMHO, the job_container/tmpfs is not working well in Slurm 22.05, but
there may be some significant improvements included in 23.02 (announced
yesterday). I've documented our experiences in the Wiki page
https://wiki.fysik.dtu.dk/Niflheim_system/Slurm_configuration/#temporary-job-directories
This page contains links to bug reports against the job_container/tmpfs
plugin.
We're using the auto_tmpdir SPANK plugin with great success in Slurm 22.05.
Best regards,
Ole
On 01-03-2023 03:27, Jason Ellul wrote:
We have recently moved to slurm 22.05.8 and have configured
job_container/tmpfs to allow private tmp folders.
job_container.conf contains:
AutoBasePath=true
BasePath=/slurm
And in slurm.conf we have set
JobContainerType=job_container/tmpfs
I can see the folders being created and they are being used but when a
job completes the root folder is not being cleaned up.
Example of running job:
[root@papr-res-compute204 ~]# ls -al /slurm/14292874
total 32
drwx------ 3 root root 34 Mar 1 13:16 .
drwxr-xr-x 518 root root 16384 Mar 1 13:16 ..
drwx------ 2 mzethoven root 6 Mar 1 13:16 .14292874
-r--r--r-- 1 root root 0 Mar 1 13:16 .ns
Example once job completes /slurm/<jobid> remains:
[root@papr-res-compute204 ~]# ls -al /slurm/14292794
total 32
drwx------ 2 root root 6 Mar 1 09:33 .
drwxr-xr-x 518 root root 16384 Mar 1 13:16 ..
Is this to be expected or should the folder /slurm/<jobid> also be removed?
Do I need to create an epilog script to remove the directory that is left?