We are in the middle of implementing an extensive range of container support on 
our new HPC platform and have decided to offer our users a wide suite of 
technologies to better support their workloads:


  *
Apptainer
  *
Podman (rootless)
  *
Docker (rootless)

We've already got a solution for automated entries in /etc/subuid and 
/etc/subgid on the head nodes (available here under GPL: 
https://github.com/megatron-uk/pam_subid), which is where we intend users to 
build their container images, and building and running containers using 
Apptainer and Podman in those environments works really well - we're happy that 
it should take care of 95% of our users needs (Docker is the last few 
percent....) and not involve giving them any special permissions.

If I ssh directly to a compute node, then Podman also works there to run an 
existing image (podman container run ...).

What I'm struggling with now is running Podman under Slurm itself on our 
compute nodes.

It appears as though Podman (in rootless mode) wants to put the majority of its 
run time / state information under /run/user/$UID ... this is fine on the head 
nodes which have interactive logins hitting PAM modules which instantiate the 
/run/user/$UID directories, but not under sbatch/srun which doesn't create them 
by default.

I've not been able to find a single, magical setting which will move all of the 
Podman state information out from /run/user to another location - there are 3 
or 4 settings involved, and even then I still find various bits of Podman want 
to create stuff under there.

Rather than hacking away at getting Podman changed to move all settings and 
state information elsewhere, it seems like the cleanest solution would just be 
to put the regular /run/user/$UID directory in place at the point Slurm starts 
the job instead.

What's the best way to get Slurm to create this and clean-up afterwards? Should 
this be in a prolog/epilog wrapper (e.g. directly calling loginctl) or is it 
cleaner to get Slurm to trigger the usual PAM session machinery in some manner?

John Snowdon
Senior Research Infrastructure Engineer (HPC)

Research Software Engineering
Catalyst Building, Room 2.01
Newcastle University
3 Science Square
Newcastle Helix
Newcastle upon Tyne
NE4 5TG
https://hpc.researchcomputing.ncl.ac.uk
-- 
slurm-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to