John, we ran into the same issues that you did.   One thing we discovered was 
that podman relies heavily on the $TMPDIR variable if it is set.   It seemed 
that in spite of making changes to storage.conf, podman still tried to use 
$TMPDIR for some of its state information.   Since $TMPDIR on our cluster was 
pointed at an NFS mount, that created all sorts of issues.

We implemented similar solutions as discussed in this thread.   However, jobs 
that are going to run podman had to be configured to unset $TMPDIR.    This 
allowed the rest of our podman config to work as intended.     So the existence 
of $TMPDIR kept interfering with our solution.   Easy to fix since our compute 
jobs are created using an automated build process.

Roger Moye
HPC Architect
713.898.0021 Mobile

QUANTLAB Financial, LLC
3 Greenway Plaza
Suite 200
Houston, Texas 77046
www.quantlab.com<https://www.quantlab.com/>


From: John Snowdon via slurm-users <[email protected]>
Sent: Friday, September 5, 2025 2:55 AM
To: [email protected]
Subject: [slurm-users] Creating /run/user/$UID - for Podman runtime [External 
Email]


Caution: This email originated from outside of the organization. Do not click 
links or open attachments unless you recognize and know the content is safe.


We are in the middle of implementing an extensive range of container support on 
our new HPC platform and have decided to offer our users a wide suite of 
technologies to better support their workloads:


  *   Apptainer

  *   Podman (rootless)

  *   Docker (rootless)

We've already got a solution for automated entries in /etc/subuid and 
/etc/subgid on the head nodes (available here under GPL: 
https://github.com/megatron-uk/pam_subid), which is where we intend users to 
build their container images, and building and running containers using 
Apptainer and Podman in those environments works really well - we're happy that 
it should take care of 95% of our users needs (Docker is the last few 
percent....) and not involve giving them any special permissions.

If I ssh directly to a compute node, then Podman also works there to run an 
existing image (podman container run ...).

What I'm struggling with now is running Podman under Slurm itself on our 
compute nodes.

It appears as though Podman (in rootless mode) wants to put the majority of its 
run time / state information under /run/user/$UID ... this is fine on the head 
nodes which have interactive logins hitting PAM modules which instantiate the 
/run/user/$UID directories, but not under sbatch/srun which doesn't create them 
by default.

I've not been able to find a single, magical setting which will move all of the 
Podman state information out from /run/user to another location - there are 3 
or 4 settings involved, and even then I still find various bits of Podman want 
to create stuff under there.

Rather than hacking away at getting Podman changed to move all settings and 
state information elsewhere, it seems like the cleanest solution would just be 
to put the regular /run/user/$UID directory in place at the point Slurm starts 
the job instead.

What's the best way to get Slurm to create this and clean-up afterwards? Should 
this be in a prolog/epilog wrapper (e.g. directly calling loginctl) or is it 
cleaner to get Slurm to trigger the usual PAM session machinery in some manner?

John Snowdon
Senior Research Infrastructure Engineer (HPC)

Research Software Engineering
Catalyst Building, Room 2.01
Newcastle University
3 Science Square
Newcastle Helix
Newcastle upon Tyne
NE4 5TG
https://hpc.researchcomputing.ncl.ac.uk
-----------------------------------------------------------------------------------

The information in this communication and any attachment is confidential and 
intended solely for the attention and use of the named addressee(s). All 
information and opinions expressed herein are subject to change without notice. 
This communication is not to be construed as an offer to sell or the 
solicitation of an offer to buy any security. Any such offer or solicitation 
can only be made by means of the delivery of a confidential private offering 
memorandum (which should be carefully reviewed for a complete description of 
investment strategies and risks). Any reliance one may place on the accuracy or 
validity of this information is at their own risk. Past performance is not 
necessarily indicative of the future results of an investment. All figures are 
estimated and unaudited unless otherwise noted. If you are not the intended 
recipient, or a person responsible for delivering this to the intended 
recipient, you are not authorized to and must not disclose, copy, distribute, 
or retain this message or any part of it. In this case, please notify the 
sender immediately at 713-333-5440
-- 
slurm-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to