Hi I can't find the reference here, but if I recall correctly the preferred user for slurmd is actually root. It is the default.
> I assume this can be fixed by modifying the configuration so "SlurmdUser=root", but does this imply that anything run with `srun` will be actually executed by root? This seems dangerous. As far as safety, I think you're OK. The sbatch/srun/salloc processes set the user ID appropriately for the phase of the job being run. Source: been running it this way for a while... My apologies if running as a non-root user is a requirement for your environment. Michael On Mon, Jul 8, 2019 at 9:39 AM Daniel Torregrosa < daniel.torregr...@insight-centre.org> wrote: > You are right. The critical part I was missing is that chown does not work > without sudo. > > I assume this can be fixed by modifying the configuration so > "SlurmdUser=root", but does this imply that anything run with `srun` will > be actually executed by root? This seems dangerous. > > Thanks a lot. > > On Mon, 8 Jul 2019 at 17:28, Jeffrey Frey <f...@udel.edu> wrote: > >> Does user "slurm" have the capability of reowning files/directories to an >> arbitrary uid/gid? Probably not -- that's something "root" can do, though. >> >> >> >> >> > On Jul 8, 2019, at 12:01 PM, Daniel Torregrosa < >> daniel.torregr...@insight-centre.org> wrote: >> > >> > Hi all, >> > >> > I am currently testing slurm (slurm-wlm 17.11.2 from a newly installed >> and updated Ubuntu server LTS). I managed to make it work on a very simple >> 1 master node and 2 compute nodes configuration. All three nodes have the >> same users (namely root, slurm and test), with slurm running both slurmctld >> and slurmd on the corresponding node (i.e. SlurmUser=slurm and >> SlurmdUser=slurm), and test as the only loggable user. >> > >> > Commands such as `salloc` and `srun` work perfectly, but `sbatch` >> fails. In `squeue`, I get "(launch failed requeued help)". When I check the >> corresponding compute node log, I get "error: >> chown(/var/spool/slurmd/d/jobxxxxx): Operation not permitted". The previous >> line has "Launching batch job xx for UID 1000" (test) or 0 (root) if >> running `sudo sbatch`. >> > >> > Batch file looks like >> > >> > #! /bin/bash >> > #SBATCH -J myjob >> > >> > hostname >> > >> > I suspect that the problem is that `srun` and `salloc` are being run by >> SlurmdUser (slurm, i.e. `srun whoami` returns slurm), who owns >> /var/spool/slurmd, but sbatch tasks are being run by the user issuing the >> command (test). >> > >> > Should I chmod /var/spool/slurmd so any user can write there, or do I >> have a configuration problem? I feel like I am missing something critical >> here. >> > >> > Thanks a lot. >> > Daniel >> >> >> :::::::::::::::::::::::::::::::::::::::::::::::::::::: >> Jeffrey T. Frey, Ph.D. >> Systems Programmer V / HPC Management >> Network & Systems Services / College of Engineering >> University of Delaware, Newark DE 19716 >> Office: (302) 831-6034 Mobile: (302) 419-4976 >> :::::::::::::::::::::::::::::::::::::::::::::::::::::: >> >> >> >> >>