It looks like your slurm.conf is specifying /var/spool as your Save state directory, and `fatal: Incorrect permissions on state save loc: /var/spool` indicates that SlurmUser (another configuration in slurm.conf) does not have access to write to it. It might be a good to make a directory dedicated for this purpose, e.g. /var/spool/slurm/<clustername>_state, and then make sure that the SlurmUser (usually either "slurm" or root, depending on your needs), can access that directory.
---- Doug Jacobsen, Ph.D. NERSC Computer Systems Engineer National Energy Research Scientific Computing Center <http://www.nersc.gov> dmjacob...@lbl.gov ------------- __o ---------- _ '\<,_ ----------(_)/ (_)__________________________ On Wed, Apr 11, 2018 at 5:44 AM, Ole Holm Nielsen < ole.h.niel...@fysik.dtu.dk> wrote: > Hi Matt, > > You might want to take a look at my Slurm Wiki, which focuses on > CentOS/RHEL 7: https://wiki.fysik.dtu.dk/niflheim/SLURM. Complete > instructions for Slurm installation, configuration, etc. is in the Wiki. > > /Ole > > > On 04/11/2018 02:26 PM, Matt Hohmeister wrote: > >> I’m brand-new to Slurm, and setting it up on a single RHEL 7.4 VM as a >> proof of concept before I deploy it. After following the instructions on >> https://www.slothparadise.com/how-to-install-slurm-on-centos-7-cluster/ >> (sorry, site not working now), I can get slurmd to start perfectly, but >> slurmctld fails to start with the following journalctl -xe; I was wondering >> if anyone has run into this or could shed some light on this…thanks in >> advance! >> >> Apr 11 08:18:30 psy-slurm polkitd[680]: Registered Authentication Agent >> for unix-process:1779:31362 (system bus name :1.26 [/usr/bin/pkttyagent >> --notify-fd 5 --fallbac >> >> Apr 11 08:18:30 psy-slurm systemd[1]: Starting Slurm controller daemon... >> >> -- Subject: Unit slurmctld.service has begun start-up >> >> -- Defined-By: systemd >> >> -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel >> >> -- >> >> -- Unit slurmctld.service has begun starting up. >> >> Apr 11 08:18:30 psy-slurm systemd[1]: PID file /var/run/slurmctld.pid not >> readable (yet?) after start. >> >> Apr 11 08:18:30 psy-slurm systemd[1]: Started Slurm controller daemon. >> >> -- Subject: Unit slurmctld.service has finished start-up >> >> -- Defined-By: systemd >> >> -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel >> >> -- >> >> -- Unit slurmctld.service has finished starting up. >> >> -- >> >> -- The start-up result is done. >> >> Apr 11 08:18:30 psy-slurm polkitd[680]: Unregistered Authentication Agent >> for unix-process:1779:31362 (system bus name :1.26, object path >> /org/freedesktop/PolicyKit1/A >> >> Apr 11 08:18:30 psy-slurm slurmctld[1787]: fatal: Incorrect permissions >> on state save loc: /var/spool >> >> Apr 11 08:18:30 psy-slurm systemd[1]: slurmctld.service: main process >> exited, code=exited, status=1/FAILURE >> >> Apr 11 08:18:30 psy-slurm systemd[1]: Unit slurmctld.service entered >> failed state. >> >> Apr 11 08:18:30 psy-slurm systemd[1]: slurmctld.service failed. >> >> Matt Hohmeister >> >> Systems and Network Administrator >> >> Department of Psychology >> >> Florida State University >> >> PO Box 3064301 >> >> Tallahassee, FL 32306-4301 >> >> Phone: +1 850 645 1902 >> >> Fax: +1 850 644 7739 >> > >