Lachlan, I note that you have dropped the slurmdm and started again with an empty database. This sounds serious! The only thing I would suggest is an strace of the slurmctld I often run straces when I have a proble. They never usually tell me much but a lot of pretty text flies past ont he screen. To be honest though detailed stracing did reveal a big performance problem with the IO in once code some time ago. I digress. Try an strace and look up a few calls before the program halts. Does an error leap out at you?
On 1 June 2018 at 02:04, Lachlan Musicman <data...@gmail.com> wrote: > On 31 May 2018 at 17:00, Ole Holm Nielsen <ole.h.niel...@fysik.dtu.dk> > wrote: > >> Hi Lachlan, >> >> Slurm upgrades on CentOS 7.5 should run without problems. It seems to me >> that your problems are unrelated to the Slurm RPMs. FWIW, I documented the >> Munge and Slurm installation as well as upgrade process in my Wiki page >> https://wiki.fysik.dtu.dk/niflheim/Slurm_installation >> > > Hi Ole, > > Yes, I agree. I already read your page with sadness because it worked for > you and didn't work for me :) I didn't think the problem was the RPMs - the > problem being I don't know *what* is causing the errors. But slurmctld is > crashing quickly after starting, everything else appears to be working > correctly - munge is up and works (tested), MariaDB is up, SlurmDBD is up. > > L. > >