I'd confirm that as well. The state directory has all of that
information. We just upgraded from 18.05 to 20.02 on a different host and
while the cluster was quiet (we had a maintenance reservation in place)
there were running jobs which survived the upgrade.
I think the big thing to watch out f
My understanding is job state directory. Theoretically if you back it up, screw
up and lose it, you can restore it and try again. There’s some mention of this
in the upgrade docs if I’m not mistaken (as they suggest backing it up in case
you mess up during).
--
#BlackLivesMatter
|| \\UTGER
Slurm users,
I'm planning on moving slurmctld and slurmdbd to a new host. I know how
to dump the MySQL DB from the old server and import it to the new
slurmdbd host, and I know how to copy the job state directories to the
new host. I plan on doing this during our next maintenance window when