Il 05/10/2021 09:22, Ole Holm Nielsen ha scritto:

What is a "frontend"?  Do you mean the slurmctld server?
Yes, sorry. "Frontend" is how we call the node(s) used by users to submit jobs, where slurmctld and slurmdbd run. We'll probably move slurmdbd and slurmctld to a dedicated VM in a future upgrade (mainly, I have to be sure it doesn't need IB or access to the gluster fs that's only available over IB). Does sbatch give slurmctld just a path to the job script or the whole script?

worked with IDLE (RESUME gives "Invalid node state specified").
So "scontrol update node=... state=idle" gives the node a correct idle state, whereas "state=resume" doesn't?  Did you restart the slurmd on the compute nodes?
Yes. Complete node reboots, actually. Multiple times. When desperate, try rebooting.

SLURM 20.11.4.
You wrote that you use Slurm 21.08 from Debian 11.  How did 20.11 get into the picture?
Good question. I copy-pasted 21.08 from a node after the upgrade, but now all nodes say 20.11.4 . Really confused :-? Just to add to the confusion, packages.debian.org gives 20.11.7+really20.11.4-2 as slurmctld version for bullseye. No mention of 21.08 anywhere, not even in sid (20.11.8). ARGH! Did I dream it? And if so, how could I c&p it????

  The slurmdbd and slurmctld servers must have versions >= that of slurmd, see some links in
https://wiki.fysik.dtu.dk/niflheim/Slurm_installation#upgrading-slurm
Yup. That's why I upgraded the whole cluster at once.

Tks for the help.

--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786

Reply via email to