Re: [slurm-users] only 1 job running

Andy Riebs Thu, 28 Jan 2021 06:56:28 -0800

Hi Chandler,

If the only changes to your system have been the slurm.confconfiguration and the addition of a new node, the easiest way to trackthis down is probably to show us the diffs between the previous andcurrent versions of slurm.conf, and a note about what's different aboutthe new node that you want to address.


Andy


On 1/28/2021 1:18 AM, Chandler wrote:

Made a little bit of progress by running sinfo:

PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
defq*        up   infinite      3  drain n[011-013]
defq*        up   infinite      1  alloc n010

not sure why n[011-013] are in drain state, that needs to be fixed.

After some searching, I ran:

scontrol update nodename=n[011-013] state=idle
and now 1 additional job has started on each of the n[011-013], so now4 jobs are running but the rest are still queued. They should all berunning. After some more searching, I guess resource sharing needs tobe turned on? Can help with doing that? I also attached the slurm.conf.
Thanks

Re: [slurm-users] only 1 job running

Reply via email to