In the past we've been using Centos 7 with slurm.spec file provided by
schedmd to build the rpms. This is working great where we can deploy the
rpms and perform upgrades via puppet.
As we are moving to Ubuntu. I noticed there are some repositories that
provide prepackaged files such as slurm-wlm a
Thanks, I will look into that.
Cheers
Geoff
Message: 1
Date: Tue, 19 Jul 2022 21:38:56 -0700
From: Christopher Samuel
To: slurm-users@lists.schedmd.com
Subject: Re: [slurm-users] Rate-limiting sbatch and srun
Message-ID: <6f7850ba-8b66-47a0-7312-8c222f1ff...@csamuel.or
Actually, I was able to fix the problem by starting slurmctld with the
-c option and then clear the runaway jobs with sacctmgr.
Thanks for your help.
J.
Le 20/07/2022 à 17:06, Julien Rey a écrit :
Hello,
Unfortunately, the sacctmgr show runawayjobs is returning the
following error:
sacctm
Hello,
Unfortunately, the sacctmgr show runawayjobs is returning the following
error:
sacctmgr: error: Slurmctld running on cluster cluster is not up, can't
check running jobs
J.
Le 20/07/2022 à 14:45, Ole Holm Nielsen a écrit :
Hi Julien,
You could make a database dump of the current da
Hi Julien,
You could make a database dump of the current database so that you can
load it on another server outside the cluster, while you reinitialize
Slurm with a fresh database.
So the database thinks that you have 253 running jobs? I guess that
slurmctld is not working, otherwise you co
Hello,
Thanks for your quick reply.
I don't mind losing jobs information but I certainly don't want to clear
the slurm database altogether.
The /var/lib/slurm-llnl/slurmctld/node_state and
/var/lib/slurm-llnl/slurmctld/node_state.old files look effectively
empty. I then entered the followin