Hello everyone,
I am using Slurm as a workload manager on a system
with a master and 3 nodes.
The operating system used is the recent rocky linux 8.4
while for slurm, is used the version 20.11.8 taken from EPEL
repository.
Everything works correctly and when the system is started the command
"syste
On 7/23/21 1:24 PM, Diego Zuccato wrote:
Il 23/07/2021 13:15, Ole Holm Nielsen ha scritto:
But it's not whowing jobIDs nor users :(
That is really strange! The pestat obtains username and jobid from the
squeue command. Do you get this information from "squeue -t running"?
$ squeue -t runnin
Il 23/07/2021 13:15, Ole Holm Nielsen ha scritto:
But it's not whowing jobIDs nor users :(
That is really strange! The pestat obtains username and jobid from the
squeue command. Do you get this information from "squeue -t running"?
$ squeue -t running
JOBID PARTITION NAME
On 7/23/21 1:15 PM, Ole Holm Nielsen wrote:
On 7/23/21 1:07 PM, Diego Zuccato wrote:
Well, Slurm reports the 15-minute load average. I guess users will
have to learn that, because we can't print help information every time.
They'd probably omit reading it anyway...
Actually, I found a bit of
On 7/23/21 1:07 PM, Diego Zuccato wrote:
Well, Slurm reports the 15-minute load average. I guess users will
have to learn that, because we can't print help information every time.
They'd probably omit reading it anyway...
Actually, I found a bit of unused space below the CPUload heading, so I
Yes, the problem was that. Thanks everyone for the help.
Greetings Riccardo
Il giorno ven 23 lug 2021 alle ore 13:04 Ole Holm Nielsen <
ole.h.niel...@fysik.dtu.dk> ha scritto:
> On 7/23/21 1:00 PM, Diego Zuccato wrote:
> > We answered in parallel :)
> > I usually prefer to avoid modifying system-
Il 23/07/2021 13:01, Ole Holm Nielsen ha scritto:
Well, Slurm reports the 15-minute load average. I guess users will
have to learn that, because we can't print help information every time.
They'd probably omit reading it anyway...
Actually, I found a bit of unused space below the CPUload head
On 7/23/21 1:00 PM, Diego Zuccato wrote:
We answered in parallel :)
I usually prefer to avoid modifying system-managed files because system
updates could reset 'em. Since systemd allows overrides, I chose to use
'em :)
I agree with you! The permanent fix will change those Systemd files in
2
On 7/23/21 12:43 PM, Ole Holm Nielsen wrote:
On 7/23/21 12:36 PM, Diego Zuccato wrote:
I believe that slurmd reports the 15 minute CPU load average to the
slurmctld, only. So you got this information already.
Yup. It's just unexpected: if you don't know, you run pestat and see
that an idle nod
We answered in parallel :)
I usually prefer to avoid modifying system-managed files because system
updates could reset 'em. Since systemd allows overrides, I chose to use
'em :)
Il 23/07/2021 12:52, Ole Holm Nielsen ha scritto:
On 7/23/21 12:29 PM, Riccardo Sucapane wrote:
I am using Slurm a
Hi Riccardo.
I've had a similar problem (slurm.conf is served via NFS share). I just
modified slurmd unit:
#systemctl edit slurmd
[Unit]
Requires=network-online.target
After=home.mount
HIH
Diego
Il 23/07/2021 12:29, Riccardo Sucapane ha scritto:
Hello everyone,
I am using Slurm as a workloa
On 7/23/21 12:29 PM, Riccardo Sucapane wrote:
I am using Slurm as a workload manager on a system
with a master and 3 nodes.
The operating system used is the recent rocky linux 8.4
while for slurm, is used the version 20.11.8 taken from EPEL
repository.
Everything works correctly and when the syst
Hi Diego,
On 7/23/21 12:36 PM, Diego Zuccato wrote:
I believe that slurmd reports the 15 minute CPU load average to the
slurmctld, only. So you got this information already.
Yup. It's just unexpected: if you don't know, you run pestat and see that
an idle node does have a very high load :)
My
Hi Loris.
Il 23/07/2021 09:05, Loris Bennett ha scritto:
We use both Zabbix and pestat. Zabbix gives us general information on
the state of the nodes and file systems, and we have added some Slurm
metrics, such as number of jobs pending, amount of memory pending,
number of GPUs pending, etc.
Hi Ole,
Ole Holm Nielsen writes:
> Hi Loris,
>
> On 7/23/21 9:05 AM, Loris Bennett wrote:
>> We use both Zabbix and pestat. Zabbix gives us general information on
>> the state of the nodes and file systems, and we have added some Slurm
>> metrics, such as number of jobs pending, amount of memor
Hi Loris,
On 7/23/21 9:05 AM, Loris Bennett wrote:
We use both Zabbix and pestat. Zabbix gives us general information on
the state of the nodes and file systems, and we have added some Slurm
metrics, such as number of jobs pending, amount of memory pending,
number of GPUs pending, etc. This ha
Ole Holm Nielsen writes:
> Hi Diego,
>
> On 7/23/21 8:16 AM, Diego Zuccato wrote:
>>> The Configless Slurm (https://slurm.schedmd.com/configless_slurm.html) from
>>> 20.02 makes distribution of slurm.conf really simple.
>> Eager to see it in Debian :)
>
> IMHO, there ought to be a community effor
17 matches
Mail list logo