Hello, I'm using* openhpc* and *slurm 18.04 version.*
I want to know how the CPU and GPU usage time for 'sreport' results is
described.
*Below is the CPU usage.*
[root@main rpm]# sreport user top start=10/01/20T00:00:00
end=10/01/20T23:59:59 TopCount=50 -t HourPer -t verbose
unknown time format
Yes, I am using the icinga2 plugin to do it, just want to check whether slurm
has already implemented.
Thanks
From: 肖正刚
Sent: Tuesday, October 27, 2020 8:19 AM
To: Chaofeng Zhang
Cc: slurm-users@lists.schedmd.com
Subject: Re: [External] Re: Is there a way to get the real-time cpu/memory
usage
Why not use monitoring tools like promethues+grafana or ganglia
Chaofeng Zhang 于2020年10月26日周一 下午1:42写道:
> I can see some cpu usage information from command sstat and sacct, but
> that is not what I want. I want to see the real-time cpu usage like the
> linux top command.
>
>
>
> Thanks
>
>
>
>
Il 22/10/20 12:56, Diego Zuccato ha scritto:
> 2) Is the shared memory accounted as belonging to the process and
> enforced accordingly by cgroups?
According to some preliminary tests, it seems it's not enforced. Or
maybe I haven't configured cgroups correctly.
Hints?
--
Diego Zuccato
DIFA - Dip
I have ConstrainRAMSpace=yes in cgroups.conf and PrologFlags=Contain,X11
in slurm.conf
I just tried
$ squeue
JOBID PARTITION NAME USER ST TIME NODES
808lcnrtx tcsh raines R 1-22:39:17 1 rtx-03
$ srun --jobid 808 --pty /bin/tcsh
^Csrun:
With debugging on I get:
Oct 26 09:22:33 rtx-03 pam_slurm_adopt[176647]: debug: Reading slurm.conf
file: /etc/slurm/slurm.conf
Oct 26 09:22:33 rtx-03 pam_slurm_adopt[176647]: debug4: found jobid = 808,
stepid = 4294967295
Oct 26 09:22:33 rtx-03 pam_slurm_adopt[176647]: debug4: found jobid = 8
Hello Matt,
Thanks. It was the AccountingStorageEnforce parameter indeed. It is working
as expected after the parameter was set to "limits,qos" in slurm.conf.
Best,
Durai
On Fri, Oct 23, 2020 at 6:12 PM Matthew Brown wrote:
> Yes, I think you need AccountingStorageEnforce to have at least "limi