Re: [slurm-users] slurm sreport CPU/GPU Gres/Tres contact

2020-10-26 Thread 김남연
Hello, I'm using* openhpc* and *slurm 18.04 version.* I want to know how the CPU and GPU usage time for 'sreport' results is described. *Below is the CPU usage.* [root@main rpm]# sreport user top start=10/01/20T00:00:00 end=10/01/20T23:59:59 TopCount=50 -t HourPer -t verbose unknown time format

Re: [slurm-users] [External] Re: Is there a way to get the real-time cpu/memory usage of job processes by using slurm command. (Chaofeng Zhang)

2020-10-26 Thread Chaofeng Zhang
Yes, I am using the icinga2 plugin to do it, just want to check whether slurm has already implemented. Thanks From: 肖正刚 Sent: Tuesday, October 27, 2020 8:19 AM To: Chaofeng Zhang Cc: slurm-users@lists.schedmd.com Subject: Re: [External] Re: Is there a way to get the real-time cpu/memory usage

Re: [slurm-users] [External] Re: Is there a way to get the real-time cpu/memory usage of job processes by using slurm command. (Chaofeng Zhang)

2020-10-26 Thread 肖正刚
Why not use monitoring tools like promethues+grafana or ganglia Chaofeng Zhang 于2020年10月26日周一 下午1:42写道: > I can see some cpu usage information from command sstat and sacct, but > that is not what I want. I want to see the real-time cpu usage like the > linux top command. > > > > Thanks > > > >

Re: [slurm-users] Increasing /dev/shm max size?

2020-10-26 Thread Diego Zuccato
Il 22/10/20 12:56, Diego Zuccato ha scritto: > 2) Is the shared memory accounted as belonging to the process and > enforced accordingly by cgroups? According to some preliminary tests, it seems it's not enforced. Or maybe I haven't configured cgroups correctly. Hints? -- Diego Zuccato DIFA - Dip

Re: [slurm-users] pam_slurm_adopt always claims now active jobs even when they do

2020-10-26 Thread Paul Raines
I have ConstrainRAMSpace=yes in cgroups.conf and PrologFlags=Contain,X11 in slurm.conf I just tried $ squeue JOBID PARTITION NAME USER ST TIME NODES 808lcnrtx tcsh raines R 1-22:39:17 1 rtx-03 $ srun --jobid 808 --pty /bin/tcsh ^Csrun:

Re: [slurm-users] pam_slurm_adopt always claims now active jobs even when they do

2020-10-26 Thread Paul Raines
With debugging on I get: Oct 26 09:22:33 rtx-03 pam_slurm_adopt[176647]: debug: Reading slurm.conf file: /etc/slurm/slurm.conf Oct 26 09:22:33 rtx-03 pam_slurm_adopt[176647]: debug4: found jobid = 808, stepid = 4294967295 Oct 26 09:22:33 rtx-03 pam_slurm_adopt[176647]: debug4: found jobid = 8

Re: [slurm-users] Partition QOS limit not being enforced

2020-10-26 Thread Durai Arasan
Hello Matt, Thanks. It was the AccountingStorageEnforce parameter indeed. It is working as expected after the parameter was set to "limits,qos" in slurm.conf. Best, Durai On Fri, Oct 23, 2020 at 6:12 PM Matthew Brown wrote: > Yes, I think you need AccountingStorageEnforce to have at least "limi