[slurm-users] sreport cluster AccountUtilizationByUser showing utilization of a deleted account

2021-02-09 Thread Chin,David
Hello, all: Details: * slurm 20.02.6 * MariaDB 10.3.17 * RHEL 8.1 I have a fairshare setup. I went through a couple of iterations in testing of manually creating accounts and users that I later deleted before putting in what is to be the production setup. One of the deleted accounts

Re: [slurm-users] Rate Limiting of RPC calls

2021-02-09 Thread Kevin Buckley
On 2021/02/10 09:33, Christopher Samuel wrote: Also getting users to use `sacct` rather than `squeue` to check what state a job is in can help a lot too, it reduces the load on slurmctld. That raises an interesting take on the two utilities, Chris, in that 1) It should be possible to write a

Re: [slurm-users] Rate Limiting of RPC calls

2021-02-09 Thread Christopher Samuel
On 2/9/21 5:08 pm, Paul Edmon wrote: 1. Being on the latest release: A lot of work has gone into improving RPC throughput, if you aren't running the latest 20.11 release I highly recommend upgrading.  20.02 also was pretty good at this. We've not gone to 20.11 on production systems yet, but I

Re: [slurm-users] Rate Limiting of RPC calls

2021-02-09 Thread Paul Edmon
We've hit this before several times. The tricks we've used to deal with this are: 1. Being on the latest release: A lot of work has gone into improving RPC throughput, if you aren't running the latest 20.11 release I highly recommend upgrading.  20.02 also was pretty good at this. 2. max_rpc

[slurm-users] Rate Limiting of RPC calls

2021-02-09 Thread Kota Tsuyuzaki
Hello guys, In our cluster, sometimes new incoming member accidentally creates too many slurm RPC calls (sbatch, sacct, etc), then slurmctld, slurmdbd, and mysql may be overloaded. To prevent such a situation, I'm looking for something like RPC Rate Limit for users. Does Slurm supports such a Ra

[slurm-users] Job Step Output Delay

2021-02-09 Thread Maria Semple
Hello all, I've noticed an odd behaviour with job steps in some Slurm environments. When a script is launched directly as a job, the output is written to file immediately. When the script is launched as a step in a job, output is written in ~30 second chunks. This doesn't happen in all Slurm envir

Re: [slurm-users] sacctmgr archive dump - no dump file produced, and data not purged?

2021-02-09 Thread Chin,David
Well, I seem to have figured it out. This worked and did what I wanted to (I think): $ sudo sacctmgr archive dump Directory=/data/Backups/Slurm PurgeEventAfter=1hour \ PurgeJobAfter=1hour PurgeStepAfter=1hour PurgeSuspendAfter=1hour \ PurgeUsageAfter=1hour Events Jobs Ste

Re: [slurm-users] Job flexibility with cons_tres

2021-02-09 Thread Yair Yarom
Hi, We have a similar configuration, very heterogeneous cluster and cons_tres. Users need to specify the CPU/memory/GPU/time, and it will schedule their job somewhere. Indeed there's currently no guarantee that you won't be left with a node with unusable GPUs because no CPUs or memory are availabl

Re: [slurm-users] Insert separating characters into sacct formated output

2021-02-09 Thread Angelos Ching
Hi Jianwen, I guess the -p or -P flag does what you want? Best regards, Angelos (Sent from mobile, please pardon me for typos and cursoriness.) > 9/2/2021 21:46、SJTU のメール: > > Hi, > >I am using SLURM 19.05.7 . Is it possible to insert user-defined > separating characters like "|" or ","

[slurm-users] Insert separating characters into sacct formated output

2021-02-09 Thread SJTU
Hi, I am using SLURM 19.05.7 . Is it possible to insert user-defined separating characters like "|" or "," into sacct's formatted outputs? That would make it easier to parse fields. Thank you! Jianwen

Re: [slurm-users] cant start slurmd

2021-02-09 Thread Sopena Ballesteros Manuel
yes, the problem was slurmd could not find munge so I added prefix to configure command and now it works. thank you From: slurm-users on behalf of Tina Friedrich Sent: Tuesday, February 9, 2021 1:22:14 PM To: slurm-users@lists.schedmd.com Subject: Re: [slurm-

Re: [slurm-users] cant start slurmd

2021-02-09 Thread Tina Friedrich
That looks odd - I mean I think it very straightforwardly wants to tell you that you've configured AuthType=auth/munge and SLURM can't find the auth_munge plugin. I didn't think you could even build SLURM without it finding munge, that's what puzzles me :) What version of SLURM is this? How d

[slurm-users] cant start slurmd

2021-02-09 Thread Sopena Ballesteros Manuel
Dear slurm user community, I am trying to setup a slurm cluster but I am getting the following error: # exec /usr/local/sbin/slurmd -v -D -f /etc/slurm/slurm.conf slurmd: Node configuration differs from hardware: CPUs=10:72(hw) Boards=1:1(hw) SocketsPerBoard=10:2(hw) CoresPerSocket=1:18(hw) Th