[slurm-users] Re: Job running slower when using Slurm

2025-04-23 Thread Feng Zhang via slurm-users
Beside slurm options, you might also need to set OpenMP env variable: export OMP_NUM_THREADS=32 (the core, not thread number) Also other similar env variables, if you use any Python libs. Best, Feng On Wed, Apr 23, 2025 at 3:22 PM Jeffrey Layton via slurm-users < slurm-users@lists.schedmd.com

[slurm-users] Re: Slurm webhooks

2025-04-23 Thread Davide DelVento via slurm-users
Thank you all. I had thought of writing my own, but I suspected it would be too large of a time sink. Your nudges (and example script) have convinced me otherwise, and in fact this is what I will do! Thanks again! On Tue, Apr 22, 2025 at 3:12 AM Bjørn-Helge Mevik via slurm-users < slurm-users@list

[slurm-users] Re: Job running slower when using Slurm

2025-04-23 Thread Jeffrey Layton via slurm-users
Roger. It's the code that prints out the threads it sees - I bet it is the cgroups. I need to look at how that it is configured as well. For the time, that comes from the code itself. I'm guessing it has a start time and and end time in the code and just takes the difference. But again, this is so

[slurm-users] Re: Job running slower when using Slurm

2025-04-23 Thread Michael DiDomenico via slurm-users
the program probably says 32 threads, because it's just looking at the box, not what slurm cgroups allow (assuming your using them) for cpu i think for an openmp program (not openmpi) you definitely want the first command with --cpus-per-task=32 are you measuring the runtime inside the program or

[slurm-users] Re: Job running slower when using Slurm

2025-04-23 Thread Jeffrey Layton via slurm-users
I tried using ntasks and cpus-per-task to get all 32 cores. So I added --ntasks=# --cpus-per-task=N to th sbatch command so that it now looks like: sbatch --nodes=1 --ntasks=1 --cpus-per-task=32

[slurm-users] Re: Job running slower when using Slurm

2025-04-23 Thread Jeffrey Layton via slurm-users
Roger. I didn't configure Slurm so let me look at slurm.conf and gres.conf to see if they restrict a job to a single CPU. Thanks On Wed, Apr 23, 2025 at 1:48 PM Michael DiDomenico via slurm-users < slurm-users@lists.schedmd.com> wrote: > without knowing anything about your environment, its reaso

[slurm-users] Re: Job running slower when using Slurm

2025-04-23 Thread Michael DiDomenico via slurm-users
without knowing anything about your environment, its reasonable to suspect that maybe your openmp program is multi-threaded, but slurm is constraining your job to a single core. evidence of this should show up when running top on the node, watching the cpu% used for the program On Wed, Apr 23, 20

[slurm-users] Job running slower when using Slurm

2025-04-23 Thread Jeffrey Layton via slurm-users
Good morning, I'm running an NPB test, bt.C that is OpenMP and built using NV HPC SDK (version 25.1). I run it on a compute node by ssh-ing to the node. It runs in about 19.6 seconds. Then I run the code using a simple job: Command to submit job: sbatch --nodes=1 run-npb-omp The script run-npb-

[slurm-users] Re: Strange output of sshare

2025-04-23 Thread Frank Schilder via slurm-users
Looks like its an unclean handling of a "0/0" somewhere when RawUsage=0 for an entire account. The issue disappears as soon as there is some usage. It can be difficult though to get some usage into the account with such a fair share penalty. In our case, we reorganised the accounts and the iss