Beside slurm options, you might also need to set OpenMP env variable:
export OMP_NUM_THREADS=32 (the core, not thread number)
Also other similar env variables, if you use any Python libs.
Best,
Feng
On Wed, Apr 23, 2025 at 3:22 PM Jeffrey Layton via slurm-users <
slurm-users@lists.schedmd.com
Thank you all. I had thought of writing my own, but I suspected it would be
too large of a time sink. Your nudges (and example script) have convinced
me otherwise, and in fact this is what I will do!
Thanks again!
On Tue, Apr 22, 2025 at 3:12 AM Bjørn-Helge Mevik via slurm-users <
slurm-users@list
Roger. It's the code that prints out the threads it sees - I bet it is the
cgroups. I need to look at how that it is configured as well.
For the time, that comes from the code itself. I'm guessing it has a start
time and and end time in the code and just takes the difference. But again,
this is so
the program probably says 32 threads, because it's just looking at the
box, not what slurm cgroups allow (assuming your using them) for cpu
i think for an openmp program (not openmpi) you definitely want the
first command with --cpus-per-task=32
are you measuring the runtime inside the program or
I tried using ntasks and cpus-per-task to get all 32 cores. So I added
--ntasks=# --cpus-per-task=N to th sbatch command so that it now looks
like:
sbatch --nodes=1 --ntasks=1 --cpus-per-task=32
Roger. I didn't configure Slurm so let me look at slurm.conf and gres.conf
to see if they restrict a job to a single CPU.
Thanks
On Wed, Apr 23, 2025 at 1:48 PM Michael DiDomenico via slurm-users <
slurm-users@lists.schedmd.com> wrote:
> without knowing anything about your environment, its reaso
without knowing anything about your environment, its reasonable to
suspect that maybe your openmp program is multi-threaded, but slurm is
constraining your job to a single core. evidence of this should show
up when running top on the node, watching the cpu% used for the
program
On Wed, Apr 23, 20
Good morning,
I'm running an NPB test, bt.C that is OpenMP and built using NV HPC SDK
(version 25.1). I run it on a compute node by ssh-ing to the node. It runs
in about 19.6 seconds.
Then I run the code using a simple job:
Command to submit job: sbatch --nodes=1 run-npb-omp
The script run-npb-
Looks like its an unclean handling of a "0/0" somewhere when RawUsage=0 for an
entire account.
The issue disappears as soon as there is some usage. It can be difficult though
to get some usage into the account with such a fair share penalty. In our case,
we reorganised the accounts and the iss