On 4/29/21 1:06 AM, Michael Robbert wrote:
I think that you want to use the output of slurmd -C, but if that isn’t
telling you the truth then you may not have built slurm with the correct
libraries. I believe that you need to build with hwloc in order to get the
most accurate details of the CPU
I think that you want to use the output of slurmd -C, but if that isn’t telling
you the truth then you may not have built slurm with the correct libraries. I
believe that you need to build with hwloc in order to get the most accurate
details of the CPU topology. Make sure you have hwloc-devel in
I'm working on populating slurm.conf on my nodes, and I noticed that slurmd
-C doesn't agree with lscpu, in all cases, and I'm not sure why. Here is
what lscpu reports:
Thread(s) per core: 2
Core(s) per socket: 2
Socket(s): 1
And here is what slurmd -C is reporting:
NodeName=devops2
Jürgen,
>> does it work with `srun --overlap ...´ or if you do `export SLURM_OVERLAP=1´
>> before running your interactive job?
I performed testing yesterday while using the "--overlap" flag, but that didn't
do anything. But, exporting the variable instead seems to have corrected the
issue:
Hi John,
does it work with `srun --overlap ...´ or if you do `export SLURM_OVERLAP=1´
before running your interactive job?
Best regards
Jürgen
* John DeSantis [210428 09:41]:
> Hello all,
>
> Just an update, the following URL almost mirrors the issue we're seeing:
> https://github.com/open-
I haven't experienced this issue here. Then again we've been using PMIx
for launching MPI for a while now, thus we may have circumvented this
particular issue.
-Paul Edmon-
On 4/28/2021 9:41 AM, John DeSantis wrote:
Hello all,
Just an update, the following URL almost mirrors the issue we're
Hello all,
Just an update, the following URL almost mirrors the issue we're seeing:
https://github.com/open-mpi/ompi/issues/8378
But, SLURM 20.11.3 was shipped with the fix. I've verified that the changes
are in the source code.
We don't want to have to downgrade SLURM to 20.02.x, but it seem
Greetings,
I see here https://slurm.schedmd.com/gres.html#GPU_Management that
CUDA_VISIBLE_DEVICES is available for NVIDIA GPUs, what about OpenCL
GPUs?
Is there an OPENCL_VISIBLE_DEVICES ?
--
Valerio Bellizzomi
https://www.selroc.systems
http://www.selnet.org
On 4/28/21 2:48 AM, Sid Young wrote:
I use SaltStack to push out the slurm.conf file to all nodes and do a
"scontrol reconfigure" of the slurmd, this makes management much easier
across the cluster. You can also do service restarts from one point etc.
Avoid NFS mounts for the config, if the mou