Re: [slurm-users] [External] slurmd -C vs lscpu - which do I use to populate slurm.conf?

2021-04-28 Thread Ole Holm Nielsen
On 4/29/21 1:06 AM, Michael Robbert wrote: I think that you want to use the output of slurmd -C, but if that isn’t telling you the truth then you may not have built slurm with the correct libraries. I believe that you need to build with hwloc in order to get the most accurate details of the CPU

Re: [slurm-users] [External] slurmd -C vs lscpu - which do I use to populate slurm.conf?

2021-04-28 Thread Michael Robbert
I think that you want to use the output of slurmd -C, but if that isn’t telling you the truth then you may not have built slurm with the correct libraries. I believe that you need to build with hwloc in order to get the most accurate details of the CPU topology. Make sure you have hwloc-devel in

[slurm-users] slurmd -C vs lscpu - which do I use to populate slurm.conf?

2021-04-28 Thread David Henkemeyer
I'm working on populating slurm.conf on my nodes, and I noticed that slurmd -C doesn't agree with lscpu, in all cases, and I'm not sure why. Here is what lscpu reports: Thread(s) per core: 2 Core(s) per socket: 2 Socket(s): 1 And here is what slurmd -C is reporting: NodeName=devops2

Re: [slurm-users] OpenMPI interactive change in behavior?

2021-04-28 Thread John DeSantis
Jürgen, >> does it work with `srun --overlap ...´ or if you do `export SLURM_OVERLAP=1´ >> before running your interactive job? I performed testing yesterday while using the "--overlap" flag, but that didn't do anything. But, exporting the variable instead seems to have corrected the issue:

Re: [slurm-users] OpenMPI interactive change in behavior?

2021-04-28 Thread Juergen Salk
Hi John, does it work with `srun --overlap ...´ or if you do `export SLURM_OVERLAP=1´ before running your interactive job? Best regards Jürgen * John DeSantis [210428 09:41]: > Hello all, > > Just an update, the following URL almost mirrors the issue we're seeing: > https://github.com/open-

Re: [slurm-users] OpenMPI interactive change in behavior?

2021-04-28 Thread Paul Edmon
I haven't experienced this issue here.  Then again we've been using PMIx for launching MPI for a while now, thus we may have circumvented this particular issue. -Paul Edmon- On 4/28/2021 9:41 AM, John DeSantis wrote: Hello all, Just an update, the following URL almost mirrors the issue we're

Re: [slurm-users] OpenMPI interactive change in behavior?

2021-04-28 Thread John DeSantis
Hello all, Just an update, the following URL almost mirrors the issue we're seeing: https://github.com/open-mpi/ompi/issues/8378 But, SLURM 20.11.3 was shipped with the fix. I've verified that the changes are in the source code. We don't want to have to downgrade SLURM to 20.02.x, but it seem

[slurm-users] CUDA vs OpenCL

2021-04-28 Thread Valerio Bellizzomi
Greetings, I see here https://slurm.schedmd.com/gres.html#GPU_Management that CUDA_VISIBLE_DEVICES is available for NVIDIA GPUs, what about OpenCL GPUs? Is there an OPENCL_VISIBLE_DEVICES ? -- Valerio Bellizzomi https://www.selroc.systems http://www.selnet.org

Re: [slurm-users] Questions about adding new nodes to Slurm

2021-04-28 Thread Ole Holm Nielsen
On 4/28/21 2:48 AM, Sid Young wrote: I use SaltStack to push out the slurm.conf file to all nodes and do a "scontrol reconfigure" of the slurmd, this makes management much easier across the cluster. You can also do service restarts from one point etc. Avoid NFS mounts for the config, if the mou