Re: [slurm-users] SLES 15 rpmbuild from 20.02.5 tarball wants munge-libs: system munge RPMs don't provide it

2020-10-22 Thread Christopher Samuel
On 10/21/20 6:32 pm, Kevin Buckley wrote: If you install SLES 15 SP1 from the Q2 ISOs so that you have Munge but not the Slurm 18 that comes on the media, and then try to "rpmbuild -ta" against a vanilla Slurm 20.02.5 tarball, you should get the error I did. Ah, yes, that looks like it was a p

Re: [slurm-users] [External] Limit usage outside reservation

2020-10-22 Thread Christopher Samuel
On 10/22/20 12:20 pm, Burian, John wrote: This doesn' t help you now, but Slurm 20.11 is expected to have "magnetic reservations," which are reservations that will adopt jobs that don't specify a reservation but otherwise meet the restrictions of the reservation: Magnetic reservations are in

Re: [slurm-users] [External] Limit usage outside reservation

2020-10-22 Thread Prentice Bisbal
On 10/20/20 3:01 AM, SJTU wrote: Hi, We reserved compute node resource on SLURM for specific users and hope they will make good use of it. But in some cases users forgot the '--reservation' parameter in job scripts, competing with other users outside the reserved nodes. Is there a recommended

Re: [slurm-users] [External] Re: Simple free for all cluster

2020-10-22 Thread Prentice Bisbal
I know I'm replying late to this party, but for what it's worth, when this topic was debated at my current employer, the more advanced users (the ones who know how to checkpoint their code, etc.) argued for shorter time limits. They wanted a max. runtime of only 24 hours, whereas the less advan

Re: [slurm-users] [External] Limit usage outside reservation

2020-10-22 Thread Burian, John
This doesn' t help you now, but Slurm 20.11 is expected to have "magnetic reservations," which are reservations that will adopt jobs that don't specify a reservation but otherwise meet the restrictions of the reservation: https://slurm.schedmd.com/SLUG20/Roadmap.pdf, search for "magnetic" John

Re: [slurm-users] [pmix] [Cross post - Slurm, PMIx, UCX] Using srun with SLURM_PMIX_DIRECT_CONN_UCX=true fails with input/output error

2020-10-22 Thread Fulcomer, Samuel
Compile slurm without ucx support. We wound up spending quality time with the Mellanox... wait, no, NVIDIA Networking UCX folks to get this sorted out. I recommend using SLURM 20 rather than 19. regards, s On Thu, Oct 22, 2020 at 10:23 AM Michael Di Domenico wrote: > was there ever a result

Re: [slurm-users] [pmix] [Cross post - Slurm, PMIx, UCX] Using srun with SLURM_PMIX_DIRECT_CONN_UCX=true fails with input/output error

2020-10-22 Thread Michael Di Domenico
was there ever a result to this? i'm seeing the same error message, but i'm not adding in all the environ flags like the original poster. On Wed, Jul 10, 2019 at 9:18 AM Daniel Letai wrote: > > Thank you Artem, > > > I've made a mistake while typing the mail, in all cases it was > 'OMPI_MCA_pml

[slurm-users] Increasing /dev/shm max size?

2020-10-22 Thread Diego Zuccato
Hello all. I've been asked to increase /dev/shm max size to 95% of the available memory for DART (DASH RunTime). In an article [*], I read: "It is thus imperative that HPC system administrators are aware of the caveats of shared memory and provide sufficient resources for both /tmp and /dev/shm f