Re: [slurm-users] Slurm fair share priority not being applied

2017-12-07 Thread Lachlan Musicman
On 8 December 2017 at 18:07, Loris Bennett wrote: > Lachlan Musicman writes: > >> > >> Running sshare -l only shows the root user: > >> Account User RawShares NormShares RawUsage NormUsage EffectvUsage > FairShare LevelFS GrpTRESMins TRESRunMins > >> -- --

Re: [slurm-users] [slurm-dev] sacct: --unit applied to NNodes

2017-12-07 Thread Loris Bennett
Hi Chris, Chris Samuel writes: > On Thursday, 7 December 2017 8:09:12 PM AEDT Loris Bennett wrote: > >> Has this issue been addressed in any subsequent versions? > > No change in 17.11.0. > > $ sacct -u csamuel -o jobid,nnodes,ncpus,reqmem,maxrss,elapsed -S 2017-07-01 > --units=G >JobID

Re: [slurm-users] Slurm fair share priority not being applied

2017-12-07 Thread Loris Bennett
Lachlan Musicman writes: > On 1 December 2017 at 20:48, Bruno Santos wrote: > >> Loris, I think you hit the nail on the head. >> >> Running sshare -l only shows the root user: >> Account User RawShares NormShares RawUsage NormUsage EffectvUsage FairShare >> LevelFS GrpTRESMins TRESRunMins

Re: [slurm-users] Slurm fair share priority not being applied

2017-12-07 Thread Lachlan Musicman
On 1 December 2017 at 20:48, Bruno Santos wrote: > Loris, I think you hit the nail on the head. > > Running sshare -l only shows the root user: > Account User RawShares NormSharesRawUsage > NormUsage EffectvUsage FairShareLevelFS > GrpTRESMinsTR

Re: [slurm-users] [slurm-dev] sacct: --unit applied to NNodes

2017-12-07 Thread Chris Samuel
On Thursday, 7 December 2017 8:09:12 PM AEDT Loris Bennett wrote: > Has this issue been addressed in any subsequent versions? No change in 17.11.0. $ sacct -u csamuel -o jobid,nnodes,ncpus,reqmem,maxrss,elapsed -S 2017-07-01 --units=G JobID NNodes NCPUS ReqMem MaxRSSEl

Re: [slurm-users] NAS benchmarks - problem with openmpi, slurm and pmi

2017-12-07 Thread Glenn (Gedaliah) Wolosh
Thanks for all your help. You are right, this no longer seems like slurm topic. I’ll post on OMPI if I need more help. ___ Gedaliah Wolosh IST Academic and Research Computing Systems (ARCS) NJIT GITC 2203 973 596 5437 gwol...@njit.edu > On Dec 7, 2017, at 4:31 PM, Artem Polyakov wr

Re: [slurm-users] NAS benchmarks - problem with openmpi, slurm and pmi

2017-12-07 Thread Artem Polyakov
You seem to use a very old OMPI implementation (the current one is 3.0). So I'd suggest to try it if you can. And it seem like a pure OMPI problem so OMPI dev list may be more appropriate for this topic. 2017-12-07 12:53 GMT-08:00 Glenn (Gedaliah) Wolosh : > > > On Dec 7, 2017, at 3:26 PM, Artem

Re: [slurm-users] NAS benchmarks - problem with openmpi, slurm and pmi

2017-12-07 Thread Glenn (Gedaliah) Wolosh
> On Dec 7, 2017, at 3:26 PM, Artem Polyakov wrote: > > Given that ring is working I don't think that it's a PMI problem. > > Can you try running NPB with the tcp btl parameters that I've provided? (I > assume you have TCP interconnect, let me know if it's not a case). > > чт, 7 дек. 2017 г.

Re: [slurm-users] NAS benchmarks - problem with openmpi, slurm and pmi

2017-12-07 Thread Artem Polyakov
Given that ring is working I don't think that it's a PMI problem. Can you try running NPB with the tcp btl parameters that I've provided? (I assume you have TCP interconnect, let me know if it's not a case). чт, 7 дек. 2017 г. в 12:03, Glenn (Gedaliah) Wolosh : > On Dec 7, 2017, at 1:18 PM, Arte

Re: [slurm-users] NAS benchmarks - problem with openmpi, slurm and pmi

2017-12-07 Thread Glenn (Gedaliah) Wolosh
> On Dec 7, 2017, at 1:18 PM, Artem Polyakov wrote: > > Couple of things to try to locate the issue: > > 1. To make sure that PMI is not working: have you tried to run something > simple (like hello_world > (https://github.com/open-mpi/ompi/blob/master/examples/hello_c.c >

Re: [slurm-users] NAS benchmarks - problem with openmpi, slurm and pmi

2017-12-07 Thread Artem Polyakov
Couple of things to try to locate the issue: 1. To make sure that PMI is not working: have you tried to run something simple (like hello_world ( https://github.com/open-mpi/ompi/blob/master/examples/hello_c.c) and ring ( https://github.com/open-mpi/ompi/blob/master/examples/ring_c.c). Please try t

Re: [slurm-users] NAS benchmarks - problem with openmpi, slurm and pmi

2017-12-07 Thread Glenn (Gedaliah) Wolosh
> On Dec 7, 2017, at 12:51 PM, Artem Polyakov wrote: > > also please post the output of > $ srun --mpi=list [gwolosh@p-slogin bin]$ srun --mpi=list srun: MPI types are... srun: mpi/mpich1_shmem srun: mpi/mpich1_p4 srun: mpi/lam srun: mpi/openmpi srun: mpi/none srun: mpi/mvapich srun: mpi/mpich

Re: [slurm-users] NAS benchmarks - problem with openmpi, slurm and pmi

2017-12-07 Thread Glenn (Gedaliah) Wolosh
> On Dec 7, 2017, at 12:49 PM, Artem Polyakov wrote: > > Hello, > > what is the value of MpiDefault option in your Slurm configuration file? MpiDefault=none > > 2017-12-07 9:37 GMT-08:00 Glenn (Gedaliah) Wolosh >: > Hello > > This is using Slurm version - 17.02.6

Re: [slurm-users] NAS benchmarks - problem with openmpi, slurm and pmi

2017-12-07 Thread Artem Polyakov
also please post the output of $ srun --mpi=list When job crashes - is there any error messages in the relevant slurmd.log's or output on the screen? 2017-12-07 9:49 GMT-08:00 Artem Polyakov : > Hello, > > what is the value of MpiDefault option in your Slurm configuration file? > > 2017-12-07 9:

Re: [slurm-users] NAS benchmarks - problem with openmpi, slurm and pmi

2017-12-07 Thread Artem Polyakov
Hello, what is the value of MpiDefault option in your Slurm configuration file? 2017-12-07 9:37 GMT-08:00 Glenn (Gedaliah) Wolosh : > Hello > > This is using Slurm version - 17.02.6 running on Scientific Linux release > 7.4 (Nitrogen) > > [gwolosh@p-slogin bin]$ module li > > Currently Loaded Mo

[slurm-users] NAS benchmarks - problem with openmpi, slurm and pmi

2017-12-07 Thread Glenn (Gedaliah) Wolosh
Hello This is using Slurm version - 17.02.6 running on Scientific Linux release 7.4 (Nitrogen) [gwolosh@p-slogin bin]$ module li Currently Loaded Modules: 1) GCCcore/.5.4.0 (H) 2) binutils/.2.26 (H) 3) GCC/5.4.0-2.26 4) numactl/2.0.11 5) hwloc/1.11.3 6) OpenMPI/1.10.3 If I run sr

Re: [slurm-users] Remote submission hosts and security

2017-12-07 Thread Chris Samuel
On Wednesday, 6 December 2017 8:27:08 AM AEDT Jeff White wrote: > I have a need to allow a server which is outside of my cluster access to > submit jobs to the cluster. I can do that easily enough by handing my > Slurm RPMs, config, and munge key to the owner of that server and opening > access i

Re: [slurm-users] [slurm-dev] sacct: --unit applied to NNodes

2017-12-07 Thread Loris Bennett
Loris Bennett writes: > Loris Bennett writes: > >> Hi, >> >> With version 16.05.10-2, the option '--units' get applied incorrectly to >> the column 'NNodes': >> >> $ sacct -u user1234 -o jobid,nnodes,ncpus,reqmem,maxrss,elapsed -S >> 2017-07-01 --units=G >>JobID NNodes NCPUS