Re: [slurm-users] NAS benchmarks - problem with openmpi, slurm and pmi

2017-12-07 Thread Glenn (Gedaliah) Wolosh
Thanks for all your help. You are right, this no longer seems like slurm topic. I’ll post on OMPI if I need more help. ___ Gedaliah Wolosh IST Academic and Research Computing Systems (ARCS) NJIT GITC 2203 973 596 5437 gwol...@njit.edu > On Dec 7, 2017, at 4:31 PM, Artem Polyakov wr

Re: [slurm-users] NAS benchmarks - problem with openmpi, slurm and pmi

2017-12-07 Thread Artem Polyakov
You seem to use a very old OMPI implementation (the current one is 3.0). So I'd suggest to try it if you can. And it seem like a pure OMPI problem so OMPI dev list may be more appropriate for this topic. 2017-12-07 12:53 GMT-08:00 Glenn (Gedaliah) Wolosh : > > > On Dec 7, 2017, at 3:26 PM, Artem

Re: [slurm-users] NAS benchmarks - problem with openmpi, slurm and pmi

2017-12-07 Thread Glenn (Gedaliah) Wolosh
> On Dec 7, 2017, at 3:26 PM, Artem Polyakov wrote: > > Given that ring is working I don't think that it's a PMI problem. > > Can you try running NPB with the tcp btl parameters that I've provided? (I > assume you have TCP interconnect, let me know if it's not a case). > > чт, 7 дек. 2017 г.

Re: [slurm-users] NAS benchmarks - problem with openmpi, slurm and pmi

2017-12-07 Thread Artem Polyakov
Given that ring is working I don't think that it's a PMI problem. Can you try running NPB with the tcp btl parameters that I've provided? (I assume you have TCP interconnect, let me know if it's not a case). чт, 7 дек. 2017 г. в 12:03, Glenn (Gedaliah) Wolosh : > On Dec 7, 2017, at 1:18 PM, Arte

Re: [slurm-users] NAS benchmarks - problem with openmpi, slurm and pmi

2017-12-07 Thread Glenn (Gedaliah) Wolosh
> On Dec 7, 2017, at 1:18 PM, Artem Polyakov wrote: > > Couple of things to try to locate the issue: > > 1. To make sure that PMI is not working: have you tried to run something > simple (like hello_world > (https://github.com/open-mpi/ompi/blob/master/examples/hello_c.c >

Re: [slurm-users] NAS benchmarks - problem with openmpi, slurm and pmi

2017-12-07 Thread Artem Polyakov
Couple of things to try to locate the issue: 1. To make sure that PMI is not working: have you tried to run something simple (like hello_world ( https://github.com/open-mpi/ompi/blob/master/examples/hello_c.c) and ring ( https://github.com/open-mpi/ompi/blob/master/examples/ring_c.c). Please try t

Re: [slurm-users] NAS benchmarks - problem with openmpi, slurm and pmi

2017-12-07 Thread Glenn (Gedaliah) Wolosh
> On Dec 7, 2017, at 12:51 PM, Artem Polyakov wrote: > > also please post the output of > $ srun --mpi=list [gwolosh@p-slogin bin]$ srun --mpi=list srun: MPI types are... srun: mpi/mpich1_shmem srun: mpi/mpich1_p4 srun: mpi/lam srun: mpi/openmpi srun: mpi/none srun: mpi/mvapich srun: mpi/mpich

Re: [slurm-users] NAS benchmarks - problem with openmpi, slurm and pmi

2017-12-07 Thread Glenn (Gedaliah) Wolosh
> On Dec 7, 2017, at 12:49 PM, Artem Polyakov wrote: > > Hello, > > what is the value of MpiDefault option in your Slurm configuration file? MpiDefault=none > > 2017-12-07 9:37 GMT-08:00 Glenn (Gedaliah) Wolosh >: > Hello > > This is using Slurm version - 17.02.6

Re: [slurm-users] NAS benchmarks - problem with openmpi, slurm and pmi

2017-12-07 Thread Artem Polyakov
also please post the output of $ srun --mpi=list When job crashes - is there any error messages in the relevant slurmd.log's or output on the screen? 2017-12-07 9:49 GMT-08:00 Artem Polyakov : > Hello, > > what is the value of MpiDefault option in your Slurm configuration file? > > 2017-12-07 9:

Re: [slurm-users] NAS benchmarks - problem with openmpi, slurm and pmi

2017-12-07 Thread Artem Polyakov
Hello, what is the value of MpiDefault option in your Slurm configuration file? 2017-12-07 9:37 GMT-08:00 Glenn (Gedaliah) Wolosh : > Hello > > This is using Slurm version - 17.02.6 running on Scientific Linux release > 7.4 (Nitrogen) > > [gwolosh@p-slogin bin]$ module li > > Currently Loaded Mo

[slurm-users] NAS benchmarks - problem with openmpi, slurm and pmi

2017-12-07 Thread Glenn (Gedaliah) Wolosh
Hello This is using Slurm version - 17.02.6 running on Scientific Linux release 7.4 (Nitrogen) [gwolosh@p-slogin bin]$ module li Currently Loaded Modules: 1) GCCcore/.5.4.0 (H) 2) binutils/.2.26 (H) 3) GCC/5.4.0-2.26 4) numactl/2.0.11 5) hwloc/1.11.3 6) OpenMPI/1.10.3 If I run sr