Re: [slurm-users] [EXT] Re: pmix issue

2020-12-08 Thread Andy Riebs
unity List *Date: *Monday, December 7, 2020 at 10:55 AM *To: *"a...@candooz.com" , Slurm User Community List *Subject: *Re: [slurm-users] [EXT] Re: pmix issue *APL external email warning: *Verify sender slurm-users-boun...@lists.schedmd.com before clicking links or attachments Ma

Re: [slurm-users] [EXT] Re: pmix issue

2020-12-07 Thread Yuengling, Philip J.
From: slurm-users on behalf of Philip Kovacs Reply-To: Philip Kovacs , Slurm User Community List Date: Monday, December 7, 2020 at 10:55 AM To: "a...@candooz.com" , Slurm User Community List Subject: Re: [slurm-users] [EXT] Re: pmix issue APL external email warning: Verify sender s

Re: [slurm-users] [EXT] Re: pmix issue

2020-12-07 Thread Philip Kovacs
Make sure the .so symlink for the pmix lib is available -- not just the versioned .so, e.g. .so.2.   Slurm requires that .so symlink.  Some distros split packages into base/devel, so you may need to install a pmix-devel package, if available, in order to add the .so symlink (which is considered

Re: [slurm-users] [EXT] Re: pmix issue

2020-12-07 Thread Andy Riebs
Hi Phil, From a distance, it feels like there may be a mismatch in Slurm versions (an auxiliary build hiding out somewhere?). You might try something like $ which srun; srun which srun Just to confirm that both the submit and execute nodes are running the same slurm instance. Andy On 12/

Re: [slurm-users] [EXT] Re: pmix issue

2020-12-07 Thread Yuengling, Philip J.
Thanks Andy, Slurm was compiled with --with-pmix=/share/local/pmix-3.2.1. The build of pmix is installed under /share/local/pmix-3.2.1 which is an NFS share across all the nodes. I should also note I used devtoolset-10 (gcc 10) on RHEL7 and confirmed that everything was compiled with that ver