> The problem is with a single, specific, node: str957-bl0-03 . The same > job script works if being allocated to another node, even with more > ranks (tested up to 224/4 on mtx-* nodes).
Ahhh... here's where the details help. So it appears that the problem is on a single node, and probably not a general configuration or system problem. I suggest starting with something like this to help figure out why node bl0-03 is different $ sudo ssh str957- bl0-02 lscpu $ sudo ssh str957- bl0-03 lscpu Andy -----Original Message----- From: Diego Zuccato [mailto:diego.zucc...@unibo.it] Sent: Tuesday, October 6, 2020 3:13 AM To: Riebs, Andy <andy.ri...@hpe.com>; Slurm User Community List <slurm-users@lists.schedmd.com> Subject: Re: [slurm-users] Segfault with 32 processes, OK with 30 ??? Il 05/10/20 14:18, Riebs, Andy ha scritto: Tks for considering my query. > You need to provide some hints! What we know so far: > 1. What we see here is a backtrace from (what looks like) an Open MPI/PMI-x > backtrace. Correct. > 2. Your decision to address this to the Slurm mailing list suggests that you > think that Slurm might be involved. At least I couldn't replicate launching manually (it always says "no slots available" unless I use mpirun -np 16 ...). I'm no MPI expert (actually less than a noob!) so I can't rule out it's unrelated to Slurm. I mostly hope that on this list I can find someone with enough experience with both Slurm and MPI. > 3. You have something (a job? a program?) that segfaults when you go from 30 > to 32 processes. Multiple programs, actually. > a. What operating system? Debian 10.5 . Only extension is PBIS-open to authenticate users from AD. > b. Are you seeing this while running Slurm? What version? 18.04, Debian packages > c. What version of Open MPI? openmpi-bin/stable,now 3.1.3-11 amd64 > d. Are you building your own PMI-x, or are you using what's provided by Open > MPI and Slurm? Using Debian packages > e. What does your hardware configuration look like -- particularly, what cpu > type(s), and how many cores/node? The node uses dual Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz for a total of 32 threads (hyperthreading is enabled: 2 sockets, 8 cores per socket, 2 threads per core). > f. What does you Slurm configuration look like (assuming you're seeing this > with Slurm)? I suggest purging your configuration files of node names and IP > addresses, and including them with your query. Here it is: -8<-- SlurmCtldHost=str957-cluster(*.*.*.*) AuthType=auth/munge CacheGroups=0 CryptoType=crypto/munge #DisableRootJobs=NO EnforcePartLimits=YES JobSubmitPlugins=lua MpiDefault=none MpiParams=ports=12000-12999 ReturnToService=2 SlurmctldPidFile=/run/slurmctld.pid SlurmctldPort=6817 SlurmdPidFile=/run/slurmd.pid SlurmdPort=6818 SlurmdSpoolDir=/var/lib/slurm/slurmd SlurmUser=slurm StateSaveLocation=/var/lib/slurm/slurmctld SwitchType=switch/none TaskPlugin=task/cgroup TmpFS=/mnt/local_data/ UsePAM=1 GetEnvTimeout=20 InactiveLimit=0 KillWait=120 MinJobAge=300 SlurmctldTimeout=20 SlurmdTimeout=30 FastSchedule=0 SchedulerType=sched/backfill SchedulerPort=7321 SelectType=select/cons_res SelectTypeParameters=CR_Core_Memory PriorityFlags=MAX_TRES PriorityType=priority/multifactor PreemptMode=CANCEL PreemptType=preempt/partition_prio AccountingStorageEnforce=safe,qos AccountingStorageHost=str957-cluster #AccountingStorageLoc= #AccountingStoragePass= #AccountingStoragePort=6819 #AccountingStorageTRES= AccountingStorageType=accounting_storage/slurmdbd #AccountingStorageUser= AccountingStoreJobComment=YES AcctGatherNodeFreq=300 ClusterName=oph JobCompLoc=/var/spool/slurm/jobscompleted.txt JobCompType=jobcomp/filetxt JobAcctGatherFrequency=30 JobAcctGatherType=jobacct_gather/linux SlurmctldDebug=3 SlurmctldLogFile=/var/log/slurm/slurmctld.log SlurmdDebug=3 SlurmdLogFile=/var/log/slurm/slurmd.log NodeName=DEFAULT Sockets=2 ThreadsPerCore=2 State=UNKNOWN NodeName=str957-bl0-0[1-2] CoresPerSocket=6 Feature=ib,blade,intel NodeName=str957-bl0-0[3-5] CoresPerSocket=8 Feature=ib,blade,intel NodeName=str957-bl0-[15-16] CoresPerSocket=4 Feature=ib,nonblade,intel NodeName=str957-bl0-[17-18] CoresPerSocket=6 ThreadsPerCore=1 Feature=nonblade,amd NodeName=str957-bl0-[19-20] Sockets=4 CoresPerSocket=8 ThreadsPerCore=1 Feature=nonblade,amd NodeName=str957-mtx-[00-15] CoresPerSocket=14 Feature=ib,nonblade,intel -8<-- > g. What does your command line look like? Especially, are you trying to run > 32 processes on a single node? Spreading them out across 2 or more nodes? The problem is with a single, specific, node: str957-bl0-03 . The same job script works if being allocated to another node, even with more ranks (tested up to 224/4 on mtx-* nodes). > h. Can you reproduce the problem if you substitute `hostname` or `true` for > the program in the command line? What about a simple MPI-enabled "hello > world?"I'll try ASAP w/ a simple 'hostname'. But I expect it to work. The original problem is with a complex program run by an user. To try to debug the issue I'm using what I think is the simplest mpi program possible: -8<-- #include "mpi.h" #include <stdio.h> #include <stdlib.h> #define MASTER 0 int main (int argc, char *argv[]) { int numtasks, taskid, len; char hostname[MPI_MAX_PROCESSOR_NAME]; MPI_Init(&argc, &argv); // int provided=0; // MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided); //printf("MPI provided threads: %d\n", provided); MPI_Comm_size(MPI_COMM_WORLD, &numtasks); MPI_Comm_rank(MPI_COMM_WORLD,&taskid); if (taskid == MASTER) printf("This is an MPI parallel code for Hello World with no communication\n"); //MPI_Barrier(MPI_COMM_WORLD); MPI_Get_processor_name(hostname, &len); printf ("Hello from task %d on %s!\n", taskid, hostname); if (taskid == MASTER) printf("MASTER: Number of MPI tasks is: %d\n",numtasks); MPI_Finalize(); printf("END OF CODE from task %d\n", taskid); } -8<-- And I got failures with it, too. -- Diego Zuccato DIFA - Dip. di Fisica e Astronomia Servizi Informatici Alma Mater Studiorum - Università di Bologna V.le Berti-Pichat 6/2 - 40127 Bologna - Italy tel.: +39 051 20 95786