On 05/06/20 15:29, Riebs, Andy wrote: Tks for the answer.
> I'm *guessing* that you are tripping over the use of "--tasks 32" on a > heterogeneous cluster, If you mean that using "--tasks 32" trips the use of a second node, then no. The node does have two AMD Opteron 6274 . > though your comment about the node without InfiniBand troubles me. If you > drain that node, or exclude it in your command line, that might correct the > problem. I wonder if OMPI and PMIx have decided that IB is the way to go, and > are failing when they try to set up on the node without IB. The job uses a single node. On another node (identical HW: they're two servers in 1U) the same job works with 32 tasks. Nodes are configured via a script, so the config should be exactly the same, but maybe something fell out of sync (continuous updates w/o reinstall since Debian 8!). But I could't find anything obviously different. > If that's not it, I'd try > 0. Check sacct for the node lists for the successful and unsuccessful runs -- > a problem node might jump out. > 1. Running your job with explicit node lists. Again, you may find a problem > node this way. I already run it with explicit nodelist to address the problematic node to try to identify and resolve the problem, not to avoid it leaving a node unused... > p.s. If this doesn't fix it, please include the Slurm and OMPI versions, and > a copy of your slurm.conf file (with identifying information like node names > removed) in your next note to this list. I'm using Debian-packaged versions: slurm-client/stable,stable,now 18.08.5.2-1+deb10u1 amd64 openmpi-bin/stable,now 3.1.3-11 amd64 slurm.conf (nodes and partitions omitted): -8<-- SlurmCtldHost=str957-cluster(#.#.#.#) AuthType=auth/munge CacheGroups=0 CryptoType=crypto/munge EnforcePartLimits=YES MpiDefault=none MpiParams=ports=12000-12999 ProctrackType=proctrack/cgroup PrologSlurmctld=/etc/slurm-llnl/SlurmCtldProlog.sh ReturnToService=2 SlurmctldPidFile=/run/slurmctld.pid SlurmctldPort=6817 SlurmdPidFile=/run/slurmd.pid SlurmdPort=6818 SlurmdSpoolDir=/var/lib/slurm/slurmd SlurmUser=slurm StateSaveLocation=/var/lib/slurm/slurmctld SwitchType=switch/none TaskPlugin=task/cgroup TmpFS=/mnt/local_data/ UsePAM=1 GetEnvTimeout=20 InactiveLimit=0 KillWait=120 MinJobAge=300 SlurmctldTimeout=20 SlurmdTimeout=30 Waittime=10 FastSchedule=0 SchedulerType=sched/backfill SchedulerPort=7321 SelectType=select/cons_res SelectTypeParameters=CR_Core_Memory PriorityType=priority/multifactor PreemptMode=CANCEL PreemptType=preempt/partition_prio AccountingStorageEnforce=safe AccountingStorageHost=str957-cluster AccountingStorageType=accounting_storage/slurmdbd AccountingStoreJobComment=YES AcctGatherNodeFreq=300 ClusterName=oph JobCompLoc=/var/spool/slurm/jobscompleted.txt JobCompType=jobcomp/filetxt JobAcctGatherFrequency=30 JobAcctGatherType=jobacct_gather/linux SlurmctldDebug=3 SlurmctldLogFile=/var/log/slurm/slurmctld.log SlurmdDebug=3 SlurmdLogFile=/var/log/slurm/slurmd.log -8<-- I've had a similar problem while adding new nodes in a new partition. I "solved" (probably) by adding a line mtl = psm2 to /etc/openmpi/openmpi-mca-params.conf . But those were nodes with IB. Since I'm quite ignorant about the whole MPI and IB ecosystem, it's mostly guesswork... -- Diego Zuccato Servizi Informatici Dip. di Fisica e Astronomia (DIFA) - Università di Bologna V.le Berti-Pichat 6/2 - 40127 Bologna - Italy tel.: +39 051 20 95786 mail: diego.zucc...@unibo.it