Il 08/06/20 12:16, Diego Zuccato ha scritto: > I have another partition on these new nodes. 4 identical machines, new > installation, ConnectX-5 card, dual Intel Xeon 5120 (14 core dual > thread). No problem running a job requiring 112 threads (on 4 nodes), > but can't run a single-node job with 56 threads. Well, actually I pinned down the problem to *one* of the four new nodes (mtx-01). Launching the test code on 56 threads always failed. Once I installed gdb package to be able to debug it, the problem disappeared! Even if I don't use gdb!
... and there is who says that gdb is not a great debugger: it catches bugs by just being there, even if you don't use it! :) -- Diego Zuccato DIFA - Dip. di Fisica e Astronomia Servizi Informatici Alma Mater Studiorum - Università di Bologna V.le Berti-Pichat 6/2 - 40127 Bologna - Italy tel.: +39 051 20 95786