Package: libopenmpi3 Version: 4.0.4-2 Followup-For: Bug #965352 Control: affects -1 src:scalapack
UCX seems to be affecting the scalapack build also: 87/96 Test #82: xdgsep ........................... Passed 109.38 sec Start 95: xshseqr 88/96 Test #83: xcgsep ........................... Passed 101.14 sec Start 96: xdhseqr 89/96 Test #96: xdhseqr ..........................***Failed 49.20 sec ScaLAPACK Test for PDHSEQR epsilon = 1.1102230246251565E-016 threshold = 30.000000000000000 Residual and Orthogonality Residual computed by: Residual = || T - Q^T*A*Q ||_F / ( ||A||_F * eps * sqrt(N) ) Orthogonality = MAX( || I - Q^T*Q ||_F, || I - Q*Q^T ||_F ) / (eps * N) Test passes if both residuals are less then threshold N NB P Q QR Time CHECK ----- --- ---- ---- -------- ------ [1595480623.088652] [monte:1320201:0] sock.c:344 UCX ERROR recv(fd=28) failed: Bad address [1595480623.199533] [monte:1320189:0] sock.c:344 UCX ERROR sendv(fd=30) failed: Connection reset by peer [monte:1320189] *** An error occurred in MPI_Bcast [monte:1320189] *** reported by process [1297350657,2] [monte:1320189] *** on communicator MPI COMMUNICATOR 5 SPLIT FROM 3 [monte:1320189] *** MPI_ERR_OTHER: known error not in list [monte:1320189] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, [monte:1320189] *** and potentially your MPI job)