> I have been using single card on Magny-Cours with no issues at all. You can
interesting. what adjustments have you made to the MPI stack to permit this? we've had a variety of apps that fail intermittently on high-core nodes. I have to say I was surprised such a thing came up - not sure whether it's inherent to IB or a result of the openmpi stack. our usual way to test this is to gradually reduce the ranks-per-node for the job until it starts to work. an interesting cosmology code works at 1 pppn but not 3 ppn on our recent 12c MC, mellanox QDR cluster. > Using 16 cores per node (8 * 2) seem the 'safe' option, but the 24 cores > (12 * 2) is better in term of price per job. Our CFD applications using MPI > (OpenMPI) may need to do about 15 'MPI_allreduce' calls in one seccond or > less, and we may probably using a pool of 1500 cores. but will that allreduce be across 1500 cores? I can get you a scaling curve for the previously mentioned MC cluster (2.2 GHz). > 2 - I've heard that QLogic behavior is better in terms of QP creation, I well, they've often bragged about message rates - I'm not sure how relate that is to QP creation. I'd be interested to hear of some experiences, too. regards, mark hahn. _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf