All, What have people achieved on this SKU on a single-node using the stock HPL 2.3 source... ??
I have seen a variety of performance claims even as high as 90% of its nominal per node peak of 4.608 TFLOPs. I can now get above 80% of peak, but not higher. I have heard that to get higher values special BIOS settings are required, including the turning off SMT which allows the chip to turbo higher. Remember this is not the 7542 processor with 32 cores per chip and the same bandwidth per socket as the 7742 which can turbo to over 100% of nominal peak for HPL. If people have gotten higher single node numbers ... what is your recipe ... ?? I am particularly interested in BIOS settings, and maybe surprise settings in the HPL.dat file. Do higher performing runs require using close to the maximum memory on the node ... ?? As this is single-node, I would not expect choice of MPI to make a difference To get to 80% with SMT on in the BIOS, I am building with an older Intel compiler and MKL that still recognizes the MKL_DEBUG_CPU_TYPE=5. Running so that the number of MPI ranks run on the node matches the number of CCXs seems ot give the best numbers. Following the tuning instructions from AMD for using BLIS and GCC for the build does not get me there. Thanks, Richard Walsh
_______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit https://beowulf.org/cgi-bin/mailman/listinfo/beowulf