I am calculating the theoretical peak (FP64) performance of the Nvidia DGX A100 system.
Now, A100 datasheet lists FP64 performance to be 9.7 TFLOPS. Two AMD 7742 CPUs will give 128 cores x 2.25 GHz base clock x 16 FP64 ops / cycle = 4.6 TFLOPS. This gives a total of 82.2 TFLOPS per DGX-A100. Here is my problem. For any system with DGX A100 on top500.org, numbers just don't add up. For eg: Selene has 560 DGX boxes, but its theoretical peak is listed as 79.2 PFLOPS, whereas I expect it should be 46 PFLOPS (ie 82.2 TFLOPS x560). The same is true for any other DGX based system listed on top500. What am I missing here? Thanks! Harsh Hemani
_______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit https://beowulf.org/cgi-bin/mailman/listinfo/beowulf