I am calculating the theoretical peak (FP64) performance of the Nvidia DGX
A100 system.

Now, A100 datasheet lists FP64 performance to be 9.7 TFLOPS.
Two AMD 7742 CPUs will give 128 cores x 2.25 GHz base clock x 16 FP64 ops /
cycle = 4.6 TFLOPS.
This gives a total of 82.2 TFLOPS per DGX-A100.

Here is my problem. For any system with DGX A100 on top500.org, numbers
just don't add up. For eg: Selene has 560 DGX boxes, but its theoretical
peak is listed as 79.2 PFLOPS, whereas I expect it should be 46 PFLOPS (ie
82.2 TFLOPS x560). The same is true for any other DGX based system listed
on top500. What am I missing here?

Thanks!

Harsh Hemani
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf

Reply via email to