A100 does 19.5 FP64 TFLOPS using tensor cores. On Thu, Jun 3, 2021 at 9:08 AM harsh_google lastname < harshscience...@gmail.com> wrote:
> I am calculating the theoretical peak (FP64) performance of the Nvidia DGX > A100 system. > > Now, A100 datasheet lists FP64 performance to be 9.7 TFLOPS. > Two AMD 7742 CPUs will give 128 cores x 2.25 GHz base clock x 16 FP64 ops > / cycle = 4.6 TFLOPS. > This gives a total of 82.2 TFLOPS per DGX-A100. > > Here is my problem. For any system with DGX A100 on top500.org, numbers > just don't add up. For eg: Selene has 560 DGX boxes, but its theoretical > peak is listed as 79.2 PFLOPS, whereas I expect it should be 46 PFLOPS (ie > 82.2 TFLOPS x560). The same is true for any other DGX based system listed > on top500. What am I missing here? > > Thanks! > > Harsh Hemani > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > https://beowulf.org/cgi-bin/mailman/listinfo/beowulf >
_______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit https://beowulf.org/cgi-bin/mailman/listinfo/beowulf