On 3/19/22 1:28 PM, Mikhail Kuzminsky wrote:
Just in the HPCG discussion, it was proposed to use the now widely
used likwid benchmark to estimate memory bandwidth. It gives
excellent estimates of hardware capabilities.
Am I right that likwid uses its own optimized assembler code for each
specific hardware?
likwid has been tuned for some architectures, so can give you some idea
of achievable performance, where some effort has been made to optimize
for caches and NUMA effects. It is an open source project where
community contributions for other architectures are welcome. Of course,
assembly optimization is not done for all codes, and one needs to rely
on the compiler in this case.
If so, it turns out that for the HPC user, stream gives a more
important estimate - the application is translated by the compiler
(they do not write in assembler - except for modules from mathematical
libraries), and stream will give a real estimate of what will be
received in the application.
This would depend. If a system runs heavily tuned community codes and/or
uses library routines for the cases for which they are optimized, likwid
may be better than stream. If a code that is primarily written for
portability is used, and is not tuned for the target platform, then
stream maybe more useful.
Mikhail Kuzminsky
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf