In message from Scott Atchley <e.scott.atch...@gmail.com> (Sun, 20 Mar
2022 14:52:10 -0400):
On Sat, Mar 19, 2022 at 6:29 AM Mikhail Kuzminsky <k...@free.net> wrote:

If so, it turns out that for the HPC user, stream gives a more
important estimate - the application is translated by the compiler
(they do not write in assembler - except for modules from mathematical
libraries), and stream will give a real estimate of what will be
received in the application.


When vendors advertise STREAM results, they compile the application with non-temporal loads and stores. This means that all memory accesses bypass the processor's caches. If your application of interest does a random walk through memory and there is neither temporal or spatial locality, then using non-temporal loads and stores makes sense and STREAM irrelevant.

STREAM is not initially oriented to random access to memory. In this
case, memory latencies are important, and it makes more sense to get a
bandwidth estimate in the mega-sweep
(https://github.com/UK-MAC/mega-stream).
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
https://beowulf.org/cgi-bin/mailman/listinfo/beowulf

Reply via email to