Just a note: prefetchnta does not serve the same function as prefetchw.
prefetchnta prefetches data that the program expects to use *exactly once*,
and never again.  If this algorithm actually wants that behavior, then you
might actually get an improvement by using prefetchnta.  However, if the
algorithm uses the prefetched data more than once (including by reading data
in the same cacheline), then prefetchnta has the wrong semantic, and will
decrease performance.

If the algorithm reuses the data, it should use prefetcht0, prefetcht1, or
prefetcht2.

As with any change to this kind of performance-critical code, you might
consider benchmarking.

- Josh Triplett

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to