http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47754

--- Comment #2 from Matthias Kretz <kretz at kde dot org> 2011-02-15 16:31:39 
UTC ---
True, the Optimization Reference Manual and AVX Docs are not very specific
about the performance impact of this. But as far as I understood the docs it
will internally not be slower than an unaligned load + op, but also not faster.
Except, of course, if it's related to memory fetch latency. So it's just about
having more registers available - again AFAIU.

If you want I can try the same testcase on ICC...

Reply via email to