http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59539
Jakub Jelinek <jakub at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |ASSIGNED Last reconfirmed| |2013-12-18 CC| |jakub at gcc dot gnu.org Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> --- Created attachment 31463 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=31463&action=edit gcc49-pr59539.patch This has been improved already for the compiler generated misaligned loads in r204219 aka PR47754, but when you explicitly use intrinsics we don't go through the movmisalign path (and, I think we shouldn't, at least I doubt when you say use _mm256_loadu_si256 you'd be expecting to use depending on tuning say two misaligned 128-bit loads instead), it still forces the generation of UNSPECs. This patch will if the compiler will emit a vmovdqu (or vmovup{s,d}) for normal *mov<mode>_internal pattern emit that instead of the UNSPECs and allow combining it into insns, while if you use the unaligned loads on something known to be unaligned, it will still not combine it (it will honor the unaligned load then, because you've requested it specially).