https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66917
--- Comment #7 from ktkachov at gcc dot gnu.org --- (In reply to Richard Biener from comment #6) > Probably because you access a.u/b.u which is uint64_t and thus the union > is laid out as having 8 byte alignment? > > How do the original GENERIC trees produced look like? The 003t.original shows: typedef union { uint64_t u[2]; uint8_t c[16]; } unionunion { uint64_t u[2]; uint8_t c[16]; }; union { uint64_t u[2]; uint8_t c[16]; } a; union { uint64_t u[2]; uint8_t c[16]; } b; union { uint64_t u[2]; uint8_t c[16]; } a; union { uint64_t u[2]; uint8_t c[16]; } b; memcpy ((void * restrict) &a.c, (const void * restrict) ap, 16); memcpy ((void * restrict) &b.c, (const void * restrict) bp, 16); a.u[0] = a.u[0] ^ b.u[0]; a.u[1] = a.u[1] ^ b.u[1]; memcpy ((void * restrict) outp, (const void * restrict) &a.c, 16);