http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58039
--- Comment #3 from Mikael Pettersson <mikpe at it dot uu.se> --- Your code performs mis-aligned uint16_t stores, which x86 allows. The vectorizer turns those into larger and still mis-aligned `movdqa' stores, which x86 does not allow, hence the SEGV. Replace the non-portable mis-aligned stores with portable code like #define int2store_little_endian(s,A) memcpy((s), &(A), 2) or gcc-specific code like struct __attribute__((__packed__)) packed_uint16 { uint16_t u16; }; #define int2store_little_endian(s,A) ((struct packed_uint16*)(s))->u16 = (A) and then the vectorizer generates large `movdqu' stores, which is pretty much the best you can hope for unless you rewrite the code to avoid mis-aligned stores.