https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81389
--- Comment #4 from Marc Glisse <glisse at gcc dot gnu.org> --- (In reply to rockeet from comment #3) > @Martin Liška Yes, my use case is: > > __m128i key128 = { key }; // key is an unsigned char > int idx = _mm_cmpestri(key128, 1, > *(const __m128i*)(data), // don't require memory align > len, > _SIDD_UBYTE_OPS|_SIDD_CMP_EQUAL_ORDERED|_SIDD_LEAST_SIGNIFICANT); > > // .... You should load the unaligned data using one of the loadu intrinsics and pass that to _mm_cmpestri. When optimizing, it should generate the code you want, but in a safe way.