* H. J. Lu: > 1. MMX maskmovq and SSE2 maskmovdqu aren't equivalent. We emulate MMX > maskmovq with SSE2 maskmovdqu by zeroing out the upper 64 bits of the > mask operand. A warning is issued since invalid memory access may > happen when bits 64:127 at memory location are unmapped: > > xmmintrin.h:1168:3: note: Emulate MMX maskmovq with SSE2 maskmovdqu may > result i > n invalid memory access > 1168 | __builtin_ia32_maskmovq ((__v8qi)__A, (__v8qi)__N, __P); > | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Would it be possible to shift the mask according to the misalignment in the address? I think this should allow avoiding crossing a page boundary if the orginal 64-bit load would not.