================ @@ -2502,10 +2509,25 @@ _mm_mulhi_pu16(__m64 __a, __m64 __b) /// A pointer to a 64-bit memory location that will receive the conditionally /// copied integer values. The address of the memory location does not have /// to be aligned. -static __inline__ void __DEFAULT_FN_ATTRS_MMX +static __inline__ void __DEFAULT_FN_ATTRS_SSE2 _mm_maskmove_si64(__m64 __d, __m64 __n, char *__p) { - __builtin_ia32_maskmovq((__v8qi)__d, (__v8qi)__n, __p); + // This is complex, because we need to support the case where __p is pointing + // within the last 15 to 8 bytes of a page. In that case, using a 128-bit + // write might cause a trap where a 64-bit maskmovq would not. (Memory + // locations not selected by the mask bits might still cause traps.) + __m128i __d128 = __anyext128(__d); + __m128i __n128 = __zext128(__n); + if (((__SIZE_TYPE__)__p & 0xfff) >= 4096-15 && + ((__SIZE_TYPE__)__p & 0xfff) <= 4096-8) { ---------------- jyknight wrote:
I believe it's correct as written: we need to ensure that we cross a potential page-protection boundary in exactly the same situations as we would've originally. Since we're now executing a 16-byte write instead of the specified 8-byte write, that means we need to back up by 8 bytes when we're at offsets 15, 14, 13, 12, 11, 10, 9, and 8 before the end of the page. At 16 bytes before, we're guaranteed to be within the page for both writes, and at 7 bytes before, we're guaranteed to cross potential-boundary for both. https://github.com/llvm/llvm-project/pull/96540 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits