https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120708
Bug ID: 120708 Summary: ix86_expand_set_or_cpymem ignores MOVE_MAX Product: gcc Version: 16.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: hjl.tools at gmail dot com CC: liuhongt at gcc dot gnu.org Target Milestone: --- Target: x86-64 i386 defines /* Max number of bytes we can move from memory to memory in one reasonably fast instruction, as opposed to MOVE_MAX_PIECES which is the number of bytes at a time which we can move efficiently. MOVE_MAX_PIECES defaults to MOVE_MAX. */ #define MOVE_MAX \ ((TARGET_AVX512F \ && (ix86_move_max == PVW_AVX512 \ || ix86_store_max == PVW_AVX512)) \ ? 64 \ : ((TARGET_AVX \ && (ix86_move_max >= PVW_AVX256 \ || ix86_store_max >= PVW_AVX256)) \ ? 32 \ : ((TARGET_SSE2 \ && TARGET_SSE_UNALIGNED_LOAD_OPTIMAL \ && TARGET_SSE_UNALIGNED_STORE_OPTIMAL) \ ? 16 : UNITS_PER_WORD))) If TARGET_SSE_UNALIGNED_LOAD_OPTIMAL or TARGET_SSE_UNALIGNED_STORE_OPTIMAL are false, MOVE_MAX is defined UNITS_PER_WORD. For -march=atom, both are false. But ix86_expand_set_or_cpymem ignores it. As the result, memcpy-vector_loop-1.c and memset-vector_loop-2.c, which are compiled with -march=atom, are compiled with SSE instructions: movdqa %xmm3, a(%rax) movdqa %xmm2, a+16(%rax) movdqa %xmm1, a+32(%rax) movdqa %xmm0, a+48(%rax)