https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119704
Bug ID: 119704 Summary: x86: partially disobeyed strategy rep-based request for inlined memset Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: mjguzik at gmail dot com Target Milestone: --- 13.3.0 runs into it, but I also tested on godbolt which claims to have 15.0.1: gcc (Compiler-Explorer-Build-gcc-ca4e6e6317ae0ceada8c46ef5db5ece165a6d1c4-binutils-2.42) 15.0.1 20250409 (experimental) ... and got the same result. I have not verified memcpy, I suspect it might suffer the same problem. src: void zero(char *buf) { __builtin_memset(buf, 0, SIZE); } compiled like so: cc -O2 -DSIZE=48 -mno-sse -mmemset-strategy=rep_byte:256:noalign,libcall:-1:noalign -c zero.c Given rep_byte I expect rep movsb to be emitted. It does happen for some sizes, but I'm also seeing regular stores or rep movsl. For sizes 40 bytes and below this still emits regular stores, *not* the rep-prefixed op, for example: 0000000000000000 <zero>: 0: f3 0f 1e fa endbr64 4: 48 c7 07 00 00 00 00 movq $0x0,(%rdi) b: 48 c7 47 08 00 00 00 movq $0x0,0x8(%rdi) 12: 00 13: 48 c7 47 10 00 00 00 movq $0x0,0x10(%rdi) 1a: 00 1b: 48 c7 47 18 00 00 00 movq $0x0,0x18(%rdi) 22: 00 23: 48 c7 47 20 00 00 00 movq $0x0,0x20(%rdi) 2a: 00 2b: c3 ret 48 bytes is movsl: 0000000000000000 <zero>: 0: f3 0f 1e fa endbr64 4: b9 0c 00 00 00 mov $0xc,%ecx 9: 31 c0 xor %eax,%eax b: f3 ab rep stos %eax,%es:(%rdi) d: c3 ret 64 bytes is movsl: 0000000000000000 <zero>: 0: f3 0f 1e fa endbr64 4: b9 10 00 00 00 mov $0x10,%ecx 9: 31 c0 xor %eax,%eax b: f3 ab rep stos %eax,%es:(%rdi) d: c3 ret 65 bytes is movsb: 0000000000000000 <zero>: 0: f3 0f 1e fa endbr64 4: b9 41 00 00 00 mov $0x41,%ecx 9: 31 c0 xor %eax,%eax b: f3 aa rep stos %al,%es:(%rdi) d: c3 ret Given the rep_byte strategy I expect the entire thing to movsb.