https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102202
Bug ID: 102202
Summary: Inefficent expansion of memset when range is [0,1]
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: enhancement
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: pinskia at gcc dot gnu.org
Target Milestone: ---
Target: x86_64-*-*
Take:
void g(int a, char *d)
{
if (a < 0 || a > 1) __builtin_unreachable();
__builtin_memset(d, 0, a);
}
----- CUT -----
GCC compiles on x86_64 to:
g(int, char*):
.cfi_startproc
testl %edi, %edi
je .L1
xorl %eax, %eax
.L2:
movl %eax, %edx
addl $1, %eax
movb $0, (%rsi,%rdx)
cmpl %edi, %eax
jb .L2
.L1:
ret
Which is better than clang/LLVM/ICC does but the loop is not needed as a will
either be 0 or 1 and we already jump around the loop.
Here is another example not using __builtin_unreachable:
void g(int a, char *d)
{
__builtin_memset(d, 0, a&1);
}