https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116704
Bug ID: 116704
Summary: Missed optimization: Setting return value to 0 on both
branches of a condition
Product: gcc
Version: 14.2.1
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: eyalroz1 at gmx dot com
Target Milestone: ---
Consider the following code (in C, or in C++):
int calc_simple(int bottom, int top) {
int sum = 0;
for (int number = bottom; number <= top; number++) {
if (number % 2 == 0) {
sum += number;
}
}
return sum;
}
with GCC 14.2.0 on x86_64, this results in (dropping `.` lines, and using Intel
dialect):
calc_simple:
.LFB0:
cmp edi, esi
jg .L5
add esi, 1
xor eax, eax
.L4:
lea edx, [rax+rdi]
test dil, 1
cmove eax, edx
add edi, 1
cmp esi, edi
jne .L4
ret
.L5:
xor eax, eax
ret
.LFE0:
If I am not mistaken, we can avoid the dual "xor eax,eax"es, and have:
calc_simple:
.LFB0:
xor eax, eax
cmp edi, esi
jg .L5
add esi, 1
.L4:
lea edx, [rax+rdi]
test dil, 1
cmove eax, edx
add edi, 1
cmp esi, edi
jne .L4
.L5:
ret
.LFE0:
... saving two instructions. It is almost moving the same command appearing on
both sides of a branch, to just before the branch (or rather the instruction
which decides where to branch). I said "almost" because the "add esi, 1" is
before the "xor eax, eax", but I'm guessing that they are interchangeable and
only arbitrarily placed one after the other.
Notes:
* I don't know if this should be an rtl-optimization or an tree-optimization
bug.
* I'm only assuming the shorter code is better, performance-wise - as the
shortening does not involve any explicit introduction of additional work. In
-Os, the xor eax, eax is indeed performed only once.