https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116704
Bug ID: 116704 Summary: Missed optimization: Setting return value to 0 on both branches of a condition Product: gcc Version: 14.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: eyalroz1 at gmx dot com Target Milestone: --- Consider the following code (in C, or in C++): int calc_simple(int bottom, int top) { int sum = 0; for (int number = bottom; number <= top; number++) { if (number % 2 == 0) { sum += number; } } return sum; } with GCC 14.2.0 on x86_64, this results in (dropping `.` lines, and using Intel dialect): calc_simple: .LFB0: cmp edi, esi jg .L5 add esi, 1 xor eax, eax .L4: lea edx, [rax+rdi] test dil, 1 cmove eax, edx add edi, 1 cmp esi, edi jne .L4 ret .L5: xor eax, eax ret .LFE0: If I am not mistaken, we can avoid the dual "xor eax,eax"es, and have: calc_simple: .LFB0: xor eax, eax cmp edi, esi jg .L5 add esi, 1 .L4: lea edx, [rax+rdi] test dil, 1 cmove eax, edx add edi, 1 cmp esi, edi jne .L4 .L5: ret .LFE0: ... saving two instructions. It is almost moving the same command appearing on both sides of a branch, to just before the branch (or rather the instruction which decides where to branch). I said "almost" because the "add esi, 1" is before the "xor eax, eax", but I'm guessing that they are interchangeable and only arbitrarily placed one after the other. Notes: * I don't know if this should be an rtl-optimization or an tree-optimization bug. * I'm only assuming the shorter code is better, performance-wise - as the shortening does not involve any explicit introduction of additional work. In -Os, the xor eax, eax is indeed performed only once.