https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119417
Bug ID: 119417
Summary: -fexpensive-optimizations forces GCC 14 to use uxtw
instead of uxth causing different result on ADD
instruction
Product: gcc
Version: 14.2.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: rogerio.souza at gmail dot com
Target Milestone: ---
Greetings,
GCC 14 running on aarch64 architecture, our team detected a unique behavior
that only happens when we use the optimization “-fexpensive-optimizations”.
We noted that the line “buggy = buggy + ((global & 0xFFFF) * 2);” from the code
below has different instructions if using the “-fexpensive-optimizations” or
not.
# With -fexpensive-optimizations uses uxtw
add x19, x19, w0, uxtw 1
# Without -fexpensive-optimizations uses uxth
add x19, x19, w0, uxth 1
The reduced testcase is available at https://godbolt.org/z/vM9Y4dMnW.
To reproduce use the command "gcc -O1 -fexpensive-optimizations":
********************************************************
#include <iostream>
__attribute__((noinline)) void print(size_t toPrint)
{
std::cout << toPrint << std::endl;
}
unsigned global = 0x10000;
int main()
{
size_t buggy = 0;
while (true)
{
buggy = buggy + ((global & 0xFFFF) * 2);
print(buggy);
if (global)
break;
}
}
********************************************************
This source code returns 0 when NOT using “-fexpensive-optimizations” and
returns 131072 when using this flag or -O2 or higher optimization.
We tested it with older GCC versions and also LLVM, only GCC 14 on Arm returns
131072.
Knowning that UXTH extends a 16-bit unsigned value to a 32-bit register and
UXTW
extends a 32-bit unsigned value to a 64-bit register, forcing a uint16_t cast
fixes this behavior as the example below:
********************************************************
buggy = buggy + (static_cast<uint16_t>(global & 0xFFFF) * 2);
print(buggy);
********************************************************
This GCC 14 behavior is a bug or it is expected and the other versions and LLVM
should be fixed?
Regards,
Rogerio