https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119417

            Bug ID: 119417
           Summary: -fexpensive-optimizations forces GCC 14 to use uxtw
                    instead of uxth causing different result on ADD
                    instruction
           Product: gcc
           Version: 14.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rogerio.souza at gmail dot com
  Target Milestone: ---

Greetings,

GCC 14 running on aarch64 architecture, our team detected a unique behavior
that only happens when we use the optimization “-fexpensive-optimizations”.

We noted that the line “buggy = buggy + ((global & 0xFFFF) * 2);” from the code
below has different instructions if using the “-fexpensive-optimizations” or
not.

# With -fexpensive-optimizations uses uxtw
       add     x19, x19, w0, uxtw 1

# Without -fexpensive-optimizations uses uxth
       add     x19, x19, w0, uxth 1

The reduced testcase is available at https://godbolt.org/z/vM9Y4dMnW.

To reproduce use the command "gcc -O1 -fexpensive-optimizations":

********************************************************
#include <iostream>

 __attribute__((noinline)) void print(size_t toPrint)
{
    std::cout << toPrint << std::endl;
}

unsigned global = 0x10000;
int main()
{
  size_t buggy = 0;
  while (true)
  {
    buggy = buggy + ((global & 0xFFFF) * 2);
    print(buggy);
    if (global)
      break;
  }
}

********************************************************

This source code returns 0 when NOT using “-fexpensive-optimizations” and
returns 131072 when using this flag or -O2 or higher optimization.

We tested it with older GCC versions and also LLVM, only GCC 14 on Arm returns
131072.

Knowning that UXTH extends a 16-bit unsigned value to a 32-bit register and
UXTW 
extends a 32-bit unsigned value to a 64-bit register, forcing a uint16_t cast
fixes this behavior as the example below:

********************************************************
    buggy = buggy + (static_cast<uint16_t>(global & 0xFFFF) * 2);
    print(buggy);
********************************************************

This GCC 14 behavior is a bug or it is expected and the other versions and LLVM
should be fixed?

Regards,
Rogerio

Reply via email to