https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122871

            Bug ID: 122871
           Summary: [13/14/15/16 Regression] de-optimized synthesis of
                    long long shift and add
           Product: gcc
           Version: 16.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rearnsha at gcc dot gnu.org
  Target Milestone: ---
            Target: arm

long long ashll_fn (long long a)
{
  long long c;

  c = a << 33;
  c += a;
  return c;
}

On a 32-bit machine, this should optimize to, eg on Arm

  add r1, r1, r0, lsl #1
  bx  lr

But instead we get (-O2)

        lsl     ip, r0, #11
        lsl     r2, r1, #11
        subs    ip, ip, r0
        orr     r2, r2, r0, lsr #21
        sbc     r2, r2, r1
        lsl     r3, ip, #11
        lsl     r2, r2, #11
        adds    r3, r3, r0
        orr     r2, r2, ip, lsr #21
        adc     r1, r1, r2
        lsl     r2, r1, #11
        lsl     r0, r3, #11
        adds    r0, r3, r0
        orr     r2, r2, r3, lsr #21
        adc     r1, r1, r2
        bx      lr

Which is much worse than GCC-5 used to generate:

        mov     r2, #0
        mov     r3, r0, asl #1
        adds    r0, r0, r2
        adc     r1, r1, r3
        bx      lr

The problem seems to stem from the gimple optimizers 'simplifying' the code to

return a * (2^33 + 1);

but then the expand pass failing to synthesise this with shifts and adds again
as it would do for a 32-bit multiply.

Reply via email to