https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116467

            Bug ID: 116467
           Summary: missed optimization: zero-extension duplicated on
                    xtensa
           Product: gcc
           Version: 12.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rsaxvc at gmail dot com
  Target Milestone: ---

On GCC 12.2.0, -O2 -Wall -Wextra, the following code:

    #include <stdint.h>

    __attribute__ ((noinline)) uint32_t callee(uint32_t x, uint16_t y){
        return x + y;
    }

    __attribute__ ((noinline)) uint32_t caller(uint32_t x, uint32_t y){
        return callee(x, y);
    }

compiles to these xtensa instructions:

    callee:
            entry   sp, 32
            extui   a3, a3, 0, 16
            add.n   a2, a3, a2
            retw.n
    caller:
            entry   sp, 32
            extui   a11, a3, 0, 16
            mov.n   a10, a2
            call8   callee
            mov.n   a2, a10
            retw.n

I was surprised to find that zero-extension (extui rDest, rSource, 0, 16)
occurs twice, once in each function. On other targets like ARM32, it looks like
uint16_t passed in a register is assumed to be passed zero-extended, so the
callee does not need to repeat it. ARM32, GCC12.2, same flags:

    callee:
            add     r0, r0, r1
            bx      lr
    caller:
            uxth    r1, r1 //similar to extui, .., .., 0, 16
            b       callee

Could xtensa do the same?

Reply via email to