https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113625

            Bug ID: 113625
           Summary: Interesting behavior with and without -mcpu=generic
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---
            Target: aarch64

Take:
```
void f(unsigned short * __restrict a, long long * __restrict b)
{
        int t = 0;
        t += a[0];
        t += a[1];
        t += a[2];
        t += a[3];
        b[0] = t;
}
```
Compile this with `-O3 -fno-vect-cost-model` (no extra configure options), GCC
produces:
```
f:
        ldr     d31, [x0]
        sub     sp, sp, #16
        uaddlv  s31, v31.4h
        str     s31, [sp, 12]
        ldrsw   x0, [sp, 12]
        str     x0, [x1]
        add     sp, sp, 16
```

And then add -mcpu=generic (or -mtune=generic), GCC produces:
```
f:
        ldr     d31, [x0]
        uaddlv  s31, v31.4h
        fmov    w0, s31
        sxtw    x0, w0
        str     x0, [x1]
        ret
```

I would have expected -mtune=generic be the same as without it but nope.

Maybe I don't understand what the default tuning settings are now vs what
-mtune=generic is supposed to do but also it is not obvious from documentation
either.  So this is at min a documentation issue.

Reply via email to