https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113625
Bug ID: 113625 Summary: Interesting behavior with and without -mcpu=generic Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: pinskia at gcc dot gnu.org Target Milestone: --- Target: aarch64 Take: ``` void f(unsigned short * __restrict a, long long * __restrict b) { int t = 0; t += a[0]; t += a[1]; t += a[2]; t += a[3]; b[0] = t; } ``` Compile this with `-O3 -fno-vect-cost-model` (no extra configure options), GCC produces: ``` f: ldr d31, [x0] sub sp, sp, #16 uaddlv s31, v31.4h str s31, [sp, 12] ldrsw x0, [sp, 12] str x0, [x1] add sp, sp, 16 ``` And then add -mcpu=generic (or -mtune=generic), GCC produces: ``` f: ldr d31, [x0] uaddlv s31, v31.4h fmov w0, s31 sxtw x0, w0 str x0, [x1] ret ``` I would have expected -mtune=generic be the same as without it but nope. Maybe I don't understand what the default tuning settings are now vs what -mtune=generic is supposed to do but also it is not obvious from documentation either. So this is at min a documentation issue.