https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116475
Bug ID: 116475 Summary: autovect: may be optimized for min/max Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: syq at gcc dot gnu.org Target Milestone: --- If we need get the minimal of 8 floats in an array. We may have code like this float min(float *x) { float ret = x[0]; for (int i=0; i<8; i++) { // from 0 in this line ret = ret<x[i] ? ret : x[i]; } return ret; } While if we compile it with aarch64-linux-gnu-gcc -O3 -ffast-math -S xx.c We get ldp q0, q1, [x0] ld1r {v31.4s}, [x0] # <-- not needed fminnm v31.4s, v1.4s, v31.4s # <-- not needed fminnm v0.4s, v31.4s, v0.4s fminnmv s0, v0.4s ret And maybe we can also use float min(float *x) { float ret = x[0]; for (int i=1; i<8; i++) { // from 1 in this line ret = ret<x[i] ? ret : x[i]; } return ret; } It will be even worse ldr q31, [x0, 4] ld1r {v30.4s}, [x0] ldp s0, s29, [x0, 20] fminnm v31.4s, v31.4s, v30.4s ldr s30, [x0, 28] fminnm s0, s0, s29 fminnmv s31, v31.4s fminnm s31, s30, s31 fminnm s0, s0, s31 ret