https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116475

            Bug ID: 116475
           Summary: autovect: may be optimized for min/max
           Product: gcc
           Version: 15.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: syq at gcc dot gnu.org
  Target Milestone: ---

If we need get the minimal of 8 floats in an array.
We may have code like this

float min(float *x) {
        float ret = x[0];
        for (int i=0; i<8; i++) {       // from 0 in this line
                ret = ret<x[i] ? ret : x[i];
        }
        return ret;
}

While if we compile it with
   aarch64-linux-gnu-gcc -O3 -ffast-math -S xx.c
We get
        ldp     q0, q1, [x0]
        ld1r    {v31.4s}, [x0]      # <-- not needed
        fminnm  v31.4s, v1.4s, v31.4s  # <-- not needed
        fminnm  v0.4s, v31.4s, v0.4s
        fminnmv s0, v0.4s
        ret




And maybe we can also use
float min(float *x) {
        float ret = x[0];
        for (int i=1; i<8; i++) {             // from 1 in this line
                ret = ret<x[i] ? ret : x[i];
        }
        return ret;
}


It will be even worse
        ldr     q31, [x0, 4]
        ld1r    {v30.4s}, [x0]
        ldp     s0, s29, [x0, 20]
        fminnm  v31.4s, v31.4s, v30.4s
        ldr     s30, [x0, 28]
        fminnm  s0, s0, s29
        fminnmv s31, v31.4s
        fminnm  s31, s30, s31
        fminnm  s0, s0, s31
        ret

Reply via email to