https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120352

            Bug ID: 120352
           Summary: scalar epiloque not needed for early break when exit
                    block is invariant
           Product: gcc
           Version: 16.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: middle-end
          Assignee: unassigned at gcc dot gnu.org
          Reporter: tnfchris at gcc dot gnu.org
            Blocks: 53947, 115130
  Target Milestone: ---

The following sequence

#define N 4
int a[N] = {0,0,0,1};
int b[N] = {0,0,0,1};

__attribute__((noipa, noinline))
int foo ()
{
  for (int i = 0; i < N; i++)
    {
      if (a[i] > b[i])
        return 1;
    }
  return 0;
}

on AArch64 compiles to:

        ldr     q31, [x2, #:lo12:.LANCHOR0]
        ldr     q30, [x1, 16]
        cmgt    v30.4s, v31.4s, v30.4s
        umaxp   v30.4s, v30.4s, v30.4s
        fmov    x3, d30
        cbnz    x3, .L8
        ret
.L8:
        ldr     w0, [x2, #:lo12:.LANCHOR0]
        ldr     w3, [x1, 16]
        cmp     w3, w0
        blt     .L6
        ldr     w0, [x1, 4]
        ldr     w2, [x1, 20]
        cmp     w2, w0
        blt     .L6
        ldr     w0, [x1, 8]
        ldr     w2, [x1, 24]
        cmp     w2, w0
        blt     .L6
        ldr     w0, [x1, 28]
        ldr     w2, [x1, 12]
        cmp     w2, w0
        cset    w0, gt
        ret
.L6:
        mov     w0, 1
        ret

However since the body of the break contains only loop invariant instructions
and the remainder of the loop is empty we don't need the scalar epiloque as the
IV value is irrelevant.

This sequence is somewhat common as it's a `contains` check


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115130
[Bug 115130] [meta-bug] early break vectorization

Reply via email to