[Bug tree-optimization/100849] New: Poor placement of vector IVs

rsandifo at gcc dot gnu.org via Gcc-bugs Tue, 01 Jun 2021 01:11:58 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100849


            Bug ID: 100849
           Summary: Poor placement of vector IVs
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rsandifo at gcc dot gnu.org
  Target Milestone: ---

Vector IV increments are usually placed at the beginning of a loop body.
This means that both the old and new IV values are live at the same time,
forcing a move.

E.g.:

int x[100], y[100];

void f1 (void)
{
  for (int i = 0; i < 100; ++i)
    x[i] = (i & 11) == 2 ? y[i] : 1;
}

produces:

  <bb 3> [local count: 268435400]:
  # vect_vec_iv_.7_47 = PHI <_48(3), { 4, 5, 6, 7 }(2)>
  # ivtmp.21_21 = PHI <ivtmp.21_16(3), 0(2)>
  _48 = vect_vec_iv_.7_47 + { 4, 4, 4, 4 };
  vect__1.8_50 = vect_vec_iv_.7_47 & { 11, 11, 11, 11 };
  vect_iftmp.11_54 = MEM <vector(4) int> [(int *)&y + 16B + ivtmp.21_21 * 1];
  vect_iftmp.12_58 = .VCOND (vect__1.8_50, { 2, 2, 2, 2 }, vect_iftmp.11_54, {
1, 1, 1, 1 }, 113);
  MEM <vector(4) int> [(int *)&x + 16B + ivtmp.21_21 * 1] = vect_iftmp.12_58;
  ivtmp.21_16 = ivtmp.21_21 + 16;
  if (ivtmp.21_16 != 384)
    goto <bb 3>; [96.00%]
  else
    goto <bb 4>; [4.00%]

It might be better to place the vector IV at the same place as
the original scalar increment (or at the end of the loop body?)

The AArch64 Advanced SIMD code is:

.L2:
        mov     v0.16b, v1.16b
        add     x2, x4, x0
        add     v1.4s, v1.4s, v6.4s
        add     x1, x3, x0
        add     x0, x0, 16
        ldr     q3, [x2, 16]
        and     v0.16b, v0.16b, v5.16b
        cmeq    v0.4s, v0.4s, v4.4s
        bsl     v0.16b, v3.16b, v2.16b
        str     q0, [x1, 16]
        cmp     x0, 384
        bne     .L2

[Bug tree-optimization/100849] New: Poor placement of vector IVs

Reply via email to