https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118558

--- Comment #9 from Richard Biener <rguenth at gcc dot gnu.org> ---
So we have

  <bb 2> [local count: 29527901]:

  <bb 3> [local count: 118111600]:
  # g_1168_24 = PHI <g_1168_14(9), 3(2)>
  # ivtmp_13 = PHI <ivtmp_28(9), 4(2)>
  _1 = g_270[g_1168_24][0];
  g_1168_14 = g_1168_24 + -1;
  ivtmp_28 = ivtmp_13 - 1;
  if (ivtmp_28 != 0)
    goto <bb 9>; [75.00%]
  else
    goto <bb 6>; [25.00%]

  <bb 9> [local count: 88583699]:
  goto <bb 3>; [100.00%]

  <bb 6> [local count: 29527901]:
  # _8 = PHI <_1(3)>

where we have a extract-last reduction of a negative step DR.

  <bb 2> [local count: 29527901]:

  <bb 3> [local count: 59055800]:
  # g_1168_24 = PHI <g_1168_14(9), 3(2)>
  # ivtmp_13 = PHI <ivtmp_28(9), 4(2)>
  # vectp_g_270.9_2 = PHI <vectp_g_270.9_3(9), &MEM <long unsigned int[5][2]>
[(void *)&g_270 + 40B](2)>
  # ivtmp_29 = PHI <ivtmp_31(9), 0(2)>
  vect__1.11_4 = MEM <vector(2) long unsigned int> [(long unsigned int
*)vectp_g_270.9_2];
  vect__1.12_6 = VEC_PERM_EXPR <vect__1.11_4, vect__1.11_4, { 1, 0 }>;
  vectp_g_270.9_16 = vectp_g_270.9_2 + 18446744073709551600;
  vect__1.13_7 = MEM <vector(2) long unsigned int> [(long unsigned int
*)vectp_g_270.9_16];
  vect__1.14_21 = VEC_PERM_EXPR <vect__1.13_7, vect__1.13_7, { 1, 0 }>;
  vect__1.15_27 = VEC_PERM_EXPR <vect__1.12_6, vect__1.14_21, { 0, 2 }>;
  _1 = g_270[g_1168_24][0];
  g_1168_14 = g_1168_24 + -1;
  ivtmp_28 = ivtmp_13 - 1;
  vectp_g_270.9_3 = vectp_g_270.9_16 + 18446744073709551600;
  ivtmp_31 = ivtmp_29 + 1;
  if (ivtmp_31 < 2)
    goto <bb 9>; [50.00%]
  else
    goto <bb 6>; [50.00%]

  <bb 9> [local count: 29527899]:
  goto <bb 3>; [100.00%]

  <bb 6> [local count: 29527901]:
  # vect__1.15_25 = PHI <vect__1.15_27(3)>
  _22 = BIT_FIELD_REF <vect__1.15_25, 64, 64>;

that does not look completely broken, but the initial value of vectp_g_270.9.2
looks suspicious, that's &g_270[2][1], getting us { g_270[2][1], g_270[3][0] }
and { g_270[1][1], g_270[2][0] } which we then reverse and remove gaps to
get { g_270[3][0], g_270[2][0] } in the first iteration and [1][0] [0][0]
in the second which we should then appropriately get the last value of.

Now - we eventually fold this up to

  vect__1.13_7 = MEM <vector(2) long unsigned int> [(long unsigned int *)&g_270
+ -8B];
  vect__1.14_21 = VEC_PERM_EXPR <vect__1.13_7, vect__1.13_7, { 1, 0 }>;
  vect__1.15_27 = BIT_INSERT_EXPR <vect__1.13_7, 0, 0 (64 bits)>;
  _22 = BIT_FIELD_REF <vect__1.15_27, 64, 64>;

where we can also see that we access memory before g_270 which might trap.
The ability to handle grouped accesses with negative stride is new and
unique to SLP IIRC (indeed GCC 14 doesn't support this).

In the end FRE5 then bails on the UB:

Value numbering stmt = vect__1.13_7 = MEM <vector(2) long unsigned int> [(long
unsigned int *)&g_270 + -8B];
Setting value number of vect__1.13_7 to { 0, 0 } (changed)

likely interpreting the negative offset as a large very positive one.

So what we're missing is that we need to apply peeling for gaps here, but
the most conservative fix might be to disallow any gaps with a negative step.

I'll try to fix up the missing args to dr_misalignment.

Reply via email to