https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114435

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
Looking again the reason for the "bad" vectorization with pcom applied is

t.c:23:23: missed:   Build SLP failed: operation unsupported _51 =
r__r0_lsm0.7_7;

that is, pcom leaves around SSA name copies which we do not handle.  We
could probably somehow ignore those during SLP build (but we've most of
the time just fixed whoever leaves those around).  Maybe it's time to
do this.  Note we do not want a plain copy in the SLP tree, instead
when looking for the def of the operands of the PHI.  Note it would be
better to avoid the SSA copy generated by predcom.

Sneaking in a copy_prop pass after pcom just for checking vectorizes
the thing just fine, including the added recurrence:

  <bb 4> [local count: 70429947]:
  _12 = {k_25(D), k_25(D)};
  vect__8.10_28 = MEM <vector(2) double> [(double *)r_26(D)];
  vect__8.11_5 = MEM <vector(2) double> [(double *)r_26(D) + 16B];
  ivtmp.24_58 = (unsigned long) in_27(D);
  _65 = (unsigned long) sz_24(D);
  _66 = _65 * 32;
  _68 = ivtmp.24_58 + _66;

  <bb 5> [local count: 640272252]:
  # vect_r__r0_lsm0.17_15 = PHI <vect__8.10_28(4), vect__45.18_16(5)>
  # vect_r__r0_lsm0.17_30 = PHI <vect__8.11_5(4), vect__45.18_17(5)>
  # ivtmp.24_51 = PHI <ivtmp.24_58(4), ivtmp.24_54(5)>
  _20 = (void *) ivtmp.24_51;
  vect__47.14_9 = MEM <vector(2) double> [(double *)_20];
  vect__47.15_11 = MEM <vector(2) double> [(double *)_20 + 16B];
  vect__46.16_13 = vect__47.14_9 * _12;
  vect__46.16_14 = vect__47.15_11 * _12;
  vect__45.18_16 = vect__46.16_13 + vect_r__r0_lsm0.17_15;
  vect__45.18_17 = vect__46.16_14 + vect_r__r0_lsm0.17_30;
  MEM <vector(2) double> [(double *)r_26(D)] = vect__45.18_16;
  MEM <vector(2) double> [(double *)r_26(D) + 16B] = vect__45.18_17;
  ivtmp.24_54 = ivtmp.24_51 + 32;
  if (ivtmp.24_54 != _68)
    goto <bb 5>; [89.00%]

Reply via email to