https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68861

Jeffrey A. Law <law at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rguenth at gcc dot gnu.org

--- Comment #2 from Jeffrey A. Law <law at redhat dot com> ---
This is starting to look like a latent but in the SLP vectorizer.   Sadly, I
cherry picked the latest SLP changes from Richi, but they don't help.

In .cunroll (for my strangely reduced testcase) we have the following key
statements:

So the first hint that makes this easier to understand is x_33 will always be
zero.  It's hidden from the compiler, but knowing makes it easier to follow.
  _13 = x_33 * 25;
  _10 = _13 + 5;
  _65 = _13 + 6;
  _149 = _13 + 7;
  _164 = 2;
  _162 = _10 + _164;
  *f_26[_169] = _162;     //   *f_26[_169] = 7
  _132 = _1 + 12;
  *f_26[_132] = _162;     //   *f_26[_132] = 7
  _135 = _1 + 13;
  _165 = _65 + _164;
  *f_26[_135] = _165;     //   *f_26[_135] = 8
  _141 = _1 + 14;
  *f_26[_141] = _165;     //   *f_26[_141] = 8
  _139 = _1 + 15;
  _146 = _164 + _149;
  *f_26[_139] = _146;     //   *f_26[_139] = 9
  _179 = _1 + 16;
  *f_26[_179] = _146;     //   *f[_26][_179] = 9


Then another batch (skipping some of the array index calculations)

  r.49_188 = 2;
  _136 = r.49_188 * 2;
  _137 = _10 + _136;
  *f_26[_133] = _137;     //   *f_26[_133] = 9
  _145 = _129 + 12;
  *f_26[_145] = _137;     //   *f_26[_145] = 9
  _163 = _129 + 13;
  _167 = _65 + _136;
  *f_26[_163] = _167;     //   *f_26[_163] = 10
  _175 = _129 + 14;
  *f_26[_175] = _167;     //   *f_26[_175] = 10
  _193 = _129 + 15;
  _197 = _136 + _149;
  *f_26[_193] = _197;     //   *f_26[_193] = 11
  _205 = _129 + 16;
  *f_26[_205] = _197;     //   *f_26[_205] = 11


Which is correct.  The SLP code is hairy, but the key bits from slp1 are
(remember that x_33 is always zero)

  vect_cst__201 = { 5, 5 };
  vect_cst__200 = { 6, 6 };
  vect_cst__199 = { 7, 7 };
  vect_cst__174 = { 25, 25 };
  vect_cst__171 = { 25, 25 };
  vect_cst__5 = {x_33, x_33};
  vect_cst__176 = {x_33, x_33};
  vect_cst__170 = {x_33, x_33};
  vect__13.88_192 = vect_cst__170 * vect_cst__174;      // { 0, 0 }
  vect__13.88_194 = vect_cst__176 * vect_cst__171;      // { 0, 0 }
  vect__13.88_196 = vect_cst__5 * vect_cst__52;         // { 0, 0 }

  vect__10.89_204 = vect__13.88_192 + vect_cst__201;    // { 5, 5 }
  _164 = 2;
  vect__10.89_206 = vect__13.88_194 + vect_cst__200;    // { 6, 6 }
  vect__10.89_207 = vect__13.88_196 + vect_cst__199;    // { 7, 7 }
  _13 = x_33 * 25;
  _149 = _13 + 7;
  vect_cst__208 = {_149, _149};                         // { 7, 7 }
  vect_cst__209 = {_164, _164};                         // { 2, 2 }
  vect_cst__11 = {_164, _164};                          // { 2, 2 }
  vect__162.90_186 = vect__10.89_204 + vect_cst__11;    // { 7, 7 }
  vect__162.90_185 = vect__10.89_206 + vect_cst__209;   // { 8, 8 }
  vect__162.90_156 = vect__10.89_207 + vect_cst__208;   // {14, 14 } WTF!

  vectp.92_155 = &*f_26[_169];
  MEM[(integer(kind=4) *)vectp.92_155] = vect__162.90_186;
  vectp.92_125 = vectp.92_155 + 8;
  MEM[(integer(kind=4) *)vectp.92_125] = vect__162.90_185;
  vectp.92_90 = vectp.92_125 + 8;
  MEM[(integer(kind=4) *)vectp.92_90] = vect__162.90_156;


Note how the last assignment stores vect__162.90_156, which is the wrong value.
 It should have been { 9, 9 }.

Te botch gets repeated in the next block of stores where we end up storing {
14, 14} instead of { 11, 11 } in the last store.

The bogus values obviously cause grief later.

Anyway, it's late.

Reply via email to