https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96481

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
   Last reconfirmed|                            |2020-08-05
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW
             Blocks|                            |53947

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Yes, this is a known limitation in that for basic-block SLP we do not perform
if-conversion.  Instead the basic-block SLP code sees

  <bb 2> [local count: 1073741824]:
  _1 = *pd_17(D);
  _2 = *pc_19(D);
  _3 = *pb_20(D);
  _4 = *pa_21(D);
  if (_3 < _4)
    goto <bb 11>; [50.00%]
  else
    goto <bb 3>; [50.00%]

  <bb 11> [local count: 536870912]:
  goto <bb 4>; [100.00%]

  <bb 3> [local count: 536870913]:

  <bb 4> [local count: 1073741824]:
  # iftmp.20_23 = PHI <_1(3), _2(11)>
  *dst_22(D) = iftmp.20_23;
  _5 = MEM[(const unsigned int *)pd_17(D) + 4B];
  _6 = MEM[(const unsigned int *)pc_19(D) + 4B];
  _7 = MEM[(const unsigned int *)pb_20(D) + 4B];
  _8 = MEM[(const unsigned int *)pa_21(D) + 4B];
...

which also rips apart the memory groups (we're slowly relaxing another
limitation that the basic-block SLP code operates on a single basic-block
at a time but for data refs this restriction will prevail).


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

Reply via email to