https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112282

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|unassigned at gcc dot gnu.org      |rguenth at gcc dot 
gnu.org
             Status|NEW                         |ASSIGNED

--- Comment #13 from Richard Biener <rguenth at gcc dot gnu.org> ---
So I can cut off bitfield lowering completely, the important part is that we
version the loop and thus try to BB vectorize the loop header (yeah, we don't
BB vectorize the whole body - or rather, we think the header _is_ the fully
body).

But a key to the failure seems to be that we BB vectorize the unrolled

      for (; ac < 1; ac++)
        for (k = 0; k < 9; k++)
          am[k] = 0;

and doing that not from SLP but from loop vectorization of if-conversion
versioned (but otherwise unchanged) loop.

It's also solely triggered by unrolling the 'z' loop.  Disabling all
following passes will still reproduce it.  The region VN triggered by
ifconversion/vectorization/unrolling isn't needed either (I disabled it).

Maybe PR111572 is related (but it doesn't change unrolling and disabling
ch_vect doesn't avoid the problem).

Unrolling does

Analyzing # of iterations of loop 2
  exit condition [23, + , 4294967295] != 0
  bounds on difference of bases: -23 ... -23
  result:
    # of iterations 23, bounded by 23
Removed pointless exit: if (ivtmp_1055 != 0)

because we computed loop->nb_iterations_upper_bound to 21:

Statement (exit)if (ivtmp_1055 != 0)
 is executed at most 23 (bounded by 23) + 1 times in loop 2.
Induction variable (int) 21 + -1 * iteration does not wrap in statement _4 =
~u.13_485;
 in loop 2.
Statement _4 = ~u.13_485;
 is executed at most 21 (bounded by 21) + 1 times in loop 2.
Induction variable (int) -21 + 1 * iteration does not wrap in statement _19 =
u.13_485 + 1;
 in loop 2.
Statement _19 = u.13_485 + 1;
 is executed at most 23 (bounded by 23) + 1 times in loop 2.
Reducing loop iteration estimate by 1; undefined statement must be executed at
the last iteration.

we're SCEV analyzing _4 here, computing {21, +, -1}_2 and VRP1 computed
[irange] int [0, +INF] somehow for it.  u.13_485 has a global range of
[-2147483647, 1], so obviously it must infer sth else here somehow and
wrongly so?

That very same def also appears with plain -O3.

Global Exported: _4 = [irange] int [0, +INF]

Hmm.  We have

Folding statement: _64 = ~u.13_20;
Global Exported: _64 = [irange] int [-2, -1] MASK 0x1 VALUE 0xfffffffe

Folding statement: _4 = ~u.13_20;
Global Exported: _4 = [irange] int [0, +INF]

but the if-conversion pass hoists that before the .LOOP_VECTORIZED

properly resetting flow-sensitive info on stmts hoisted fixes this.

Meh.

Premature duplicate transforms ...

Reply via email to