https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112282
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #13 from Richard Biener <rguenth at gcc dot gnu.org> --- So I can cut off bitfield lowering completely, the important part is that we version the loop and thus try to BB vectorize the loop header (yeah, we don't BB vectorize the whole body - or rather, we think the header _is_ the fully body). But a key to the failure seems to be that we BB vectorize the unrolled for (; ac < 1; ac++) for (k = 0; k < 9; k++) am[k] = 0; and doing that not from SLP but from loop vectorization of if-conversion versioned (but otherwise unchanged) loop. It's also solely triggered by unrolling the 'z' loop. Disabling all following passes will still reproduce it. The region VN triggered by ifconversion/vectorization/unrolling isn't needed either (I disabled it). Maybe PR111572 is related (but it doesn't change unrolling and disabling ch_vect doesn't avoid the problem). Unrolling does Analyzing # of iterations of loop 2 exit condition [23, + , 4294967295] != 0 bounds on difference of bases: -23 ... -23 result: # of iterations 23, bounded by 23 Removed pointless exit: if (ivtmp_1055 != 0) because we computed loop->nb_iterations_upper_bound to 21: Statement (exit)if (ivtmp_1055 != 0) is executed at most 23 (bounded by 23) + 1 times in loop 2. Induction variable (int) 21 + -1 * iteration does not wrap in statement _4 = ~u.13_485; in loop 2. Statement _4 = ~u.13_485; is executed at most 21 (bounded by 21) + 1 times in loop 2. Induction variable (int) -21 + 1 * iteration does not wrap in statement _19 = u.13_485 + 1; in loop 2. Statement _19 = u.13_485 + 1; is executed at most 23 (bounded by 23) + 1 times in loop 2. Reducing loop iteration estimate by 1; undefined statement must be executed at the last iteration. we're SCEV analyzing _4 here, computing {21, +, -1}_2 and VRP1 computed [irange] int [0, +INF] somehow for it. u.13_485 has a global range of [-2147483647, 1], so obviously it must infer sth else here somehow and wrongly so? That very same def also appears with plain -O3. Global Exported: _4 = [irange] int [0, +INF] Hmm. We have Folding statement: _64 = ~u.13_20; Global Exported: _64 = [irange] int [-2, -1] MASK 0x1 VALUE 0xfffffffe Folding statement: _4 = ~u.13_20; Global Exported: _4 = [irange] int [0, +INF] but the if-conversion pass hoists that before the .LOOP_VECTORIZED properly resetting flow-sensitive info on stmts hoisted fixes this. Meh. Premature duplicate transforms ...