https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80549
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |wrong-code Priority|P3 |P2 Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org --- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> --- I will have a look (disabling VRP1 seems to help). It is cunroll that does the harmful transform in the end (niter analysis?): -Loop 4 iterates 246 times. -Loop 4 iterates at most 246 times. -Loop 4 likely iterates at most 246 times. -Not unrolling loop 4 (--param max-completely-peel-times limit reached). +Loop 2 iterates 246 times. +Loop 2 iterates at most 27 times. +Loop 2 likely iterates at most 27 times. +Analyzing # of iterations of loop 2 + exit condition [246, + , 255] != 0 + bounds on difference of bases: -246 ... -246 + result: + # of iterations 246, bounded by 246 +Removed pointless exit: if (ivtmp_32 != 0) +Not unrolling loop 2 (--param max-completely-peel-times limit reached). we see we can somehow preserve loop2 while we re-discovered it with vrp1 disabled (eventually dropping the upper iteration bound). Looks like another latent threading issue to me. Without VRP1: Threaded jump 14 --> 3 to 17 Threaded jump 7 --> 11 to 18 fix_loop_structure: fixing up loops for function fix_loop_structure: removing loop 1 flow_loops_find: discovered new loop 3 with header 3 flow_loops_find: discovered new loop 4 with header 9 that's CFG cleanup after DOM2. With VRP1 enabled we thread the same amount of jumps but only have Threaded jump 14 --> 3 to 17 Threaded jump 7 --> 11 to 18 fix_loop_structure: fixing up loops for function fix_loop_structure: removing loop 1 flow_loops_find: discovered new loop 3 with header 3