https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97077
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |missed-optimization Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Last reconfirmed| |2020-09-17 --- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> --- This is because the second loop has a load from {0,1,2,3,4} in its body and thus appears larger to unroll (we don't estimate those loads to go away - a missed optimization). static const int C.0[5] = {0, 1, 2, 3, 4}; ... <bb 4> [local count: 894749065]: # __for_begin_19 = PHI <__for_begin_10(5), &C.0(7)> # prephitmp_3 = PHI <pretmp_15(5), 0(7)> # ivtmp_14 = PHI <ivtmp_8(5), 5(7)> foo (prephitmp_3); __for_begin_10 = __for_begin_19 + 4; ivtmp_8 = ivtmp_14 - 1; if (ivtmp_8 == 0) goto <bb 6>; [20.00%] else goto <bb 5>; [80.00%] <bb 5> [local count: 715756304]: pretmp_15 = MEM[(const int *)__for_begin_19 + 4B]; goto <bb 4>; [100.00%] Estimating sizes for loop 2 BB: 4, after_exit: 0 size: 2 foo (prephitmp_3); size: 1 __for_begin_10 = __for_begin_19 + 4; size: 1 ivtmp_8 = ivtmp_14 - 1; Induction variable computation will be folded away. size: 2 if (ivtmp_8 == 0) Exit condition will be eliminated in peeled copies. Exit condition will be eliminated in last copy. Constant conditional. BB: 5, after_exit: 1 size: 1 pretmp_15 = MEM[(const int *)__for_begin_19 + 4B]; size: 7-3, last_iteration: 6-3 Loop size: 7 Estimated size after unrolling: 12 Not unrolling loop 2: size would grow. Not unrolling loop 2: contains call and code would grow. at some point I had patches to improve this but they had negative ripple-down effects so I reverted them.