https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84037
--- Comment #16 from Richard Biener <rguenth at gcc dot gnu.org> --- So discussion lead to the proposal to add another unroll parameter, for example --param small-loop-size which serves as a "barrier" we may not cross when optimizing a loop. Thus for all loops <= small-loop-size before the transform inhibit the transform from growing the loop to > small-loop-size. Limit other loops like before. The motivation is to model things like the loop-stream-detector or uop caches where falling out of either has a severe performance impact. That's without trying to combine this new heuristic with the computed speedup by unrolling/vectorization. The default of the parameter would be zero (disable that feature) and targets could set it based on their micro-architecture and benchmarking. It should _always_ be smaller than --param max-unrolled-insns. Sounds sane?