https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88760
--- Comment #29 from Wilco <wilco at gcc dot gnu.org> --- (In reply to Jiu Fu Guo from comment #28) > For these kind of small loops, it would be acceptable to unroll in GIMPLE, > because register pressure and instruction cost may not be major concerns; > just like "cunroll" and "cunrolli" passes (complete unroll) which also been > done at O2. Absolutely, unrolling is a high-level optimization like vectorization. Note the existing unroller doesn't care about register pressure or costs, but a high-level unroller can certainly take this into account by not unrolling as aggressively like the existing unroller.