https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95760
--- Comment #2 from Jim Wilson <wilson at gcc dot gnu.org> --- I took another look, and it turns out that the should_duplicate_loop_header_p for size/speed is not the only issue. There is also an issue in tree-ssa-loop-ivopts.c when computing iv costs. With speed, the +4 iv is computed as cheaper than the +1 iv. With size, the +4 iv and +1 iv have the exact same cost, and since the +1 iv was looked at first that one was chosen. If I hack adjust_setup_cost to use to always use the speed cost calculation, and retain the should_duplicate_loop_header_p hack, then both the inner and outer loops get the +4 iv with -Os. Looking at gcc-8.3, I see that the outer loop has the +4 iv and the inner loop as the +1 iv. This looks similar to the result I get with the adjust_setup_cost hack but not the should_duplicate_loop_header_p hack. So I think the regression is solely due to some change in the cost calculation. There is a lot of code involved in cost calculations. This could have even been a riscv backend change. I would suggest doing a bisect over the gcc git tree if you want to see exactly where and how the cost calculation changed. The -Os and -O2 optimization diverges in try_improve_iv_set where it does "if (acost < best_cost)". Maybe this could be improved to handle the case where acost == best_cost, and use some other criteria to choose which one is better, e.g. maybe a giv is better than a biv if they have the same cost. I haven't tried looking into this.