https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95760

--- Comment #2 from Jim Wilson <wilson at gcc dot gnu.org> ---
I took another look, and it turns out that the should_duplicate_loop_header_p
for size/speed is not the only issue.  There is also an issue in
tree-ssa-loop-ivopts.c when computing iv costs.  With speed, the +4 iv is
computed as cheaper than the +1 iv.  With size, the +4 iv and +1 iv have the
exact same cost, and since the +1 iv was looked at first that one was chosen. 
If I hack adjust_setup_cost to use to always use the speed cost calculation,
and retain the should_duplicate_loop_header_p hack, then both the inner and
outer loops get the +4 iv with -Os.

Looking at gcc-8.3, I see that the outer loop has the +4 iv and the inner loop
as the +1 iv.  This looks similar to the result I get with the
adjust_setup_cost hack but not the should_duplicate_loop_header_p hack.  So I
think the regression is solely due to some change in the cost calculation.

There is a lot of code involved in cost calculations.  This could have even
been a riscv backend change.  I would suggest doing a bisect over the gcc git
tree if you want to see exactly where and how the cost calculation changed.

The -Os and -O2 optimization diverges in try_improve_iv_set where it does "if
(acost < best_cost)".  Maybe this could be improved to handle the case where
acost == best_cost, and use some other criteria to choose which one is better,
e.g. maybe a giv is better than a biv if they have the same cost.  I haven't
tried looking into this.

Reply via email to