https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104686
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |vmakarov at gcc dot gnu.org
--- Comment #9 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Martin Liška from comment #8)
> (In reply to Martin Liška from comment #7)
> > (In reply to Richard Biener from comment #6)
> > > Both revisions affect vectorizer cost modeling only. With
> > > -fno-vect-cost-model it compiles faster for me but still a slow 30s and
> > > 91%
> > > in RA.
> >
> > There are numbers with -fno-vect-cost-model:
> >
> > Bisecting latest revisions
> > a9e2ebe839d56416(24 Feb 2022 22:16)([email protected]): [took: 36.06 s]
> > result: OK
> > 250f234988b62316(20 Apr 2021 09:51)([email protected]): [took: 18.35
> > s] result: OK
> >
> > I'm going to find out where the change happensed.
>
> Which started with r12-2463-ga6291d88d5b6c17d.
I think you want to keep r12-7293 in the tree - the above introduced a huge
regression that was already fixed. So this testcase was probably always
slow to compile and spending all time in RA?
When using callgrind on the reduced testcase and a -O0 compiler
I see most time spent in ira_object_conflict_iter_cond, in particular
the loop
/* Skip bits that are zero. */
for (; (word & 1) == 0; word >>= 1)
bit_num++;
and the load
obj = ira_object_id_map[bit_num + i->base_conflict_id];
maybe we can use ctz_hwi here (hopefully we optimize this loop with -O2).
This function is most called from allocnos_conflict_p which is called
from update_conflict_hard_regno_costs.
In particular we have 677544 calls to get_next_upate_cost () and
1824339 calls to allocnos_conflict () there from just 34480 calls
to update_conflict_hard_regno_costs. queue_update_cost is called
1365947 times.
Maybe we can improve things or at least cut things off with decreasing
precision somehow? Vlad?