[Bug target/104686] [12 Regression] Huge compile-time regression building SPEC 2017 538.imagick_r with -march=skylake

rguenth at gcc dot gnu.org via Gcc-bugs Fri, 25 Feb 2022 05:00:49 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104686


Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |vmakarov at gcc dot gnu.org

--- Comment #9 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Martin Liška from comment #8)
> (In reply to Martin Liška from comment #7)
> > (In reply to Richard Biener from comment #6)
> > > Both revisions affect vectorizer cost modeling only.  With
> > > -fno-vect-cost-model it compiles faster for me but still a slow 30s and 
> > > 91%
> > > in RA.
> > 
> > There are numbers with -fno-vect-cost-model:
> > 
> > Bisecting latest revisions
> >   a9e2ebe839d56416(24 Feb 2022 22:16)(ol...@adacore.com): [took: 36.06 s]
> > result: OK
> >   250f234988b62316(20 Apr 2021 09:51)(stefa...@linux.ibm.com): [took: 18.35
> > s] result: OK
> > 
> > I'm going to find out where the change happensed.
> 
> Which started with r12-2463-ga6291d88d5b6c17d.

I think you want to keep r12-7293 in the tree - the above introduced a huge
regression that was already fixed.  So this testcase was probably always
slow to compile and spending all time in RA?

When using callgrind on the reduced testcase and a -O0 compiler
I see most time spent in ira_object_conflict_iter_cond, in particular
the loop

      /* Skip bits that are zero.  */
      for (; (word & 1) == 0; word >>= 1)
        bit_num++;

and the load

      obj = ira_object_id_map[bit_num + i->base_conflict_id];

maybe we can use ctz_hwi here (hopefully we optimize this loop with -O2).

This function is most called from allocnos_conflict_p which is called
from update_conflict_hard_regno_costs.

In particular we have 677544 calls to get_next_upate_cost () and
1824339 calls to allocnos_conflict () there from just 34480 calls
to update_conflict_hard_regno_costs.  queue_update_cost is called
1365947 times.

Maybe we can improve things or at least cut things off with decreasing
precision somehow?  Vlad?

[Bug target/104686] [12 Regression] Huge compile-time regression building SPEC 2017 538.imagick_r with -march=skylake

Reply via email to