https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116645

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2024-09-09
            Summary|[13/14/15 regression] Huge  |[13/14/15 regression] Huge
                   |performance loss after      |performance loss after
                   |13.2.0 compiler upgrade     |13.2.0 compiler upgrade;
                   |                            |reload CSE regs has
                   |                            |scalability issues

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
I measure 41s with GCC 13.2, 57% spent in reload CSE regs, vs. 14s with GCC
11.4, 7% spent in reload CSE regs.

The compile-time with -O2 behaves similarly with GCC 13.2 but increases to 36s
with GCC 11.4, also showing the 63% reload CSE regs figure.  So I'd probably
blame inliner heuristic changes for the observed difference but the problem
exposed looks latent.

With -O1, the suggested option for large auto-generated code when you
experience
compile-time or memory-usage issues, GCC 11.4 takes 20s, GCC 13.2 similar
(again both with 53% in reload CSE regs).  Also confirmed with GCC 14.2 and
a somewhat old trunk (r15-2794).

So confirmed.

I don't think bisection will reveal anything interesting.  Somebody needs
to sit down and look at postreload why it takes so long for this testcase.
A profile for GCC 14.2 shows

Samples: 89K of event 'cycles:Pu', Event count (approx.): 114469365450          
Overhead       Samples  Command  Shared Object         Symbol                   
  37.58%         33298  cc1plus  cc1plus               [.]
_ZN10hash_tableI13cselib_hasherLb0E11xcallocatorE19find_slot_with_hashERKPNS0_3keyEj13insert_option
  14.25%         12619  cc1plus  cc1plus               [.]
_Z22rtx_equal_for_cselib_1P7rtx_defS0_12machine_modei
   4.27%          3776  cc1plus  cc1plus               [.]
_Z14bitmap_set_bitP11bitmap_headi

so I'd say it's a bad hash (again).

Reply via email to