https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114480
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Known to fail| |14.0
Keywords| |ra
Ever confirmed|0 |1
CC| |vmakarov at gcc dot gnu.org
Last reconfirmed|2024-03-26 00:00:00 |2024-03-27
--- Comment #10 from Richard Biener <rguenth at gcc dot gnu.org> ---
I see on x86_64-linux w/ release checking
tree SSA rewrite : 76.99 ( 31%) 0.09 ( 5%) 77.11 ( 31%)
96M ( 9%)
integrated RA : 92.31 ( 37%) 0.15 ( 8%) 92.49 ( 37%)
105M ( 10%)
LRA create live ranges : 54.01 ( 22%) 0.00 ( 0%) 54.02 ( 22%)
885k ( 0%)
TOTAL : 246.34 1.88 248.43
1039M
246.34user 2.02system 4:08.92elapsed 99%CPU (0avgtext+0avgdata
3287072maxresident)k
70416inputs+0outputs (110major+1229628minor)pagefaults 0swaps
tree SSA rewrite is interesting, probably bitmap slowness and cache dependent.
With -O1:
tree PTA : 85.65 ( 14%) 0.21 ( 3%) 85.89 ( 14%)
348M ( 2%)
tree SSA rewrite : 76.05 ( 13%) 0.10 ( 1%) 76.14 ( 12%)
96M ( 1%)
tree SSA incremental : 181.52 ( 30%) 0.03 ( 0%) 181.50 ( 30%)
10031k ( 0%)
expand vars : 66.72 ( 11%) 0.00 ( 0%) 66.74 ( 11%)
6132k ( 0%)
expand : 64.33 ( 11%) 0.02 ( 0%) 64.39 ( 11%)
172M ( 1%)
TOTAL : 603.55 7.72 611.61
19327M
603.55user 7.83system 10:11.78elapsed 99%CPU (0avgtext+0avgdata
19809792maxresident)k
21520inputs+0outputs (48major+5102514minor)pagefaults 0swaps
definitely "interesting" testcase.
The profile for -O0 shows IDF compute (that's SSA rewrite, a usual suspect)
and other bits that might be interesting for the RA part.
Samples: 1M of event 'cycles:u', Event count (approx.): 1332096582355
Overhead Samples Command Shared Object Symbol
24.78% 243663 cc1plus cc1plus [.] compute_idf
11.29% 115134 cc1plus cc1plus [.] make_hard_regno_dead
10.29% 104126 cc1plus cc1plus [.] process_bb_node_lives
5.29% 53680 cc1plus cc1plus [.] mark_pseudo_regno_live
4.95% 50051 cc1plus cc1plus [.] mark_ref_dead
3.95% 40075 cc1plus cc1plus [.]
update_allocno_pressure
2.73% 27977 cc1plus cc1plus [.]
lra_create_live_ranges_
2.48% 25136 cc1plus cc1plus [.] inc_register_pressure
2.37% 24268 cc1plus cc1plus [.] update_pseudo_point
2.23% 21976 cc1plus cc1plus [.] mergesort<sort_ctx>
2.19% 22208 cc1plus cc1plus [.] make_object_dead
2.09% 21316 cc1plus cc1plus [.] sparseset_clear_bit
1.99% 20181 cc1plus cc1plus [.] bitmap_set_bit
I'll note this was all tested on trunk, GCC 11 might behave even worse and
quite some deep recursion issues have been fixed in newer releases.