For a -O0 tramp3d-v4.cpp compile, reload and its calls to CONSTRAINT_LEN are on top of the profiles:
Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls s/call s/call name 3.74 1.34 1.34 265677 0.00 0.00 record_reg_classes 2.96 2.40 1.06 1145706 0.00 0.00 gt_ggc_mx_lang_tree_node 2.62 3.34 0.94 10097098 0.00 0.00 ggc_alloc_stat 2.51 4.24 0.90 2369119 0.00 0.00 walk_tree 2.51 5.14 0.90 323915 0.00 0.00 find_reloads 2.48 6.03 0.89 10110867 0.00 0.00 expand_template_argument_pack 2.29 6.85 0.82 33351577 0.00 0.00 lookup_constraint 2.20 7.64 0.79 16245996 0.00 0.00 ggc_set_mark 1.42 8.15 0.51 6300484 0.00 0.00 mark_set_1 1.42 8.66 0.51 526692 0.00 0.00 retrieve_specialization 1.14 9.07 0.41 32030410 0.00 0.00 insn_constraint_len [42] 5.0 0.90 0.88 323915 find_reloads [42] 0.45 0.00 18162825/33351577 lookup_constraint [68] 0.23 0.00 17752937/32030410 insn_constraint_len [106] At least insn_constraint_len could be inlined: size_t insn_constraint_len (enum constraint_num c) { switch (c) { case CONSTRAINT_Y2: return 2; case CONSTRAINT_Yi: return 2; case CONSTRAINT_Ym: return 2; default: break; } return 1; } and both, insn_constraint_len and lookup_constraint can be marked pure. For i686 at least an optimized CONSTRAINT_LEN can be done with #define CONSTRAINT_LEN(c_,s_) (c_ == 'Y' ? 2 : 1) -- Summary: CONSTRAINT_LEN is slow on i?86, x86_64 Product: gcc Version: 4.3.0 Status: UNCONFIRMED Keywords: compile-time-hog Severity: normal Priority: P3 Component: target AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: rguenth at gcc dot gnu dot org GCC target triplet: i?86-*-* x86_64-*-* http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31420