------- Comment #12 from bonzini at gnu dot org 2009-02-05 10:26 ------- FRE is not a problem because all the time (93%) is spent computing ANTIC; of this, half is phi_translate and the other half is bitmap_set operations.
I get a relatively good (15%) improvement from Index: tree-ssa-sccvn.c =================================================================== --- tree-ssa-sccvn.c (revision 143938) +++ tree-ssa-sccvn.c (working copy) @@ -398,9 +398,14 @@ vn_reference_op_eq (const void *p1, cons static hashval_t vn_reference_op_compute_hash (const vn_reference_op_t vro1) { - return iterative_hash_expr (vro1->op0, vro1->opcode) - + iterative_hash_expr (vro1->op1, vro1->opcode) - + iterative_hash_expr (vro1->op2, vro1->opcode); + hashval_t result = 0; + if (vro1->op0) + result += iterative_hash_expr (vro1->op0, vro1->opcode); + if (vro1->op1) + result += iterative_hash_expr (vro1->op1, vro1->opcode); + if (vro1->op2) + result += iterative_hash_expr (vro1->op2, vro1->opcode); + return result; } /* Return the hashcode for a given reference operation P1. */ and another 8% from this: Index: tree-ssa-pre.c =================================================================== --- tree-ssa-pre.c (revision 143938) +++ tree-ssa-pre.c (working copy) @@ -216,11 +216,11 @@ pre_expr_hash (const void *p1) case CONSTANT: return vn_hash_constant_with_type (PRE_EXPR_CONSTANT (e)); case NAME: - return iterative_hash_expr (PRE_EXPR_NAME (e), 0); + return iterative_hash_hashval_t (SSA_NAME_VERSION (PRE_EXPR_NAME (e)), 0); case NARY: - return vn_nary_op_compute_hash (PRE_EXPR_NARY (e)); + return PRE_EXPR_NARY (e)->hashcode; case REFERENCE: - return vn_reference_compute_hash (PRE_EXPR_REFERENCE (e)); + return PRE_EXPR_REFERENCE (e)->hashcode; default: abort (); } (Tested with "make check RUNTESTFLAGS=tree-ssa.exp=*[pf]re*"). At least these two kick hashing almost out of the profile and bring PRE down from 50% to 40% of the compilation time. They also speedup a bit the bitmap_sets since get_or_alloc_expression_id was also doing hashing. The remaining main offenders are phi_translate_set and phi_translate_1. Apart from some bitmap_sets, their profile is quite flat so no more microoptimization I guess. I'll bootstrap/regtest the above. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35639