https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89115
--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> --- callgrind computes lra_inheritance -> inerhit_in_ebb -> htab_find_slot -> ... -> rtx_equal_p as the most time-consuming part. That's from insert_invariant. Likely the hash function for this particular testcase is bad (there's no hash statistics on this hashtable printed). But we call 462 000 times htab_find_slot but 2 150 000 000 times invariant_eq_p. Likely we have many (mem (plus (symbol-ref) CONST_INT) with different constants but lra_rtx_hash does case SCRATCH: case CONST_DOUBLE: case CONST_INT: case CONST_VECTOR: return val; which means it ignores the actual constant value (for whatever reason)? Doing a simple Index: gcc/lra.c =================================================================== --- gcc/lra.c (revision 268383) +++ gcc/lra.c (working copy) @@ -1719,10 +1719,12 @@ lra_rtx_hash (rtx x) case SCRATCH: case CONST_DOUBLE: - case CONST_INT: case CONST_VECTOR: return val; + case CONST_INT: + return val + UINTVAL (x); + default: break; } improves compile time to > /usr/bin/time /abuild/rguenther/obj/gcc/cc1 -quiet tagCircle49h12.i -O 18.82user 0.90system 0:19.73elapsed 100%CPU (0avgtext+0avgdata 3789340maxresident)k 0inputs+9808outputs (0major+933676minor)pagefaults 0swaps For sub-fmts of CONST_INT the hash function already performs this operation. dead store elim1 : 10.20 ( 54%) 0.77 ( 36%) 10.96 ( 52%) 3048002 kB ( 89%) LRA reload inheritance : 0.08 ( 0%) 0.00 ( 0%) 0.08 ( 0%) 0 kB ( 0%) I'm going to test sth like the above. Does nothing to the memory use though.