For the testcase in PR114855 at -O1 add_store_equivs shows up as the main sink for bitmap_set_bit because it uses a bitmap to mark all seen insns by UID to make sure the forward walk in memref_used_between_p will find the insn in question. Given we do have a CFG here the functions operation is questionable, given memref_used_between_p together with the walk of all insns is obviously quadratic in the worst case that whole thing should be re-done ... but, for the testcase, using a sbitmap of size get_max_uid () + 1 gets bitmap_set_bit off the profile and improves IRA time from 15.58s (8%) to 3.46s (2%).
Now, given above quadraticness I wonder whether we should instead gate add_store_equivs on optimize > 1 or flag_expensive_optimizations. Jeff, you added the bitmap in r6-7529-g14d7d4be52585b, I have no idea how get_insns () works at this point and in which CFG mode we are but a simplification might be to simply verify both insns are in the same BB and hopefully get_insns gets us walk the insns in order there, thus we could elide the bitmap completely (with some loss of cases, but the function comment suggests it is supposed to catch single-BB cases only anyway?!). Bootstrap and regtest running on x86_64-unknown-linux-gnu. OK if that succeeds? Thanks, Richard. PR rtl-optimization/114855 * ira.cc (add_store_equivs): Use sbitmap for tracking visited insns. --- gcc/ira.cc | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/gcc/ira.cc b/gcc/ira.cc index 156541df4e6..3936456c4ed 100644 --- a/gcc/ira.cc +++ b/gcc/ira.cc @@ -3838,7 +3838,8 @@ update_equiv_regs (void) static void add_store_equivs (void) { - auto_bitmap seen_insns; + auto_sbitmap seen_insns (get_max_uid () + 1); + bitmap_clear (seen_insns); for (rtx_insn *insn = get_insns (); insn; insn = NEXT_INSN (insn)) { -- 2.43.0