When cse_insn prunes src{,_folded,_eqv_here,_related} with the equivalence set in the *_same_value chain it also searches for an equivalence to the destination of the instruction with
/* This is the same as the destination of the insns, we want to prefer it. Copy it to src_related. The code below will then give it a negative cost. */ if (GET_CODE (dest) == code && rtx_equal_p (p->exp, dest)) src_related = p->exp; this picks up the last such equivalence and in particular any later duplicate will be pruned by the preceeding else if (src_related && GET_CODE (src_related) == code && rtx_equal_p (src_related, p->exp)) src_related = 0; first. This wastes cycles doing extra rtx_equal_p checks. The following instead searches for the first destination equivalence separately in this loop and delays using src_related for it until we are about to process that, avoiding another redundant rtx_equal_p check. I've came here because of a testcase with very large equivalence lists and compile-time of cse_insn. The patch below doesn't speed it up significantly since there's no equivalence on the destination. In theory this opens the possibility to track dest_related separately, avoiding the implicit pruning of any previous value in src_related. As is the change should be a no-op for code generation. Bootstrapped and tested on x86_64-unknown-linux-gnu, queued for stage1. * cse.cc (cse_insn): Track an equivalence to the destination separately and delay using src_related for it. --- gcc/cse.cc | 51 +++++++++++++++++++++++++++------------------------ 1 file changed, 27 insertions(+), 24 deletions(-) diff --git a/gcc/cse.cc b/gcc/cse.cc index 8fbda4ecc86..543cb1fe36f 100644 --- a/gcc/cse.cc +++ b/gcc/cse.cc @@ -4614,6 +4614,7 @@ cse_insn (rtx_insn *insn) rtx src_eqv_here; rtx src_const = 0; rtx src_related = 0; + rtx dest_related = 0; bool src_related_is_const_anchor = false; struct table_elt *src_const_elt = 0; int src_cost = MAX_COST; @@ -5085,10 +5086,11 @@ cse_insn (rtx_insn *insn) src_related = 0; /* This is the same as the destination of the insns, we want - to prefer it. Copy it to src_related. The code below will - then give it a negative cost. */ - if (GET_CODE (dest) == code && rtx_equal_p (p->exp, dest)) - src_related = p->exp; + to prefer it. The code below will then give it a negative + cost. */ + if (!dest_related + && GET_CODE (dest) == code && rtx_equal_p (p->exp, dest)) + dest_related = p->exp; } /* Find the cheapest valid equivalent, trying all the available @@ -5130,27 +5132,28 @@ cse_insn (rtx_insn *insn) } } - if (src_related) + if (dest_related) { - if (rtx_equal_p (src_related, dest)) - src_related_cost = src_related_regcost = -1; - else - { - src_related_cost = COST (src_related, mode); - src_related_regcost = approx_reg_cost (src_related); - - /* If a const-anchor is used to synthesize a constant that - normally requires multiple instructions then slightly prefer - it over the original sequence. These instructions are likely - to become redundant now. We can't compare against the cost - of src_eqv_here because, on MIPS for example, multi-insn - constants have zero cost; they are assumed to be hoisted from - loops. */ - if (src_related_is_const_anchor - && src_related_cost == src_cost - && src_eqv_here) - src_related_cost--; - } + src_related_cost = src_related_regcost = -1; + /* Handle it as src_related. */ + src_related = dest_related; + } + else if (src_related) + { + src_related_cost = COST (src_related, mode); + src_related_regcost = approx_reg_cost (src_related); + + /* If a const-anchor is used to synthesize a constant that + normally requires multiple instructions then slightly prefer + it over the original sequence. These instructions are likely + to become redundant now. We can't compare against the cost + of src_eqv_here because, on MIPS for example, multi-insn + constants have zero cost; they are assumed to be hoisted from + loops. */ + if (src_related_is_const_anchor + && src_related_cost == src_cost + && src_eqv_here) + src_related_cost--; } /* If this was an indirect jump insn, a known label will really be -- 2.35.3