The following patch fixes

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119285

The patch was successfully bootstrapped and tested on x86_64 and aarch64.

I've checked the patch for SPEC2017 lbm_s on Zen4 and i5-13600k and don't see performance or code size change anymore.

I also checked whole SPEC2017 for GCC without patches for PR114991 and PR119285 and with the patches and did not find visible performance change on the two x86_64 machines.
commit 8e0e17677afc1a93aa31b8b83849848b7bb52b9b
Author: Vladimir N. Makarov <vmaka...@redhat.com>
Date:   Mon Mar 17 15:21:46 2025 -0400

    [PR119285][IRA]: Use an additional way of reg equiv invariant substitution correctness
    
    Patch for PR114991 resulted in 5% decrease of SPEC2017 lbm performance
    on Zen2 and Zen4.  For one RTL insn of lbm, LRA with PR114991 patch
    can not confirm that the equivalence insertion will create a valid RTL
    insn.  This resulted in that the pseudo equiv was assumed costly and
    pseudo was assigned to hard reg (caller saved as the pseudo lives
    through calls) and some other pseudos did not get hard regs as it was
    before PR114991 patch.  The insn in question is `pseudo1 = pseduo2 +
    pseudo3` where pseudo2 has equiv `hard_reg + const`.  The old code
    recognized the insn after equiv substitution as LEA.  The new code
    failed.  The patch here makes to use two ways for equiv subsbtitution
    correctness, the old one and new one (mostly for memory addresses
    where the old code fails to find the substitution correctness).  So
    given patch fixes lbm performance degradation and actually makes GCC
    to generate the same code as it was before PR114991 patch.
    
    gcc/ChangeLog:
    
            PR rtl-optimization/119285
            * ira-costs.cc (equiv_can_be_consumed_p): Use 2 ways for
            recognizing a valid insn after equiv insertion.

diff --git a/gcc/ira-costs.cc b/gcc/ira-costs.cc
index b568c7d0326..70cba942a7b 100644
--- a/gcc/ira-costs.cc
+++ b/gcc/ira-costs.cc
@@ -1794,29 +1794,28 @@ validate_autoinc_and_mem_addr_p (rtx x)
 static bool
 equiv_can_be_consumed_p (int regno, rtx to, rtx_insn *insn, bool invariant_p)
 {
-  if (invariant_p)
+  validate_replace_src_group (regno_reg_rtx[regno], to, insn);
+  /* We can change register to equivalent memory in autoinc rtl.  Some code
+     including verify_changes assumes that autoinc contains only a register.
+     So check this first.  */
+  bool res = validate_autoinc_and_mem_addr_p (PATTERN (insn));
+  if (res)
+    res = verify_changes (0);
+  cancel_changes (0);
+  if (!res && invariant_p)
     {
-      /* We use more expensive code for the invariant because we need to
+      /* Here we use more expensive code for the invariant because we need to
 	 simplify the result insn as the invariant can be arithmetic rtx
-	 inserted into another arithmetic rtx.  */
+	 inserted into another arithmetic rtx, e.g. into memory address.  */
       rtx pat = PATTERN (insn);
       int code = INSN_CODE (insn);
       PATTERN (insn) = copy_rtx (pat);
       PATTERN (insn)
 	= simplify_replace_rtx (PATTERN (insn), regno_reg_rtx[regno], to);
-      bool res = !insn_invalid_p (insn, false);
+      res = !insn_invalid_p (insn, false);
       PATTERN (insn) = pat;
       INSN_CODE (insn) = code;
-      return res;
     }
-  validate_replace_src_group (regno_reg_rtx[regno], to, insn);
-  /* We can change register to equivalent memory in autoinc rtl.  Some code
-     including verify_changes assumes that autoinc contains only a register.
-     So check this first.  */
-  bool res = validate_autoinc_and_mem_addr_p (PATTERN (insn));
-  if (res)
-    res = verify_changes (0);
-  cancel_changes (0);
   return res;
 }
 

Reply via email to