https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101523

--- Comment #53 from Richard Biener <rguenth at gcc dot gnu.org> ---
So just to recap, with reverting the change and instead doing

diff --git a/gcc/combine.cc b/gcc/combine.cc
index a4479f8d836..ff25752cac4 100644
--- a/gcc/combine.cc
+++ b/gcc/combine.cc
@@ -4186,6 +4186,10 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1,
rtx_insn *i0,
       adjust_for_new_dest (i3);
     }

+  bool i2_unchanged = false;
+  if (rtx_equal_p (newi2pat, PATTERN (i2)))
+    i2_unchanged = true;
+
   /* We now know that we can do this combination.  Merge the insns and
      update the status of registers and LOG_LINKS.  */

@@ -4752,6 +4756,9 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1,
rtx_insn *i0,
   combine_successes++;
   undo_commit ();

+  if (i2_unchanged)
+    return i3;
+
   rtx_insn *ret = newi2pat ? i2 : i3;
   if (added_links_insn && DF_INSN_LUID (added_links_insn) < DF_INSN_LUID
(ret))
     ret = added_links_insn;

combine time is down from 79s (93%) to 3.5s (37%), quite a bit more than
with the currently installed patch which has combine down to 0.02s (0%).
But notably peak memory use is down from 9GB to 400MB (installed patch 340MB).

That was with a cross from x86_64-linux and a release checking build.

This change should avoid any code generation changes, I do think if the
pattern doesn't change what distribute_notes/links does should be a no-op
even to I2 so we can ignore added_{links,notes}_insn (not ignoring them
only provides a 50% speedup).

I like the 0% combine result of the installed patch but the regressions
observed probably mean this needs to be defered to stage1.

Reply via email to