https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121439

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2025-08-07
             Status|UNCONFIRMED                 |NEW

--- Comment #9 from Richard Biener <rguenth at gcc dot gnu.org> ---
For machine generated code we suggest to use -O1

For me GCC 15 at -O0 takes ~3GB memory and 35s, the interesing part is -O1,
there I also see combine taking ages.

Note combine is inherently quadratic in particular with large BBs and it
does not have any limiting implemented.  In particular LOG_LINK distance
isn't limited (and we do xyz-inbetween walks) and the retry extra walks
of insns makes it even cubic.

Something like the following tackles the latter, but I suspect for this
testcase it's also the former.  See 2nd hunk below.

diff --git a/gcc/combine.cc b/gcc/combine.cc
index 4dbc1f6a4a4..40bb93755fb 100644
--- a/gcc/combine.cc
+++ b/gcc/combine.cc
@@ -1239,6 +1239,18 @@ combine_instructions (rtx_insn *f, unsigned int nregs)
        label_tick_ebb_start = label_tick;
       last_bb = this_basic_block;

+      rtx_insn *tem = BB_HEAD (this_basic_block);
+      if (!NONDEBUG_INSN_P (tem))
+       tem = next_nondebug_insn (tem);
+      int bb_first_luid = DF_INSN_LUID (tem);
+      tem = BB_END (this_basic_block);
+      if (!NONDEBUG_INSN_P (tem))
+       tem = prev_nondebug_insn (tem);
+      int bb_last_luid = DF_INSN_LUID (tem);
+      int extra_walk_budget
+       = bb_first_luid < bb_last_luid ? bb_last_luid - bb_first_luid : 0;
+      extra_walk_budget += 256;
+
       rtl_profile_for_bb (this_basic_block);
       for (insn = BB_HEAD (this_basic_block);
           insn != NEXT_INSN (BB_END (this_basic_block));
@@ -1432,6 +1444,17 @@ combine_instructions (rtx_insn *f, unsigned int nregs)
            record_dead_and_set_regs (insn);

 retry:
+         if (next)
+           {
+             if (DF_INSN_LUID (next) < DF_INSN_LUID (insn))
+               {
+                 if (DF_INSN_LUID (insn) - DF_INSN_LUID (next)
+                     < extra_walk_budget)
+                   next = NULL;
+                 else
+                   extra_walk_budget -= DF_INSN_LUID (insn) - DF_INSN_LUID
(next);
+               }
+           }
          ;
        }
     }


diff --git a/gcc/combine.cc b/gcc/combine.cc
index 4dbc1f6a4a4..c9e5c91c973 100644
--- a/gcc/combine.cc
+++ b/gcc/combine.cc
@@ -1079,6 +1079,10 @@ create_log_links (void)
                  && asm_noperands (PATTERN (use_insn)) >= 0)
                continue;

+             /* Don't add far away links.  */
+             if (DF_INSN_LUID (use_insn) - DF_INSN_LUID (insn) > 256)
+               continue;
+
              /* Don't add duplicate links between instructions.  */
              struct insn_link *links;
              FOR_EACH_LOG_LINK (links, use_insn)


that get's combine in check but then afer a few minutes LRA blows up
with >25GB memory use so I had to kill it.

Reply via email to