Hi!

With --param=min-nondebug-insn-uid= parameter one can force really large
INSN_UIDs even in moderately large functions.
Obviously the compiler will break badly if INSN_UID (which is host int)
wraps around, but in dfa_insn_code_enlarge it actually breaks already
when INSN_UID reaches INT_MAX / 2 + 1, because it tries to resize
vector to twice the max INSN_UID and overflows (UB on the compiler side
and then trying to allocate 0xfffffffe00000001-ish bytes).

The following patch fixes it by making sure not to overflow and cap
the allocation size at INT_MAX (that means even maximum INSN_UID INT_MAX
will not work properly, but just reaching that is highly undesirable
anyway).  Alternatively dfa_insn_codes_length could be e.g. HOST_WIDE_INT
and we could just use HOST_WIDE_INT_C (2) * uid, but then we'd allocate
parts of vector for something that we really won't use anyway.

Of course, it is really bad idea to use
--param=min-nondebug-insn-uid=1073741824
or even huge numbers several times smaller than that, because the compiler
often allocates arrays sized by maximum INSN_UID, so e.g. compiling empty
function on cross to mips64-linux with that argument needed around 38GiB
of RAM.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2026-03-11  Jakub Jelinek  <[email protected]>

        PR target/124436
        * genautomata.cc (output_dfa_insn_code_func): Use
        MIN (INT_MAX, 2U * uid) instead of 2 * uid in dfa_insn_code_enlarge.

--- gcc/genautomata.cc.jj       2026-01-02 09:56:10.173336414 +0100
+++ gcc/genautomata.cc  2026-03-10 21:16:53.837972428 +0100
@@ -8143,7 +8143,7 @@ static void\n\
 dfa_insn_code_enlarge (int uid)\n\
 {\n\
   int i = %s;\n\
-  %s = 2 * uid;\n\
+  %s = MIN (INT_MAX, 2U * uid);\n\
   %s = XRESIZEVEC (int, %s,\n\
                  %s);\n\
   for (; i < %s; i++)\n\

        Jakub

Reply via email to