[Bug rtl-optimization/88751] New: Performance regression reload vs lra

krebbel at gcc dot gnu.org Tue, 08 Jan 2019 00:49:06 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88751


            Bug ID: 88751
           Summary: Performance regression reload vs lra
           Product: gcc
           Version: 9.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: krebbel at gcc dot gnu.org
  Target Milestone: ---

There is a big performance drop in OpenJ9 after they have updated from GCC
4.8.5 to GCC 7.3.0.

- The performance regression disappears after compiling the byte code
interpreter loop with -mno-lra.
https://github.com/eclipse/openj9/blob/master/runtime/vm/BytecodeInterpreter.hpp

- The problem comes from the frequently accessed _pc and _sp variables being
assigned to stack slots instead of registers. With GCC 4.8 both variables end
up in hard regs.

- The problem can be seen on x86 as well as on S/390.

- In LRA the root cause of the problem is a threshold which prevents LRA from
running the full register coloring step (ira.c):

   /* If there are too many pseudos and/or basic blocks (e.g. 10K
      pseudos and 10K blocks or 100K pseudos and 1K blocks), we will
      use simplified and faster algorithms in LRA.  */
  lra_simple_p = (ira_use_lra_p && max_reg_num () >= (1 << 26) /
  last_basic_block_for_fn (cfun));

  For the huge run() function in the byte code interpreter the numbers are:

  (gdb) p max_reg_num()
  $6 = 27089
  (gdb) p last_basic_block_for_fn(cfun)
  $7 = 4799

  Forcing GCC to run the full coloring pass makes the _pc and _sp variables to
get hard regs assigned again.


As a quick workaround we might want to turn this threshold into a parameter.

Long-term it would be good if we could either enable the heuristic to estimate
whether full coloring would be beneficial or improve the fallback coloring to
cover such important cases.

[Bug rtl-optimization/88751] New: Performance regression reload vs lra

Reply via email to