https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114729

--- Comment #22 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Vineet Gupta <vine...@gcc.gnu.org>:

https://gcc.gnu.org/g:7bef3482f27ce13ba7e6c4f43943f28a49e63a40

commit r15-5925-g7bef3482f27ce13ba7e6c4f43943f28a49e63a40
Author: Vineet Gupta <vine...@rivosinc.com>
Date:   Wed Dec 4 10:42:37 2024 -0800

    sched1: parameterize pressure scheduling spilling aggressiveness
[PR/114729]

    sched1 computes ECC (Excess Change Cost) for each insn, which represents
    the register pressure attributed to the insn.
    Currently the pressure sensitive scheduling algorithm deliberately ignores
    negative ECC values (pressure reduction), making them 0 (neutral), leading
    to more spills. This happens due to the assumption that the compiler has
    a reasonably accurate processor pipeline scheduling model and thus tries
    to aggresively fill pipeline bubbles with spill slots.

    This however might not be true, as the model might not be available for
    certains uarches or even applicable especially for modern out-of-order
cores.

    The existing heuristic induces spill frenzy on RISC-V, noticably so on
    SPEC2017 507.Cactu. If insn scheduling is disabled completely, the
    total dynamic icounts for this workload are reduced in half from
    ~2.5 trillion insns to ~1.3 (w/ -fno-schedule-insns).

    This patch adds --param=cycle-accurate-model={0,1} to gate the spill
    behavior.

     - The default (1) preserves existing spill behavior.

     - targets/uarches sensitive to spilling can override the param to (0)
       to get the reverse effect. RISC-V backend does so too.

    The actual perf numbers are very promising.

    (1) On RISC-V BPI-F3 in-order CPU, -Ofast -march=rv64gcv_zba_zbb_zbs:

      Before:
      ------
      Performance counter stats for './cactusBSSN_r_base.rivos spec_ref.par':

          4,917,712.97 msec task-clock:u                     #    1.000 CPUs
utilized
                 5,314      context-switches:u               #    1.081 /sec
                     3      cpu-migrations:u                 #    0.001 /sec
               204,784      page-faults:u                    #   41.642 /sec
     7,868,291,222,513      cycles:u                         #    1.600 GHz
     2,615,069,866,153      instructions:u                   #    0.33  insn
per cycle
        10,799,381,890      branches:u                       #    2.196 M/sec
            15,714,572      branch-misses:u                  #    0.15% of all
branches

      After:
      -----
      Performance counter stats for './cactusBSSN_r_base.rivos spec_ref.par':

          4,552,979.58 msec task-clock:u                     #    0.998 CPUs
utilized
               205,020      context-switches:u               #   45.030 /sec
                     2      cpu-migrations:u                 #    0.000 /sec
               204,221      page-faults:u                    #   44.854 /sec
     7,285,176,204,764      cycles:u        (7.4% faster)    #    1.600 GHz
     2,145,284,345,397      instructions:u (17.96% fewer)    #    0.29  insn
per cycle
        10,799,382,011      branches:u                       #    2.372 M/sec
            16,235,628      branch-misses:u                  #    0.15% of all
branches

    (2) Wilco reported 20% perf gains on aarch64 Neoverse V2 runs.

    gcc/ChangeLog:
            PR target/11472
            * params.opt (--param=cycle-accurate-model=): New opt.
            * doc/invoke.texi (cycle-accurate-model): Document.
            * haifa-sched.cc (model_excess_group_cost): Return negative
            delta if param_cycle_accurate_model is 0.
            (model_excess_cost): Ceil negative baseECC to 0 only if
            param_cycle_accurate_model is 1.
            Dump the actual ECC value.
            * config/riscv/riscv.cc (riscv_option_override): Set param
            to 0.

    gcc/testsuite/ChangeLog:
            PR target/114729
            * gcc.target/riscv/riscv.exp: Enable new tests to build.
            * gcc.target/riscv/sched1-spills/spill1.cpp: Add new test.

    Signed-off-by: Vineet Gupta <vine...@rivosinc.com>

Reply via email to