So this is where my FRM excursions really began, trying to elide the extraneous FRM save/restores when none was needed.
After a call_insn, mode needs to be switched from DYN_CALL to DYN. Failing to do so could defer the DYN_CALL to DYN, creating interim transitions leading to extraneous FRM save/retore. The current back checking of call_insn was too coarse-grained and flawed just like flawed forward check which was removed earlier in the series. The API prev_nonnote_nondebug_insn_bb () implies current insn and the call_insn to be in same BB which need not always be true. The problem is not with the API, but the use thereof. Fix this by tracking call_insn more explicitly in TARGET_MODE_NEEDED. - On seeing a call_insn, make a note. - On subsequent insns, if note seen, do the state switch and clear the note. The number of FRM read/writes across SPEC2017 -Ofast -mrv64gcv improves as well. Before After ------------- --------------- frrm fsrmi fsrm frrm fsrmi frrm perlbench_r 17 0 1 17 0 1 cpugcc_r 11 0 0 11 0 0 bwaves_r 16 0 1 16 0 1 mcf_r 11 0 0 11 0 0 cactusBSSN_r 19 0 1 19 0 1 namd_r 14 0 1 14 0 1 parest_r 24 0 1 24 0 1 povray_r 26 1 6 26 1 6 lbm_r 6 0 0 6 0 0 omnetpp_r 17 0 1 17 0 1 wrf_r 1268 13 1603 613 13 82 cpuxalan_r 17 0 1 17 0 1 ldecod_r 11 0 0 11 0 0 x264_r 11 0 0 11 0 0 blender_r 61 12 42 39 12 16 cam4_r 45 13 20 40 13 17 deepsjeng_r 11 0 0 11 0 0 imagick_r 132 16 25 33 16 18 leela_r 12 0 0 12 0 0 nab_r 13 0 1 13 0 1 exchange2_r 16 0 1 16 0 1 fotonik3d_r 19 0 1 19 0 1 roms_r 21 0 1 21 0 1 xz_r 6 0 0 6 0 0 - ----------------- -------------- 1804 55 1707 1023 55 150 This fixes PR119164 (and also PR119832 w/o need for TARGET_MODE_CONFLUENCE). It also fixes PR120203 where current codegen seemed wrong for float-point-dynamic-frm-74.c It was missing a FRM save after 2nd call normalize_vl_2 (). | frrm a5 | fsrmi 1 | | vfadd.vv v1,v8,v9 | fsrm a5 | beq a1,zero,.L2 | | call normalize_vl_1 | frrm a5 | | .L3: | fsrmi 3 | vfadd.vv v8,v8,v9 | fsrm a5 | jr ra | | .L2: | call normalize_vl_2 | frrm a5 <-- missing | j .L3 PR target/119164 PR target/120203 gcc/ChangeLog: * config/riscv/riscv.cc (CFUN_IN_CALL): New macro. (struct mode_switching_info): Add new field. (riscv_frm_adjust_mode_after_call): Remove. (riscv_frm_mode_needed): Track call_insn. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/float-point-dynamic-frm-74.c: Bump expected FRRM by 1. * gcc.target/riscv/rvv/base/pr119164.c: New test. Signed-off-by: Vineet Gupta <vine...@rivosinc.com> --- gcc/config/riscv/riscv.cc | 42 +++++++------------ .../rvv/base/float-point-dynamic-frm-74.c | 2 +- .../gcc.target/riscv/rvv/base/pr119164.c | 22 ++++++++++ 3 files changed, 39 insertions(+), 27 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr119164.c diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 37f3ace49a8b..cd959241e023 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -107,6 +107,8 @@ along with GCC; see the file COPYING3. If not see /* True the mode switching has static frm, or false. */ #define STATIC_FRM_P(c) ((c)->machine->mode_sw_info.static_frm_p) +#define CFUN_IN_CALL(c) ((c)->machine->mode_sw_info.cfun_call) + /* True if we can use the instructions in the XTheadInt extension to handle interrupts, or false. */ #define TH_INT_INTERRUPT(c) \ @@ -176,10 +178,13 @@ struct GTY(()) mode_switching_info { mode instruction in the function or not. */ bool static_frm_p; + bool cfun_call; + mode_switching_info () { dynamic_frm = NULL_RTX; static_frm_p = false; + cfun_call = false; } }; @@ -12148,20 +12153,6 @@ riscv_emit_mode_set (int entity, int mode, int prev_mode, } } -/* Adjust the FRM_NONE insn after a call to FRM_DYN for the - underlying emit. */ - -static int -riscv_frm_adjust_mode_after_call (rtx_insn *cur_insn, int mode) -{ - rtx_insn *insn = prev_nonnote_nondebug_insn_bb (cur_insn); - - if (insn && CALL_P (insn)) - return riscv_vector::FRM_DYN; - - return mode; -} - /* Return mode that frm must be switched into prior to the execution of insn. */ @@ -12173,26 +12164,25 @@ riscv_frm_mode_needed (rtx_insn *cur_insn, int code) /* The dynamic frm will be initialized only onece during cfun. */ DYNAMIC_FRM_RTL (cfun) = gen_reg_rtx (SImode); emit_insn_at_entry (gen_frrmsi (DYNAMIC_FRM_RTL (cfun))); + CFUN_IN_CALL (cfun) = false; } if (CALL_P (cur_insn)) + { + CFUN_IN_CALL (cfun) = true; return riscv_vector::FRM_DYN_CALL; + } int mode = code >= 0 ? get_attr_frm_mode (cur_insn) : riscv_vector::FRM_NONE; if (mode == riscv_vector::FRM_NONE) - /* After meet a call, we need to backup the frm because it may be - updated during the call. Here, for each insn, we will check if - the previous insn is a call or not. When previous insn is call, - there will be 2 cases for the emit mode set. - - 1. Current insn is not MODE_NONE, then the mode switch framework - will do the mode switch from MODE_CALL to MODE_NONE natively. - 2. Current insn is MODE_NONE, we need to adjust the MODE_NONE to - the MODE_DYN, and leave the mode switch itself to perform - the emit mode set. - */ - mode = riscv_frm_adjust_mode_after_call (cur_insn, mode); + { + if (CFUN_IN_CALL (cfun)) + { + CFUN_IN_CALL (cfun) = false; + return riscv_vector::FRM_DYN; + } + } else if (riscv_static_frm_mode_p (mode)) STATIC_FRM_P (cfun) = true; diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-dynamic-frm-74.c b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-dynamic-frm-74.c index c8a580038ec9..15284828044d 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-dynamic-frm-74.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/float-point-dynamic-frm-74.c @@ -33,7 +33,7 @@ test_float_point_dynamic_frm (vfloat32m1_t op1, vfloat32m1_t op2, } /* { dg-final { scan-assembler-times {vfadd\.v[vf]\s+v[0-9]+,\s*v[0-9]+,\s*[fav]+[0-9]+} 2 } } */ -/* { dg-final { scan-assembler-times {frrm\s+[axs][0-9]+} 2 } } */ +/* { dg-final { scan-assembler-times {frrm\s+[axs][0-9]+} 3 } } */ /* { dg-final { scan-assembler-times {fsrm\s+[axs][0-9]+} 2 } } */ /* { dg-final { scan-assembler-times {fsrmi\s+[01234]} 2 } } */ /* { dg-final { scan-assembler-not {fsrmi\s+[axs][0-9]+,\s*[01234]} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr119164.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr119164.c new file mode 100644 index 000000000000..a39a7f177f05 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr119164.c @@ -0,0 +1,22 @@ +/* Reduced from SPEC2017 blender: node_texture_util.c. + The conditional function call was tripping mode switching state machine */ + +/* { dg-do compile } */ +/* { dg-options " -Ofast -march=rv64gcv_zvl256b -ftree-vectorize -mrvv-vector-bits=zvl" } */ + +void *a; +float *b; +short c; +void d(); +void e() { + if (a) + d(); + if (c) { + b[0] = b[0] * 0.5f + 0.5f; + b[1] = b[1] * 0.5f + 0.5f; + } +} + +/* { dg-final { scan-assembler-not {frrm\s+[axs][0-9]+} } } */ +/* { dg-final { scan-assembler-not {fsrmi\s+[01234]} } } */ +/* { dg-final { scan-assembler-not {fsrm\s+[axs][0-9]+} } } */ -- 2.43.0