On 5/9/25 2:27 PM, Vineet Gupta wrote:
This showed up when debugging the testcase for PR119164.
RISC-V FRM mode-switching state machine has special handling for transitions
to and from a call_insn as FRM needs to saved/restored around calls (any
call is considered potentially FRM clobbering). Consider the following
canonical example where insns 2, 4, 6 come are due to user code, while
the rest of frm save/restore insns 1, 3, 5, 7 need to be generated for the
ABI semantics.
test_float_point_frm_static:
1: frrm a5 <--
2: fsrmi 2
3: fsrm a5 <--
4: call normalize_vl
5: frrm a5 <--
6: fsrmi 3
7: fsrm a5 <--
Not the focus of this series, but isn't the frrm@5 unnecessary? If
there is no use of FRM after a call, but before the next set of FRM,
then the restore isn't needed. This is a pretty standard optimization
in caller-saves style code generation. Though maybe it's really an
artifact of not showing the uses of FRM in your little example?
Current implementation of RISC-V TARGET_MODE_NEEDED has special handling
if the call_insn is last insn of BB, to ensure FRM save/reads are emitted
on all the edges. However it doesn't work as intended and is borderline
bogus for following reasons:
- It fails to detect call_insn as last of BB (PR119164 test) if the
next BB starts with a code label (say due to call being conditional).
Granted this is a deficiency of API next_nonnote_nondebug_insn_bb ()
which incorrectly returns next BB code_label as opposed to returning
NULL (and this behavior is kind of relied upon by much of gcc).
This causes missed/delayed state transition to DYN.
- If code is tightened to actually detect above such as:
- rtx_insn *insn = next_nonnote_nondebug_insn_bb (cur_insn);
- if (!insn)
+ if (BB_END (BLOCK_FOR_INSN (cur_insn)) == cur_insn)
I'm guessing the concern here was CUR_INSN could be a call and the last
real insn in the block, but there could be notes and debug statements
between the call and the actual end of the block?
Would it be safer as:
rtx_insn *insn = next_nonnote_nondebug_insn_bb (cur_insn);
if (!insn || BLOCK_FOR_INSN (cur_insn) != BLOCK_FOR_INSN (insn)
But it's all academic since this code really just needs to go away.
edge insertion happens but ends up splitting the BB which generic
mode-sw doesn't expect and ends up hittng an ICE.
- TARGET_MODE_NEEDED hook typically don't modify the CFG.
Yea, we probably don't want TARGET_MODE_NEEDED to be modifying the CFG.
- For abnormal edges, insert_insn_end_basic_block () is called, which
by design on encountering call_insn as last in BB, inserts new insn
BEFORE the call, not after.
Right. In theory we're not supposed to be asking for insertion on
abnormals anyway. It would be interesting to know if that's still
happening or if it's legacy from the late 90s when we first started
lighting up LCM algorithms and made all the common mistakes WRT abnormals.
So this is just all wrong and ripe for removal. Moreover there seems to
be no testsuite coverage for this code path at all. Results don't change
at all if this is removed.
The total number of FRM read/writes emitted (static count) across all
benchmarks of a SPEC2017 -Ofast -march=rv64gcv build decrease slightly
so its a net win even if minimal but the real gain is reduced complexity
and maintenance.
Before Patch
---------------- ---------------
frrm fsrmi fsrm frrm fsrmi frrm
perlbench_r 42 0 4 42 0 4
cpugcc_r 167 0 17 167 0 17
bwaves_r 16 0 1 16 0 1
mcf_r 11 0 0 11 0 0
cactusBSSN_r 79 0 27 76 0 27
namd_r 119 0 63 119 0 63
parest_r 218 0 114 168 0 114 <--
povray_r 123 1 17 123 1 17
lbm_r 6 0 0 6 0 0
omnetpp_r 17 0 1 17 0 1
wrf_r 2287 13 1956 2287 13 1956
cpuxalan_r 17 0 1 17 0 1
ldecod_r 11 0 0 11 0 0
x264_r 14 0 1 14 0 1
blender_r 724 12 182 724 12 182
cam4_r 324 13 169 324 13 169
deepsjeng_r 11 0 0 11 0 0
imagick_r 265 16 34 265 16 34
leela_r 12 0 0 12 0 0
nab_r 13 0 1 13 0 1
exchange2_r 16 0 1 16 0 1
fotonik3d_r 20 0 11 20 0 11
roms_r 33 0 23 33 0 23
xz_r 6 0 0 6 0 0
---------------- ---------------
4551 55 2623 4498 55 2623
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_frm_emit_after_bb_end): Delete.
(riscv_frm_mode_needed): Remove call riscv_frm_emit_after_bb_end.
OK once all the pre/post requisites are installed.
jeff