https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120245
Bug ID: 120245 Summary: RISC-V: Avoid FRM read/writes in non FRM related code paths Product: gcc Version: 15.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: target Assignee: vineetg at gcc dot gnu.org Reporter: vineetg at gcc dot gnu.org CC: jeffreyalaw at gmail dot com, pan2.li at intel dot com, rdapp at gcc dot gnu.org Target Milestone: --- For a function which needs FRM save/restore (say due to static rounding mode setting), RISC-V frm mode switching backend generates FRM read/writes in all the code paths, even those which don't deal with RM at all. e.g. gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-ceil-0.c -O3 -march=rv64gcv_zvfh -mabi=lp64d -ffast-math -ftree-vectorize void test__Float16___builtin_ceilf16 (_Float16 *out, _Float16 *in, unsigned count) { for (unsigned i = 0; i < count; i++) out[i] = __builtin_ceilf16 (in[i]); } test__Float16___builtin_ceilf16: .LFB2: frrm a3 # backup RM (backup reg) beq a2,zero,.L14 bleu a5,a4,.L3 fsrmi 3 # static RM .L4: vfcvt.x.f.v v3,v2,v0.t # Uses static RM vfcvt.f.x.v v1,v3,v0.t # Uses static RM vfsgnj.vv v1,v1,v2 bne a2,zero,.L4 fsrm a3 # Restore OK ld ra,24(sp) ld s0,16(sp) ld s1,8(sp) addi sp,sp,32 jr ra .L3: .L6: call ceilf16 frrm a3 bne s0,s2,.L6 fsrm a3 # Restore needless jr ra .L14: fsrm a3 # Restore Needless ret the frm state machine has a "sticky" note of static mode even been set and if so, around function calls, it generates the save/restore even if that code path doesn't need to. FWIW The sticky static mode is needed for different reasons (see gcc/testsuite/gcc.target/riscv/rvv/base/float-point-dynamic-frm-74.c) Anyhow, if we can delay the intial FRM save to right before actual static RM (that makes the backup reg live only on needed edges) and at time of restore check if the edge for insertion has backup reg live - only then insert it.