https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120141
Jeffrey A. Law <law at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Last reconfirmed| |2025-06-04 --- Comment #1 from Jeffrey A. Law <law at gcc dot gnu.org> --- In general one should not assume that the compiler is going to clean up redundancies when using the vector intrinsic interface. They're really meant to be a nearly direct mapping from a C functional interface to an assembly instruction. Essentially intrinsics bypass the entire gimple optimizer pipeline and generate code that is exceedingly difficult for the RTL optimizer pipeline to clean up after the fact. So I'm torn about whether or not to even try to tackle this problem. We'd essentially be writing custom optimizers for the intrinsics. I don't mind if we can trivially catch them, say at initial RTL expansion time, but I also don't want to introduce a lot of code for this stuff... To that end.... So in gimple the program is just this: vl_1 = __riscv_vsetvlmax_e16m1 (); a_4 = __riscv_vadd_vv_u16m1 (x_2(D), y_3(D), vl_1); _5 = __riscv_vsrl_vx_u16m1 (a_4, 0, vl_1); [tail call] So there really isn't any reasonable way to optimize away the dumb code in gimple. As we expand into RTL we have: [ ... ] (insn 9 8 13 2 (set (reg:RVVM1HI 136 [ <retval> ]) (if_then_else:RVVM1HI (unspec:RVVMF16BI [ (const_vector:RVVMF16BI repeat [ (const_int 1 [0x1]) ]) (reg/v:DI 134 [ vl ]) (const_int 2 [0x2]) repeated x2 (const_int 0 [0]) (reg:SI 66 vl) (reg:SI 67 vtype) ] UNSPEC_VPREDICATE) (lshiftrt:RVVM1HI (reg/v:RVVM1HI 135 [ a ]) (const_int 0 [0])) (unspec:RVVM1HI [ (reg:DI 0 zero) ] UNSPEC_VUNDEF))) "j.c":6:12 -1 (nil)) Which is almost certainly too complex for the various generic bits of RTL optimizers to clean up. If we're going to try to deal with this, a reasonable place might be the expander in riscv-vector-builtins.cc. We know the icode and from that we can potentially look at arguments which would make the operation a nop. (or in this case a simple reg->reg copy) and adjust the code accordingly.