https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120141

Jeffrey A. Law <law at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2025-06-04

--- Comment #1 from Jeffrey A. Law <law at gcc dot gnu.org> ---
In general one should not assume that the compiler is going to clean up
redundancies when using the vector intrinsic interface.  They're really meant
to be a nearly direct mapping from a C functional interface to an assembly
instruction.

Essentially intrinsics bypass the entire gimple optimizer pipeline and generate
code that is exceedingly difficult for the RTL optimizer pipeline to clean up
after the fact.

So I'm torn about whether or not to even try to tackle this problem.  We'd
essentially be writing custom optimizers for the intrinsics.   I don't mind if
we can trivially catch them, say at initial RTL expansion time, but I also
don't want to introduce a lot of code for this stuff...  To that end....






So in gimple the program is just this:

  vl_1 = __riscv_vsetvlmax_e16m1 ();
  a_4 = __riscv_vadd_vv_u16m1 (x_2(D), y_3(D), vl_1);
  _5 = __riscv_vsrl_vx_u16m1 (a_4, 0, vl_1); [tail call]

So there really isn't any reasonable way to optimize away the dumb code in
gimple. 

As we expand into RTL we have:

[ ... ]
(insn 9 8 13 2 (set (reg:RVVM1HI 136 [ <retval> ])
        (if_then_else:RVVM1HI (unspec:RVVMF16BI [
                    (const_vector:RVVMF16BI repeat [
                            (const_int 1 [0x1])
                        ])
                    (reg/v:DI 134 [ vl ])
                    (const_int 2 [0x2]) repeated x2
                    (const_int 0 [0])
                    (reg:SI 66 vl)
                    (reg:SI 67 vtype)
                ] UNSPEC_VPREDICATE)
            (lshiftrt:RVVM1HI (reg/v:RVVM1HI 135 [ a ])
                (const_int 0 [0]))
            (unspec:RVVM1HI [
                    (reg:DI 0 zero)
                ] UNSPEC_VUNDEF))) "j.c":6:12 -1
     (nil))

Which is almost certainly too complex for the various generic bits of RTL
optimizers to clean up.

If we're going to try to deal with this, a reasonable place might be the
expander in riscv-vector-builtins.cc.  We know the icode and from that we can
potentially look at arguments which would make the operation a nop. (or in this
case a simple reg->reg copy) and adjust the code accordingly.

Reply via email to