https://gcc.gnu.org/bugzilla/show_bug.cgi?id=125191

            Bug ID: 125191
           Summary: lra introduces redundant vector reg copy with
                    paradoxical subreg
           Product: gcc
           Version: 17.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rdapp at gcc dot gnu.org
                CC: garthlei at gcc dot gnu.org
  Target Milestone: ---
            Target: riscv

The following source

typedef char vnx16qi __attribute__ ((vector_size (16)));

void permute1 (vnx16qi values1, vnx16qi values2, vnx16qi *out)
{
  vnx16qi v = __builtin_shufflevector (values1, values2, 0, 2, 4, 6, 8, 10, 12,
14, 16, 18, 20, 22, 24, 26, 28, 30);
  *(vnx16qi *) out = v;
}

(and the riscv backend massaged to emit narrowing shifts instead of compress
insns for this permute case)

results in

        ...
        vnsrl.wi        v1,v1,0
        vnsrl.wi        v2,v2,0
        vsetivli        zero,16,e8,m1,ta,ma
        vmv1r.v v3,v1                        # redundant
        vslideup.vi     v3,v2,8
        vse8.v  v3,0(a4)

It should just be
        vslideup.vi v1,v2,8

The insn is

(insn 24 34 25 2 (set (reg:V16QI 144 [ v_3 ])
        (unspec:V16QI [
                (unspec:V16BI [
                        (const_vector:V16BI [
                                (const_int 1 [0x1]) repeated x16
                            ])
                        (const_int 16 [0x10])
                        (const_int 2 [0x2]) repeated x3
                        (reg:SI 66 vl)
                        (reg:SI 67 vtype)
                    ] UNSPEC_VPREDICATE)
                (subreg:V16QI (reg:V8QI 146) 0)
                (subreg:V16QI (reg:V8QI 147) 0)
                (const_int 8 [0x8])
            ] UNSPEC_VSLIDEUP)) "bla.c":6:11 22727 {pred_slideupv16qi}

(note the paradoxical subreg on each source) 

ira already decides to use the same register:

      Popping a1(r144,l0)  --         assign reg 97
      Popping a2(r146,l0)  --         assign reg 97

lra seems OK with that at first:

         Considering alt=3 of insn 24:   (0) &vr  (1) Wc1  (2) 0  (3) vr  (4)
rK  (5) rvl  (6) i  (7) i  (8) i
            0 Early clobber: reject++
          overall=1,losers=0,rld_nregs=0
      Choosing alt 3 in insn 24:  (0) &vr  (1) Wc1  (2) 0  (3) vr  (4) rK  (5)
rvl  (6) i  (7) i  (8) i {pred_slideupv16qi}

but then:

********** Assignment #1: **********

    Spill r144 after risky transformations

getting us to:

(insn 44 23 24 2 (set (reg:V16QI 99 v3 [orig:144 v_3 ] [144])
        (reg:V16QI 97 v1 [146])) "bla.c":6:11 3328 {*movv16qi}
     (nil))
(insn 24 44 25 2 (set (reg:V16QI 99 v3 [orig:144 v_3 ] [144])
        (unspec:V16QI [
                (unspec:V16BI [
                        (const_vector:V16BI [
                                (const_int 1 [0x1]) repeated x16
                            ])
                        (const_int 16 [0x10])
                        (const_int 2 [0x2]) repeated x3
                        (reg:SI 66 vl)
                        (reg:SI 67 vtype)
                    ] UNSPEC_VPREDICATE)
                (reg:V16QI 99 v3 [orig:144 v_3 ] [144])
                (reg:V16QI 98 v2 [147])
                (const_int 8 [0x8])
            ] UNSPEC_VSLIDEUP)) "bla.c":6:11 22727 {pred_slideupv16qi}
     (expr_list:REG_EQUIV (mem:V16QI (reg/f:DI 14 a4 [orig:156 out ] [156]) [0
*out_5(D)+0 S16 A128])
        (nil)))

I suppose the paradoxical subreg is the culprit.  I haven't done further
analysis and wanted to document the current state.

Reply via email to