https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116611

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|UNCONFIRMED                 |RESOLVED

--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
I now see

  _99 = .SELECT_VL (ivtmp_97, POLY_INT_CST [4, 4]);
  ivtmp_44 = _99 * 32;
  vect_array.16 = .MASK_LEN_LOAD_LANES (vectp_in.14_43, 32B, { -1, ... }, _99,
0);
  vect__2.17_71 = vect_array.16[0];
  vect__2.18_72 = vect_array.16[1];
  vect__2.19_73 = vect_array.16[2];
  vect__2.20_74 = vect_array.16[3];
  vect__2.21_75 = vect_array.16[4];
  vect__2.22_76 = vect_array.16[5];
  vect__2.23_77 = vect_array.16[6];
  vect__2.24_78 = vect_array.16[7];
  vect_array.27[0] = vect__2.17_71;
  vect_array.27[1] = vect__2.18_72;
  vect_array.27[2] = vect__2.19_73;
  vect_array.27[3] = vect__2.20_74;
  vect_array.27[4] = vect__2.21_75;
  vect_array.27[5] = vect__2.22_76;
  vect_array.27[6] = vect__2.23_77;
  vect_array.27[7] = vect__2.24_78;
  .MASK_LEN_STORE_LANES (vectp_out.25_80, 32B, { -1, ... }, _99, 0,
vect_array.27);
  ivtmp_93 = _99 * 4;
  .MASK_LEN_STORE (vectp_ia.28_94, 32B, { -1, ... }, _99, 0, vect__2.19_73);

which I think is perfect and what I expected.  It doesn't show the
previous issue anymore.

There seems to be some confusion with RA though:

.L8:
        vsetvli a5,a3,e8,mf4,ta,ma
        vlseg8e32.v     v8,(a2)
        slli    a0,a5,5
        slli    a7,a5,2
        sub     a3,a3,a5
        add     a2,a2,a0
        vmv1r.v v16,v8
        vmv1r.v v17,v9
        vmv1r.v v18,v10
        vmv1r.v v19,v11
        vmv1r.v v20,v12
        vmv1r.v v21,v13
        vmv1r.v v22,v14
        vmv1r.v v23,v15
        vsseg8e32.v     v16,(a1)
        add     a1,a1,a0
        vse32.v v10,0(a4)
        add     a4,a4,a7
        bne     a3,zero,.L8

why do we copy the register group v8-15 to v16-23?  Is this because of
the v10 use?  Then a single move of v10 to v18 should have sufficed?

Reply via email to