https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111794

--- Comment #7 from Robin Dapp <rdapp at gcc dot gnu.org> ---
  vectp.4_188 = x_50(D);
  vect__1.5_189 = MEM <vector(8) int> [(int *)vectp.4_188];
  mask__2.6_190 = { 1, 1, 1, 1, 1, 1, 1, 1 } == vect__1.5_189;
  mask_patt_156.7_191 = VIEW_CONVERT_EXPR<vector(8)
<signed-boolean:1>>(mask__2.6_190);
  _1 = *x_50(D);
  _2 = _1 == 1;
  vectp.9_192 = y_51(D);
  vect__3.10_193 = MEM <vector(8) short int> [(short int *)vectp.9_192];
  mask__4.11_194 = { 2, 2, 2, 2, 2, 2, 2, 2 } == vect__3.10_193;
  mask_patt_157.12_195 = mask_patt_156.7_191 & mask__4.11_194;
  vect_patt_158.13_196 = VEC_COND_EXPR <mask_patt_157.12_195, { 1, 1, 1, 1, 1,
1, 1, 1 }, { 0, 0, 0, 0, 0, 0, 0, 0 }>;
  vect_patt_159.14_197 = (vector(8) int) vect_patt_158.13_196;


This yields the following assembly:
        vsetivli        zero,8,e32,m2,ta,ma
        vle32.v v2,0(a0)
        vmv.v.i v4,1
        vle16.v v1,0(a1)
        vmseq.vv        v0,v2,v4
        vsetvli zero,zero,e16,m1,ta,ma
        vmseq.vi        v1,v1,2
        vsetvli zero,zero,e32,m2,ta,ma
        vmv.v.i v2,0
        vmand.mm        v0,v0,v1
        vmerge.vvm      v2,v2,v4,v0
        vse32.v v2,0(a0)

Apart from CSE'ing v4 this looks pretty good to me.  My connection is really
poor at the moment so I cannot quickly compare what aarch64 does for that
example.

Reply via email to