On Tue, 14 Dec 2021, Prathamesh Kulkarni wrote: > On Tue, 7 Dec 2021 at 19:08, Richard Sandiford > <richard.sandif...@arm.com> wrote: > > > > Prathamesh Kulkarni <prathamesh.kulka...@linaro.org> writes: > > > On Thu, 2 Dec 2021 at 23:11, Richard Sandiford > > > <richard.sandif...@arm.com> wrote: > > >> > > >> Prathamesh Kulkarni <prathamesh.kulka...@linaro.org> writes: > > >> > Hi Richard, > > >> > I have attached a WIP untested patch for PR96463. > > >> > IIUC, the PR suggests to transform > > >> > lhs = svld1rq ({-1, -1, ...}, &v[0]) > > >> > into: > > >> > lhs = vec_perm_expr<v, v, {0, 0, ...}> > > >> > if v is vector of 4 elements, and each element is 32 bits on little > > >> > endian target ? > > >> > > > >> > I am sorry if this sounds like a silly question, but I am not sure how > > >> > to convert a vector of type int32x4_t into svint32_t ? In the patch, I > > >> > simply used NOP_EXPR (which I expected to fail), and gave type error > > >> > during gimple verification: > > >> > > >> It should be possible in principle to have a VEC_PERM_EXPR in which > > >> the operands are Advanced SIMD vectors and the result is an SVE vector. > > >> > > >> E.g., the dup in the PR would be something like this: > > >> > > >> foo (int32x4_t a) > > >> { > > >> svint32_t _2; > > >> > > >> _2 = VEC_PERM_EXPR <x_1(D), x_1(D), { 0, 1, 2, 3, 0, 1, 2, 3, ... }>; > > >> return _2; > > >> } > > >> > > >> where the final operand can be built using: > > >> > > >> int source_nelts = TYPE_VECTOR_SUBPARTS (…rhs type…).to_constant (); > > >> vec_perm_builder sel (TYPE_VECTOR_SUBPARTS (…lhs type…), source_nelts, > > >> 1); > > >> for (int i = 0; i < source_nelts; ++i) > > >> sel.quick_push (i); > > >> > > >> I'm not sure how well-tested that combination is though. It might need > > >> changes to target-independent code. > > > Hi Richard, > > > Thanks for the suggestions. > > > I tried the above approach in attached patch, but it still results in > > > ICE due to type mismatch: > > > > > > pr96463.c: In function ‘foo’: > > > pr96463.c:8:1: error: type mismatch in ‘vec_perm_expr’ > > > 8 | } > > > | ^ > > > svint32_t > > > int32x4_t > > > int32x4_t > > > svint32_t > > > _3 = VEC_PERM_EXPR <x_4(D), x_4(D), { 0, 1, 2, 3, ... }>; > > > during GIMPLE pass: ccp > > > dump file: pr96463.c.032t.ccp1 > > > pr96463.c:8:1: internal compiler error: verify_gimple failed > > > > > > Should we perhaps add another tree code, that "extends" a fixed-width > > > vector into it's VLA equivalent ? > > > > No, I think this is just an extreme example of the combination not being > > well-tested. :-) Obviously it's worse than I thought. > > > > I think accepting this kind of VEC_PERM_EXPR is still the way to go. > > Richi, WDYT? > Hi Richi, ping ?
We check case VEC_PERM_EXPR: if (!useless_type_conversion_p (lhs_type, rhs1_type) || !useless_type_conversion_p (lhs_type, rhs2_type)) { error ("type mismatch in %qs", code_name); and LHS is svint32_t while x_4 is int32x4_t (the permutation type can indeed be different - it needs to be integer for example). The test is indeed unnecessarily strict if there are two vector types that are not compatible but have the same element mode and the same number of elements. I guess we could check sth like if (!useless_type_conversion_p (TREE_TYPE (lhs_type), TREE_TYPE (rhs1_type)) || !types_compatible_p (rhs1_type, rhs2_type)) instead - we later check TYPE_VECTOR_SUBPARTS so they match. But note that vec_perm_optab has a single mode only so I'm not sure what mode we should pick at RTL expansion time for your quoted case so I'm a bit nervous here. Richard? Richard.