Perhaps I'm missing something (I'm not too familiar with SVE semantics), but is there a reason that the solution for PR96473 uses a VEC_PERM_EXPR and not just a VEC_DUPLICATE_EXPR? The folding of sv1d1rq (svptrue_..., ...) doesn't seem to require either the blending or the permutation functionality of a VEC_PERM_EXPR. Instead, it seems to be misusing (the modified) VEC_PERM_EXPR as a form of VIEW_CONVERT_EXPR that allows us to convert/mismatch the type of the operands to the type of the result.
Conceptually, (as in Richard's original motivation for the PR), svint32_t foo (int32x4_t x) { return svld1rq (svptrue_b8 (), &x[0]); } can be optimized to (something like) svint32_t foo (int32x4_t x) { return svdup_32 (x[0]); } // or dup z0.q, z0.q[0] equivalent hence it makes sense for fold to transform the gimple form of the first, into the gimple form of the second(?) Just curious. Roger -- > -----Original Message----- > From: Richard Sandiford <richard.sandif...@arm.com> > Sent: 06 February 2023 12:22 > To: Richard Biener <richard.guent...@gmail.com> > Cc: Roger Sayle <ro...@nextmovesoftware.com>; GCC Patches <gcc- > patc...@gcc.gnu.org> > Subject: Re: [DOC PATCH] Document the VEC_PERM_EXPR tree code (and minor > clean-ups). > > Richard Biener <richard.guent...@gmail.com> writes: > > On Sat, Feb 4, 2023 at 9:35 PM Roger Sayle <ro...@nextmovesoftware.com> > wrote: > >> > >> > >> This patch (primarily) documents the VEC_PERM_EXPR tree code in > >> generic.texi. For ease of review, it is provided below as a pair of > >> diffs. The first contains just the new text added to describe > >> VEC_PERM_EXPR, the second tidies up this part of the documentation by > >> sorting the tree codes into alphabetical order, and providing > >> consistent section naming/capitalization, so changing this section > >> from "Vectors" to "Vector Expressions" (matching the nearby "Unary > >> and Binary Expressions"). > >> > >> Tested with make pdf and make html on x86_64-pc-linux-gnu. > >> The reviewer(s) can decide whether to approve just the new content, > >> or the content+clean-up. Ok for mainline? > > > > +@item VEC_PERM_EXPR > > +This node represents a vector permute/blend operation. The three > > +operands must be vectors of the same number of elements. The first > > +and second operands must be vectors of the same type as the entire > > +expression, > > > > this was recently relaxed for the case of constant permutes in which > > case the first and second operands only have to have the same element > > type as the result. See tree-cfg.cc:verify_gimple_assign_ternary. > > > > The following description will become a bit more awkward here and for > > rhs1/rhs2 with different number of elements the modulo interpretation > > doesn't hold - I believe we require in-bounds elements for constant > > permutes. Richard can probably clarify things here. > > I thought that the modulo behaviour still applies when the node has a constant > selector, it's just that the in-range form is the canonical one. > > With variable-length vectors, I think it's in principle possible to have a stepped > constant selector whose start elements are in-range but whose final elements > aren't (and instead wrap around when applied). > E.g. the selector could zip the last quarter of the inputs followed by the first > quarter. > > Thanks, > Richard