Hi! On Mon, Jun 01, 2020 at 09:15:06AM -0700, Carl Love wrote: > The following patch adds support for the vec_blendv and vec_permx > builtins.
Pretty interesting insns ;-) > * config/rs6000/altivec.h: Add define for vec_blendv and > vec_permx. > * config/rs6000/altivec.md: Add unspec UNSPEC_XXBLEND, > UNSPEC_XXPERMX. Similar as the other patches. > New define_mode VM3. (VM3): New mode iterator. (Etc.) > * config/rs6000/rs6000-c.c: > (altivecaltivec_resolve_overloaded_builtin): Duplicated "altivec". > + /* Need vec char with each element = 255, is there a better > way? */ Yes, just splat -1 (in any mode, vplti*), or byte 0xff (xxspltib). There probably are some utility functions to create those? > + /* Reverse value of byte element index eidx by subracting bits > [3:7] of > + each operand[3] element from 31. Swap order of operands so > indexing > + will be correct. Reverse the 32-byte section identifier match > by > + subracting bits [0:2] of elemet from 7. */ Hrm, you could xor every index with 0x1f? That can often simplify with whatever sets the mask... > + emit_insn (gen_altivec_vsubsbs (tmp, vreg, operands[3])); ... but this cannot, it is an unspec (saturating subtract). > + else if (icode == CODE_FOR_xxpermx) > + { > + /* Only allow 8-bit unsigned literals. */ > + STRIP_NOPS (arg3); > + if (TREE_CODE (arg3) != INTEGER_CST > + || TREE_INT_CST_LOW (arg3) & ~0xff) > + { > + error ("argument 4 must be an 8-bit unsigned literal"); > + return CONST0_RTX (tmode); > + } > + } I think it should be a 3-bit constant, instead? > +Vector Blend Variable > +Blend the first and second argument vectors according to the sign bits > of the > +corresponding elements of the third argument vector. Maybe it should say it is related to vsel/xxsel, but per bigger element? > +@findex vec_vlendv (typo). > +Vector Permute Extendedextracth Stray "extracth"? > +Perform a partial permute of the first two arguments, which form a 32- > byte > +section of an emulated vector up to 256 bytes wide, using the partial > permute > +control vector in the third argument. The fourth argument > (constrained to > +values of 0-7) identifies which 32-byte section of the emulated vector > is > +contained in the first two arguments. > +@findex vec_permx Maybe say that the elements not corresponding to that section are set to 0? > +++ b/gcc/testsuite/gcc.target/powerpc/vec-blend-runnable.c > +/* { dg-final { scan-assembler-times {\msplati\M} 6 } } */ > +/* { dg-final { scan-assembler-times {\msrdbi\M} 6 } } */ These do not work like this? Some v or xx splti*, and vsrdbi? Thanks! Segher