Hi!
On Mon, Jun 01, 2020 at 09:15:06AM -0700, Carl Love wrote:
> The following patch adds support for the vec_blendv and vec_permx
> builtins.
Pretty interesting insns ;-)
> * config/rs6000/altivec.h: Add define for vec_blendv and
> vec_permx.
> * config/rs6000/altivec.md: Add unspec UNSPEC_XXBLEND,
> UNSPEC_XXPERMX.
Similar as the other patches.
> New define_mode VM3.
(VM3): New mode iterator.
(Etc.)
> * config/rs6000/rs6000-c.c:
> (altivecaltivec_resolve_overloaded_builtin):
Duplicated "altivec".
> + /* Need vec char with each element = 255, is there a better
> way? */
Yes, just splat -1 (in any mode, vplti*), or byte 0xff (xxspltib).
There probably are some utility functions to create those?
> + /* Reverse value of byte element index eidx by subracting bits
> [3:7] of
> + each operand[3] element from 31. Swap order of operands so
> indexing
> + will be correct. Reverse the 32-byte section identifier match
> by
> + subracting bits [0:2] of elemet from 7. */
Hrm, you could xor every index with 0x1f? That can often simplify with
whatever sets the mask...
> + emit_insn (gen_altivec_vsubsbs (tmp, vreg, operands[3]));
... but this cannot, it is an unspec (saturating subtract).
> + else if (icode == CODE_FOR_xxpermx)
> + {
> + /* Only allow 8-bit unsigned literals. */
> + STRIP_NOPS (arg3);
> + if (TREE_CODE (arg3) != INTEGER_CST
> + || TREE_INT_CST_LOW (arg3) & ~0xff)
> + {
> + error ("argument 4 must be an 8-bit unsigned literal");
> + return CONST0_RTX (tmode);
> + }
> + }
I think it should be a 3-bit constant, instead?
> +Vector Blend Variable
> +Blend the first and second argument vectors according to the sign bits
> of the
> +corresponding elements of the third argument vector.
Maybe it should say it is related to vsel/xxsel, but per bigger element?
> +@findex vec_vlendv
(typo).
> +Vector Permute Extendedextracth
Stray "extracth"?
> +Perform a partial permute of the first two arguments, which form a 32-
> byte
> +section of an emulated vector up to 256 bytes wide, using the partial
> permute
> +control vector in the third argument. The fourth argument
> (constrained to
> +values of 0-7) identifies which 32-byte section of the emulated vector
> is
> +contained in the first two arguments.
> +@findex vec_permx
Maybe say that the elements not corresponding to that section are set
to 0?
> +++ b/gcc/testsuite/gcc.target/powerpc/vec-blend-runnable.c
> +/* { dg-final { scan-assembler-times {\msplati\M} 6 } } */
> +/* { dg-final { scan-assembler-times {\msrdbi\M} 6 } } */
These do not work like this? Some v or xx splti*, and vsrdbi?
Thanks!
Segher