On Fri, Jan 07, 2022 at 02:15:22PM -0500, David Edelsohn wrote:
> > Power10 ISA added `xxblendv*` instructions which are realized in the
> > `vec_blendv` instrinsic.
> >
> > Use `vec_blendv` for `_mm_blendv_epi8`, `_mm_blendv_ps`, and
> > `_mm_blendv_pd` compatibility intrinsics, when `_ARCH_PWR10`.
> >
> > Also, copy a test from i386 for testing `_mm_blendv_ps`.
> > This should have come with commit ed04cf6d73e233c74c4e55c27f1cbd89ae4710e8,
> > but was inadvertently omitted.
> >
> > 2021-10-20  Paul A. Clarke  <p...@us.ibm.com>
> >
> > gcc
> > * config/rs6000/smmintrin.h (_mm_blendv_epi8): Use vec_blendv
> > when _ARCH_PWR10.
> > (_mm_blendv_ps): Likewise.
> > (_mm_blendv_pd): Likewise.
> >
> > gcc/testsuite
> > * gcc.target/powerpc/sse4_1-blendvps.c: Copy from gcc.target/i386,
> > adjust dg directives to suit.
> > ---
> > Tested on Power10 powerpc64le-linux (compiled with and without
> > `-mcpu=power10`).
> >
> > OK for trunk?
> 
> This is okay modulo
> 
> > + return (__m128i) vec_blendv ((__v16qu) __A, (__v16qu) __B, (__v16qu) 
> > __mask);
> 
> Should the above be __v16qi like x86?

That does arguably match the types involved (epi8) better.

Shall I change the original implementation as well (4 lines later)?

>   return (__m128i) vec_sel ((__v16qi) __A, (__v16qi) __B, __lmask);

PC

Reply via email to