On Fri, Jan 7, 2022 at 3:57 PM Paul A. Clarke wrote:
>
> On Fri, Jan 07, 2022 at 02:40:51PM -0500, David Edelsohn via Gcc-patches
> wrote:
> > +#ifdef __LITTLE_ENDIAN__
> > + /* Sum across four integers with two integer results. */
> > + asm ("vsum2sws %0,%1,%2" : "=v" (result) : "v" (vsum), "
On Fri, Jan 07, 2022 at 02:40:51PM -0500, David Edelsohn via Gcc-patches wrote:
> +#ifdef __LITTLE_ENDIAN__
> + /* Sum across four integers with two integer results. */
> + asm ("vsum2sws %0,%1,%2" : "=v" (result) : "v" (vsum), "v" (zero));
> + /* Note: vec_sum2s could be used here, but on litt
+#ifdef __LITTLE_ENDIAN__
+ /* Sum across four integers with two integer results. */
+ asm ("vsum2sws %0,%1,%2" : "=v" (result) : "v" (vsum), "v" (zero));
+ /* Note: vec_sum2s could be used here, but on little-endian, vector
+ shifts are added that are not needed for this use-case.
+ A
Hi!
On Fri, Oct 22, 2021 at 12:28:49PM -0500, Paul A. Clarke wrote:
> Power9 ISA added `vabsdub` instruction which is realized in the
> `vec_absd` instrinsic.
>
> Use `vec_absd` for `_mm_sad_epu8` compatibility intrinsic, when
> `_ARCH_PWR9`.
>
> Also, the realization of `vec_sum2s` on little-en
On Mon, Nov 08, 2021 at 11:43:26AM -0600, Paul A. Clarke via Gcc-patches wrote:
> Gentle ping...
Gentle re-ping.
> On Fri, Oct 22, 2021 at 12:28:49PM -0500, Paul A. Clarke via Gcc-patches
> wrote:
> > Power9 ISA added `vabsdub` instruction which is realized in the
> > `vec_absd` instrinsic.
> >
Gentle ping...
On Fri, Oct 22, 2021 at 12:28:49PM -0500, Paul A. Clarke via Gcc-patches wrote:
> Power9 ISA added `vabsdub` instruction which is realized in the
> `vec_absd` instrinsic.
>
> Use `vec_absd` for `_mm_sad_epu8` compatibility intrinsic, when
> `_ARCH_PWR9`.
>
> Also, the realization
Power9 ISA added `vabsdub` instruction which is realized in the
`vec_absd` instrinsic.
Use `vec_absd` for `_mm_sad_epu8` compatibility intrinsic, when
`_ARCH_PWR9`.
Also, the realization of `vec_sum2s` on little-endian includes
two shifts in order to position the input and output to match
the sem