Re: [PATCH] rs6000: Add optimizations for _mm_sad_epu8

2022-01-07 Thread David Edelsohn via Gcc-patches
On Fri, Jan 7, 2022 at 3:57 PM Paul A. Clarke wrote: > > On Fri, Jan 07, 2022 at 02:40:51PM -0500, David Edelsohn via Gcc-patches > wrote: > > +#ifdef __LITTLE_ENDIAN__ > > + /* Sum across four integers with two integer results. */ > > + asm ("vsum2sws %0,%1,%2" : "=v" (result) : "v" (vsum), "

Re: [PATCH] rs6000: Add optimizations for _mm_sad_epu8

2022-01-07 Thread Paul A. Clarke via Gcc-patches
On Fri, Jan 07, 2022 at 02:40:51PM -0500, David Edelsohn via Gcc-patches wrote: > +#ifdef __LITTLE_ENDIAN__ > + /* Sum across four integers with two integer results. */ > + asm ("vsum2sws %0,%1,%2" : "=v" (result) : "v" (vsum), "v" (zero)); > + /* Note: vec_sum2s could be used here, but on litt

Re: [PATCH] rs6000: Add optimizations for _mm_sad_epu8

2022-01-07 Thread David Edelsohn via Gcc-patches
+#ifdef __LITTLE_ENDIAN__ + /* Sum across four integers with two integer results. */ + asm ("vsum2sws %0,%1,%2" : "=v" (result) : "v" (vsum), "v" (zero)); + /* Note: vec_sum2s could be used here, but on little-endian, vector + shifts are added that are not needed for this use-case. + A

Re: [PATCH] rs6000: Add optimizations for _mm_sad_epu8

2021-11-19 Thread Segher Boessenkool
Hi! On Fri, Oct 22, 2021 at 12:28:49PM -0500, Paul A. Clarke wrote: > Power9 ISA added `vabsdub` instruction which is realized in the > `vec_absd` instrinsic. > > Use `vec_absd` for `_mm_sad_epu8` compatibility intrinsic, when > `_ARCH_PWR9`. > > Also, the realization of `vec_sum2s` on little-en

Re: [PING^2 PATCH] rs6000: Add optimizations for _mm_sad_epu8

2021-11-18 Thread Paul A. Clarke via Gcc-patches
On Mon, Nov 08, 2021 at 11:43:26AM -0600, Paul A. Clarke via Gcc-patches wrote: > Gentle ping... Gentle re-ping. > On Fri, Oct 22, 2021 at 12:28:49PM -0500, Paul A. Clarke via Gcc-patches > wrote: > > Power9 ISA added `vabsdub` instruction which is realized in the > > `vec_absd` instrinsic. > >

[PING PATCH] rs6000: Add optimizations for _mm_sad_epu8

2021-11-08 Thread Paul A. Clarke via Gcc-patches
Gentle ping... On Fri, Oct 22, 2021 at 12:28:49PM -0500, Paul A. Clarke via Gcc-patches wrote: > Power9 ISA added `vabsdub` instruction which is realized in the > `vec_absd` instrinsic. > > Use `vec_absd` for `_mm_sad_epu8` compatibility intrinsic, when > `_ARCH_PWR9`. > > Also, the realization

[PATCH] rs6000: Add optimizations for _mm_sad_epu8

2021-10-22 Thread Paul A. Clarke via Gcc-patches
Power9 ISA added `vabsdub` instruction which is realized in the `vec_absd` instrinsic. Use `vec_absd` for `_mm_sad_epu8` compatibility intrinsic, when `_ARCH_PWR9`. Also, the realization of `vec_sum2s` on little-endian includes two shifts in order to position the input and output to match the sem