Re: [Pixman] [PATCH 1/4] vmx: optimize scaled_nearest_scanline_vmx_8888_8888_OVER

Pekka Paalanen Mon, 07 Sep 2015 04:04:34 -0700

On Sun,  6 Sep 2015 18:27:08 +0300
Oded Gabbay <[email protected]> wrote:


> This patch optimizes scaled_nearest_scanline_vmx_8888_8888_OVER and all
> the functions it calls (combine1, combine4 and
> core_combine_over_u_pixel_vmx).
> 
> The optimization is done by removing use of expand_alpha_1x128 and
> expand_alpha_2x128 in favor of splat_alpha and MUL/ADD macros from
> pixman_combine32.h.
> 
> Running "lowlevel-blt-bench -n over_8888_8888" on POWER8, 8 cores,
> 3.4GHz, RHEL 7.2 ppc64le gave the following results:
> 
> reference memcpy speed = 24847.3MB/s (6211.8MP/s for 32bpp fills)
> 
>                 Before          After           Change
>               --------------------------------------------
> L1              182.05          210.22         +15.47%
> L2              180.6           208.92         +15.68%
> M               180.52          208.22         +15.34%
> HT              130.17          178.97         +37.49%
> VT              145.82          184.22         +26.33%
> R               104.51          129.38         +23.80%
> RT              48.3            61.54          +27.41%
> Kops/s          430             504            +17.21%
> 
> Signed-off-by: Oded Gabbay <[email protected]>
> ---
>  pixman/pixman-vmx.c | 80 
> ++++++++++++-----------------------------------------
>  1 file changed, 18 insertions(+), 62 deletions(-)
> 
> diff --git a/pixman/pixman-vmx.c b/pixman/pixman-vmx.c
> index a9bd024..d9fc5d6 100644
> --- a/pixman/pixman-vmx.c
> +++ b/pixman/pixman-vmx.c

> @@ -646,19 +643,10 @@ static force_inline uint32_t
>  combine1 (const uint32_t *ps, const uint32_t *pm)
>  {
>      uint32_t s = *ps;
> +    uint32_t a = ALPHA_8(*pm);

pm is dereferenced before checked for NULL.

>  
>      if (pm)
> -    {
> -     vector unsigned int ms, mm;
> -
> -     mm = unpack_32_1x128 (*pm);
> -     mm = expand_alpha_1x128 (mm);
> -
> -     ms = unpack_32_1x128 (s);
> -     ms = pix_multiply (ms, mm);
> -
> -     s = pack_1x128_32 (ms);
> -    }
> +     UN8x4_MUL_UN8(s, a);
>  
>      return s;
>  }

Thanks,
pq

pgpTP2m4Vnfgw.pgp
Description: OpenPGP digital signature

_______________________________________________
Pixman mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/pixman

Re: [Pixman] [PATCH 1/4] vmx: optimize scaled_nearest_scanline_vmx_8888_8888_OVER

Reply via email to