Re: [Pixman] [PATCH] mmx: add src_8888_0565

Matt Turner Fri, 20 Apr 2012 20:29:01 -0700

On Fri, Apr 20, 2012 at 3:43 PM, Matt Turner <[email protected]> wrote:
> On Thu, Apr 19, 2012 at 5:40 PM, Matt Turner <[email protected]> wrote:
>> Uses the pmadd technique described in
>> http://software.intel.com/sites/landingpage/legacy/mmx/MMX_App_24-16_Bit_Conversion.pdf
>> +static force_inline __m64
>> +pack_4xpacked565 (__m64 a, __m64 b)
>> +{
>> +    __m64 rb0 = _mm_and_si64 (a, MC (packed_565_rb));
>> +    __m64 rb1 = _mm_and_si64 (b, MC (packed_565_rb));
>> +
>> +    __m64 t0 = _mm_madd_pi16 (rb0, MC (565_pack_multiplier));
>> +    __m64 t1 = _mm_madd_pi16 (rb1, MC (565_pack_multiplier));
>> +
>> +    __m64 g0 = _mm_and_si64 (a, MC (packed_565_g));
>> +    __m64 g1 = _mm_and_si64 (b, MC (packed_565_g));
>> +
>> +    t0 = _mm_or_si64 (t0, g0);
>> +    t1 = _mm_or_si64 (t1, g1);
>> +
>> +    t0 = shift(t0, -5);
>> +    t1 = shift(t1, -5 + 16);
>> +
>> +    return _mm_shuffle_pi16 (_mm_or_si64 (t0, t1), _MM_SHUFFLE (3, 1, 2, 
>> 0));
>> +}
>
> I think the return statement can be simplified with a _mm_packs_pi32,
> but I couldn't get it to work. If someone has a chance to take a look,
> I'd be very appreciative.


I realized in talking with Søren on IRC that the code in the pdf
converts to 555, which allows packssdw to work. We'd need packusdw
here, but it wasn't added until SSE 4.1.

It looks like the ffmpeg 888 -> 565 MMX code unpacks the input in a
way that avoids needing to repack it at the end, but I don't think
that is an improvement over an extra shuffle at the end. I'll play
with it some and see.
_______________________________________________
Pixman mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/pixman

Re: [Pixman] [PATCH] mmx: add src_8888_0565

Reply via email to