On Tue, May 1, 2018 at 10:02 AM, Paul B Mahol <[email protected]> wrote:
> +cglobal overlay_row_22, 6, 8, 8, 0, d, da, s, a, w, al, r, x
[...]
> + movu m2, [aq+2*xq]
> + pand m2, m3
> + movu m6, [aq+2*xq]
> + pand m6, m7
> + psrlw m6, 8
> + paddw m2, m6
> + psrlw m2, 1
> + movu m6, [aq+2*xq]
> + pand m6, m3
> + paddw m2, m6
> + psrlw m2, 1
I believe this can be simplified to something like (untested):
movu m1, [aq+2*xq]
pandn m2, m3, m1
psllw m1, 8
pavgw m2, m1
pavgw m2, m1
psrlw m2, 8
> +cglobal overlay_row_20, 6, 8, 8, 0, d, da, s, a, w, al, r, x
[...]
> + movu m2, [aq+2*xq]
> + pand m2, m3
> + movu m6, [aq+2*xq]
> + pand m6, m7
> + psrlw m6, 8
> + paddw m2, m6
> + movu m6, [daq+2*xq]
> + pand m6, m3
> + paddw m2, m6
> + movu m6, [daq+2*xq]
> + pand m6, m7
> + psrlw m6, 8
> + paddw m2, m6
> + psrlw m2, 2
And this to (untested):
mova m6, [pb_1]
...
movu m2, [aq+2*xq]
movu m1, [daq+2*xq]
pmaddubsw m2, m6
pmaddubsw m1, m6
paddw m2, m1
psrlw m2, 2
_______________________________________________
ffmpeg-devel mailing list
[email protected]
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel