On Wed, Aug 28, 2024 at 10:43 PM Ramiro Polla <[email protected]> wrote: > > The mmxext implementation is slower than the C version. > > rgb24toyv12_16_200_c: 14812.6 ( 1.00x) > rgb24toyv12_16_200_mmxext: 17400.4 ( 0.85x) > rgb24toyv12_128_60_c: 35616.9 ( 1.00x) > rgb24toyv12_128_60_mmxext: 39610.4 ( 0.90x) > rgb24toyv12_512_16_c: 37209.4 ( 1.00x) > rgb24toyv12_512_16_mmxext: 41136.2 ( 0.90x) > rgb24toyv12_1920_4_c: 34737.4 ( 1.00x) > rgb24toyv12_1920_4_mmxext: 34818.9 ( 1.00x) > rgb24toyv12_1920_4_negstride_c: 34855.2 ( 1.00x) > rgb24toyv12_1920_4_negstride_mmxext: 34773.7 ( 1.00x) > --- > libswscale/x86/rgb2rgb.c | 207 --------------------------------------- > 1 file changed, 207 deletions(-)
It's actually still faster under x86_32. New patch attached only disables it for x86_64 instead of removing it.
From 8619788f0cd34e600ea8743d770db94873be8693 Mon Sep 17 00:00:00 2001 From: Ramiro Polla <[email protected]> Date: Wed, 28 Aug 2024 20:03:39 +0200 Subject: [PATCH v2 3/4] swscale/x86/rgb2rgb: disable rgb24toyv12_mmxext for x86_64 The mmxext implementation is slower than the C version in x86_64. m32 m64 rgb24toyv12_16_200_c: 24942.7 14812.6 rgb24toyv12_16_200_mmxext: 17857.2 ( 1.40x) 17400.4 ( 0.85x) rgb24toyv12_128_60_c: 56892.9 35616.9 rgb24toyv12_128_60_mmxext: 40730.9 ( 1.40x) 39610.4 ( 0.90x) rgb24toyv12_512_16_c: 58402.7 37209.4 rgb24toyv12_512_16_mmxext: 44842.4 ( 1.30x) 41136.2 ( 0.90x) rgb24toyv12_1920_4_c: 54827.4 34737.4 rgb24toyv12_1920_4_mmxext: 51169.9 ( 1.07x) 34818.9 ( 1.00x) --- libswscale/x86/rgb2rgb.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/libswscale/x86/rgb2rgb.c b/libswscale/x86/rgb2rgb.c index 4d6ba9ff21..46a82c3f09 100644 --- a/libswscale/x86/rgb2rgb.c +++ b/libswscale/x86/rgb2rgb.c @@ -1480,7 +1480,7 @@ static inline void planar2x_mmxext(const uint8_t *src, uint8_t *dst, int srcWidt * others are ignored in the C version. * FIXME: Write HQ version. */ -#if HAVE_7REGS +#if ARCH_X86_32 && HAVE_7REGS static inline void rgb24toyv12_mmxext(const uint8_t *src, uint8_t *ydst, uint8_t *udst, uint8_t *vdst, int width, int height, int lumStride, int chromStride, int srcStride, @@ -2257,9 +2257,9 @@ static av_cold void rgb2rgb_init_mmxext(void) yuyvtoyuv422 = yuyvtoyuv422_mmxext; planar2x = planar2x_mmxext; -#if HAVE_7REGS +#if ARCH_X86_32 && HAVE_7REGS ff_rgb24toyv12 = rgb24toyv12_mmxext; -#endif /* HAVE_7REGS */ +#endif /* ARCH_X86_32 && HAVE_7REGS */ yuyvtoyuv420 = yuyvtoyuv420_mmxext; uyvytoyuv420 = uyvytoyuv420_mmxext; -- 2.30.2
_______________________________________________ ffmpeg-devel mailing list [email protected] https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email [email protected] with subject "unsubscribe".
