On Wed, Aug 28, 2024 at 10:43 PM Ramiro Polla <[email protected]> wrote:
>
> The mmxext implementation is slower than the C version.
>
> rgb24toyv12_16_200_c:                                14812.6 ( 1.00x)
> rgb24toyv12_16_200_mmxext:                           17400.4 ( 0.85x)
> rgb24toyv12_128_60_c:                                35616.9 ( 1.00x)
> rgb24toyv12_128_60_mmxext:                           39610.4 ( 0.90x)
> rgb24toyv12_512_16_c:                                37209.4 ( 1.00x)
> rgb24toyv12_512_16_mmxext:                           41136.2 ( 0.90x)
> rgb24toyv12_1920_4_c:                                34737.4 ( 1.00x)
> rgb24toyv12_1920_4_mmxext:                           34818.9 ( 1.00x)
> rgb24toyv12_1920_4_negstride_c:                      34855.2 ( 1.00x)
> rgb24toyv12_1920_4_negstride_mmxext:                 34773.7 ( 1.00x)
> ---
>  libswscale/x86/rgb2rgb.c | 207 ---------------------------------------
>  1 file changed, 207 deletions(-)

It's actually still faster under x86_32. New patch attached only
disables it for x86_64 instead of removing it.
From 8619788f0cd34e600ea8743d770db94873be8693 Mon Sep 17 00:00:00 2001
From: Ramiro Polla <[email protected]>
Date: Wed, 28 Aug 2024 20:03:39 +0200
Subject: [PATCH v2 3/4] swscale/x86/rgb2rgb: disable rgb24toyv12_mmxext for
 x86_64

The mmxext implementation is slower than the C version in x86_64.

                                m32               m64
rgb24toyv12_16_200_c:       24942.7           14812.6
rgb24toyv12_16_200_mmxext:  17857.2 ( 1.40x)  17400.4 ( 0.85x)
rgb24toyv12_128_60_c:       56892.9           35616.9
rgb24toyv12_128_60_mmxext:  40730.9 ( 1.40x)  39610.4 ( 0.90x)
rgb24toyv12_512_16_c:       58402.7           37209.4
rgb24toyv12_512_16_mmxext:  44842.4 ( 1.30x)  41136.2 ( 0.90x)
rgb24toyv12_1920_4_c:       54827.4           34737.4
rgb24toyv12_1920_4_mmxext:  51169.9 ( 1.07x)  34818.9 ( 1.00x)
---
 libswscale/x86/rgb2rgb.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/libswscale/x86/rgb2rgb.c b/libswscale/x86/rgb2rgb.c
index 4d6ba9ff21..46a82c3f09 100644
--- a/libswscale/x86/rgb2rgb.c
+++ b/libswscale/x86/rgb2rgb.c
@@ -1480,7 +1480,7 @@ static inline void planar2x_mmxext(const uint8_t *src, uint8_t *dst, int srcWidt
  * others are ignored in the C version.
  * FIXME: Write HQ version.
  */
-#if HAVE_7REGS
+#if ARCH_X86_32 && HAVE_7REGS
 static inline void rgb24toyv12_mmxext(const uint8_t *src, uint8_t *ydst, uint8_t *udst, uint8_t *vdst,
                                        int width, int height,
                                        int lumStride, int chromStride, int srcStride,
@@ -2257,9 +2257,9 @@ static av_cold void rgb2rgb_init_mmxext(void)
     yuyvtoyuv422       = yuyvtoyuv422_mmxext;
 
     planar2x           = planar2x_mmxext;
-#if HAVE_7REGS
+#if ARCH_X86_32 && HAVE_7REGS
     ff_rgb24toyv12     = rgb24toyv12_mmxext;
-#endif /* HAVE_7REGS */
+#endif /* ARCH_X86_32 && HAVE_7REGS */
 
     yuyvtoyuv420       = yuyvtoyuv420_mmxext;
     uyvytoyuv420       = uyvytoyuv420_mmxext;
-- 
2.30.2

_______________________________________________
ffmpeg-devel mailing list
[email protected]
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
[email protected] with subject "unsubscribe".

Reply via email to