Re: [FFmpeg-devel] [PATCH v3] avcodec/mathops: Optimize generic mid_pred function

YunQiang Su Wed, 15 Mar 2023 03:09:44 -0700

Michael Niedermayer <[email protected]> 于2023年3月8日周三 04:45写道：
>
> On Tue, Mar 07, 2023 at 05:08:27PM +0800, Junxian Zhu wrote:
> > From: Junxian Zhu <[email protected]>
> >
> > Rewrite mid_pred function in generic mathops.h, reduce branch jump to 
> > improve performance. And because nowadays new version compiler can compile 
> > enough short asmbbely code as handwritting in these function, so remove 
> > specified optimized mips inline asmbbely mathops.h.
>
> as you write, that it improves performance
> what speed effect does this have exactly?
> thx
>


I tested the performance, using this code
```
#include <stdio.h>
#include <time.h>
#include <stdlib.h>

#define FFMIN(a, b) ( a>b ? b : a )
#define FFMAX(a, b) ( a>b ? a : b )

int mid_pred(int a, int b, int c)
{
#if OLD
    if(a>b){
        if(c>b){
            if(c>a) b=a;
            else    b=c;
        }
    }else{
        if(b>c){
            if(c>a) b=c;
            else    b=a;
        }
    }
    return b;
#else
    int t0,t1,t2,t3;
    t0 = (a > b) ? b : a ;
    t1 = (a > b) ? a : b ;
    t2 = (t0 > c) ? t0 : c;
    t3 = (t1 > t2) ? t2 : t1;
    return t3;
#endif
}

int main() {
    int a[1024], b[1024], c[1024], d[1024];

    srand(time(NULL));
    for(int i=0; i<1024; i++) {
        a[i] = rand();
        b[i] = rand();
        c[i] = rand();
     }
     for (int j=0; j<1e7+rand()%2; j++)
         for(int i=0; i<1024; i++)
             d[i] = mid_pred(a[i], b[i], c[i]);

     printf("%d, %d\n", d[rand()%1024], j);
}
```

On MacOS 13.2 with Apple M1:
The old code              the new code
2.1s                            2.3s

On Cavium ThunderX / arm64 (GCC 10.2.1 -O3)
The old code              the new code
52.7s                          37.8s

On Loongson 3A4000/mips64el (GCC 10.2.1 -O3)
The old code              the new code
90s                             5s

On Intel(R) Xeon(R) CPU E7-4820 v4 @ 2.00GHz (GCC 10.2.1 -O3)
The old code              the new code
14.4s                          15.4s

On SF19A2890/MIPS interAptiv (GCC 10.2.1 -O3)
The old code              the new code
314s                           39.3s

On Intel(R) Xeon(R) CPU E7-4820 v4 @ 2.00GHz (GCC 12.2.0 -O3)
The old code              the new code
14.4s                          8.8s

On sifive,bullet0/rv64imafdc  (GCC 12.2.0 -O3, 1e6 times instead of 1e7)
The old code              the new code
11.9s                          15.2s

On Freescale i.MX53/ARMv7 Processor rev 5 (v7l)  (GCC 12.2.0 -O3, 1e6
times instead of 1e7)
The old code              the new code
24.1s                          15.7s

On POWER8 (architected), altivec supported, BIG ENDIAN, ppc64  (GCC 12.2.0 -O3)
The old code              the new code
43.1s                          50.8s

On POWER8 (architected), altivec supported, LITTLE ENDIAN, ppc64el
(GCC 12.2.0 -O3)
The old code              the new code
7.8s                            4.7s

On PA8900 (Shortfin) PA-RISC (GCC 12.2.0 -O3 1e6 times instead of 1e7)
The old code              the new code
39.9s                          47.2s

On IBM/S390 aka s390x (GCC 12.2.0 -O3)
The old code              the new code
82.2s                          30.8s

On Intel(R)  Itanium(R)  Processor 9320  (GCC 12.2.0 -O3)
The old code              the new code
89.5s                          78.1s

Cavium Octeon III V0.2  FPU V0.0 /mipsel  (GCC 12.2.0 -O3)
The old code              the new code
117.5s                        118.5s




> [...]
> --
> Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> It is dangerous to be right in matters on which the established authorities
> are wrong. -- Voltaire
> _______________________________________________
> ffmpeg-devel mailing list
> [email protected]
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> [email protected] with subject "unsubscribe".
_______________________________________________
ffmpeg-devel mailing list
[email protected]
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
[email protected] with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v3] avcodec/mathops: Optimize generic mid_pred function

Reply via email to