May 9, 2023, 11:51 by [email protected]: > We are submitting a set of patches that significantly improve H.264 decoding > performance > by utilizing RVV intrinsic code. The average speedup(FPS) achieved by these > patches is more than 2x, > as experimented on 720P videos running on an internal FPGA board. > > Patch1: add support for RVV intrinsic code in the configure file > Patch2: optimize chroma motion compensation > Patch3: optimize luma motion compensation > Patch4: optimize dsp functions, such as IDCT, in-loop filtering, and weighed > filtering > Patch5: optimize intra prediction > > Arnie Chang (5): > configure: Add detection of RISC-V vector intrinsic support > lavc/h264chroma: Add vectorized implementation of chroma MC for RISC-V > lavc/h264qpel: Add vectorized implementation of luma MC for RISC-V > lavc/h264dsp: Add vectorized implementation of DSP functions for > RISC-V > lavc/h264pred: Add vectorized implementation of intra prediction for > RISC-V >
Could you rewrite this in asm instead? I'd like for risc-v to have the same policy like we do for arm - no intrinsics. There's a long list of reasons we don't use intrinsics which I won't get into. Just a few days ago, I discovered that our PPC intrinsics were quite badly performing due to compiler issues, in some cases, 500x slower than C. Also, we don't care about overall speedup. We have checkasm --bench to measure the per-function speedup over C. _______________________________________________ ffmpeg-devel mailing list [email protected] https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email [email protected] with subject "unsubscribe".
