Re: [FFmpeg-devel] [PATCH] lavc/aarch64: Add neon implementation for sse8

Martin Storsjö Thu, 04 Aug 2022 01:04:38 -0700

On Mon, 25 Jul 2022, Hubert Mazur wrote:

Provide optimized implementation of sse8 function for arm64.


Performance comparison tests are shown below.
- sse_1_c: 133.0
- sse_1_neon: 36.7

Benchmarks and tests run with checkasm tool on AWS Graviton 3.

Signed-off-by: Hubert Mazur <[email protected]>
---
libavcodec/aarch64/me_cmp_init_aarch64.c |  3 +
libavcodec/aarch64/me_cmp_neon.S         | 72 ++++++++++++++++++++++++
2 files changed, 75 insertions(+)


The same comments as for sse16 and sse4 apply here too.

// Martin

_______________________________________________
ffmpeg-devel mailing list
[email protected]
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
[email protected] with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] lavc/aarch64: Add neon implementation for sse8

Reply via email to