On Tue, 3 Dec 2024, Zhao Zhili wrote:
From: Zhao Zhili <[email protected]>
Test on rpi 5 with gcc 12:
apply_bdof_8_8x16_c: 7315.2 ( 1.00x)
apply_bdof_8_8x16_neon: 1876.8 ( 3.90x)
apply_bdof_8_16x8_c: 7170.5 ( 1.00x)
apply_bdof_8_16x8_neon: 1752.8 ( 4.09x)
apply_bdof_8_16x16_c: 14695.2 ( 1.00x)
apply_bdof_8_16x16_neon: 3490.5 ( 4.21x)
apply_bdof_10_8x16_c: 7371.5 ( 1.00x)
apply_bdof_10_8x16_neon: 1863.8 ( 3.96x)
apply_bdof_10_16x8_c: 7172.0 ( 1.00x)
apply_bdof_10_16x8_neon: 1766.0 ( 4.06x)
apply_bdof_10_16x16_c: 14551.5 ( 1.00x)
apply_bdof_10_16x16_neon: 3576.0 ( 4.07x)
apply_bdof_12_8x16_c: 7236.5 ( 1.00x)
apply_bdof_12_8x16_neon: 1863.8 ( 3.88x)
apply_bdof_12_16x8_c: 7316.5 ( 1.00x)
apply_bdof_12_16x8_neon: 1758.8 ( 4.16x)
apply_bdof_12_16x16_c: 14691.2 ( 1.00x)
apply_bdof_12_16x16_neon: 3480.5 ( 4.22x)
---
libavcodec/aarch64/vvc/dsp_init.c | 21 ++
libavcodec/aarch64/vvc/inter.S | 399 +++++++++++++++++++++++++++
libavcodec/aarch64/vvc/of_template.c | 64 +++++
3 files changed, 484 insertions(+)
create mode 100644 libavcodec/aarch64/vvc/of_template.c
Thanks, this looks ok to me.
(I don't personally use .req to give registers local names like this patch
does, but it works fine with our tools, so if you prefer doing it that
way, that's totally ok.)
// Martin
_______________________________________________
ffmpeg-devel mailing list
[email protected]
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link above, or email
[email protected] with subject "unsubscribe".