Re: [FFmpeg-devel] [PATCH 2/7] avcodec/aarch64/mpegvideoencdsp: add neon implementations for pix_sum and pix_norm1

2024-08-18 Thread Rémi Denis-Courmont
Le 18 août 2024 23:13:21 GMT+03:00, Ramiro Polla a écrit : > A53 A76 >pix_norm1_c: 519.2 231.5 >pix_norm1_neon: 195.0 ( 2.66x) 44.2 ( 5.24x) >pix_sum_c: 344.5 242.2 >pix_sum_neon:119.0 ( 2.89x) 41.7 ( 5.81x) >--- > libavcodec/

Re: [FFmpeg-devel] [PATCH 2/7] avcodec/aarch64/mpegvideoencdsp: add neon implementations for pix_sum and pix_norm1

2024-08-18 Thread Ramiro Polla
On Sun, Aug 18, 2024 at 10:43 PM Martin Storsjö wrote: > On Sun, 18 Aug 2024, Ramiro Polla wrote: > > > A53 A76 > > pix_norm1_c: 519.2 231.5 > > pix_norm1_neon: 195.0 ( 2.66x) 44.2 ( 5.24x) > > pix_sum_c: 344.5 242.2 > > pix_sum_neon:

Re: [FFmpeg-devel] [PATCH] Fix nullptr dereference with invalid encryption metadata.

2024-08-18 Thread Michael Niedermayer
On Fri, Aug 02, 2024 at 03:08:29PM -0700, Dale Curtis wrote: > Found by fuzzer. > > Bug: https://crbug.com/356720789 > Signed-off-by: Dale Curtis > --- > libavformat/mov.c | 8 ++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > mov.c |8 ++-- > 1 file changed, 6 insertions(+)

Re: [FFmpeg-devel] [PATCH 3/7] avcodec/aarch64/mpegvideoencdsp: add dotprod implementation for pix_norm1

2024-08-18 Thread Martin Storsjö
On Sun, 18 Aug 2024, Ramiro Polla wrote: A76 pix_norm1_c:231.5 pix_norm1_neon: 44.2 ( 5.24x) pix_norm1_dotprod: 20.7 (11.18x) --- libavcodec/aarch64/mpegvideoencdsp_init.c | 10 libavcodec/aarch64/mpegvideoencdsp_neon.S | 28 +++ 2 f

Re: [FFmpeg-devel] [PATCH 2/7] avcodec/aarch64/mpegvideoencdsp: add neon implementations for pix_sum and pix_norm1

2024-08-18 Thread Martin Storsjö
On Sun, 18 Aug 2024, Ramiro Polla wrote: A53 A76 pix_norm1_c: 519.2 231.5 pix_norm1_neon: 195.0 ( 2.66x) 44.2 ( 5.24x) pix_sum_c: 344.5 242.2 pix_sum_neon:119.0 ( 2.89x) 41.7 ( 5.81x) --- Hmm, those speedups on the A53 look q

Re: [FFmpeg-devel] [PATCH v2 5/5] swscale/aarch64/yuv2rgb: add neon yuv42{0, 2}p -> gbrp unscaled colorspace converters

2024-08-18 Thread Ramiro Polla
On Wed, Aug 14, 2024 at 2:41 PM Martin Storsjö wrote: > On Tue, 6 Aug 2024, Ramiro Polla wrote: > > checkasm --bench on a Raspberry Pi 5 Model B Rev 1.0: > > yuv420p_gbrp_128_c: 1243.0 > > yuv420p_gbrp_128_neon: 453.5 > > yuv420p_gbrp_1920_c: 18165.5 > > yuv420p_gbrp_1920_neon: 6700.0 > > yuv422p_

Re: [FFmpeg-devel] [PATCH 2/7] avcodec/aarch64/mpegvideoencdsp: add neon implementations for pix_sum and pix_norm1

2024-08-18 Thread Ramiro Polla
On Sun, Aug 18, 2024 at 10:13 PM Ramiro Polla wrote: > >A53 A76 > pix_norm1_c: 519.2 231.5 > pix_norm1_neon: 195.0 ( 2.66x) 44.2 ( 5.24x) > pix_sum_c: 344.5 242.2 > pix_sum_neon:119.0 ( 2.89x) 41.7 ( 5.81x) This new patchset n

[FFmpeg-devel] [PATCH 7/7] avcodec/mpegvideoencdsp: speed up draw_edges_8_c by inlining it for all used edge widths

2024-08-18 Thread Ramiro Polla
This commit also restricts w to 4, 8, or 16. Intel(R) Core(TM) i5-5300U CPU @ 2.30GHz: beforeafter draw_edges_8_1724_4_c:45074.5 7144.7 ( 6.31x) draw_edges_8_1724_8_c:41716.5 7216.0 ( 5.78x) draw_edges_8_1724_16_c: 45282.7 16026.2 ( 2.83x) draw_edge

[FFmpeg-devel] [PATCH 6/7] avcodec/x86/mpegvideoencdsp: speed up draw_edges_mmx by using memcpy()

2024-08-18 Thread Ramiro Polla
The mmx memory copy code is not nearly as efficient as memcpy(), which would make draw_edges_mmx much slower than draw_edges_8_c. Intel(R) Core(TM) i5-5300U CPU @ 2.30GHz: beforeafter draw_edges_8_1724_4_mmx: 8697.2 8739.6 ( 1.00x) draw_edges_8_1724_8_mmx:

[FFmpeg-devel] [PATCH 5/7] avcodec/x86/mpegvideoencdsp: fix comment for draw_edges_mmx

2024-08-18 Thread Ramiro Polla
Not only w == 8 and w == 16 are supported, but also w == 4. --- libavcodec/x86/mpegvideoencdsp_init.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/libavcodec/x86/mpegvideoencdsp_init.c b/libavcodec/x86/mpegvideoencdsp_init.c index ec174b15aa..503548e668 100644 --- a/libav

[FFmpeg-devel] [PATCH 4/7] checkasm/mpegvideoencdsp: add draw_edges

2024-08-18 Thread Ramiro Polla
--- tests/checkasm/mpegvideoencdsp.c | 53 1 file changed, 53 insertions(+) diff --git a/tests/checkasm/mpegvideoencdsp.c b/tests/checkasm/mpegvideoencdsp.c index 00d21ba0ba..21954a4a65 100644 --- a/tests/checkasm/mpegvideoencdsp.c +++ b/tests/checkasm/mpegvideoen

[FFmpeg-devel] [PATCH 3/7] avcodec/aarch64/mpegvideoencdsp: add dotprod implementation for pix_norm1

2024-08-18 Thread Ramiro Polla
A76 pix_norm1_c:231.5 pix_norm1_neon: 44.2 ( 5.24x) pix_norm1_dotprod: 20.7 (11.18x) --- libavcodec/aarch64/mpegvideoencdsp_init.c | 10 libavcodec/aarch64/mpegvideoencdsp_neon.S | 28 +++ 2 files changed, 38 insertions(+) diff --g

[FFmpeg-devel] [PATCH 2/7] avcodec/aarch64/mpegvideoencdsp: add neon implementations for pix_sum and pix_norm1

2024-08-18 Thread Ramiro Polla
A53 A76 pix_norm1_c: 519.2 231.5 pix_norm1_neon: 195.0 ( 2.66x) 44.2 ( 5.24x) pix_sum_c: 344.5 242.2 pix_sum_neon:119.0 ( 2.89x) 41.7 ( 5.81x) --- libavcodec/aarch64/Makefile | 2 + libavcodec/aarch64/mpegvideoenc

[FFmpeg-devel] [PATCH 1/7] checkasm/mpegvideoencdsp: add pix_sum and pix_norm1

2024-08-18 Thread Ramiro Polla
--- tests/checkasm/Makefile | 1 + tests/checkasm/checkasm.c| 3 ++ tests/checkasm/checkasm.h| 1 + tests/checkasm/mpegvideoencdsp.c | 77 4 files changed, 82 insertions(+) create mode 100644 tests/checkasm/mpegvideoencdsp.c diff --git

[FFmpeg-devel] [PATCH] src/index/news: drop redundant sentence

2024-08-18 Thread Lynne via ffmpeg-devel
The sentence was added as an advice, given the maturity of the xHE-AAC implementation, but didn't deserve its own paragraph, nor its subject deserved being left a referential footnote in the stead of article. --- src/index | 3 --- 1 file changed, 3 deletions(-) diff --git a/src/index b/src/index

Re: [FFmpeg-devel] [PATCH 1/3] avformat/iamf_parse: clear padding

2024-08-18 Thread Michael Niedermayer
On Wed, Aug 14, 2024 at 12:07:04PM -0300, James Almer wrote: > On 8/14/2024 11:34 AM, Michael Niedermayer wrote: > > Fixes: use of uninitialized value > > Fixes: > > 70929/clusterfuzz-testcase-minimized-ffmpeg_dem_IAMF_fuzzer-5931276639469568 > > > > Found-by: continuous fuzzing process > > http

Re: [FFmpeg-devel] [PATCH 5/5] avcodec/hevc/ps: use unsigned shift

2024-08-18 Thread Michael Niedermayer
On Fri, Aug 16, 2024 at 08:27:07PM -0300, James Almer wrote: > On 8/16/2024 8:15 PM, Michael Niedermayer wrote: > > Fixes: left shift of 1 by 31 places cannot be represented in type 'int' > > Fixes: > > 70726/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_HEVC_fuzzer-6149928703819776 > > > > F

Re: [FFmpeg-devel] [PATCH 3/5] avcodec/cbs_h265_syntax_template:

2024-08-18 Thread Michael Niedermayer
On Fri, Aug 16, 2024 at 08:38:41PM -0300, James Almer wrote: > On 8/16/2024 8:15 PM, Michael Niedermayer wrote: > > Fixes: Assertion width > 0 && width <= 32 failed > > Fixes: > > 71012/clusterfuzz-testcase-minimized-ffmpeg_BSF_HEVC_METADATA_fuzzer-6073354744823808 > > > > Found-by: continuous fu

Re: [FFmpeg-devel] [PATCH 1/7] checkasm: add csv/tsv bench output

2024-08-18 Thread Ramiro Polla
On Tue, Aug 13, 2024 at 4:03 PM J. Dekker wrote: > When collecting performance information from checkasm it is common > to parse the output for use in graphs to compare vs different > architectures. > > Signed-off-by: J. Dekker > --- When I redirect stdout to a csv file, the first two lines are:

[FFmpeg-devel] [PATCH] Skip parsing of hwaccel mjpeg after decoding

2024-08-18 Thread Lluís Batlle i Rossell
Attached. Together with previous patch "Less CPU use in hwaccel MJPEG decoding" the hwaccel mjpeg decoding uses about 90% less cpu than before. >From 5960e16ae7561c6c6ad982c90f4e6ea1d30df91b Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Llu=C3=ADs=20Batlle=20i=20Rossell?= Date: Sun, 18 Aug 2024 14:14:

[FFmpeg-devel] [PATCH 3/3] lavu/opt: add API for retrieving array-type option values

2024-08-18 Thread Anton Khirnov
Previously one could only convert the entire array to a string, not access individual elements. --- doc/APIchanges| 3 ++ libavutil/opt.c | 119 +++--- libavutil/opt.h | 40 ++ libavutil/tests/opt.c | 63 ++

[FFmpeg-devel] [PATCH 1/3] lavu/opt: document underlying C types for enum AVOptionType

2024-08-18 Thread Anton Khirnov
--- libavutil/opt.h | 78 +++-- 1 file changed, 75 insertions(+), 3 deletions(-) diff --git a/libavutil/opt.h b/libavutil/opt.h index 07e27a9208..23bc495158 100644 --- a/libavutil/opt.h +++ b/libavutil/opt.h @@ -240,26 +240,98 @@ * before the file is

[FFmpeg-devel] [PATCH 2/3] lavu/opt: forward av_opt_get_video_rate() to av_opt_get_q()

2024-08-18 Thread Anton Khirnov
The two functions are exactly the same. --- libavutil/opt.c | 13 + 1 file changed, 1 insertion(+), 12 deletions(-) diff --git a/libavutil/opt.c b/libavutil/opt.c index 32a9e059e3..2cfc2d9c5a 100644 --- a/libavutil/opt.c +++ b/libavutil/opt.c @@ -1277,18 +1277,7 @@ int av_opt_get_imag

Re: [FFmpeg-devel] [PATCH] lavc/vvc_mc: R-V V avg w_avg

2024-08-18 Thread flow gg
I wrote `ff_vvc_w_avg_8_rvv` by mimicking the h264 weight function. Based on the test results for 49 different resolutions, most of them were significantly slower. Only 2x32 and 2x64 had similar performance, without noticeable speed improvement. I'm not sure about the reason. Some differences ar