Le lauantaina 4. toukokuuta 2024, 21.02.25 EEST Rémi Denis-Courmont a écrit : > Le lauantaina 4. toukokuuta 2024, 17.48.31 EEST [email protected] a écrit : > > From: sunyuechi <[email protected]> > > > > C908: > > vp8_put_bilin4_h_c: 373.5 > > vp8_put_bilin4_h_rvv_i32: 158.7 > > vp8_put_bilin8_h_c: 1437.7 > > vp8_put_bilin8_h_rvv_i32: 318.7 > > vp8_put_bilin16_h_c: 2845.7 > > vp8_put_bilin16_h_rvv_i32: 374.7 > > --- > > > > libavcodec/riscv/vp8dsp_init.c | 11 +++++++ > > libavcodec/riscv/vp8dsp_rvv.S | 54 ++++++++++++++++++++++++++++++++++ > > 2 files changed, 65 insertions(+) > > > > diff --git a/libavcodec/riscv/vp8dsp_init.c > > b/libavcodec/riscv/vp8dsp_init.c index c364de3dc9..32cb4893a4 100644 > > --- a/libavcodec/riscv/vp8dsp_init.c > > +++ b/libavcodec/riscv/vp8dsp_init.c > > @@ -34,6 +34,10 @@ VP8_EPEL(16, rvv); > > > > VP8_EPEL(8, rvv); > > VP8_EPEL(4, rvv); > > > > +VP8_BILIN(16, rvv); > > +VP8_BILIN(8, rvv); > > +VP8_BILIN(4, rvv); > > + > > > > av_cold void ff_vp78dsp_init_riscv(VP8DSPContext *c) > > { > > #if HAVE_RVV > > > > @@ -47,6 +51,13 @@ av_cold void ff_vp78dsp_init_riscv(VP8DSPContext *c) > > > > c->put_vp8_bilinear_pixels_tab[0][0][0] = > > ff_put_vp8_pixels16_rvv; > > c->put_vp8_bilinear_pixels_tab[1][0][0] = ff_put_vp8_pixels8_rvv; > > c->put_vp8_bilinear_pixels_tab[2][0][0] = ff_put_vp8_pixels4_rvv; > > > > + > > + c->put_vp8_bilinear_pixels_tab[0][0][1] = > > ff_put_vp8_bilin16_h_rvv; + > > c->put_vp8_bilinear_pixels_tab[0][0][2] = > > ff_put_vp8_bilin16_h_rvv; + c->put_vp8_bilinear_pixels_tab[1][0][1] > > = ff_put_vp8_bilin8_h_rvv; + > > c->put_vp8_bilinear_pixels_tab[1][0][2] > > = ff_put_vp8_bilin8_h_rvv; + > > c->put_vp8_bilinear_pixels_tab[2][0][1] > > = ff_put_vp8_bilin4_h_rvv; + > > c->put_vp8_bilinear_pixels_tab[2][0][2] > > = ff_put_vp8_bilin4_h_rvv; } > > > > #endif > > } > > > > diff --git a/libavcodec/riscv/vp8dsp_rvv.S b/libavcodec/riscv/vp8dsp_rvv.S > > index 063ab7110c..c8d265e516 100644 > > --- a/libavcodec/riscv/vp8dsp_rvv.S > > +++ b/libavcodec/riscv/vp8dsp_rvv.S > > @@ -98,3 +98,57 @@ func ff_put_vp8_pixels4_rvv, zve32x > > > > vsetivli zero, 4, e8, mf4, ta, ma > > put_vp8_pixels > > > > endfunc > > > > + > > +.macro bilin_h_load dst len > > +.ifc \len,4 > > + vsetivli zero, 5, e8, mf2, ta, ma > > +.elseif \len == 8 > > + vsetivli zero, 9, e8, m1, ta, ma > > +.else > > + vsetivli zero, 17, e8, m2, ta, ma > > +.endif > > It might be worth defining a pseudo-instruction macro in asm.S that would > statically compute the minimal LMUL from just the AVL and SEW. Then we don't > to repeat these if blocks times and again, we can just do: > > vsetvlstatic \len + 1, e8 > > or something like that
On second thought, concealing the LMUL from the programmer is perhaps not the smartest idea, since it heavily constrains register allocation. -- 雷米‧德尼-库尔蒙 http://www.remlab.net/ _______________________________________________ ffmpeg-devel mailing list [email protected] https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email [email protected] with subject "unsubscribe".
