Le lauantaina 19. lokakuuta 2024, 13.55.46 EEST Rémi Denis-Courmont a écrit : > Le perjantaina 11. lokakuuta 2024, 13.40.20 EEST [email protected] a écrit : > > From: sunyuechi <[email protected]> > > +.macro put_uni_pixels w, vlen, id > > +\id\w\vlen: > > +.if \w == 128 && \vlen == 128 > > + li t0, \w > > + vsetvli zero, t0, e8, m8, ta, ma > > +.else > > + vsetvlstatic8 \w, \vlen > > +.endif > > +1: > > + vle8.v v0, (a2) > > + addi a4, a4, -1 > > + vse8.v v0, (a0) > > + add a2, a2, a3 > > + add a0, a0, a1 > > + bnez a4, 1b > > + ret > > Up to 64-bit rows, you can use strided loads and stores here.
Or perhaps not if the vectors are not aligned but vectors should not be necessary here. This is especially true on the BPi whose memory bus is rather slow, so even scalar copy can saturate it. -- Rémi Denis-Courmont http://www.remlab.net/ _______________________________________________ ffmpeg-devel mailing list [email protected] https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email [email protected] with subject "unsubscribe".
