Le perjantaina 11. lokakuuta 2024, 13.40.20 EEST [email protected] a écrit : > From: sunyuechi <[email protected]> > +.macro put_uni_pixels w, vlen, id > +\id\w\vlen: > +.if \w == 128 && \vlen == 128 > + li t0, \w > + vsetvli zero, t0, e8, m8, ta, ma > +.else > + vsetvlstatic8 \w, \vlen > +.endif > +1: > + vle8.v v0, (a2) > + addi a4, a4, -1 > + vse8.v v0, (a0) > + add a2, a2, a3 > + add a0, a0, a1 > + bnez a4, 1b > + ret
Up to 64-bit rows, you can use strided loads and stores here. Though for memory copying, unaligned scalar accesses might be just as fast. -- レミ・デニ-クールモン http://www.remlab.net/ _______________________________________________ ffmpeg-devel mailing list [email protected] https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email [email protected] with subject "unsubscribe".
