Hi,
On Thu, Jul 4, 2024, 13:54 Rémi Denis-Courmont <[email protected]> wrote: > Le torstaina 4. heinäkuuta 2024, 19.26.19 EEST Sean McGovern a écrit : > > Is that correlated with the comment above re: len? Or is it more general > > that I should unroll until I've exhausted the available vector registers? > > You should unroll if it improves bandwidth. > > -- > レミ・デニ-クールモン > http://www.remlab.net/ > > > > _______________________________________________ > ffmpeg-devel mailing list > [email protected] > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > [email protected] with subject "unsubscribe". > After adding a 2nd set of load/left shift/store it was diminishing/no returns for more unrolling. I'll send the updated version later. Does wasted32 (and I guess wasted33 by proxy) not have to worry about loops tails? I noticed the other vectorized versions don't do anything special in that regard. -- Sean McGovern > _______________________________________________ ffmpeg-devel mailing list [email protected] https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email [email protected] with subject "unsubscribe".
