https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77610
--- Comment #6 from Oleg Endo <olegendo at gcc dot gnu.org> --- (In reply to Rich Felker from comment #5) > Of course, fancy memcpy in general is only a win beyond a certain size. For > DMA I did not mean I want to use DMA for any size beyond gcc's proposed > function-call threshold. Rather, the vdso-provided function would choose > what to do appropriately for the hardware. But on J2 (nommu, no special > kernel mode) I suspect DMA could be a win at sizes as low as 256 bytes, with > spin-to-completion and a lock shared between user (vdso) and kernel rather > than using a syscall (not sure this is justified, though). Using a syscall > with sleep-during-dma would have a significantly larger threshold before > it's worthwhile. I see. Anyway, I agree that something like attachment 39642 is useful. > Regarding how I measured kernel performance increase, I was just looking at > boot timing with printk timestamps enabled. The main time consumer is > unpacking initramfs. Ah right, that sounds like copying memory around. Do you happen to have any other runtime measurements?