On Tue, Dec 8, 2020 at 6:36 PM Florian Fainelli <f.faine...@gmail.com> wrote: > > dma_sync_single_for_{cpu,device} is what you would need in order to make > a partial cache line invalidation. You would still need to unmap the > same address+length pair that was used for the initial mapping otherwise > the DMA-API debugging will rightfully complain.
I tried replacing dma_unmap_single(9K, DMA_FROM_DEVICE); with dma_sync_single_for_cpu(received_size=1500 bytes, DMA_FROM_DEVICE); dma_unmap_single_attrs(9K, DMA_FROM_DEVICE, DMA_ATTR_SKIP_CPU_SYNC); and that works! But the bandwidth is still pretty bad, because the cpu now spends most of its time doing dma_map_single(9K, DMA_FROM_DEVICE); which spends a lot of time doing __dma_page_cpu_to_dev. When I try and replace that with dma_map_single_attrs(9K, DMA_FROM_DEVICE, DMA_ATTR_SKIP_CPU_SYNC); Then I get lots of dropped packets, which seems to indicate data corruption. Interestingly, when I do dma_map_single_attrs(9K, DMA_FROM_DEVICE, DMA_ATTR_SKIP_CPU_SYNC); dma_sync_single_for_{cpu|device}(9K, DMA_FROM_DEVICE); then the dropped packets disappear, but things are still very slow. What am I missing?