> What are the addresses of the ring entries? > I bet there is something wrong with the cache coherency and/or > flushing. > > So the MAC hardware has done the write but (somewhere) it > isn't visible to the cpu for ages.
CMA memory is disabled in our kernel config. So the descriptors allocated with dma_alloc_coherent() won't be CMA memory. Could this cause a different caching/flushing behaviour? > I've seen a 'fec' ethernet block in a freescale DSP. > IIRC it is a fairly simple block - won't be doing out-of-order writes. > > The imx6q seems to be arm based. > I'm guessing that means it doesn't do cache coherency for ethernet dma > accesses. > That (more or less) means the rings need to be mapped uncached. > Any attempt to just flush/invalidate the cache lines is doomed. > > ... > > > I could only think of skipping/dropping the descriptor when the > > > current is still busy but the next one is ready. > > > But it is not easily possible because the "stuck" descriptor gets > > > ready after a huge delay. > > I bet the descriptor is at the end of a cache line which finally > gets re-read. I stumbled across FEC ethernet issues [Was: PL310 errata workarounds] https://www.spinics.net/lists/arm-kernel/thrd312.html#315574. Changes to the PL310 cache driver (used in imx6q) were made, to also fix fec issues. This PL310 cleanup/fixes are not contained in the 3.10.108 kernel. So maybe i have to look also there.