Prashant writes ("Re: tg3 NIC driver bug in 3.14.x under Xen [and 3 more
messages]"):
> Ian, using your config we are able to recreate the problem that you are
> seeing. The driver finds the RX data buffer to be all zero, with a
> analyzer trace we are seeing the chip is DMA'ing valid RX data buffer
> contents to the host but once the driver tries to read this DMA area, it
> is seeing all zero's which is the reason of the corruption. This is only
> for the RX data buffer, the RX descriptor and status block update DMA
> regions are having valid contents.
I am no expert on this area, but this suggests that the driver is
misoperating the Linux DMA management API. This is what I think
Konrad suspected when he suggested the `iommu=soft swiotlb=force'
command line option.
Note in kernel-parameters.txt:
swiotlb= [ARM,IA-64,PPC,MIPS,X86]
Format: { <int> | force }
<int> -- Number of I/O TLB slabs
force -- force using of bounce buffers even if they
wouldn't be automatically used by the kernel
So with `swiotlb=force' the DMA is _expected_ to go to a bounce buffer
managed by the kernel DMA API.
> This is unlikely to be a chip or driver issue, as the chip is doing the
> correct DMA but the corruption occurs before driver reads it. Would
> request iommu experts to take a look and suggest what can be done next.
As I say above I think this is probably a driver bug.
I have seen identical symptoms on a >5yo desktop box under my desk and
on two brand new rackmount servers; I therefore doubt that it's a
hardware problem.
Ian.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html