On Mon, Dec 12, 2016 at 10:12:43PM -0500, Theodore Ts'o wrote: > On Tue, Dec 13, 2016 at 04:28:17AM +0200, Michael S. Tsirkin wrote: > > > > That's unfortunate, of course. It could be a hypervisor or > > a guest kernel bug. ideas: > > - does host have mq capability? how many queues? > > - how about # of msix vectors? > > - after you send something on tx queues, > > are interrupts arriving on rx queues? > > - is problem rx or tx? > > set ip and arp manually and send a packet to known MAC, > > does it get there? > > Sorry, I don't know how to debug virtio-net. Given that it's in a > cloud environment, I also can't set ip addresses manually, since ip > addresses are set manually.
OK, but you can send raw ethernet frames preseumably? > If you can send me a patch, I'm happy to apply it and send you back > results. Let's start with collecting stats from sysfs for this device. pls get features bitmap from there, pls get /proc/interrupts mappings, and pls use lspci to dump pci config. > I can say that I've had _zero_ problems using pretty much any kernel > from 3.10 to 4.9 using Google Compute Engine. The commit I referenced > caused things to stop working. So in terms of regression, this is > definitely a regression, and it's definitely caused by commit > 449000102901. Even if it is a hypervisor "bug", I'm pretty sure I > know what Linus will say if I ask him to revert it. Linux kernels are > expected to work around hardware bugs, and breaking users just because > hardware is "broken" by some definition is generally not considered > friendly, especially when has been working for years and years before > some commit "fixed" things. I'm open to limiting new features to virtio 1 mode just to avoid the hassle of dealing with legacy hypervisors. But let's not argue about it until we know the root cause. > > I would very much like to work with you to fix it, but I will need > your help, since virtio-net doesn't seem to print any informational > during the boot sequence, and I don't know how the best way to debug > it. > > Cheers, > > - Ted Let's start with debugging it like any PCI NIC. -- MST