As of now, we've been running HWE 4.10 for little more than 16 hours and no problems so far. Previously we'd hit the problem within the hour.
There is however one new logmessage that we haven't seen before, neither with 1.4.x driver or 2.0.x. But it might be unrelated, we can't see any particular performance-issues in any of our monitoring/graphs. And the message is: TCP: bond0.5: Driver has suspect GRO implementation, TCP performance may be compromised. How do we proceed? :-) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1723127 Title: Intel i40e PF reset due to incorrect MDD detection (continues...) Status in linux package in Ubuntu: In Progress Status in linux source package in Xenial: In Progress Bug description: This is a continuation from bug 1713553; a patch was added in that bug to attempt to fix this, and it may have helped reduce the issue but appears not to have fixed it, based on more reports. The issue is the i40e driver, when TSO is enabled, sometimes sees the NIC firmware issue a "MDD event" where MDD is "Malicious Driver Detection". This is vaguely defined in the i40e spec, but with no way to tell what the NIC actually saw that it didn't like. So, the driver can do nothing but print an error message and reset the PF (or VF). Unfortunately, this resets the interface, which causes an interruption in network traffic flow while the PF is resetting. See bug 1713553 for more details. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1723127/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp