> How do we proceed? :-)

one bug at a time, please.  As this NIC's "MDD" behavior doesn't
indicate what happened that it disliked, I can't tell if that is related
or not to the MDD events, but I suspect not, especially if you have not
seen that happen for kernels when you did get MDD events.

since the Ubuntu 4.4.0 isn't an ancestor of the Ubuntu 4.10.0 kernel, to
bisect we would need to start at the merge base anyway (mainline 4.4
kernel); and since there are no changes to the i40e driver between
mainline 4.10 and Ubuntu 4.10.0, a bisect will be a lot easier if we
shift over to the mainline kernel series.

Are you able to test various kernel versions during the bisect process?
It may take a while, and it's important to make sure at each step to
determine for certain if the kernel is 'good' or 'bad' - an incorrect
evaluation at any step leads to an incorrect endpoint.

If you are able to help with a kernel bisect by testing, can you test
each of these kernels:

http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.4-wily/

http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.10/

I expect the v4.4 to be 'bad' (encounter the MDD event) and 4.10 to be
'good' (no MDD event), based on your evaluation of the Ubuntu kernels
based on those versions.  If those are good/bad as expected, we can
start the bisection between them.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1723127

Title:
  Intel i40e PF reset due to incorrect MDD detection (continues...)

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Xenial:
  In Progress

Bug description:
  This is a continuation from bug 1713553; a patch was added in that bug
  to attempt to fix this, and it may have helped reduce the issue but
  appears not to have fixed it, based on more reports.

  The issue is the i40e driver, when TSO is enabled, sometimes sees the
  NIC firmware issue a "MDD event" where MDD is "Malicious Driver
  Detection".  This is vaguely defined in the i40e spec, but with no way
  to tell what the NIC actually saw that it didn't like.  So, the driver
  can do nothing but print an error message and reset the PF (or VF).
  Unfortunately, this resets the interface, which causes an interruption
  in network traffic flow while the PF is resetting.

  See bug 1713553 for more details.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1723127/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to