Sorry for the delay.

So we have 2 options on how to continue debugging here:

1. we can try a traditional git bisect.  This would involve testing
various kernel builds, to try to eventually narrow down the issue to
being fixed by a specific commit.  It's a long-ish process, depending on
how long testing each build takes, and it's critical that verification
of 'good' or 'bad' at each step is correct - otherwise the bisect ends
at the wrong commit.  Each step will involve me building a new kernel,
you test with the kernel until it fails or you've tested long enough to
be sure that kernel build is 'good'.  With hard-to-reproduce problems
like this, bisecting can be tough, because if a build doesn't fail for a
long time, that doesn't necessarily mean it's "good", it may just not
have failed yet, in which case the bisect will end at the wrong commit,
which doesn't help with figuring out how to fix anything.

2. Intel has provided me some undocumented commands that will allow
controlling what MDD events the nic triggers on.  I can provide those
instructions, and you can test with each MDD event bit set individually,
until the problem reproduces - then we know exactly which MDD source
triggered the event, which should help identify what the driver did to
cause the MDD event.  This way has a much better chance of finding the
specific problem, but the downside is you'll need to run undocumented
commands with your hardware.  I believe there should not be any risk in
doing that since the info came from Intel, but I can't personally verify
it, as I don't currently have access to this specific NIC.

If you're willing to try #2, I'll add the specific commands/instructions
and you can get started testing.  Otherwise if you would prefer not to
run the undocumented commands, I can start a kernel bisect.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1723127

Title:
  Intel i40e PF reset due to incorrect MDD detection (continues...)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1723127/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to