I too have the same problem on Debian as 3 others do.

As a former Ethernet driver developer, I noticed that the queue is empty when the interrupt was fired. And that it appeared hung in the Linux qdisc portion at Interrupt context, to a point of having watchdog timer expiring.

My relevant details is:
    Dell OptiPlex 980
    3.16.0-4-amd64
    linux/3.16.7-ckt25-2 (2016-04-08) x86_64
    Intel Gigabit Ethernet 82578DM Gigabit Network Connection (rev 05)


From what I've gathered from the following potentially duplicate bug #798512 and Intel Community Forums:

1 - It isn't CPU-related
2.  This error happened in the following Linux kernel versions:
    a. 3.16.0-4-amd64
    b. 3.19.5 (source: Intel communities)
    c. 4.3+70~bpo8+1
    b. 3.16.7-ckt11-1
3. This error does NOT happen in the following Linux kernel versions (take this with a grain of salt, for we haven't a reliable repeatable bug inducement yet):
    a. 3.16.7-ckt20-1+deb8u4
4. Intel driver used but still have error
   b. 3.3.3-NAPI
5. Intel hardware having this problem
  a. Intel I217-V (rev 04) (onboard) (has lspci SERR-)
  b. Intel 82578DM (rev 05) (onboard)  (has lspci SERR+)
  c. Intel Corporation 82579V Gigabit Network Connection (rev 05) (onboard)
6. Linux network
a. eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 b. eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br0 state UP mode DEFAULT group default qlen 1000


So far, common thread of the alike problems is the following (more reports will eliminate a few):
1.  e1000e driver
2.  ip link using 'qdisc' and 'pfifo_fast' option
2.  onboard Ethernet (PCI-related?)
3. Starting at Linux 3.16.0
4.  IP outgoing packets dropped was non-zero (mostly 32 packets)
4.  share similar call stack backtrace:

Bug #777683 call stack backtrace

[ 295.041406] <IRQ> [<ffffffff8150b405>] ? dump_stack+0x41/0x51 [ 295.041417] [<ffffffff81067797>] ? warn_slowpath_common+0x77/0x90 [ 295.041420] [<ffffffff810677fc>] ? warn_slowpath_fmt+0x4c/0x50 [ 295.041425] [<ffffffff81074777>] ? mod_timer+0x127/0x1e0 [ 295.041430] [<ffffffff8143eb96>] ? dev_watchdog+0x236/0x240 [ 295.041433] [<ffffffff8143e960>] ? dev_graft_qdisc+0x70/0x70 [ 295.041436] [<ffffffff81072ae1>] ? call_timer_fn+0x31/0x100 [ 295.041439] [<ffffffff8143e960>] ? dev_graft_qdisc+0x70/0x70 [ 295.041442] [<ffffffff81074119>] ? run_timer_softirq+0x209/0x2f0 [ 295.041445] [<ffffffff8106c641>] ? __do_softirq+0xf1/0x290 [ 295.041448] [<ffffffff8106ca15>] ? irq_exit+0x95/0xa0 [ 295.041451] [<ffffffff81514455>] ? smp_apic_timer_interrupt+0x45/0x60 [ 295.041455] [<ffffffff8151253d>] ? apic_timer_interrupt+0x6d/0x80 [ 295.041456] <EOI> [<ffffffff81074a26>] ? get_next_timer_interrupt+0x1d6/0x250 [ 295.041465] [<ffffffff813ddf9f>] ? cpuidle_enter_state+0x4f/0xc0 [ 295.041468] [<ffffffff813ddf98>] ? cpuidle_enter_state+0x48/0xc0 [ 295.041472] [<ffffffff810a7fa8>] ? cpu_startup_entry+0x2f8/0x400 [ 295.041475] [<ffffffff81903071>] ? start_kernel+0x492/0x49d [ 295.041478] [<ffffffff81902a04>] ? set_init_arg+0x4e/0x4e [ 295.041480] [<ffffffff81902120>] ? early_idt_handlers+0x120/0x120 [ 295.041483] [<ffffffff8190271f>] ? x86_64_start_kernel+0x14d/0x15c [ 295.041485] ---[ end trace aaf46f7eeccba58f ]--- [ 295.041502] e1000e 0000:00:19.0 eth-office: Reset adapter unexpectedly

Intel Community Forums (Intel 3.3.3-NAPI driver):
(source: https://communities.intel.com/message/305442#305442)
<IRQ>
[<ffffffff812e1ac9>] ? dump_stack+0x40/0x57
[<ffffffff81074451>] ? warn_slowpath_common+0x81/0xb0
[<ffffffff810744dc>] ? warn_slowpath_fmt+0x5c/0x80
[<ffffffff814b89e9>] ? dev_watchdog+0x229/0x240
[<ffffffff814b87c0>] ? dev_deactivate_queue.constprop.34+0x60/0x60
[<ffffffff810d6e90>] ? call_timer_fn+0x30/0xf0
[<ffffffff814b87c0>] ? dev_deactivate_queue.constprop.34+0x60/0x60
[<ffffffff810d861d>] ? run_timer_softirq+0x17d/0x2b0
[<ffffffff81078ca7>] ? __do_softirq+0x107/0x270
[<ffffffff81078f46>] ? irq_exit+0x86/0x90
[<ffffffff8158d90e>] ? smp_apic_timer_interrupt+0x3e/0x50
[<ffffffff8158b7a2>] ? apic_timer_interrupt+0x82/0x90
<EOI>
[<ffffffff8145ce08>] ? cpuidle_enter_state+0xe8/0x220
[<ffffffff8145cde3>] ? cpuidle_enter_state+0xc3/0x220
[<ffffffff810b3894>] ? cpu_startup_entry+0x294/0x350
[<ffffffff8104b600>] ? start_secondary+0x150/0x190

Debian Bug #798512

<ffffffff81067797>] ? warn_slowpath_common+0x77/0x90
<ffffffff810677fc>] ? warn_slowpath_fmt+0x4c/0x50
<ffffffff81074777>] ? mod_timer+0x127/0x1e0
<ffffffff8143eb96>] ? dev_watchdog+0x236/0x240
<ffffffff8143e960>] ? dev_graft_qdisc+0x70/0x70
<ffffffff81072ae1>] ? call_timer_fn+0x31/0x100
<ffffffff8143e960>] ? dev_graft_qdisc+0x70/0x70
<ffffffff81074119>] ? run_timer_softirq+0x209/0x2f0
<ffffffff8106c641>] ? __do_softirq+0xf1/0x290
<ffffffff8106ca15>] ? irq_exit+0x95/0xa0

My /var/log/message (3.6.14):
dmesg: e1000e: Intel(R) PRO/1000 Network Driver - 2.3.2-k
dmesg: e1000e 0000:00:19.0: Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode
dmesg: e1000e 0000:00:19.0 eth0: Detected Hardware Unit Hang:
May 24 18:44:55 sandbay kernel: [ 840.766377] <IRQ> [<ffffffff8150e835>] ? dump_stack+0x5d/0x78 May 24 18:44:55 sandbay kernel: [ 840.766391] [<ffffffff810677f7>] ? warn_slowpath_common+0x77/0x90 May 24 18:44:55 sandbay kernel: [ 840.766396] [<ffffffff8106785c>] ? warn_slowpath_fmt+0x4c/0x50 May 24 18:44:55 sandbay kernel: [ 840.766410] [<ffffffff81440f86>] ? dev_watchdog+0x236/0x240 May 24 18:44:55 sandbay kernel: [ 840.766418] [<ffffffff81440d50>] ? dev_graft_qdisc+0x70/0x70 May 24 18:44:55 sandbay kernel: [ 840.766424] [<ffffffff81072ba1>] ? call_timer_fn+0x31/0x100 May 24 18:44:55 sandbay kernel: [ 840.766435] [<ffffffff81440d50>] ? dev_graft_qdisc+0x70/0x70 May 24 18:44:55 sandbay kernel: [ 840.766439] [<ffffffff810741d9>] ? run_timer_softirq+0x209/0x2f0 May 24 18:44:55 sandbay kernel: [ 840.766444] [<ffffffff8106c6a1>] ? __do_softirq+0xf1/0x290 May 24 18:44:55 sandbay kernel: [ 840.766452] [<ffffffff8106ca75>] ? irq_exit+0x95/0xa0 May 24 18:44:55 sandbay kernel: [ 840.766457] [<ffffffff81517822>] ? do_IRQ+0x52/0xe0 May 24 18:44:55 sandbay kernel: [ 840.766465] [<ffffffff8151566d>] ? common_interrupt+0x6d/0x6d May 24 18:44:55 sandbay kernel: [ 840.766467] <EOI> [<ffffffff813e011f>] ? cpuidle_enter_state+0x4f/0xc0 May 24 18:44:55 sandbay kernel: [ 840.766475] [<ffffffff813e0118>] ? cpuidle_enter_state+0x48/0xc0 May 24 18:44:55 sandbay kernel: [ 840.766483] [<ffffffff810a8398>] ? cpu_startup_entry+0x2f8/0x400 May 24 18:44:55 sandbay kernel: [ 840.766488] [<ffffffff81042cbf>] ? start_secondary+0x20f/0x2d0

Some helpful tips for those who do have this same problem is to provide the outputof the following shell commands:
- uname -a
- lspci -vv
- dmesg | grep e1000 # not 'grep e1000e', we want to know if conflicts between Intel Eth driver exist - ip -s link show # we want to know if there are 1 or more Ethernet netdevice
- callstack backtrace (from dmesg or /var/log/message)
- firmware version


Reply via email to