I too have the same problem on Debian as 3 others do.
As a former Ethernet driver developer, I noticed that the queue is empty
when the interrupt was fired. And that it appeared hung in the Linux
qdisc portion at Interrupt context, to a point of having watchdog timer
expiring.
My relevant details is:
Dell OptiPlex 980
3.16.0-4-amd64
linux/3.16.7-ckt25-2 (2016-04-08) x86_64
Intel Gigabit Ethernet 82578DM Gigabit Network Connection (rev 05)
From what I've gathered from the following potentially duplicate bug
#798512 and Intel Community Forums:
1 - It isn't CPU-related
2. This error happened in the following Linux kernel versions:
a. 3.16.0-4-amd64
b. 3.19.5 (source: Intel communities)
c. 4.3+70~bpo8+1
b. 3.16.7-ckt11-1
3. This error does NOT happen in the following Linux kernel versions
(take this with a grain of salt, for we haven't a reliable repeatable
bug inducement yet):
a. 3.16.7-ckt20-1+deb8u4
4. Intel driver used but still have error
b. 3.3.3-NAPI
5. Intel hardware having this problem
a. Intel I217-V (rev 04) (onboard) (has lspci SERR-)
b. Intel 82578DM (rev 05) (onboard) (has lspci SERR+)
c. Intel Corporation 82579V Gigabit Network Connection (rev 05) (onboard)
6. Linux network
a. eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
pfifo_fast state UP group default qlen 1000
b. eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
master br0 state UP mode DEFAULT group default qlen 1000
So far, common thread of the alike problems is the following (more
reports will eliminate a few):
1. e1000e driver
2. ip link using 'qdisc' and 'pfifo_fast' option
2. onboard Ethernet (PCI-related?)
3. Starting at Linux 3.16.0
4. IP outgoing packets dropped was non-zero (mostly 32 packets)
4. share similar call stack backtrace:
Bug #777683 call stack backtrace
[ 295.041406] <IRQ> [<ffffffff8150b405>] ? dump_stack+0x41/0x51 [
295.041417] [<ffffffff81067797>] ? warn_slowpath_common+0x77/0x90 [
295.041420] [<ffffffff810677fc>] ? warn_slowpath_fmt+0x4c/0x50 [
295.041425] [<ffffffff81074777>] ? mod_timer+0x127/0x1e0 [ 295.041430]
[<ffffffff8143eb96>] ? dev_watchdog+0x236/0x240 [ 295.041433]
[<ffffffff8143e960>] ? dev_graft_qdisc+0x70/0x70 [ 295.041436]
[<ffffffff81072ae1>] ? call_timer_fn+0x31/0x100 [ 295.041439]
[<ffffffff8143e960>] ? dev_graft_qdisc+0x70/0x70 [ 295.041442]
[<ffffffff81074119>] ? run_timer_softirq+0x209/0x2f0 [ 295.041445]
[<ffffffff8106c641>] ? __do_softirq+0xf1/0x290 [ 295.041448]
[<ffffffff8106ca15>] ? irq_exit+0x95/0xa0 [ 295.041451]
[<ffffffff81514455>] ? smp_apic_timer_interrupt+0x45/0x60 [ 295.041455]
[<ffffffff8151253d>] ? apic_timer_interrupt+0x6d/0x80 [ 295.041456]
<EOI> [<ffffffff81074a26>] ? get_next_timer_interrupt+0x1d6/0x250 [
295.041465] [<ffffffff813ddf9f>] ? cpuidle_enter_state+0x4f/0xc0 [
295.041468] [<ffffffff813ddf98>] ? cpuidle_enter_state+0x48/0xc0 [
295.041472] [<ffffffff810a7fa8>] ? cpu_startup_entry+0x2f8/0x400 [
295.041475] [<ffffffff81903071>] ? start_kernel+0x492/0x49d [
295.041478] [<ffffffff81902a04>] ? set_init_arg+0x4e/0x4e [ 295.041480]
[<ffffffff81902120>] ? early_idt_handlers+0x120/0x120 [ 295.041483]
[<ffffffff8190271f>] ? x86_64_start_kernel+0x14d/0x15c [ 295.041485]
---[ end trace aaf46f7eeccba58f ]--- [ 295.041502] e1000e 0000:00:19.0
eth-office: Reset adapter unexpectedly
Intel Community Forums (Intel 3.3.3-NAPI driver):
(source: https://communities.intel.com/message/305442#305442)
<IRQ>
[<ffffffff812e1ac9>] ? dump_stack+0x40/0x57
[<ffffffff81074451>] ? warn_slowpath_common+0x81/0xb0
[<ffffffff810744dc>] ? warn_slowpath_fmt+0x5c/0x80
[<ffffffff814b89e9>] ? dev_watchdog+0x229/0x240
[<ffffffff814b87c0>] ? dev_deactivate_queue.constprop.34+0x60/0x60
[<ffffffff810d6e90>] ? call_timer_fn+0x30/0xf0
[<ffffffff814b87c0>] ? dev_deactivate_queue.constprop.34+0x60/0x60
[<ffffffff810d861d>] ? run_timer_softirq+0x17d/0x2b0
[<ffffffff81078ca7>] ? __do_softirq+0x107/0x270
[<ffffffff81078f46>] ? irq_exit+0x86/0x90
[<ffffffff8158d90e>] ? smp_apic_timer_interrupt+0x3e/0x50
[<ffffffff8158b7a2>] ? apic_timer_interrupt+0x82/0x90
<EOI>
[<ffffffff8145ce08>] ? cpuidle_enter_state+0xe8/0x220
[<ffffffff8145cde3>] ? cpuidle_enter_state+0xc3/0x220
[<ffffffff810b3894>] ? cpu_startup_entry+0x294/0x350
[<ffffffff8104b600>] ? start_secondary+0x150/0x190
Debian Bug #798512
<ffffffff81067797>] ? warn_slowpath_common+0x77/0x90
<ffffffff810677fc>] ? warn_slowpath_fmt+0x4c/0x50
<ffffffff81074777>] ? mod_timer+0x127/0x1e0
<ffffffff8143eb96>] ? dev_watchdog+0x236/0x240
<ffffffff8143e960>] ? dev_graft_qdisc+0x70/0x70
<ffffffff81072ae1>] ? call_timer_fn+0x31/0x100
<ffffffff8143e960>] ? dev_graft_qdisc+0x70/0x70
<ffffffff81074119>] ? run_timer_softirq+0x209/0x2f0
<ffffffff8106c641>] ? __do_softirq+0xf1/0x290
<ffffffff8106ca15>] ? irq_exit+0x95/0xa0
My /var/log/message (3.6.14):
dmesg: e1000e: Intel(R) PRO/1000 Network Driver - 2.3.2-k
dmesg: e1000e 0000:00:19.0: Interrupt Throttling Rate (ints/sec) set to
dynamic conservative mode
dmesg: e1000e 0000:00:19.0 eth0: Detected Hardware Unit Hang:
May 24 18:44:55 sandbay kernel: [ 840.766377] <IRQ>
[<ffffffff8150e835>] ? dump_stack+0x5d/0x78
May 24 18:44:55 sandbay kernel: [ 840.766391] [<ffffffff810677f7>] ?
warn_slowpath_common+0x77/0x90
May 24 18:44:55 sandbay kernel: [ 840.766396] [<ffffffff8106785c>] ?
warn_slowpath_fmt+0x4c/0x50
May 24 18:44:55 sandbay kernel: [ 840.766410] [<ffffffff81440f86>] ?
dev_watchdog+0x236/0x240
May 24 18:44:55 sandbay kernel: [ 840.766418] [<ffffffff81440d50>] ?
dev_graft_qdisc+0x70/0x70
May 24 18:44:55 sandbay kernel: [ 840.766424] [<ffffffff81072ba1>] ?
call_timer_fn+0x31/0x100
May 24 18:44:55 sandbay kernel: [ 840.766435] [<ffffffff81440d50>] ?
dev_graft_qdisc+0x70/0x70
May 24 18:44:55 sandbay kernel: [ 840.766439] [<ffffffff810741d9>] ?
run_timer_softirq+0x209/0x2f0
May 24 18:44:55 sandbay kernel: [ 840.766444] [<ffffffff8106c6a1>] ?
__do_softirq+0xf1/0x290
May 24 18:44:55 sandbay kernel: [ 840.766452] [<ffffffff8106ca75>] ?
irq_exit+0x95/0xa0
May 24 18:44:55 sandbay kernel: [ 840.766457] [<ffffffff81517822>] ?
do_IRQ+0x52/0xe0
May 24 18:44:55 sandbay kernel: [ 840.766465] [<ffffffff8151566d>] ?
common_interrupt+0x6d/0x6d
May 24 18:44:55 sandbay kernel: [ 840.766467] <EOI>
[<ffffffff813e011f>] ? cpuidle_enter_state+0x4f/0xc0
May 24 18:44:55 sandbay kernel: [ 840.766475] [<ffffffff813e0118>] ?
cpuidle_enter_state+0x48/0xc0
May 24 18:44:55 sandbay kernel: [ 840.766483] [<ffffffff810a8398>] ?
cpu_startup_entry+0x2f8/0x400
May 24 18:44:55 sandbay kernel: [ 840.766488] [<ffffffff81042cbf>] ?
start_secondary+0x20f/0x2d0
Some helpful tips for those who do have this same problem is to provide
the outputof the following shell commands:
- uname -a
- lspci -vv
- dmesg | grep e1000 # not 'grep e1000e', we want to know if
conflicts between Intel Eth driver exist
- ip -s link show # we want to know if there are 1 or more Ethernet
netdevice
- callstack backtrace (from dmesg or /var/log/message)
- firmware version