On 5/7/07, Florian Kulzer <[EMAIL PROTECTED]> wrote:
On Sat, May 05, 2007 at 14:01:45 -0700, jason.public wrote:
> Every once in a while, my eth1 interface dies with the following messages.
>
> # dmesg
> [...]
> APIC error on CPU0: 02(02)
> APIC error on CPU0: 02(02)
> APIC error on CPU0: 02(02)
> APIC error on CPU0: 02(02)
> APIC error on CPU0: 02(02)
> APIC error on CPU0: 02(02)
You might want to try booting with the option "noapic". Maybe the
network interface hang-ups are caused by problem with the interrupts.
> NETDEV WATCHDOG: eth1_rename: transmit timed out
> eth1_rename: transmit timed out, tx_status 00 status 6000.
> diagnostics: net 0cc0 media 8802 dma 005000a1 fifo 0000
> Flags; bus-master 1, dirty 392071(7) current 392087(7)
> Transmit list 1f030660 vs. df030660.
> 0: @df030200 length 80000042 status 00000042
> 1: @df0302a0 length 80000042 status 00000042
> 2: @df030340 length 8000002a status 0000002a
> 3: @df0303e0 length 8000002a status 0000002a
> 4: @df030480 length 8000004d status 0000004d
> 5: @df030520 length 8000002a status 8000002a
> 6: @df0305c0 length 8000002a status 8000002a
> 7: @df030660 length 800000ea status 000000ea
> 8: @df030700 length 800000ea status 000000ea
> 9: @df0307a0 length 8000004a status 0000004a
> 10: @df030840 length 8000004c status 0000004c
> 11: @df0308e0 length 8000004c status 0000004c
> 12: @df030980 length 8000004d status 0000004d
> 13: @df030a20 length 8000004d status 0000004d
> 14: @df030ac0 length 8000004a status 0000004a
> 15: @df030b60 length 8000004a status 0000004a
> eth1_rename: Resetting the Tx ring pointer.
>
> Either I have to restart it with "/etc/init.d/networking restart", or
> it restarts by itself after a minute or so. Any idea what the problem
> is?
Which ethernet card are we talking about? Please post the relevant part
of the output of "lspci", as well as the output of "/sbin/ifconfig".
I am also not too happy about seeing "eth1_rename" up there. This may
not be related to your main problem, but we might as well fix it while
we are wrangling with the networking anyway. Therefore I would like to
see the content of your /etc/udev/rules.d/z25_persistent-net.rules file.
I'll try booting with the noapic option, but it might take a while to
see results, as this only happens a couple of times a week on average.
There are two ethernet cards, but it's the 3Com that has been causing
problems. When eth1_rename dies, it usually causes eth0 (the Realtek)
to stop working as well.
# lspci|grep Ethernet
00:08.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8169
Gigabit Ethernet (rev 10)
00:0a.0 Ethernet controller: 3Com Corporation 3c905 100BaseTX [Boomerang]
Here's the output of ifconfig:
# ifconfig
eth0 Link encap:Ethernet HWaddr 00:08:54:B3:70:2E
inet addr:192.168.1.5 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::208:54ff:feb3:702e/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:5213103 errors:0 dropped:0 overruns:0 frame:0
TX packets:20599066 errors:0 dropped:46 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:429758527 (409.8 MiB) TX bytes:363937475 (347.0 MiB)
Interrupt:209 Base address:0x4000
eth1_rena Link encap:Ethernet HWaddr 00:60:97:7D:43:37
inet addr:192.168.0.3 Bcast:192.168.0.255 Mask:255.255.255.0
inet6 addr: fe80::260:97ff:fe7d:4337/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:1645586 errors:0 dropped:0 overruns:0 frame:0
TX packets:1372484 errors:2 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:1575637794 (1.4 GiB) TX bytes:165806431 (158.1 MiB)
Interrupt:201 Base address:0xa000
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:287 errors:0 dropped:0 overruns:0 frame:0
TX packets:287 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:536268 (523.6 KiB) TX bytes:536268 (523.6 KiB)
And the contents of /etc/udev/rules.d/z25_persistent-net.rules:
# This file was automatically generated by the /lib/udev/write_net_rules
# program, probably run by the persistent-net-generator.rules rules file.
#
# You can modify it, as long as you keep each rule on a single line.
# PCI device 10b7:9050 (3c59x)
SUBSYSTEM=="net", DRIVERS=="?*", SYSFS{address}=="00:60:97:7d:43:37",
NAME="eth0"
# PCI device 0x10ec:0x8169 (r8169)
SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{address}=="00:08:54:b3:70:2e",
NAME="eth0"
The "_rename" suffix used to bother me (it doesn't make sense for a
developer to assume that a user will want to figure out how to rename
a device), but I've gotten used to it.
I've searched google for this error, and a lot of discussions about
bugs in a recent kernel come up. Maybe the solution will be to
upgrade my kernel once the bug is fixed. Any other ideas?
Thanks,
Jason
--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]