Stephen Hemminger <[EMAIL PROTECTED]> writes: > 4) What is the IRQ routing? > There are two issues here, first the driver will never work with edge > trigger IRQ's, some motherboards also have busted BIOS and chipsets > that don't do MSI properly. A couple of module parameters are available > to help: > disable_msi=1 avoids using MSI > idle_timeout=10 polls for lost IRQ's every N ms (10)
i didn't take long to lock up the machine again. i've rebooted back into stock 2.6.20-rc1 and added the two module parameters above. cat /proc/interrupts now gives me: 17: 203 IO-APIC-fasteoi eth0, CMI8738 so i guess the MSI interrupts are disabled. we'll see how this works. > 5) What are the messages in the console log when problem happens? kernel: NETDEV WATCHDOG: eth0: transmit timed out kernel: sky2 eth0: tx timeout kernel: sky2 eth0: transmit ring 402 .. 361 report=406 done=406 kernel: sky2 status report lost? kernel: NETDEV WATCHDOG: eth0: transmit timed out kernel: sky2 eth0: tx timeout kernel: sky2 eth0: transmit ring 406 .. 361 report=406 done=406 kernel: sky2 hardware hung? flushing kernel: NETDEV WATCHDOG: eth0: transmit timed out kernel: sky2 eth0: tx timeout kernel: sky2 eth0: transmit ring 361 .. 321 report=406 done=406 kernel: sky2 status report lost? kernel: NETDEV WATCHDOG: eth0: transmit timed out kernel: sky2 eth0: tx timeout kernel: sky2 eth0: transmit ring 406 .. 366 report=406 done=406 kernel: sky2 hardware hung? flushing > 7) Please get a current version of ethtool from: > git://git.kernel.org/pub/scm/network/ethtool/ethtool.git > and run ethtool register dump after a problem occurs: > ethtool -d eth0 this is the output after it stopped working: PCI config ---------- 00: ab 11 62 43 07 04 18 00 15 00 00 02 08 00 00 00 10: 04 c0 df fd 00 00 00 00 01 ce 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 62 14 8c 05 30: 00 00 00 00 48 00 00 00 00 00 00 00 03 01 00 00 40: 00 00 f0 01 00 80 a0 01 01 50 02 fe 00 20 00 14 50: 03 5c 00 80 00 00 00 01 00 00 00 01 05 e0 83 00 60: 0c 10 e0 fe 00 00 00 00 61 41 00 00 00 00 00 00 70: 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Control Registers ----------------- Register Access Port 0x00 LED Control/Status 0xA603164A Interrupt Source 0x40000000 Interrupt Mask 0xC000001D Interrupt Hardware Error Source 0x00000000 Interrupt Hardware Error Mask 0x2E003F3F Bus Management Unit ------------------- CSR Receive Queue 1 0x00010000 CSR Sync Queue 1 0xFFFFFFFF CSR Async Queue 1 0x00000000 MAC Addresses --------------- Addr 1 00 11 09 DA 39 A3 Addr 2 00 11 09 DA 39 A3 Addr 3 00 00 00 00 00 00 Connector type 0x4A (J) PMD type 0x54 (T) PHY type 0x80 Chip Id 0xB6 Yukon-2 EC (rev 0) Ram Buffer 0x0C Status BMU: ----------- Control 0x0002220A Last Index 0x07FF Put Index 0x0601 List Address 0x000000007FBF8000 Transmit 1 done index 0x0196 Transmit index threshold 0x000A Status FIFO Write Pointer 0x16 Read Pointer 0x16 Level 0x00 Watermark 0x10 ISR Watermark 0x10 Status level Init 0x000030D4 Value 0x00000D00 Test 0x04 Control 0x02 TX status Init 0x0001E848 Value 0x0001E848 Test 0x04 Control 0x02 ISR Init 0x000009C4 Value 0x000009C4 Test 0x04 Control 0x02 GMAC control 0x005A GPHY control 0x2002 LINK control 0x02 GMAC 1 Status 0xD000 Control 0x1800 Transmit 0x1000 Receive 0xE000 Transmit flow control 0xFFFF Transmit parameter 0xD7C4 Serial mode 0x221E Source address: 00 11 09 DA 39 A3 Physical address: 00 11 09 DA 39 A3 Rx GMAC 1 End Address 0x0000007F Almost Full Thresh 0x00000070 Control/Test 0x0900228A FIFO Flush Mask 0x000018FB FIFO Flush Threshold 0x0000000B Truncation Threshold 0x0000017C Upper Pause Threshold 0x00000000 Lower Pause Threshold 0x00000081 VLAN Tag 0x00000074 FIFO Write Pointer 0x00000000 FIFO Write Level 0x0000007B FIFO Read Pointer 0x00000000 FIFO Read Level 0x00000079 Tx GMAC 1 End Address 0x0000007F Almost Full Thresh 0x00000010 Control/Test 0x0102220A FIFO Flush Mask 0x00000000 FIFO Flush Threshold 0x00000000 Truncation Threshold 0x00000000 Upper Pause Threshold 0x00000000 Lower Pause Threshold 0x00000081 VLAN Tag 0x0000002A FIFO Write Pointer 0x0000002A FIFO Write Level 0x00000000 FIFO Read Pointer 0x00000000 FIFO Read Level 0x0000002A Receive Queue 1 --------------- Buffer control 0x05F8 Byte Counter 49408 Descriptor Address 0x0000000076F4F810 Status 0x05EA0100 Timestamp 0x00000000 BMU Control/Status 0x000061AA Done 0x0000 Request 0x0000000076F4F810 Csum1 Offset 52057 Piston 14 Csum2 Offset 52057 Positing 14 Sync Transmit Queue 1 --------------- Descriptor Address 0x0000000000000000 Address Counter 0x0000000000000000 Current Byte Counter 0 BMU Control/Status 0x00000000 Flag & FIFO Address 0x00000000 Control 0x00000000 Next 0x00000000 Data 0x0000000000000000 Status 0x00000000 Timestamp 0x00000000 Csum Start 0x0000 Pos 0 Write 0 Async Transmit Queue 1 --------------- Buffer control 0x053D Byte Counter 49950 Descriptor Address 0x0000000047237000 Status 0x000005EA Timestamp 0x00010000 BMU Control/Status 0x800011AA Done 0x0000 Request 0x000000004723753D Csum Start 0x0032 Pos 0 Write 0 Receive RAMbuffer 1 --------------- Start Address 0x00000000 End Address 0x00000E7F Write Pointer 0x00000079 Read Pointer 0x0000007E Upper Threshold/Pause Packets 0x00000D80 Lower Threshold/Pause Packets 0x000003A0 Upper Threshold/High Priority 0x00000AE0 Lower Threshold/High Priority 0x00000740 Packet Counter 0x00000029 Level 0x00000E7B Test 0x0002221A Sync Transmit RAMbuffer 1 --------------- Start Address 0x00000000 End Address 0x00000000 Write Pointer 0x00000000 Read Pointer 0x00000000 Packet Counter 0x00000000 Level 0x00000000 Test 0x00000000 Async Transmit RAMbuffer 1 --------------- Start Address 0x00000E80 End Address 0x000017FF Write Pointer 0x0000132A Read Pointer 0x0000132A Packet Counter 0x00000000 Level 0x00000000 Test 0x0002222A i don't know if it helps but i am also including the output of ethtool while the card was still working: PCI config ---------- 00: ab 11 62 43 07 04 10 00 15 00 00 02 08 00 00 00 10: 04 c0 df fd 00 00 00 00 01 ce 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 62 14 8c 05 30: 00 00 00 00 48 00 00 00 00 00 00 00 03 01 00 00 40: 00 00 f0 01 00 80 a0 01 01 50 02 fe 00 20 00 14 50: 03 5c 00 80 00 00 00 01 00 00 00 01 05 e0 83 00 60: 0c 10 e0 fe 00 00 00 00 61 41 00 00 00 00 00 00 70: 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Control Registers ----------------- Register Access Port 0x00 LED Control/Status 0xA603164A Interrupt Source 0x00000000 Interrupt Mask 0xC000001D Interrupt Hardware Error Source 0x00000000 Interrupt Hardware Error Mask 0x2E003F3F Bus Management Unit ------------------- CSR Receive Queue 1 0x00010000 CSR Sync Queue 1 0xFFFFFFFF CSR Async Queue 1 0x00000000 MAC Addresses --------------- Addr 1 00 11 09 DA 39 A3 Addr 2 00 11 09 DA 39 A3 Addr 3 00 00 00 00 00 00 Connector type 0x4A (J) PMD type 0x54 (T) PHY type 0x80 Chip Id 0xB6 Yukon-2 EC (rev 0) Ram Buffer 0x0C Status BMU: ----------- Control 0x0002220A Last Index 0x07FF Put Index 0x00B8 List Address 0x000000007FBF8000 Transmit 1 done index 0x0057 Transmit index threshold 0x000A Status FIFO Write Pointer 0x08 Read Pointer 0x08 Level 0x00 Watermark 0x10 ISR Watermark 0x10 Status level Init 0x000030D4 Value 0x000030D4 Test 0x04 Control 0x02 TX status Init 0x0001E848 Value 0x0001E848 Test 0x04 Control 0x02 ISR Init 0x000009C4 Value 0x000009C4 Test 0x04 Control 0x02 GMAC control 0x005A GPHY control 0x2002 LINK control 0x02 GMAC 1 Status 0xD000 Control 0x1800 Transmit 0x1000 Receive 0xE000 Transmit flow control 0xFFFF Transmit parameter 0xD7C4 Serial mode 0x221E Source address: 00 11 09 DA 39 A3 Physical address: 00 11 09 DA 39 A3 Rx GMAC 1 End Address 0x0000007F Almost Full Thresh 0x00000070 Control/Test 0x0900228A FIFO Flush Mask 0x000018FB FIFO Flush Threshold 0x0000000B Truncation Threshold 0x0000017C Upper Pause Threshold 0x00000000 Lower Pause Threshold 0x00000081 VLAN Tag 0x00000027 FIFO Write Pointer 0x00000000 FIFO Write Level 0x00000000 FIFO Read Pointer 0x00000000 FIFO Read Level 0x00000027 Tx GMAC 1 End Address 0x0000007F Almost Full Thresh 0x00000010 Control/Test 0x0102220A FIFO Flush Mask 0x00000000 FIFO Flush Threshold 0x00000000 Truncation Threshold 0x00000000 Upper Pause Threshold 0x00000000 Lower Pause Threshold 0x00000081 VLAN Tag 0x00000032 FIFO Write Pointer 0x00000032 FIFO Write Level 0x00000000 FIFO Read Pointer 0x00000000 FIFO Read Level 0x00000032 Receive Queue 1 --------------- Buffer control 0x05F8 Byte Counter 49408 Descriptor Address 0x000000001727E010 Status 0x003C0100 Timestamp 0x00000000 BMU Control/Status 0x000061AA Done 0x0000 Request 0x000000001727E010 Csum1 Offset 12632 Piston 14 Csum2 Offset 12632 Positing 14 Sync Transmit Queue 1 --------------- Descriptor Address 0x0000000000000000 Address Counter 0x0000000000000000 Current Byte Counter 0 BMU Control/Status 0x00000000 Flag & FIFO Address 0x00000000 Control 0x00000000 Next 0x00000000 Data 0x0000000000000000 Status 0x00000000 Timestamp 0x00000000 Csum Start 0x0000 Pos 0 Write 0 Async Transmit Queue 1 --------------- Buffer control 0x06CC Byte Counter 49950 Descriptor Address 0x0000000046AD23C6 Status 0x000005EA Timestamp 0x00010000 BMU Control/Status 0x800011AA Done 0x0000 Request 0x0000000046AD2A92 Csum Start 0x0032 Pos 0 Write 0 Receive RAMbuffer 1 --------------- Start Address 0x00000000 End Address 0x00000E7F Write Pointer 0x00000427 Read Pointer 0x00000427 Upper Threshold/Pause Packets 0x00000D80 Lower Threshold/Pause Packets 0x000003A0 Upper Threshold/High Priority 0x00000AE0 Lower Threshold/High Priority 0x00000740 Packet Counter 0x00000000 Level 0x00000000 Test 0x0002221A Sync Transmit RAMbuffer 1 --------------- Start Address 0x00000000 End Address 0x00000000 Write Pointer 0x00000000 Read Pointer 0x00000000 Packet Counter 0x00000000 Level 0x00000000 Test 0x00000000 Async Transmit RAMbuffer 1 --------------- Start Address 0x00000E80 End Address 0x000017FF Write Pointer 0x000017B2 Read Pointer 0x000017B2 Packet Counter 0x00000000 Level 0x00000000 Test 0x0002222A i'll try to lock up the networking again and if it still happens i'll swith to the vendor driver and see what that has to say. --alex-- -- | I believe the moment is at hand when, by a paranoiac and active | | advance of the mind, it will be possible (simultaneously with | | automatism and other passive states) to systematize confusion | | and thus to help to discredit completely the world of reality. | - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html