Summary: Broadcom 5762 NIC locks up under heavy load.
Description: The tg3 Broadcom network driver that binds with chipset 5762 locks up when under heavy network load. When this happens, a reboot is necessary to recover network. Sometimes, bringing the interface offline and online (via ifconfig) would recover networking. I've also tested with the latest tg3 driver 3.137h (dec 2014 version) and networking is still problematic. I have also disabled TSO, GSO etc... with ethtool, but the bug still surfaces. This bug may be related to the integrated Firmware because at the time of the crash, the memory dump of the bcm5762 chip is completely cleared out with 0xFFs. Here is the procedure to replicate the issue because it is hard to replicate it under moderate network load. 1. Bootup a machine with a broadcom 5762 NIC (ie. HP DeskElite 705) using a Ubuntu/Kubunu Live CD 14.04-15.04, or a build with the latest mainline kernel. 2. From another machine: start 5 sessions, repetitively copy (scp with public key authentication) a 70 MB file back and forth to the tg3 machine in each session. (not sure if this is necessary) 3. Create a 1GB file on the tg3 machine, with something like dd if=/dev/urandom of=/my_test_file bs=1024 count=$((1024*1000)) 4. From another machine: repetitively secure copy that 1GB file from the tg3 machine. This can be done with something like: while [ 0 ]; do scp -i /my/scp/private.key u...@ip.of.tg3:/my_test_file /tmp done; Networking will lockup in about 10-30 minutes, in some rare cases up to 4 hours of run time. Having multiple instances of the 1GB file transfer will significantly reduce the occurrence time. Keywords: networking, tg3 kernel version: Linux version 4.0.0-gbf70def. I have also tested with the following kernel versions: 3.17, 3.16, 2.6.39. Kernel log message (Oops): (see full ref: https://launchpadlibrarian.net/204185480/dmesg) WARNING: CPU: 0 PID: 1830 at net/sched/sch_generic.c:303 dev_watchdog+0xfc/0x185() NETDEV WATCHDOG: eth0 (tg3): transmit queue 0 timed out Modules linked in: CPU: 0 PID: 1830 Comm: cat Not tainted 4.0.0-gbf70def #4 Hardware name: Hewlett-Packard HP EliteDesk 705 G1 MT/2215, BIOS L06 v02.15 10/22/2014 00000000 00000000 f581df18 c06e5045 c0a7ec29 f581df30 c01319e9 c0668e77 f4c30000 00000000 0005da10 f581df48 c0131a73 00000009 f581df40 c0a7ec29 f581df5c f581df78 c0668e77 c0a7ec62 0000012f c0a7ec29 f4c30000 c0a60eba Call Trace: [<c06e5045>] dump_stack+0x41/0x52 [<c01319e9>] warn_slowpath_common+0x83/0x9a [<c0668e77>] ? dev_watchdog+0xfc/0x185 [<c0131a73>] warn_slowpath_fmt+0x2b/0x2f [<c0668e77>] dev_watchdog+0xfc/0x185 [<c0668d7b>] ? pfifo_fast_dequeue+0xaf/0xaf [<c0165221>] call_timer_fn+0x47/0xcd [<c01655d9>] run_timer_softirq+0x165/0x1c4 [<c0668d7b>] ? pfifo_fast_dequeue+0xaf/0xaf [<c0133d84>] __do_softirq+0xbe/0x1ef [<c0133cc6>] ? _local_bh_enable+0x40/0x40 [<c0103551>] do_softirq_own_stack+0x22/0x28 <IRQ> [<c0134003>] irq_exit+0x39/0x47 [<c0121b41>] smp_apic_timer_interrupt+0x38/0x42 [<c06f1959>] apic_timer_interrupt+0x2d/0x34 [<c06f0c20>] ? _raw_spin_unlock_irqrestore+0xd/0xf [<c0389fb5>] extract_buf+0x83/0xc7 [<c038b68e>] extract_entropy_user+0xc2/0x11a [<c038b74e>] urandom_read+0x68/0xbf [<c038b6e6>] ? extract_entropy_user+0x11a/0x11a [<c01d4594>] __vfs_read+0x1b/0x47 [<c01d462b>] vfs_read+0x6b/0xd3 [<c01d46d7>] SyS_read+0x44/0x84 [<c06f11c2>] syscall_call+0x7/0x7 System info and detailed description: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1447664 I can help test proposed patches fairly quickly. So please let me know if you need anything. Thank you. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html