Ok, I'll check it out. Thank you very much! By the way, we downloaded and tested one of the Deb packages you created, and it worked quite well. Will check which one was exactly before reporting (almost sure it was the one for xenial).
We managed to reproduce the issue easily by booting into pxe and, after the nic was started (trying to get an ip), we reset the machine and booted into Ubuntu. There is a huge difference by doing this and doing a cold boot, directly into Ubuntu. My hypothesis is that pxe setups the nic in a way that is not the default, by changing one (or more) of the config bits for some register. This same bit(s) is/are not being touched by the tg3 driver without patch. This way, a boot may work sometimes, maybe due to default values not being set by the kernel module tg3 (and being set by pxe code, if it executed before Linux is loaded). Anyway, the unpatched kernel breaks very quickly, while the patched kernel you provided worked out very well. This happens after running pxe. I will check your links soon and return with our results in the next days, hopefully this weekend or next week. Thank you, Paulo On Mar 20, 2018 14:16, "Kai-Heng Feng" <kai.heng.f...@canonical.com> wrote: Guy, Broadcom has a new patch [1] that need to test. Here's the kernel [2] to try. [1] https://lkml.org/lkml/2018/3/20/35 [2] https://people.canonical.com/~khfeng/lp1447664-20180320/ -- You received this bug notification because you are subscribed to the bug report. https://bugs.launchpad.net/bugs/1447664 Title: 14e4:1687 broadcom tg3 network driver disconnects under high load Status in linux package in Ubuntu: Triaged Status in linux package in Debian: New Bug description: The tg3 broadcom network driver that binds with chipset 5762 goes offline and unable to recover (even with tg3 watchdog timeout) when network transmit is under high load. Call trace: https://launchpadlibrarian.net/204185480/dmesg When this happens, only a reboot would be able to fix it. Sometimes, however, bringing the interface offline and online (via ifconfig) would recover networking. I've also tested with the latest tg3 driver (dec 2014 version) and networking is still problematic. I have also disabled TSO, GSO etc... with ethtool and the bug still surfaces. This bug may be related to the integrated Firmware. Here is the procedure to replicate the issue because it is hard to replicate it under moderate network load. 1. Bootup a machine with a broadcom 5762 NIC (ie. HP DeskElite 705) using a Ubuntu/Kubunu Live CD 14.04-15.04. 2. from another machine: start 5 sessions, repetitively copy (scp with public key authentication) a 70 meg file back and forth to the tg3 machine in each session. (not sure if this is necessary) 3. create a 1GB file on the tg3 machine, with something like dd if=/dev/urandom of=/my/test/file bs=1024 count=$((1024*1000)) 4. from another machine: repetitively scp copy that 1GB file from the tg3 machine. This can be done with something like: while [ 0 ]; do scp -i /my/scp/private.key u...@ip.of.tg3:/my/test/file /tmp done; Networking will mostly goes offline in about 10-30 minutes. WORKAROUND: Add udev rule to make the changes permanent in /etc/udev/rules.d/80-tg3-fix.rules : ACTION=="add", SUBSYSTEM=="net", ATTRS{vendor}=="0x14e4", ATTRS{device}=="0x1687", RUN+="/sbin/ethtool -K %k highdma off" ProblemType: Bug DistroRelease: Ubuntu 15.04 Package: linux-image-3.19.0-15-generic 3.19.0-15.15 ProcVersionSignature: Ubuntu 3.19.0-15.15-generic 3.19.3 Uname: Linux 3.19.0-15-generic x86_64 ApportVersion: 2.17.2-0ubuntu1 Architecture: amd64 AudioDevicesInUse: USER PID ACCESS COMMAND /dev/snd/controlC1: kubuntu 3748 F.... pulseaudio /dev/snd/controlC0: kubuntu 3748 F.... pulseaudio CasperVersion: 1.360 Date: Thu Apr 23 11:16:24 2015 IwConfig: eth0 no wireless extensions. lo no wireless extensions. LiveMediaBuild: Kubuntu 15.04 "Vivid Vervet" - Release amd64 (20150422) MachineType: Hewlett-Packard HP EliteDesk 705 G1 MT ProcEnviron: LANGUAGE= TERM=xterm PATH=(custom, no user) LANG=en_US.UTF-8 SHELL=/bin/bash ProcFB: 0 radeondrmfb ProcKernelCmdLine: BOOT_IMAGE=/casper/vmlinuz.efi file=/cdrom/preseed/hostname.seed boot=casper maybe-ubiquity quiet splash --- PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: Home directory not accessible: Permission denied No PulseAudio daemon running, or not running as session daemon. RelatedPackageVersions: linux-restricted-modules-3.19.0-15-generic N/A linux-backports-modules-3.19.0-15-generic N/A linux-firmware 1.143 RfKill: SourcePackage: linux UdevLog: Error: [Errno 2] No such file or directory: '/var/log/udev' UpgradeStatus: No upgrade log present (probably fresh install) dmi.bios.date: 10/22/2014 dmi.bios.vendor: Hewlett-Packard dmi.bios.version: L06 v02.15 dmi.board.asset.tag: 2UA5041TG4 dmi.board.name: 2215 dmi.board.vendor: Hewlett-Packard dmi.chassis.asset.tag: 2UA5041TG4 dmi.chassis.type: 6 dmi.chassis.vendor: Hewlett-Packard dmi.modalias: dmi:bvnHewlett-Packard:bvrL06v02.15:bd10/22/2014:svnHewlett-Packard:pnHPEliteDesk705G1MT:pvr:rvnHewlett-Packard:rn2215:rvr:cvnHewlett-Packard:ct6:cvr: dmi.product.name: HP EliteDesk 705 G1 MT dmi.sys.vendor: Hewlett-Packard To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1447664/+subscriptions -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1447664 Title: 14e4:1687 broadcom tg3 network driver disconnects under high load Status in linux package in Ubuntu: Triaged Status in linux package in Debian: New Bug description: The tg3 broadcom network driver that binds with chipset 5762 goes offline and unable to recover (even with tg3 watchdog timeout) when network transmit is under high load. Call trace: https://launchpadlibrarian.net/204185480/dmesg When this happens, only a reboot would be able to fix it. Sometimes, however, bringing the interface offline and online (via ifconfig) would recover networking. I've also tested with the latest tg3 driver (dec 2014 version) and networking is still problematic. I have also disabled TSO, GSO etc... with ethtool and the bug still surfaces. This bug may be related to the integrated Firmware. Here is the procedure to replicate the issue because it is hard to replicate it under moderate network load. 1. Bootup a machine with a broadcom 5762 NIC (ie. HP DeskElite 705) using a Ubuntu/Kubunu Live CD 14.04-15.04. 2. from another machine: start 5 sessions, repetitively copy (scp with public key authentication) a 70 meg file back and forth to the tg3 machine in each session. (not sure if this is necessary) 3. create a 1GB file on the tg3 machine, with something like dd if=/dev/urandom of=/my/test/file bs=1024 count=$((1024*1000)) 4. from another machine: repetitively scp copy that 1GB file from the tg3 machine. This can be done with something like: while [ 0 ]; do scp -i /my/scp/private.key u...@ip.of.tg3:/my/test/file /tmp done; Networking will mostly goes offline in about 10-30 minutes. WORKAROUND: Add udev rule to make the changes permanent in /etc/udev/rules.d/80-tg3-fix.rules : ACTION=="add", SUBSYSTEM=="net", ATTRS{vendor}=="0x14e4", ATTRS{device}=="0x1687", RUN+="/sbin/ethtool -K %k highdma off" ProblemType: Bug DistroRelease: Ubuntu 15.04 Package: linux-image-3.19.0-15-generic 3.19.0-15.15 ProcVersionSignature: Ubuntu 3.19.0-15.15-generic 3.19.3 Uname: Linux 3.19.0-15-generic x86_64 ApportVersion: 2.17.2-0ubuntu1 Architecture: amd64 AudioDevicesInUse: USER PID ACCESS COMMAND /dev/snd/controlC1: kubuntu 3748 F.... pulseaudio /dev/snd/controlC0: kubuntu 3748 F.... pulseaudio CasperVersion: 1.360 Date: Thu Apr 23 11:16:24 2015 IwConfig: eth0 no wireless extensions. lo no wireless extensions. LiveMediaBuild: Kubuntu 15.04 "Vivid Vervet" - Release amd64 (20150422) MachineType: Hewlett-Packard HP EliteDesk 705 G1 MT ProcEnviron: LANGUAGE= TERM=xterm PATH=(custom, no user) LANG=en_US.UTF-8 SHELL=/bin/bash ProcFB: 0 radeondrmfb ProcKernelCmdLine: BOOT_IMAGE=/casper/vmlinuz.efi file=/cdrom/preseed/hostname.seed boot=casper maybe-ubiquity quiet splash --- PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: Home directory not accessible: Permission denied No PulseAudio daemon running, or not running as session daemon. RelatedPackageVersions: linux-restricted-modules-3.19.0-15-generic N/A linux-backports-modules-3.19.0-15-generic N/A linux-firmware 1.143 RfKill: SourcePackage: linux UdevLog: Error: [Errno 2] No such file or directory: '/var/log/udev' UpgradeStatus: No upgrade log present (probably fresh install) dmi.bios.date: 10/22/2014 dmi.bios.vendor: Hewlett-Packard dmi.bios.version: L06 v02.15 dmi.board.asset.tag: 2UA5041TG4 dmi.board.name: 2215 dmi.board.vendor: Hewlett-Packard dmi.chassis.asset.tag: 2UA5041TG4 dmi.chassis.type: 6 dmi.chassis.vendor: Hewlett-Packard dmi.modalias: dmi:bvnHewlett-Packard:bvrL06v02.15:bd10/22/2014:svnHewlett-Packard:pnHPEliteDesk705G1MT:pvr:rvnHewlett-Packard:rn2215:rvr:cvnHewlett-Packard:ct6:cvr: dmi.product.name: HP EliteDesk 705 G1 MT dmi.sys.vendor: Hewlett-Packard To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1447664/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp