Public bug reported: The following 25Gb Broadcom NIC error was seen on Xenial running the 4.4.0-141-generic kernel on an amd64 host seeing moderate-heavy network traffic (just once):
* The bnxt_en_po driver froze on a "TX timed out" error and triggered the Netdev Watchdog timer under load. * From kernel log: "NETDEV WATCHDOG: eno2d1 (bnxt_en_bpo): transmit queue 0 timed out" See attached kern.log excerpt file for full excerpt of error log. * Release = Xenial Kernel = 4.4.0-141-generic #167 eno2d1 = Product Name: Broadcom Adv. Dual 25Gb Ethernet * This caused the driver to reset in order to recover: "bnxt_en_bpo 0000:19:00.1 eno2d1: TX timeout detected, starting reset task!" driver: bnxt_en_bpo version: 1.8.1 source: ubuntu/bnxt/bnxt.c: bnxt_tx_timeout() * The loss of connectivity and softirq stall caused other failures on the system. * The bnxt_en_po driver is the imported Broadcom driver pulled in to support newer Broadcom HW (specific boards) while the bnx_en module continues to support the older HW. The current Linux upstream driver does not compile easily with the 4.4 kernel (too many changes). * This upstream and bnxt_en driver fix is a likely solution: "bnxt_en: Fix TX timeout during netpoll" commit: 73f21c653f930f438d53eed29b5e4c65c8a0f906 This fix has not been applied to the bnxt_en_po driver version, but review of the code indicates that it is susceptible to the bug, and the fix would be reasonable. * No easy way to reproduce this ** Affects: linux (Ubuntu) Importance: High Status: New ** Tags: xenial -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1814095 Title: bnxt_en_po: TX timed out triggering Netdev Watchdog Timer To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1814095/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs