Thanks Przemyslaw, good explanation on bug's description! I'm dealing
with this one, will update status here with news.
Cheers,


Guilherme

** Also affects: linux (Ubuntu Cosmic)
   Importance: Undecided
       Status: New

** Also affects: linux (Ubuntu Disco)
   Importance: Undecided
       Status: New

** Also affects: linux (Ubuntu Eoan)
   Importance: Undecided
       Status: Incomplete

** Also affects: linux (Ubuntu Bionic)
   Importance: Undecided
       Status: New

** Also affects: linux (Ubuntu Xenial)
   Importance: Undecided
       Status: New

** Also affects: linux (Ubuntu Ff-series)
   Importance: Undecided
       Status: New

** Changed in: linux (Ubuntu Eoan)
       Status: Incomplete => Confirmed

** Changed in: linux (Ubuntu Ff-series)
       Status: New => Confirmed

** Changed in: linux (Ubuntu Disco)
       Status: New => Confirmed

** Changed in: linux (Ubuntu Cosmic)
       Status: New => Confirmed

** Changed in: linux (Ubuntu Bionic)
       Status: New => Confirmed

** Changed in: linux (Ubuntu Xenial)
       Status: New => Confirmed

** Changed in: linux (Ubuntu Xenial)
   Importance: Undecided => High

** Changed in: linux (Ubuntu Bionic)
   Importance: Undecided => High

** Changed in: linux (Ubuntu Cosmic)
   Importance: Undecided => High

** Changed in: linux (Ubuntu Disco)
   Importance: Undecided => High

** Changed in: linux (Ubuntu Eoan)
   Importance: Undecided => High

** Changed in: linux (Ubuntu Ff-series)
   Importance: Undecided => High

** Changed in: linux (Ubuntu Xenial)
     Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Changed in: linux (Ubuntu Bionic)
     Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Changed in: linux (Ubuntu Cosmic)
     Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Changed in: linux (Ubuntu Ff-series)
     Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Changed in: linux (Ubuntu Eoan)
     Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Changed in: linux (Ubuntu Disco)
     Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Tags removed: bionic
** Tags added: bnx2x sts

** Description changed:

  For the customer OpenStack deployment we deploy infra nodes on Dell R630
  servers. The servers have onboard Broadcom's NetXtreme II BCM57800 NIC
  (quad port: 2x1G ports, 2x10G ports). For each port in UP state, we
  observe 100% CPU load. So in total, we observe 4 CPUs with 100% load.
  
  perf report shows function bnx2x_ptp_task taking up much of the CPUs
  time: https://pastebin.canonical.com/p/kfrpd6Pwh5/
  
  Also, /var/log/syslog contains the following outputs every few seconds:
  
- [1738143.581721] bnx2x: [bnx2x_start_xmit:3855(eno4)]The device supports only 
a single outstanding packet to timestamp, this packet will not be timestamped 
- [1738176.727642] bnx2x: [bnx2x_start_xmit:3855(eno1)]The device supports only 
a single outstanding packet to timestamp, this packet will not be timestamped 
- [1738207.988310] bnx2x: [bnx2x_start_xmit:3855(eno3)]The device supports only 
a single outstanding packet to timestamp, this packet will not be timestamped 
- [1738240.227333] bnx2x: [bnx2x_start_xmit:3855(eno2)]The device supports only 
a single outstanding packet to timestamp, this packet will not be timestamped 
+ [1738143.581721] bnx2x: [bnx2x_start_xmit:3855(eno4)]The device supports only 
a single outstanding packet to timestamp, this packet will not be timestamped
+ [1738176.727642] bnx2x: [bnx2x_start_xmit:3855(eno1)]The device supports only 
a single outstanding packet to timestamp, this packet will not be timestamped
+ [1738207.988310] bnx2x: [bnx2x_start_xmit:3855(eno3)]The device supports only 
a single outstanding packet to timestamp, this packet will not be timestamped
+ [1738240.227333] bnx2x: [bnx2x_start_xmit:3855(eno2)]The device supports only 
a single outstanding packet to timestamp, this packet will not be timestamped
  
  So, the problem seems to be in a "timestampped" TX packet; the driver
  for some reason (to be yet understood) get an unexpected value from a
  register and then, it that same function, reschedule itself to try again
  this register read, read gets a bad value again, and so on infinitely.
  
  This is showing in the system as the 100% CPU usage kthreads; the
  message "The device supports only a single outstanding packet to
  timestamp, this packet will not be timestamped" happens because the
  driver can only timestamp a single TX packet at a time, and given it's
  stuck trying, it cannot accept another packet in this "queue".
  
  The infinite loop appears to be:
  
- static void bnx2x_ptp_task(struct work_struct *work) 
- { 
- struct bnx2x *bp = container_of(work, struct bnx2x, ptp_task); 
- int port = BP_PORT(bp); 
- u32 val_seq; 
- u64 timestamp, ns; 
- struct skb_shared_hwtstamps shhwtstamps; 
+ static void bnx2x_ptp_task(struct work_struct *work)
+ {
+ struct bnx2x *bp = container_of(work, struct bnx2x, ptp_task);
+ int port = BP_PORT(bp);
+ u32 val_seq;
+ u64 timestamp, ns;
+ struct skb_shared_hwtstamps shhwtstamps;
  
- /* Read Tx timestamp registers */ 
- val_seq = REG_RD(bp, port ? NIG_REG_P1_TLLH_PTP_BUF_SEQID : 
- NIG_REG_P0_TLLH_PTP_BUF_SEQID); 
- if (val_seq & 0x10000) { 
- [...] 
- } else { 
- DP(BNX2X_MSG_PTP, "There is no valid Tx timestamp yet\n"); 
- /* Reschedule to keep checking for a valid timestamp value */ 
- schedule_work(&bp->ptp_task); 
- } 
+ /* Read Tx timestamp registers */
+ val_seq = REG_RD(bp, port ? NIG_REG_P1_TLLH_PTP_BUF_SEQID :
+ NIG_REG_P0_TLLH_PTP_BUF_SEQID);
+ if (val_seq & 0x10000) {
+ [...]
+ } else {
+ DP(BNX2X_MSG_PTP, "There is no valid Tx timestamp yet\n");
+ /* Reschedule to keep checking for a valid timestamp value */
+ schedule_work(&bp->ptp_task);
+ }
  
  It appears that val_seq & 0x10000 is never true, so the task constantly
  reschedules itself immediately. Instrumenting the function shows that it
  is being called in excess of 100,000 times per second. The REG_RD call
  does appear to be expensive (as it's a register read from the device)
  and shows high in the perf report, but that by itself doesn't appear to
  be the root cause (i.e., it's not hanging forever in the REG_RD).
  
  The cause appears to be that the driver is not prepared to deal with the
  PTP request never being completed by the hardware. It's unclear why it
  isn't completing, but regardless, the driver should not loop forever
  here.
- 
- 
- Additional info: 
- 
- 
- ubuntu@infra-1:~$ uname -a 
- Linux infra-1 4.15.0-50-generic #54-Ubuntu SMP Mon May 6 18:46:08 UTC 2019 
x86_64 x86_64 x86_64 GNU/Lin 
- 
- 
- ubuntu@infra-1:~$ lspci | grep Broadcom 
- 01:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme II 
BCM57800 1/10 Gigabit Ethernet (rev 10) 
- 01:00.1 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme II 
BCM57800 1/10 Gigabit Ethernet (rev 10) 
- 01:00.2 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme II 
BCM57800 1/10 Gigabit Ethernet (rev 10) 
- 01:00.3 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme II 
BCM57800 1/10 Gigabit Ethernet (rev 10) 
- 
- 
- ubuntu@infra-1:~$ lspci -n | grep 01:00 
- 01:00.0 0200: 14e4:168a (rev 10) 
- 01:00.1 0200: 14e4:168a (rev 10) 
- 01:00.2 0200: 14e4:168a (rev 10) 
- 01:00.3 0200: 14e4:168a (rev 10) 
- 
- 
- ubuntu@infra-1:~/deploy$ sudo lshw -c network 
- *-network:0 
- description: Ethernet interface 
- product: NetXtreme II BCM57800 1/10 Gigabit Ethernet 
- vendor: Broadcom Inc. and subsidiaries 
- physical id: 0 
- bus info: pci@0000:01:00.0 
- logical name: eno1 
- version: 10 
- serial: 42:39:92:e0:66:b6 
- size: 10Gbit/s 
- capacity: 10Gbit/s 
- width: 64 bits 
- clock: 33MHz 
- capabilities: pm vpd msi msix pciexpress bus_master cap_list rom ethernet 
physical tp 100bt 100bt-fd 1000bt-fd 10000bt-fd autonegotiation 
- configuration: autonegotiation=on broadcast=yes driver=bnx2x 
driverversion=1.712.30-0 duplex=full firmware=FFV14.10.07 bc 7.14.11 phy 1.45 
latency=0 link=yes multicast=yes port=twisted pair slave=yes speed=10Gbit/s 
- resources: irq:79 memory:95000000-957fffff memory:95800000-95ffffff 
memory:96030000-9603ffff memory:91a00000-91a7ffff 
- *-network:1 
- description: Ethernet interface 
- product: NetXtreme II BCM57800 1/10 Gigabit Ethernet 
- vendor: Broadcom Inc. and subsidiaries 
- physical id: 0.1 
- bus info: pci@0000:01:00.1 
- logical name: eno2 
- version: 10 
- serial: 42:39:92:e0:66:b6 
- size: 10Gbit/s 
- capacity: 10Gbit/s 
- width: 64 bits 
- clock: 33MHz 
- capabilities: pm vpd msi msix pciexpress bus_master cap_list rom ethernet 
physical tp 100bt 100bt-fd 1000bt-fd 10000bt-fd autonegotiation 
- configuration: autonegotiation=on broadcast=yes driver=bnx2x 
driverversion=1.712.30-0 duplex=full firmware=FFV14.10.07 bc 7.14.11 phy 1.45 
latency=0 link=yes multicast=yes port=twisted pair slave=yes speed=10Gbit/s 
- resources: irq:90 memory:94000000-947fffff memory:94800000-94ffffff 
memory:96020000-9602ffff memory:91a80000-91afffff 
- *-network:2 
- description: Ethernet interface 
- product: NetXtreme II BCM57800 1/10 Gigabit Ethernet 
- vendor: Broadcom Inc. and subsidiaries 
- physical id: 0.2 
- bus info: pci@0000:01:00.2 
- logical name: eno3 
- version: 10 
- serial: 52:f2:aa:63:a5:3c 
- size: 1Gbit/s 
- capacity: 1Gbit/s 
- width: 64 bits 
- clock: 33MHz 
- capabilities: pm vpd msi msix pciexpress bus_master cap_list rom ethernet 
physical tp 10bt 10bt-fd 100bt 100bt-fd 1000bt-fd autonegotiation 
- configuration: autonegotiation=on broadcast=yes driver=bnx2x 
driverversion=1.712.30-0 duplex=full firmware=FFV14.10.07 bc 7.14.11 latency=0 
link=yes multicast=yes port=twisted pair slave=yes speed=1Gbit/s 
- resources: irq:90 memory:93000000-937fffff memory:93800000-93ffffff 
memory:96010000-9601ffff memory:91b00000-91b7ffff 
- *-network:3 
- description: Ethernet interface 
- product: NetXtreme II BCM57800 1/10 Gigabit Ethernet 
- vendor: Broadcom Inc. and subsidiaries 
- physical id: 0.3 
- bus info: pci@0000:01:00.3 
- logical name: eno4 
- version: 10 
- serial: 52:f2:aa:63:a5:3c 
- size: 1Gbit/s 
- capacity: 1Gbit/s 
- width: 64 bits 
- clock: 33MHz 
- capabilities: pm vpd msi msix pciexpress bus_master cap_list rom ethernet 
physical tp 10bt 10bt-fd 100bt 100bt-fd 1000bt-fd autonegotiation 
- configuration: autonegotiation=on broadcast=yes driver=bnx2x 
driverversion=1.712.30-0 duplex=full firmware=FFV14.10.07 bc 7.14.11 latency=0 
link=yes multicast=yes port=twisted pair slave=yes speed=1Gbit/s 
- resources: irq:111 memory:92000000-927fffff memory:92800000-92ffffff 
memory:96000000-9600ffff memory:91b80000-91bfffff 
- *-network:0 
- description: Ethernet interface 
- physical id: 3 
- logical name: bond1.1166 
- serial: 42:39:92:e0:66:b6 
- capabilities: ethernet physical 
- configuration: autonegotiation=off broadcast=yes driver=802.1Q VLAN Support 
driverversion=1.8 duplex=full firmware=N/A link=yes multicast=yes 
- *-network:1 
- description: Ethernet interface 
- physical id: 4 
- logical name: bond1 
- serial: 42:39:92:e0:66:b6 
- capabilities: ethernet physical 
- configuration: autonegotiation=off broadcast=yes driver=bonding 
driverversion=3.7.1 duplex=full firmware=2 link=yes master=yes multicast=yes 
- *-network:2 
- description: Ethernet interface 
- physical id: 5 
- logical name: broam 
- serial: 36:76:ae:d3:1d:3b 
- capabilities: ethernet physical 
- configuration: broadcast=yes driver=bridge driverversion=2.3 firmware=N/A 
ip=10.246.65.10 link=yes multicast=yes 
- *-network:3 
- description: Ethernet interface 
- physical id: 6 
- logical name: brinternal 
- serial: ce:27:22:0d:8b:d1 
- capabilities: ethernet physical 
- configuration: broadcast=yes driver=bridge driverversion=2.3 firmware=N/A 
ip=10.246.66.10 link=yes multicast=yes 
- *-network:4 
- description: Ethernet interface 
- physical id: 7 
- logical name: bond1.1171 
- serial: 42:39:92:e0:66:b6 
- capabilities: ethernet physical 
- configuration: autonegotiation=off broadcast=yes driver=802.1Q VLAN Support 
driverversion=1.8 duplex=full firmware=N/A link=yes multicast=yes 
- *-network:5 
- description: Ethernet interface 
- physical id: 8 
- logical name: bond0 
- serial: 52:f2:aa:63:a5:3c 
- capabilities: ethernet physical 
- configuration: autonegotiation=off broadcast=yes driver=bonding 
driverversion=3.7.1 duplex=full firmware=2 link=yes master=yes multicast=yes 
- *-network:6 
- description: Ethernet interface 
- physical id: 9 
- logical name: brexternal 
- serial: 5e:e0:5c:1f:da:01 
- capabilities: ethernet physical 
- configuration: broadcast=yes driver=bridge driverversion=2.3 firmware=N/A 
ip=10.246.71.10 link=yes multicast=yes 
- 
- 
- ubuntu@infra-1:~$ modinfo bnx2x 
- filename: 
/lib/modules/4.15.0-50-generic/kernel/drivers/net/ethernet/broadcom/bnx2x/bnx2x.ko
 
- firmware: bnx2x/bnx2x-e2-7.13.1.0.fw 
- firmware: bnx2x/bnx2x-e1h-7.13.1.0.fw 
- firmware: bnx2x/bnx2x-e1-7.13.1.0.fw 
- version: 1.712.30-0 
- license: GPL 
- description: QLogic 
BCM57710/57711/57711E/57712/57712_MF/57800/57800_MF/57810/57810_MF/57840/57840_MF
 Driver 
- author: Eliezer Tamir 
- srcversion: 5338D57FE057310DCD66774 
- alias: pci:v000014E4d0000163Fsv*sd*bc*sc*i* 
- alias: pci:v000014E4d0000163Esv*sd*bc*sc*i* 
- alias: pci:v000014E4d0000163Dsv*sd*bc*sc*i* 
- alias: pci:v00001077d000016ADsv*sd*bc*sc*i* 
- alias: pci:v000014E4d000016ADsv*sd*bc*sc*i* 
- alias: pci:v00001077d000016A4sv*sd*bc*sc*i* 
- alias: pci:v000014E4d000016A4sv*sd*bc*sc*i* 
- alias: pci:v000014E4d000016ABsv*sd*bc*sc*i* 
- alias: pci:v000014E4d000016AFsv*sd*bc*sc*i* 
- alias: pci:v000014E4d000016A2sv*sd*bc*sc*i* 
- alias: pci:v00001077d000016A1sv*sd*bc*sc*i* 
- alias: pci:v000014E4d000016A1sv*sd*bc*sc*i* 
- alias: pci:v000014E4d0000168Dsv*sd*bc*sc*i* 
- alias: pci:v000014E4d000016AEsv*sd*bc*sc*i* 
- alias: pci:v000014E4d0000168Esv*sd*bc*sc*i* 
- alias: pci:v000014E4d000016A9sv*sd*bc*sc*i* 
- alias: pci:v000014E4d000016A5sv*sd*bc*sc*i* 
- alias: pci:v000014E4d0000168Asv*sd*bc*sc*i* 
- alias: pci:v000014E4d0000166Fsv*sd*bc*sc*i* 
- alias: pci:v000014E4d00001663sv*sd*bc*sc*i* 
- alias: pci:v000014E4d00001662sv*sd*bc*sc*i* 
- alias: pci:v000014E4d00001650sv*sd*bc*sc*i* 
- alias: pci:v000014E4d0000164Fsv*sd*bc*sc*i* 
- alias: pci:v000014E4d0000164Esv*sd*bc*sc*i* 
- depends: mdio,libcrc32c,ptp 
- retpoline: Y 
- intree: Y 
- name: bnx2x 
- vermagic: 4.15.0-50-generic SMP mod_unload 
- signat: PKCS#7 
- signer: 
- sig_key: 
- sig_hashalgo: md4 
- parm: num_queues: Set number of queues (default is as a number of CPUs) (int) 
- parm: disable_tpa: Disable the TPA (LRO) feature (int) 
- parm: int_mode: Force interrupt mode other than MSI-X (1 INT#x; 2 MSI) (int) 
- parm: dropless_fc: Pause on exhausted host ring (int) 
- parm: mrrs: Force Max Read Req Size (0..3) (for debug) (int) 
- parm: debug: Default debug msglevel (int)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1832082

Title:
  bnx2x driver causes 100% CPU load

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1832082/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to