This bug is missing log files that will aid in diagnosing the problem.
While running an Ubuntu kernel (not a mainline or third-party kernel)
please enter the following command in a terminal window:

apport-collect 1723482

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable
to run this command, please add a comment stating that fact and change
the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the
Ubuntu Kernel Team.

** Changed in: linux (Ubuntu)
       Status: New => Incomplete

** Tags added: xenial

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1723482

Title:
  qlcnic firmware hang detected kvm ganeti

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  1) Ubuntu release:

  Description:  Ubuntu 16.04.3 LTS
  Release:      16.04

  2) Package version:

  * linux-image-extra-4.4.0-96-generic (4.4.0-96.119)
  * Also with HWE kernel (4.10.x)

  3) What I expect:

  I have a 10G interface (HP NC523SFP 10Gb 2-port) in a HP ProLiant
  DL380p Gen8, BIOS P70 07/01/2015. The interface is configured using
  the module qlcnic and it works with the names ens2f0 and ens2f1. They
  also have VLANs configured.

  I have installed Ganeti software and bridges over those interfaces,
  br-dmz over ens2f0 and br-str over ens2f1.

  Everything should work without connectivity loss.

  4) What happened instead:

  The interface loses the connectivity from time to time, although it
  recovers itself, with the following error:

  Oct 12 18:23:14 mazinger kernel: [107906.678468] qlcnic 0000:07:00.1: Pause 
control frames disabled on all ports
  Oct 12 18:23:14 mazinger kernel: [107906.678470] qlcnic 0000:07:00.0: Pause 
control frames disabled on all ports
  Oct 12 18:23:14 mazinger kernel: [107906.678475] qlcnic 0000:07:00.0: 
firmware hang detected
  Oct 12 18:23:14 mazinger kernel: [107906.678482] qlcnic 0000:07:00.0: Dumping 
hw/fw registers
  Oct 12 18:23:14 mazinger kernel: [107906.678482] PEG_HALT_STATUS1: 
0x40001502, PEG_HALT_STATUS2: 0x3e1f80,
  Oct 12 18:23:14 mazinger kernel: [107906.678482] PEG_NET_0_PC: 0x6d920, 
PEG_NET_1_PC: 0x6d976,
  Oct 12 18:23:14 mazinger kernel: [107906.678482] PEG_NET_2_PC: 0x149, 
PEG_NET_3_PC: 0x6edbe,
  Oct 12 18:23:14 mazinger kernel: [107906.678482] PEG_NET_4_PC: 0x1e2f3
  Oct 12 18:23:14 mazinger kernel: [107906.680107] qlcnic 0000:07:00.1: 
firmware hang detected
  Oct 12 18:23:14 mazinger kernel: [107906.680385] qlcnic 0000:07:00.1: Dumping 
hw/fw registers
  Oct 12 18:23:14 mazinger kernel: [107906.680385] PEG_HALT_STATUS1: 
0x40001502, PEG_HALT_STATUS2: 0x3e1f80,
  Oct 12 18:23:14 mazinger kernel: [107906.680385] PEG_NET_0_PC: 0x6d920, 
PEG_NET_1_PC: 0x6d976,
  Oct 12 18:23:14 mazinger kernel: [107906.680385] PEG_NET_2_PC: 0x149, 
PEG_NET_3_PC: 0x6edbe,
  Oct 12 18:23:14 mazinger kernel: [107906.680385] PEG_NET_4_PC: 0x1e2f3
  Oct 12 18:23:14 mazinger kernel: [107906.695571] br-dmz: port 1(ens2f0.2) 
entered disabled state
  Oct 12 18:23:15 mazinger kernel: [107907.690629] br-str: port 1(ens2f1.10) 
entered disabled state
  Oct 12 18:23:16 mazinger kernel: [107908.706988] qlcnic 0000:07:00.1: 
Detected state change from DEV_NEED_RESET, skipping ack check
  Oct 12 18:23:17 mazinger kernel: [107909.423713] qlcnic 0000:07:00.0 ens2f0: 
Dump data 15044136 bytes captured, dump data address = ffffc900334c3000, 
template header size 36864 bytes, template address = ffffc900193da000
  Oct 12 18:23:21 mazinger kernel: [107912.800338] qlcnic 0000:07:00.0: loading 
firmware from flash
  Oct 12 18:23:27 mazinger kernel: [107919.137580] qlcnic 0000:07:00.0: Driver 
v5.3.63, firmware v4.20.1
  Oct 12 18:23:27 mazinger kernel: [107919.501555] qlcnic 0000:07:00.1: Driver 
v5.3.63, firmware v4.20.1
  Oct 12 18:23:28 mazinger kernel: [107920.425737] qlcnic 0000:07:00.0 ens2f0: 
Rx Context[0] Created, state 0x2
  Oct 12 18:23:28 mazinger kernel: [107920.435780] qlcnic 0000:07:00.0 ens2f0: 
Tx Context[0x8000] Created, state 0x2
  Oct 12 18:23:28 mazinger kernel: [107920.453103] qlcnic 0000:07:00.0 ens2f0: 
Tx Context[0x8008] Created, state 0x2
  Oct 12 18:23:29 mazinger kernel: [107921.598651] qlcnic 0000:07:00.0 ens2f0: 
Tx Context[0x800a] Created, state 0x2
  Oct 12 18:23:29 mazinger kernel: [107921.615752] qlcnic 0000:07:00.0 ens2f0: 
Tx Context[0x800c] Created, state 0x2
  Oct 12 18:23:30 mazinger kernel: [107922.196706] qlcnic 0000:07:00.1 ens2f1: 
Rx Context[1] Created, state 0x2
  Oct 12 18:23:30 mazinger kernel: [107922.406680] qlcnic 0000:07:00.1 ens2f1: 
Tx Context[0x8001] Created, state 0x2
  Oct 12 18:23:30 mazinger kernel: [107922.422646] qlcnic 0000:07:00.1 ens2f1: 
Tx Context[0x8009] Created, state 0x2
  Oct 12 18:23:30 mazinger kernel: [107922.439890] qlcnic 0000:07:00.1 ens2f1: 
Tx Context[0x800b] Created, state 0x2
  Oct 12 18:23:30 mazinger kernel: [107922.456417] qlcnic 0000:07:00.1 ens2f1: 
Tx Context[0x800d] Created, state 0x2
  Oct 12 18:23:31 mazinger kernel: [107923.500128] qlcnic 0000:07:00.0 ens2f0: 
NIC Link is up
  Oct 12 18:23:31 mazinger kernel: [107923.500360] br-dmz: port 1(ens2f0.2) 
entered forwarding state
  Oct 12 18:23:31 mazinger kernel: [107923.500375] br-dmz: port 1(ens2f0.2) 
entered forwarding state
  Oct 12 18:23:31 mazinger kernel: [107923.500680] qlcnic 0000:07:00.1 ens2f1: 
NIC Link is up
  Oct 12 18:23:31 mazinger kernel: [107923.500971] br-str: port 1(ens2f1.10) 
entered forwarding state
  Oct 12 18:23:31 mazinger kernel: [107923.500985] br-str: port 1(ens2f1.10) 
entered forwarding state
  ---------------

  Sometimes it also has kernel errors and need to be rebooted to recover
  the connectivity:

  Oct  9 14:36:41 mazinger kernel: [262273.497512] ------------[ cut here 
]------------
  Oct  9 14:36:41 mazinger kernel: [262273.497821] WARNING: CPU: 6 PID: 0 at 
/build/linux-z2ccW0/linux-4.4.0/net/sched/sch_generic.c:306 
dev_watchdog+0x237/0x240()
  Oct  9 14:36:41 mazinger kernel: [262273.498083] NETDEV WATCHDOG: ens2f0 
(qlcnic): transmit queue 0 timed out
  Oct  9 14:36:41 mazinger kernel: [262273.498579] Modules linked in: joydev 
binfmt_misc hpwdt ipmi_ssif bridge intel_rapl x86_pkg_temp_thermal input_leds 
intel_powerclamp serio_raw sb_edac edac_core lpc_ich 8250_fintek hpilo ioatdma 
shpchp ipmi_si ipmi_msghandler mac_hid kvm_intel kvm irqbypass ib_iser rdma_cm 
iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi 8021q garp mrp stp llc coretemp drbd lru_cache autofs4 
btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx 
xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul 
crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul 
glue_helper ablk_helper qlcnic hid_generic tg3 igb dca hpsa vxlan cryptd usbhid 
ptp psmouse ip6_udp_tunnel pata_acpi hid i2c_algo_bit scsi_transport_sas 
pps_core udp_tunnel wmi fjes
  Oct  9 14:36:41 mazinger kernel: [262273.498651] CPU: 6 PID: 0 Comm: 
swapper/6 Not tainted 4.4.0-96-generic #119-Ubuntu
  Oct  9 14:36:41 mazinger kernel: [262273.498652] Hardware name: HP ProLiant 
DL380p Gen8, BIOS P70 07/01/2015
  Oct  9 14:36:41 mazinger kernel: [262273.498654]  0000000000000286 
fc090740aa4761f7 ffff881fbf783d98 ffffffff813fabd3
  Oct  9 14:36:41 mazinger kernel: [262273.498666]  ffff881fbf783de0 
ffffffff81d715f8 ffff881fbf783dd0 ffffffff810812e2
  Oct  9 14:36:41 mazinger kernel: [262273.498668]  0000000000000000 
ffff881fade31b00 0000000000000006 ffff881fade30000
  Oct  9 14:36:41 mazinger kernel: [262273.498681] Call Trace:
  Oct  9 14:36:41 mazinger kernel: [262273.498683]  <IRQ>  [<ffffffff813fabd3>] 
dump_stack+0x63/0x90
  Oct  9 14:36:41 mazinger kernel: [262273.498691]  [<ffffffff810812e2>] 
warn_slowpath_common+0x82/0xc0
  Oct  9 14:36:41 mazinger kernel: [262273.498693]  [<ffffffff8108137c>] 
warn_slowpath_fmt+0x5c/0x80
  Oct  9 14:36:41 mazinger kernel: [262273.498697]  [<ffffffff8175eca7>] 
dev_watchdog+0x237/0x240
  Oct  9 14:36:41 mazinger kernel: [262273.498700]  [<ffffffff8175ea70>] ? 
qdisc_rcu_free+0x40/0x40
  Oct  9 14:36:41 mazinger kernel: [262273.498705]  [<ffffffff810ed035>] 
call_timer_fn+0x35/0x120
  Oct  9 14:36:41 mazinger kernel: [262273.498708]  [<ffffffff8175ea70>] ? 
qdisc_rcu_free+0x40/0x40
  Oct  9 14:36:41 mazinger kernel: [262273.498711]  [<ffffffff810ed9ea>] 
run_timer_softirq+0x23a/0x2f0
  Oct  9 14:36:41 mazinger kernel: [262273.498714]  [<ffffffff81085dc1>] 
__do_softirq+0x101/0x290
  Oct  9 14:36:41 mazinger kernel: [262273.498717]  [<ffffffff810860c3>] 
irq_exit+0xa3/0xb0
  Oct  9 14:36:41 mazinger kernel: [262273.498721]  [<ffffffff81845d22>] 
smp_apic_timer_interrupt+0x42/0x50
  Oct  9 14:36:41 mazinger kernel: [262273.498724]  [<ffffffff81843fe2>] 
apic_timer_interrupt+0x82/0x90
  Oct  9 14:36:41 mazinger kernel: [262273.498726]  <EOI>  [<ffffffff816d680e>] 
? cpuidle_enter_state+0x10e/0x2b0
  Oct  9 14:36:41 mazinger kernel: [262273.498731]  [<ffffffff816d69e7>] 
cpuidle_enter+0x17/0x20
  Oct  9 14:36:41 mazinger kernel: [262273.498735]  [<ffffffff810c47c2>] 
call_cpuidle+0x32/0x60
  Oct  9 14:36:41 mazinger kernel: [262273.498737]  [<ffffffff816d69c3>] ? 
cpuidle_select+0x13/0x20
  Oct  9 14:36:41 mazinger kernel: [262273.498739]  [<ffffffff810c4a80>] 
cpu_startup_entry+0x290/0x350
  Oct  9 14:36:41 mazinger kernel: [262273.498743]  [<ffffffff810517b4>] 
start_secondary+0x154/0x190
  Oct  9 14:36:41 mazinger kernel: [262273.498749] ---[ end trace 
6388d35f388918bc ]---
  Oct  9 14:36:41 mazinger kernel: [262273.498765] qlcnic 0000:07:00.0 ens2f0: 
rds_ring=0 crb_rcv_producer=3113 producer=3114 num_desc=4096
  Oct  9 14:36:41 mazinger kernel: [262273.498773] qlcnic 0000:07:00.0 ens2f0: 
rds_ring=1 crb_rcv_producer=1023 producer=0 num_desc=1024
  Oct  9 14:36:41 mazinger kernel: [262273.498781] qlcnic 0000:07:00.0 ens2f0: 
sds_ring=0 crb_sts_consumer=659 consumer=659 crb_intr_mask=0 num_desc=4096
  Oct  9 14:36:41 mazinger kernel: [262273.498788] qlcnic 0000:07:00.0 ens2f0: 
sds_ring=1 crb_sts_consumer=2894 consumer=2894 crb_intr_mask=0 num_desc=4096
  Oct  9 14:36:41 mazinger kernel: [262273.498792] qlcnic 0000:07:00.0 ens2f0: 
sds_ring=2 crb_sts_consumer=3092 consumer=3092 crb_intr_mask=0 num_desc=4096
  Oct  9 14:36:41 mazinger kernel: [262273.498796] qlcnic 0000:07:00.0 ens2f0: 
sds_ring=3 crb_sts_consumer=570 consumer=570 crb_intr_mask=0 num_desc=4096
  Oct  9 14:36:41 mazinger kernel: [262273.498798] qlcnic 0000:07:00.0 ens2f0: 
Tx ring=0 Context Id=0x8000
  Oct  9 14:36:41 mazinger kernel: [262273.498800] qlcnic 0000:07:00.0 ens2f0: 
xmit_finished=161917485, xmit_called=161920455, xmit_on=0, xmit_off=2
  Oct  9 14:36:41 mazinger kernel: [262273.498802] qlcnic 0000:07:00.0 ens2f0: 
crb_intr_mask=0
  Oct  9 14:36:41 mazinger kernel: [262273.498805] qlcnic 0000:07:00.0 ens2f0: 
hw_producer=481, sw_producer=481 sw_consumer=491, hw_consumer=491
  Oct  9 14:36:41 mazinger kernel: [262273.498807] qlcnic 0000:07:00.0 ens2f0: 
Total desc=1024, Available desc=10
  Oct  9 14:36:41 mazinger kernel: [262273.498809] qlcnic 0000:07:00.0 ens2f0: 
Tx ring=1 Context Id=0x8008
  Oct  9 14:36:41 mazinger kernel: [262273.498811] qlcnic 0000:07:00.0 ens2f0: 
xmit_finished=152057037, xmit_called=152059997, xmit_on=0, xmit_off=2
  Oct  9 14:36:41 mazinger kernel: [262273.498813] qlcnic 0000:07:00.0 ens2f0: 
crb_intr_mask=0
  Oct  9 14:36:41 mazinger kernel: [262273.498816] qlcnic 0000:07:00.0 ens2f0: 
hw_producer=81, sw_producer=81 sw_consumer=91, hw_consumer=91
  Oct  9 14:36:41 mazinger kernel: [262273.498818] qlcnic 0000:07:00.0 ens2f0: 
Total desc=1024, Available desc=10
  Oct  9 14:36:41 mazinger kernel: [262273.498819] qlcnic 0000:07:00.0 ens2f0: 
Tx ring=2 Context Id=0x800a
  Oct  9 14:36:41 mazinger kernel: [262273.498821] qlcnic 0000:07:00.0 ens2f0: 
xmit_finished=133645903, xmit_called=133648936, xmit_on=0, xmit_off=2
  Oct  9 14:36:41 mazinger kernel: [262273.498824] qlcnic 0000:07:00.0 ens2f0: 
crb_intr_mask=0
  Oct  9 14:36:41 mazinger kernel: [262273.498827] qlcnic 0000:07:00.0 ens2f0: 
hw_producer=572, sw_producer=572 sw_consumer=582, hw_consumer=582
  Oct  9 14:36:41 mazinger kernel: [262273.498828] qlcnic 0000:07:00.0 ens2f0: 
Total desc=1024, Available desc=10
  Oct  9 14:36:41 mazinger kernel: [262273.498830] qlcnic 0000:07:00.0 ens2f0: 
Tx ring=3 Context Id=0x800c
  Oct  9 14:36:41 mazinger kernel: [262273.498836] qlcnic 0000:07:00.0 ens2f0: 
xmit_finished=162932700, xmit_called=162935603, xmit_on=0, xmit_off=2
  Oct  9 14:36:41 mazinger kernel: [262273.498843] qlcnic 0000:07:00.0 ens2f0: 
crb_intr_mask=0
  Oct  9 14:36:41 mazinger kernel: [262273.498850] qlcnic 0000:07:00.0 ens2f0: 
hw_producer=568, sw_producer=568 sw_consumer=578, hw_consumer=578
  Oct  9 14:36:41 mazinger kernel: [262273.498857] qlcnic 0000:07:00.0 ens2f0: 
Total desc=1024, Available desc=10
  Oct  9 14:36:41 mazinger kernel: [262273.498863] qlcnic 0000:07:00.0 ens2f0: 
Tx timeout, reset adapter context.
  Oct  9 14:36:43 mazinger kernel: [262275.251864] qlcnic 0000:07:00.0: CDRP 
command failed: [7]
  Oct  9 14:36:43 mazinger kernel: [262275.252143] qlcnic 0000:07:00.0: Host 
MBX regs(2)
  Oct  9 14:36:43 mazinger kernel: [262275.252146] 00000039 
  Oct  9 14:36:43 mazinger kernel: [262275.252148] 00050032 <6>[262275.252150] 
  Oct  9 14:36:43 mazinger kernel: [262275.252153] qlcnic 0000:07:00.0: FW MBX 
regs(3)
  Oct  9 14:36:43 mazinger kernel: [262275.252155] 00000007 
  Oct  9 14:36:43 mazinger kernel: [262275.252156] 00000000 00000000 
  Oct  9 14:36:43 mazinger kernel: [262275.252158] 
  Oct  9 14:36:43 mazinger kernel: [262275.252166] qlcnic 0000:07:00.0 ens2f0: 
Failed to Delete interrupts 7
  Oct  9 14:36:43 mazinger kernel: [262275.279376] br-dmz: port 1(ens2f0.2) 
entered disabled state
  Oct  9 14:36:43 mazinger kernel: [262275.447095] qlcnic 0000:07:00.0 ens2f0: 
Rx Context[0] Created, state 0x2
  Oct  9 14:36:43 mazinger kernel: [262275.493365] qlcnic 0000:07:00.0 ens2f0: 
Tx Context[0x8000] Created, state 0x2
  Oct  9 14:36:43 mazinger kernel: [262275.509816] qlcnic 0000:07:00.0 ens2f0: 
Tx Context[0x800e] Created, state 0x2
  Oct  9 14:36:43 mazinger kernel: [262275.527651] qlcnic 0000:07:00.0 ens2f0: 
Tx Context[0x8010] Created, state 0x2
  Oct  9 14:36:43 mazinger kernel: [262275.543852] qlcnic 0000:07:00.0 ens2f0: 
Tx Context[0x8012] Created, state 0x2
  Oct  9 14:36:43 mazinger kernel: [262275.545966] qlcnic 0000:07:00.0 ens2f0: 
qlcnic_reset_hw_context: soft reset complete
  -----------

  What I have tried to fix it:

  - I have upgraded the interface firmware to the latest version
  provided by HP:

  # ethtool -i ens2f0
  driver: qlcnic
  version: 5.3.63
  firmware-version: 4.20.1
  expansion-rom-version: 
  bus-info: 0000:07:00.0
  supports-statistics: yes
  supports-test: yes
  supports-eeprom-access: yes
  supports-register-dump: yes
  supports-priv-flags: no

  - I have opened a case with HP. Following their recomendations I have
  upgraded the firmware of the server to the latest version. After
  capturing a AHS (Active Health System) log the have told me there
  isn't a hardware problem and it should be a software issue.

  - I have tried HWE Kernel (version 4.10.x) which comes with a newer
  version of qlcnic module (5.3.65) but it didn't solved the problem.

  - After reading about some problems with TOS and virtual environments,
  I have disabled TOS/GOS and other configuration in the interfaces:

  auto <iface>
  iface <iface> inet manual
      pre-up /sbin/ethtool --offload <iface> gso off tso off sg off gro off

  
  I have found similar problems googling but all of them were solved applying 
one/some of those things. The issue seems to be related to this kind of 
interfaces and using them with virtual environments.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1723482/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to