------- Comment From mauri...@br.ibm.com 2017-12-11 11:00 EDT------- Closing/rejecting this bug. No testing available for now.
Please continue to bug Breno in case of problems :- ) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1710690 Title: Ubuntu16.04.3: System running network stress crashes with Alignment exception Status in The Ubuntu-power-systems project: Incomplete Status in linux package in Ubuntu: Incomplete Bug description: ==== State: Open by: nguyenp on 11 August 2017 11:03:32 ==== Contact: ======= Paul Nguyen nguy...@us.ibm.com BMC: ==== bos1u1 Firmware Revision : 00.25 Firmware Build Time : 20170807 BMC MAC address : 0c:c4:7a:f4:4d:60 PNOR Build Time : 20170729 CPLD Version : B2.91.00 Ubuntu 16.04.3: =========== bos1u1p1 ver 1.5.4.5 - OS, HTX, Firmware and Machine details OS: GNU/Linux OS Version: Ubuntu 16.04.3 LTS \n \l Kernel Version: 4.11.0-12-generic HTX Version: htxubuntu-448 Host Name: bos1u1p1 Machine Serial No: C819UAF32B00002 Machine Type/Model: 9006-12C root@bos1u1p1:~# dpkg -l |grep mlx ii libmlx4-1 41mlnx1-OFED.4.1.0.1.0.41014 ppc64el Userspace driver for Mellanox ConnectX InfiniBand HCAs ii libmlx4-1-dbg 41mlnx1-OFED.4.1.0.1.0.41014 ppc64el Debugging symbols for the libmlx4 driver ii libmlx4-dev 41mlnx1-OFED.4.1.0.1.0.41014 ppc64el Development files for the libmlx4 driver ii libmlx5-1 41mlnx1-OFED.4.1.0.1.3.0.1.41014 ppc64el Userspace driver for Mellanox ConnectX InfiniBand HCAs ii libmlx5-1-dbg 41mlnx1-OFED.4.1.0.1.3.0.1.41014 ppc64el Debugging symbols for the libmlx5 driver ii libmlx5-dev 41mlnx1-OFED.4.1.0.1.3.0.1.41014 ppc64el Development files for the libmlx5 driver root@bos1u1p1:~# lsscsi [0:2:0:0] disk SEAGATE ST4000NM0034 E005 /dev/sda [0:3:123:0] enclosu ADAPTEC Smart Adapter 2.99 - root@bos1u1p1:~# lspci 0000:00:00.0 PCI bridge: IBM Device 04c1 0001:00:00.0 PCI bridge: IBM Device 04c1 0002:00:00.0 PCI bridge: IBM Device 04c1 0002:01:00.0 Ethernet controller: Intel Corporation Ethernet Controller X710/X557-AT 10GBASE-T (rev 02) 0002:01:00.1 Ethernet controller: Intel Corporation Ethernet Controller X710/X557-AT 10GBASE-T (rev 02) 0002:01:00.2 Ethernet controller: Intel Corporation Ethernet Controller X710/X557-AT 10GBASE-T (rev 02) 0002:01:00.3 Ethernet controller: Intel Corporation Ethernet Controller X710/X557-AT 10GBASE-T (rev 02) 0003:00:00.0 PCI bridge: IBM Device 04c1 0003:01:00.0 Serial Attached SCSI controller: Adaptec Series 8 12G SAS/PCIe 3 (rev 01) 0004:00:00.0 PCI bridge: IBM Device 04c1 0004:01:00.0 PCI bridge: ASPEED Technology, Inc. AST1150 PCI-to-PCI Bridge (rev 04) 0004:02:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics Family (rev 41) 0005:00:00.0 PCI bridge: IBM Device 04c1 0005:01:00.0 USB controller: Texas Instruments TUSB73x0 SuperSpeed USB 3.0 xHCI Host Controller (rev 02) 0030:00:00.0 PCI bridge: IBM Device 04c1 0030:01:00.0 Infiniband controller: Mellanox Technologies Device 1019 0030:01:00.1 Infiniband controller: Mellanox Technologies Device 1019 0031:00:00.0 PCI bridge: IBM Device 04c1 0031:01:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01) 0031:01:00.1 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01) 0032:00:00.0 PCI bridge: IBM Device 04c1 0033:00:00.0 PCI bridge: IBM Device 04c1 0033:01:00.0 Ethernet controller: Intel Corporation 82575EB Gigabit Network Connection (rev 02) 0033:01:00.1 Ethernet controller: Intel Corporation 82575EB Gigabit Network Connection (rev 02) root@bos1u1p1:~# ifconfig -a enP2p1s0f0 Link encap:Ethernet HWaddr ac:1f:6b:09:c0:9e inet addr:9.3.20.217 Bcast:9.3.21.255 Mask:255.255.254.0 inet6 addr: fe80::ae1f:6bff:fe09:c09e/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:62603 errors:0 dropped:0 overruns:0 frame:0 TX packets:105 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:4784741 (4.7 MB) TX bytes:14043 (14.0 KB) enP2p1s0f1 Link encap:Ethernet HWaddr ac:1f:6b:09:c0:9f BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) enP2p1s0f2 Link encap:Ethernet HWaddr ac:1f:6b:09:c0:a0 inet addr:108.1.1.217 Bcast:108.1.1.255 Mask:255.255.255.0 inet6 addr: fe80::ae1f:6bff:fe09:c0a0/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:127350969 errors:0 dropped:65 overruns:0 frame:0 TX packets:124182822 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:186712761859 (186.7 GB) TX bytes:181731375235 (181.7 GB) enP2p1s0f3 Link encap:Ethernet HWaddr ac:1f:6b:09:c0:a1 inet addr:108.1.2.217 Bcast:108.1.2.255 Mask:255.255.255.0 inet6 addr: fe80::ae1f:6bff:fe09:c0a1/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:124182726 errors:0 dropped:0 overruns:0 frame:0 TX packets:127351289 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:181731250053 (181.7 GB) TX bytes:186713217880 (186.7 GB) enP49p1s0f0 Link encap:Ethernet HWaddr 0c:c4:7a:eb:17:ea inet addr:104.1.1.217 Bcast:104.1.1.255 Mask:255.255.255.0 inet6 addr: fe80::ec4:7aff:feeb:17ea/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:126415178 errors:0 dropped:0 overruns:0 frame:0 TX packets:124809946 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:185222061670 (185.2 GB) TX bytes:183034803740 (183.0 GB) enP49p1s0f1 Link encap:Ethernet HWaddr 0c:c4:7a:eb:17:eb inet addr:104.1.2.217 Bcast:104.1.2.255 Mask:255.255.255.0 inet6 addr: fe80::ec4:7aff:feeb:17eb/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:124809938 errors:0 dropped:0 overruns:0 frame:0 TX packets:126415188 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:183034803062 (183.0 GB) TX bytes:185222062528 (185.2 GB) enP51p1s0f0 Link encap:Ethernet HWaddr 0c:c4:7a:b4:28:6c inet addr:105.1.1.217 Bcast:105.1.1.255 Mask:255.255.255.0 inet6 addr: fe80::ec4:7aff:feb4:286c/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:8067390 errors:0 dropped:0 overruns:0 frame:0 TX packets:10234208 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:8355184401 (8.3 GB) TX bytes:13586437864 (13.5 GB) Memory:620c180800000-620c18081ffff enP51p1s0f1 Link encap:Ethernet HWaddr 0c:c4:7a:b4:28:6d inet addr:102.1.2.217 Bcast:102.1.2.255 Mask:255.255.255.0 inet6 addr: fe80::ec4:7aff:feb4:286d/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:10234110 errors:0 dropped:0 overruns:0 frame:0 TX packets:8067404 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:13586422574 (13.5 GB) TX bytes:8355185459 (8.3 GB) Memory:620c180820000-620c18083ffff ib0 Link encap:UNSPEC HWaddr 00-00-00-86-FE-80-00-00-00-00-00-00-00-00-00-00 inet addr:103.1.1.217 Bcast:103.1.1.255 Mask:255.255.255.0 inet6 addr: fe80::268a:703:a3:2a38/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:2044 Metric:1 RX packets:24 errors:0 dropped:0 overruns:0 frame:0 TX packets:14 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:256 RX bytes:2400 (2.4 KB) TX bytes:1160 (1.1 KB) ib1 Link encap:UNSPEC HWaddr 00-00-08-46-FE-80-00-00-00-00-00-00-00-00-00-00 inet addr:103.1.2.217 Bcast:103.1.2.255 Mask:255.255.255.0 inet6 addr: fe80::268a:703:a3:2a39/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:2044 Metric:1 RX packets:24 errors:0 dropped:0 overruns:0 frame:0 TX packets:15 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:256 RX bytes:2400 (2.4 KB) TX bytes:1220 (1.2 KB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:2173 errors:0 dropped:0 overruns:0 frame:0 TX packets:2173 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:349296 (349.2 KB) TX bytes:349296 (349.2 KB) root@bos1u1p1:~# HTX Device Status Summary Current Time: 222 08/10/17 15:35:12 Cycle Count(Min/Max)=0/183 System Start Time: 222 08/10/17 15:11:22 Page Number(Cur/Max)=1/2 -------------------------------------------------------------------------------- Last Update Cycle Curr. | Last Update Cycle Curr. ST Device Day Time Count Stanza | ST Device Day Time Count Stanza RN cache0 222 15:34:50 15 6 | RN cpu15 222 15:35:03 5 4 RN cache1 222 15:35:03 51 5 | RN enP2p1s0f222 15:35:11 183 1 RN cpu0 222 15:34:47 2 4 | RN enP2p1s0f222 15:35:11 173 1 RN cpu1 222 15:33:59 2 13 | RN enP49p1s0222 15:35:11 181 1 RN cpu2 222 15:34:05 2 8 | RN enP49p1s0222 15:35:10 177 1 RN cpu3 222 15:32:59 2 13 | RN enP51p1s0222 15:34:45 4 1 RN cpu4 222 15:34:44 2 10 | RN enP51p1s0222 15:34:46 10 1 RN cpu5 222 15:34:33 2 13 | RN fpu0 222 15:35:06 3 25 RN cpu6 222 15:35:01 2 6 | RN fpu1 222 15:34:58 4 5 RN cpu7 222 15:32:42 2 13 | RN fpu2 222 15:35:10 3 30 RN cpu8 222 15:35:08 3 10 | RN fpu3 222 15:34:59 4 19 RN cpu9 222 15:35:02 3 13 | RN fpu4 222 15:34:57 3 35 RN cpu10 222 15:35:03 5 2 | RN fpu5 222 15:35:07 4 11 RN cpu11 222 15:34:47 5 1 | RN fpu6 222 15:35:05 3 28 RN cpu12 222 15:34:53 5 2 | RN fpu7 222 15:35:09 4 19 RN cpu13 222 15:35:09 5 2 | RN fpu8 222 15:35:04 6 5 RN cpu14 222 15:35:10 5 3 | RN fpu9 222 15:35:09 6 17 RN fpu10 222 15:35:28 9 2 | RN sctu10 222 15:35:24 55 1 RN fpu11 222 15:35:29 8 41 | RN sctu11 222 15:35:28 62 4 RN fpu12 222 15:35:25 8 43 | RN sctu13 222 15:35:28 66 3 RN fpu13 222 15:35:26 8 43 | RN sctu14 222 15:35:31 59 2 RN fpu14 222 15:35:29 9 7 | RN sctu15 222 15:35:28 71 2 RN fpu15 222 15:35:28 9 10 | RN tlbie 222 15:34:48 26 2 RN mem 222 15:35:10 1 4 | RN mlx5_0 222 15:35:22 6 1 | RN mlx5_1 222 15:35:25 6 1 | RN rng 222 15:35:21 196 1 | RN sctu1 222 15:35:20 64 2 | RN sctu2 222 15:35:25 50 4 | RN sctu3 222 15:35:27 67 4 | RN sctu5 222 15:35:29 70 3 | RN sctu6 222 15:35:22 57 4 | RN sctu7 222 15:35:23 68 4 | RN sctu9 222 15:35:20 62 1 | Problem Description: ==================== On my Boston LC system, I running with firmware BMC 0.25 and PNOR 0807. System is running with Ubuntu16.04.3 - The system was running network stress then it crashed with an alligment exception. The system is currently in xmon debugger and is available for developer to look. [ 1105.304668] Unable to handle kernel paging request for unaligned access at address 0xc00a000000000122 [ 1105.304850] Faulting instruction address: 0xc000000000a1e6b4 1e:mon> 1e:mon> e cpu 0x1e: Vector: 600 (Alignment) at [c000000007f0b1c0] pc: c000000000a1e6b4: skb_release_data+0xd4/0x1b0 lr: c000000000a1e82c: __kfree_skb+0x2c/0x50 sp: c000000007f0b440 msr: 9000000000009033 dar: c00a000000000122 dsisr: 8000000 current = 0xc000001a93574400 paca = 0xc000000007b90e00 softe: 0 irq_happened: 0x01on 4.11.0-12-generic (buildd@bos01-ppc64el-026) (gcc version 5.4.0 20160609 (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.4) ) #17~16.04.1-Ubuntu SMP Fri Jul 28 13:52:51 UTC 2017 (Ubuntu 4.11.0-12.17~16.04.1-generic 4.11.12) 1e:mon> t [c000000007f0b480] c000000000a1e82c __kfree_skb+0x2c/0x50 [c000000007f0b4b0] c000000000abd77c tcp_clean_rtx_queue+0x2cc/0xd30 [c000000007f0b5d0] c000000000ac0f68 tcp_ack+0x5a8/0xa30 [c000000007f0b710] c000000000ac3004 tcp_rcv_established+0x1b4/0x830 [c000000007f0b790] c000000000ad05d4 tcp_v4_do_rcv+0x1b4/0x2f0 [c000000007f0b7d0] c000000000ad4168 tcp_v4_rcv+0xe18/0xe50 [c000000007f0b8c0] c000000000a9e7f0 ip_local_deliver_finish+0x170/0x350 [c000000007f0b910] c000000000a9f120 ip_local_deliver+0x60/0x130 [c000000007f0b970] c000000000a9ec28 ip_rcv_finish+0x258/0x510 [c000000007f0ba00] c000000000a9f550 ip_rcv+0x360/0x470 [c000000007f0ba70] c000000000a38e5c __netif_receive_skb_core+0x97c/0xdf0 [c000000007f0bb30] c000000000a3cbc8 netif_receive_skb_internal+0x38/0xd0 [c000000007f0bb70] c000000000a3db6c napi_gro_receive+0x11c/0x1d0 [c000000007f0bbb0] c008000010430c1c i40e_clean_rx_irq+0x74c/0xb00 [i40e] [c000000007f0bca0] c00800001043134c i40e_napi_poll+0x37c/0x8f0 [i40e] [c000000007f0bd50] c000000000a3d47c net_rx_action+0x39c/0x4a0 [c000000007f0be50] c000000000bcdc9c __do_softirq+0x19c/0x3fc [c000000007f0bf40] c0000000000f3748 irq_exit+0xe8/0x120 [c000000007f0bf60] c000000000016b00 __do_irq+0x90/0x1d0 [c000000007f0bf90] c00000000002a5d0 call_do_irq+0x14/0x24 [c000001a935dfde0] c000000000016ce0 do_IRQ+0xa0/0x150 [c000001a935dfe30] c000000000009b94 h_virt_irq_common+0x114/0x120 --- Exception: ea1 at 00003fff920d8ce4 SP (3fff9184e5c0) is in userspace 1e:mon> r R00 = c000000000a1e82c R16 = c000000007f0b678 R01 = c000000007f0b440 R17 = c000000007f0b660 R02 = c0000000014fdf00 R18 = 00000000000000694 R04 = c000001c86fad600 R20 = 0000000000000000 R05 = 00000000003417f0 R21 = 00000000001fffff R06 = c000001d9701b2a0 R22 = c000001c7aae3d58 R07 = c000001d9701b000 R23 = ffffffffffffff92 R08 = 0000000000000280 R24 = 000000000001fe28 R09 = 000000000120c00a R25 = 00000000f8eb2d17 R10 = c00a000000000122 R26 = 0003126941e19a70 R11 = 0003126941e19a26 R27 = 0000000000000000 R12 = 0000000039059303 R28 = 0003126941e199ed R13 = c000000007b90e00 R29 = c000001c86fad600 R14 = c000001c7aae3c00 R30 = c000001d9701b280 R15 = c000001c86fad600 R31 = 0000000000000000 pc = c000000000a1e6b4 skb_release_data+0xd4/0x1b0 cfar= c000000000a1e670 skb_release_data+0x90/0x1b0 lr = c000000000a1e82c __kfree_skb+0x2c/0x50 msr = 9000000000009033 cr = 39059305 ctr = c000000000a97200 xer = 00000000a0000000 trap = 600 dar = c00a000000000122 dsisr = 08000000 1e:mon> S msr = 9000000000001033 sprg0 = 000000000000091c pvr = 00000000004e0100 sprg1 = 0000000000000000 dec = ffff6e56be3f98df sprg2 = 0000000000000000 sp = c000000007f0ac00 sprg3 = 000000000000001e toc = c0000000014fdf00 dar = c00a000000000122 srr0 = 0000000000091484 srr1 = 0000000000001033 dsisr = 08000000 dscr = 0000000000000000 ppr = 0000000000000000 pir = 0000005e sdr1 = 0000000000000000 hdar = 0000000000000000 hdsisr = 00000000 hsrr0 = 00000000300050b0 hsrr1 = 0000000000001002 hdec = 7600c75f lpcr = 0000000001d2f012 pcr = 0000000000000000 lpidr = 00000000 hsprg0 = 0000000007b90e00 hsprg1 = 0000000007b90e00 dabr = 0000000000000000 dabrx = 0000000000091484 dpdes = 0000000000000000 tir = 0000000000000002 cir = 00000000 fscr = 0000000000000180 tar = 0000000000000000 pspb = 00000000 mmcr0 = 0000000000000000 mmcr1 = 0000000000000000 mmcr2 = 0000000000000000 pmc1 = 00000000 pmc2 = 00000000 pmc3 = 00000000 pmc4 = 00000000 mmcra = 0000000000000000 siar = 0000000000000000 pmc5 = 80000001 sdar = 0000000000000000 sier = 0000000000000000 pmc6 = 8000000b ebbhr = 0000000000000000 ebbrr = 0000000000000000 bescr = 0000000000000000 hfscr = 000000000000059f dhdes = 0000000000091484 rpr = 0000000000000000 dawr = 0000000000000000 dawrx = 000000000000fc00 ciabr = 0000000000000000 1e:mon> == Comment: #7 - VIPIN K. PARASHAR <vipar...@in.ibm.com> - 2017-08-14 12:50:36 == 1e:mon> di %pc c000000000a1e6b4 7d205028 lwarx r9,0,r10 c000000000a1e6b8 3129ffff addic r9,r9,-1 c000000000a1e6bc 7d20512d stwcx. r9,0,r10 c000000000a1e6c0 40c2fff4 bne- c000000000a1e6b4 # skb_release_data+0xd4/0x1b0 c000000000a1e6c4 7d2907b4 extsw r9,r9 c000000000a1e6c8 7c0004ac hwsync c000000000a1e6cc 2fa90000 cmpdi cr7,r9,0 c000000000a1e6d0 409effb0 bne cr7,c000000000a1e680 # skb_release_data+0xa0/0x1b0 c000000000a1e6d4 4b88b495 bl c0000000002a9b68 # __put_page+0x8/0x80 c000000000a1e6d8 60000000 nop c000000000a1e6dc 893e0000 lbz r9,0(r30) c000000000a1e6e0 395f0001 addi r10,r31,1 c000000000a1e6e4 7d5f07b4 extsw r31,r10 c000000000a1e6e8 7f895000 cmpw cr7,r9,r10 c000000000a1e6ec 419dffa8 bgt cr7,c000000000a1e694 # skb_release_data+0xb4/0x1b0 c000000000a1e6f0 893e0001 lbz r9,1(r30) 1e:mon> >> c000000000a1e6b4 7d205028 lwarx r9,0,r10 <--- R10 = c00a000000000122 R10 is being used for load but it doesn't contain word aligned address. Thus Alignment interrupt is getting triggered. == Comment: #11 - VIPIN K. PARASHAR <vipar...@in.ibm.com> - 2017-08-14 13:12:54 == (In reply to comment #5) > Before I did apt-get update and upgrade on this system, I was running with > this level: > > 4.10.0-22-generic #24~16.04.1-Ubuntu SMP Mon May 22 22:11:01 UTC 2017 > ppc64le ppc64le ppc64le GNU/Linux > > With the previous level above, I did not see the problem. Crash is being seen with kernel 4.11.0-12-generic in TCP/IP stack code Its likely due to some code changes introduced after >= 4.10.0-22-generic. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-power-systems/+bug/1710690/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp