------- Comment From ukri...@us.ibm.com 2017-07-27 18:33 EDT------- I just opened another BZ 157097 for the same issue. I was referred to this bug and I see that it addresses the same issue I was debugging. But we need the upstream commit be5c5e843c4afa1c8397cb740b6032bd4142f32d pulled into Xenial 16.04.3 HWE v4.10 kernel also.
Bad commit 2337d207288f163e10bd8d4d7eeb0c1c75046a0c is included in 16.04.3 HWE v4.10 kernel, so we need the fixing upstream commit in Xenial (16.04.3) also if possible. I know we are cutting close to 16.04.3 release date but this is a regression, so it would be good to have the fixing commit if possible. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1684054 Title: [LTCTest][Opal][FW860.20] HMI recoverable errors failed to recover and system goes to dump state. Status in The Ubuntu-power-systems project: In Progress Status in linux package in Ubuntu: Fix Released Status in linux source package in Zesty: New Bug description: == Comment: #0 - Pridhiviraj Paidipeddi <ppaid...@in.ibm.com> - 2017-04-17 06:08:41 == ---Problem Description--- HMI Recoverable error injection tests leads to system checkstop followed by system dump with ubuntu 17.04 os and kernel 4.10.0-19-generic ppc64le Contact Information = ppaid...@in.ibm.com ---uname output--- #21-Ubuntu SMP Thu Apr 6 17:03:05 UTC 2017 ppc64le ppc64le ppc64le GNU/Linux Machine Type = PowerNV 8284-22A ---System Hang--- System is in dumping state. after dump finishes system will IPL to OS again. ---Debugger--- A debugger is not configured == Comment: #3 - Pridhiviraj Paidipeddi <ppaid...@in.ibm.com> - 2017-04-17 06:12:51 == # uname -a #21-Ubuntu SMP Thu Apr 6 17:03:05 UTC 2017 ppc64le ppc64le ppc64le GNU/Linux # cat /etc/os-release NAME="Ubuntu" VERSION="17.04 (Zesty Zapus)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 17.04" VERSION_ID="17.04" HOME_URL="https://www.ubuntu.com/" SUPPORT_URL="https://help.ubuntu.com/" BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/" PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy" VERSION_CODENAME=zesty UBUNTU_CODENAME=zesty root@p8wookie:~# == Comment: #4 - Kevin W. Rudd <ru...@us.ibm.com> - 2017-04-17 11:10:22 == == Comment: #5 - MAHESH J. SALGAONKAR <mahesh.salgaon...@in.ibm.com> - 2017-04-17 13:34:03 == it looks like below commit is a culprit: ======================================= commit 2337d207288f163e10bd8d4d7eeb0c1c75046a0c Author: Nicholas Piggin <npig...@gmail.com> Date: Fri Jan 27 14:24:33 2017 +1000 powerpc/64: CONFIG_RELOCATABLE support for hmi interrupts The branch from hmi_exception_early to hmi_exception_realmode must use a "relocatable-style" branch, because it is branching from unrelocated exception code to beyond __end_interrupts. Signed-off-by: Nicholas Piggin <npig...@gmail.com> Signed-off-by: Michael Ellerman <m...@ellerman.id.au> ======================================= With the above commit changes now hmi_exception_realmode() is called using bctrl which ends up messing up TOC (r2) value and further access using new r2 results into unpredictable behaviour. ---------------------------------------- c000000000025f50 <hmi_exception_realmode>: c000000000025f50: 3a 01 4c 3c addis r2,r12,314 c000000000025f54: b0 01 42 38 addi r2,r2,432 c000000000025f58: a6 02 08 7c mflr r0 ----------------------------------------- With above commit the hmi_exception_early() code jumps to c000000000025f50 (hmi_exception_realmode+0x0) which then sets up new value for r2. If we revert above commit the code jumps to c000000000025f58 (hmi_exception_realmode+0x8) and hmi handler works fine. After reverting above patch I don't see this issue anymore. I have rebuilt the ubuntu kernel after reverting above patch and you can find the kernel rpm at: Can you please retry your tests with above kernel and see if issue still persists. == Comment: #6 - MAHESH J. SALGAONKAR <mahesh.salgaon...@in.ibm.com> - 2017-04-17 23:02:31 == Spoke to Michael Ellerman this morning. He helped me to identify the root cause and a fix patch beow: diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index 857bf7c5b946..7cfeb8768587 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -982,7 +982,7 @@ TRAMP_REAL_BEGIN(hmi_exception_early) EXCEPTION_PROLOG_COMMON_2(PACA_EXGEN) EXCEPTION_PROLOG_COMMON_3(0xe60) addi r3,r1,STACK_FRAME_OVERHEAD - BRANCH_LINK_TO_FAR(r4, hmi_exception_realmode) + BRANCH_LINK_TO_FAR(r12, hmi_exception_realmode) /* Windup the stack. */ /* Move original HSRR0 and HSRR1 into the respective regs */ ld r9,_MSR(r1) == Comment: #7 - Pridhiviraj Paidipeddi <ppaid...@in.ibm.com> - 2017-04-18 01:52:03 == == Comment: #8 - Pridhiviraj Paidipeddi <ppaid...@in.ibm.com> - 2017-04-18 01:53:57 == Hi Mahesh Tested all the HMI Recoverable errors on the below patched kernel, attached the corresponding executing logs. All tests are working fine. #21 SMP Mon Apr 17 12:58:30 EDT 2017 ppc64le ppc64le ppc64le GNU/Linux Thanks == Comment: #9 - MAHESH J. SALGAONKAR <mahesh.salgaon...@in.ibm.com> - 2017-04-18 06:07:56 == (In reply to comment #8) > Hi Mahesh > Tested all the HMI Recoverable errors on the below patched kernel, attached > the corresponding executing logs. All tests are working fine. > > Linux p8wookie 4.10.0-19.bz153487-generic #21 SMP Mon Apr 17 12:58:30 EDT > 2017 ppc64le ppc64le ppc64le GNU/Linux > > > Thanks Thanks. Michael has posted fix for this upstream. http://patchwork.ozlabs.org/patch/751647/ I will rebuild the new ubuntu kernel with above patch. == Comment: #12 - Pridhiviraj Paidipeddi <ppaid...@in.ibm.com> - 2017-04-18 09:27:59 == (In reply to comment #11) > > > > https://git.kernel.org/powerpc/c/be5c5e843c4afa1c8397cb740b6032 > > I have built new kernel with above patch and you can find it below path > >:/home2/mahesh/u2/bz153487v2/linux-image-4.10.0-19.bz153487v2- > generic_4.10.0-19.bz153487v2.21_ppc64el.deb Tested with this new patched kernel, all tests are working fine. Linux p8wookie 4.10.0-19.bz153487v2-generic #21 SMP Tue Apr 18 07:43:13 EDT 2017 ppc64le ppc64le ppc64le GNU/Linux Will attach is full the execution logs here. == Comment: #13 - Pridhiviraj Paidipeddi <ppaid...@in.ibm.com> - 2017-04-18 09:29:43 == == Comment: #14 - MAHESH J. SALGAONKAR <mahesh.salgaon...@in.ibm.com> - 2017-04-19 03:52:18 == (In reply to comment #12) > (In reply to comment #11) > > > > > > https://git.kernel.org/powerpc/c/be5c5e843c4afa1c8397cb740b6032 > > Thanks for testing. We need to mirror this to ubuntu for fix patch inclusion > > Linux p8wookie 4.10.0-19.bz153487v2-generic #21 SMP Tue Apr 18 07:43:13 EDT > 2017 ppc64le ppc64le ppc64le GNU/Linux > > Will attach is full the execution logs here. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-power-systems/+bug/1684054/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp