** Tags removed: targetmilestone-inin--- ** Tags added: targetmilestone-inin14044
-- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1537666 Title: ISST-LTE: Ubuntu 14.04.4 LPAR interrupts at check_and_cede_processor Status in linux package in Ubuntu: Triaged Bug description: == Comment: #0 - YUECHANG E. MEI <ye...@us.ibm.com> - 2015-12-11 17:19:07 == ---Problem Description--- We have an Ubuntu 14.04.4 LPAR, conelp2. It is running stress test: base, io, and tcp. When checking "dmesg", we see this interruption: [Fri Dec 11 13:58:50 2015] --- interrupt: 501 at plpar_hcall_norets+0x1c/0x28 [Fri Dec 11 13:58:50 2015] LR = check_and_cede_processor+0x34/0x50 In the previous test, conelp2 stopped all the stress tests by itself because it ran out of memory. Is the out of memory issue relating to the interruption? Contact Information = Yuechang (Erin) Mei /ye...@us.ibm.com, Raja Sunkari /rajas...@in.ibm.com ---uname output--- Linux conelp2 4.2.0-21-generic #25~14.04.1-Ubuntu SMP Thu Dec 3 13:55:42 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux Machine Type = EUH Alpine 8408-E8E ---Debugger--- A debugger is not configured ---Steps to Reproduce--- 1. install Ubuntu 14.04.4 in a LPAR, then update to the latest 14.04.4 kernel by using this workaround: echo "deb http://software.linux.ibm.com/pub/ubuntu-ppc64el-repository/ trusty-proposed main restricted universe multiverse" >> /etc/apt/sources.list apt-get update apt-get install linux-image-generic-lts-wily 2. Setup the Stress test, and start base,io, tcp 3. After an hour, check dmesg, then you will see the message about the interruption Stack trace output: no Oops output: no System Dump Info: The system is not configured to capture a system dump. *Additional Instructions for Yuechang (Erin) Mei /ye...@us.ibm.com, Raja Sunkari /rajas...@in.ibm.com: -Post a private note with access information to the machine that the bug is occuring on. -Attach sysctl -a output output to the bug. == Comment: #1 - YUECHANG E. MEI <ye...@us.ibm.com> - 2015-12-11 17:23:00 == == Comment: #3 - YUECHANG E. MEI <ye...@us.ibm.com> - 2015-12-14 15:23:33 == == Comment: #4 - MAMATHA INAMDAR <mainam...@in.ibm.com> - 2015-12-15 03:56:14 == dmrsg show page allocation failure [Fri Dec 11 13:45:38 2015] swapper/127: page allocation failure: order:0, mode:0x120 [Fri Dec 11 13:45:38 2015] CPU: 127 PID: 0 Comm: swapper/127 Not tainted 4.2.0-21-generic #25~14.04.1-Ubuntu [Fri Dec 11 13:45:38 2015] Call Trace: [Fri Dec 11 13:45:38 2015] [c00000027fbc3890] [c000000000a805ec] dump_stack+0x90/0xbc (unreliable) [Fri Dec 11 13:45:38 2015] [c00000027fbc38c0] [c00000000021c118] warn_alloc_failed+0x118/0x160 [Fri Dec 11 13:45:38 2015] [c00000027fbc3960] [c000000000221114] __alloc_pages_nodemask+0x834/0xa60 [Fri Dec 11 13:45:38 2015] [c00000027fbc3b10] [c000000000221404] __alloc_page_frag+0xc4/0x190 [Fri Dec 11 13:45:38 2015] [c00000027fbc3b50] [c0000000008f6d20] netdev_alloc_frag+0x50/0x80 [Fri Dec 11 13:45:38 2015] [c00000027fbc3b80] [c000000000764e80] tg3_alloc_rx_data+0xa0/0x2c0 [Fri Dec 11 13:45:38 2015] [c00000027fbc3be0] [c000000000767344] tg3_poll_work+0x484/0x1070 [Fri Dec 11 13:45:38 2015] [c00000027fbc3ce0] [c000000000767f8c] tg3_poll_msix+0x5c/0x210 [Fri Dec 11 13:45:38 2015] [c00000027fbc3d30] [c00000000090ebb8] net_rx_action+0x2d8/0x430 [Fri Dec 11 13:45:38 2015] [c00000027fbc3e40] [c0000000000ba124] __do_softirq+0x174/0x390 [Fri Dec 11 13:45:38 2015] [c00000027fbc3f40] [c0000000000ba6c8] irq_exit+0xc8/0x100 [Fri Dec 11 13:45:38 2015] [c00000027fbc3f60] [c0000000000111ec] __do_irq+0x8c/0x190 [Fri Dec 11 13:45:38 2015] [c00000027fbc3f90] [c000000000024278] call_do_irq+0x14/0x24 [Fri Dec 11 13:45:38 2015] [c0000002763a39b0] [c000000000011390] do_IRQ+0xa0/0x120 [Fri Dec 11 13:45:38 2015] [c0000002763a3a10] [c0000000000099b0] restore_check_irq_replay+0x2c/0x70 [Fri Dec 11 13:45:38 2015] --- interrupt: 501 at plpar_hcall_norets+0x1c/0x28 [Fri Dec 11 13:45:38 2015] LR = check_and_cede_processor+0x34/0x50 [Fri Dec 11 13:45:38 2015] [c0000002763a3d00] [c0000000008a8d90] check_and_cede_processor+0x20/0x50 (unreliable) [Fri Dec 11 13:45:38 2015] [c0000002763a3d60] [c0000000008a8fb8] shared_cede_loop+0x68/0x170 [Fri Dec 11 13:45:38 2015] [c0000002763a3da0] [c0000000008a615c] cpuidle_enter_state+0xbc/0x350 [Fri Dec 11 13:45:38 2015] [c0000002763a3e00] [c000000000110f3c] call_cpuidle+0x7c/0xd0 [Fri Dec 11 13:45:38 2015] [c0000002763a3e40] [c0000000001112d0] cpu_startup_entry+0x340/0x450 [Fri Dec 11 13:45:38 2015] [c0000002763a3f10] [c000000000044ab4] start_secondary+0x364/0x3a0 [Fri Dec 11 13:45:38 2015] [c0000002763a3f90] [c000000000008b6c] start_secondary_prolog+0x10/0x14 [Fri Dec 11 13:45:38 2015] Mem-Info: [Fri Dec 11 13:45:38 2015] active_anon:714 inactive_anon:2255 isolated_anon:0 == Comment: #5 - Luciano Chavez <cha...@us.ibm.com> - 2016-01-04 14:28:59 == Hi Yuechang, Atomic page allocation failure warnings originating from network stack allocation request are common under stress conditions. The order 0x0 page allocation failures are probably the easiest to tune for assuming there isn't a leak. Suggest you start with at least having a minimum free pool reservation of 64MB and see if that helps eliminate that particular warning. First check that current value is lower than that cat /proc/sys/vm/min_free_kbytes and then set it with echo 65536 > /proc/sys/vm/min_free_kbytes If existing value is already higher than 64MB then pick a larger value. If this helps, update the /etc/sysctl.conf file to keep that persistent between boots with an entry of vm.min_free_kbytes = 65536 or whatever the best value that helped. == Comment: #6 - Jonathan Dalton <jodal...@us.ibm.com> - 2016-01-06 15:34:18 == root@conelp2:~# root@conelp2:~# cat /proc/sys/vm/min_free_kbytes 180224 root@conelp2:~# echo 365536 > /proc/sys/vm/min_free_kbytes root@conelp2:~# cat /proc/sys/vm/min_free_kbytes 365536 root@conelp2:~# == Comment: #7 - Jonathan Dalton <jodal...@us.ibm.com> - 2016-01-07 11:41:51 == root@conelp2:~# root@conelp2:~# cat /proc/sys/vm/min_free_kbytes 180224 root@conelp2:~# echo 365536 > /proc/sys/vm/min_free_kbytes root@conelp2:~# cat /proc/sys/vm/min_free_kbytes 365536 root@conelp2:~# == Comment: #8 - Raja Shekhar Reddy Sunkari <rajas...@in.ibm.com> - 2016-01-11 02:30:19 == Hi Luciano, I have run stress test on conelp2 after updating value to: root@conelp2:~# cat /proc/sys/vm/min_free_kbytes 365536 Tests ran successfully for 72hrs without any interruption. However, dmesg output still shows the page allocation failure messages but appear less frequent when compared to last run. == Comment: #9 - Jonathan Dalton <jodal...@us.ibm.com> - 2016-01-13 13:02:16 == I restarted stress tests Monday and verified today (Wednesday) that: root@conelp2:~# cat /proc/sys/vm/min_free_kbytes 365536 Was increased. With the increased "min_free_kbytes" there is nothing in the current dmesg that says: interrupt 501 page allocation fault So, increasing the "min_free_kbytes" during stress eliminated the fault, however, is this still a bug? Should the "min_free_kbytes" have to be increased? Attached is the dmesg associated with this comment. == Comment: #12 - Luciano Chavez <cha...@us.ibm.com> - 2016-01-22 20:22:08 == (In reply to comment #11) > Hi Luciano, > > I see some info for --set-recommended-min_free_kbytes documented in the > following link > > http://manpages.ubuntu.com/manpages/trusty/man8/hugeadm.8.html > > Can you please check and let me know. Hi Mamatha, Thanks. That documentation is specific to a utility for huge pages though so we may have to mirror it and see if the Canonical folks can point to Ubuntu documentation they have on when to change min_free_kbytes. Hi canonical, Please point to Ubuntu documentation that will explain when to change min_free_kbytes. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1537666/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp