------- Comment From kalsh...@in.ibm.com 2014-12-03 12:09 EDT------- root@u1410:~# ./madv_poison -C -i 1 vm.memory_failure_early_kill = 0 [pid 1043] start page-poisoning test [pid 1043] there are 1 shm_child [pid 1043] have spawned 1 processes [pid 1043] wait for Pid 1046 [pid 1046] shm dirty poisoning page 0x3fff8d1c0000 [ 348.014055] MCE 0x6e4: dirty LRU page recovery: Recovered [pid 1046] writing 2 [ 348.014210] MCE: Killing madv_poison:1046 due to hardware memory corruption fault at 3fff8d1c0000 [pid 1046] signal 7 code 4 addr 0x3fff8d1c0000 [pid 1046] pass: recovered [pid 1043] Ins 0: Pid 1046: pass - shared memory test [pid 1043] !!! Page Poisoning Test got PASS. !!!
[pid 1043] page-poisoning test done! root@u1410:~# -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1370425 Title: kernel bug seen while try to use madvise system call with MADV_HWPOISON mode Status in linux package in Ubuntu: Fix Released Status in linux source package in Utopic: Fix Committed Status in linux source package in Vivid: Fix Released Bug description: Problem Description ==================== kernel bug seen while try to use madvise system call with MADV_HWPOISON mode ---uname output--- Linux u10thp 3.16.0-9-generic #14-Ubuntu SMP Fri Aug 15 15:03:36 UTC 2014 ppc64le ppc64le ppc64le GNU/Linux Machine Type = Power 8 Steps to Reproduce ==================== 1. Install Ubuntu 14.10 guest on PowerKVM. 2. Setup hugepage backing guest VM. 3. Try madv_poison.c code to test madvise sys. call with HWPOISON mode(test code is attached). gcc -o madv_poison madv_poison.c ./madv_poison -C -i 1 (1 - shm_test) Ubuntu 14.10 LE throws kernel bug : root@u10thp:~# ./madv_poison -C -i 1 vm.memory_failure_early_kill = 0 [pid 2301] start page-poisoning test [pid 2301] there are 1 shm_child [pid 2301] have spawned 1 processes [pid 2301] wait for Pid 2304 [pid 2304] shm dirty poisoning page 0x3fffa7ce0000 [ 7905.009001] Injecting memory failure for page 0xe6a7 at 0x3fffa7ce0000 [ 7905.009359] MCE 0xe6a7: dirty LRU page recovery: Recovered [pid 2304] writing 2 [ 7905.009901] ------------[ cut here ]------------ [ 7905.010164] kernel BUG at /build/buildd/linux-3.16.0/arch/powerpc/mm/fault.c:180! [ 7905.010396] Oops: Exception in kernel mode, sig: 5 [#234] [ 7905.010438] SMP NR_CPUS=2048 NUMA pSeries [ 7905.010480] Modules linked in: pseries_rng rtc_generic ohci_pci [ 7905.010614] CPU: 0 PID: 2304 Comm: madv_poison Tainted: G D 3.16.0-9-generic #14-Ubuntu [ 7905.010686] task: c0000000e0a92a60 ti: c0000000e09e8000 task.ti: c0000000e09e8000 [ 7905.010746] NIP: c0000000009e3314 LR: c0000000009e2e54 CTR: 0000000000000000 [ 7905.010864] REGS: c0000000e09eb990 TRAP: 0700 Tainted: G D (3.16.0-9-generic) [ 7905.010924] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE> CR: 28002882 XER: 00000000 [ 7905.011125] CFAR: c0000000009e3170 SOFTE: 1 GPR00: c0000000009e2e54 c0000000e09ebc10 c0000000013742e0 0000000000000010 GPR04: c0000000e0b37ff8 00003fffa7ce0000 00000000000000a9 0000000000000000 GPR08: 0000000000000000 0000000000000010 c0000000e0a92a60 0000000000000020 GPR12: 0000000048002884 c00000000fe40000 0000000000000000 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 00000000000000a9 0000000000000000 c0000000e0597a40 c0000000e022b060 GPR24: 0000000000000010 c0000000e022b000 c000000000009568 00003fffa7ce0000 GPR28: 0000000000000000 0000000000000000 0000000002000000 c0000000e09ebea0 [ 7905.012189] NIP [c0000000009e3314] do_page_fault+0x984/0x990 [ 7905.012241] LR [c0000000009e2e54] do_page_fault+0x4c4/0x990 [ 7905.012281] Call Trace: [ 7905.012361] [c0000000e09ebc10] [c0000000009e2e54] do_page_fault+0x4c4/0x990 (unreliable) [ 7905.012434] [c0000000e09ebe30] [c000000000009568] handle_page_fault+0x10/0x30 [ 7905.012494] Instruction dump: [ 7905.012580] e92d0290 e8690460 38630060 4b7274d9 60000000 e93f0108 3bc00000 792a97e3 [ 7905.012683] 4082f77c 3bc00009 60000000 4bfff774 <0fe00000> 00000000 00000000 3c4c0099 [ 7905.012845] ---[ end trace a48a199a061eed79 ]--- [ 7905.019084] [pid 2301] Ins 0: Pid 2304: failed - shared memory test [pid 2301] !!! Page Poisoning Test is FAILED (1 failures found). !!! [pid 2301] page-poisoning test done! root@u10thp:~# == Comment: #1 - Kalpana Shetty <kalsh...@in.ibm.com> - == The test code works fine with x86/Ubuntu VM so if it is not supported on power then it should have thrown an error not supported as it does with PowerKVM / RHEL 7 VM. Intel/Ubuntu 14.04 VM: =================================> Working fine. root@u04vm14:~# ./madv_poison -C -i 1 (shm_test case) vm.memory_failure_early_kill = 0 [pid 7325] start page-poisoning test [pid 7325] there are 1 shm_child [pid 7325] have spawned 1 processes [pid 7325] wait for Pid 7328 [pid 7328] shm dirty poisoning page 0x7f60ca8ea000 [pid 7328] writing 2 [pid 7328] signal 7 code 4 addr 0x7f60ca8ea000 [pid 7328] pass: recovered [pid 7325] Ins 0: Pid 7328: pass - shared memory test [pid 7325] !!! Page Poisoning Test got PASS. !!! [pid 7325] page-poisoning test done! PowerKVM / RHEL 7 VM: [root@rhel7-web-VM1 ~]# ./madv_poison -C -i 1 sysctl: cannot stat /proc/sys/vm/memory_failure_early_kill: No such file or directory [pid 11512] start page-poisoning test [pid 11512] there are 1 shm_child [pid 11512] have spawned 1 processes [pid 11514] shm dirty poisoning page 0x3fff84d60000 [pid 11512] wait for Pid 11514 [pid 11514] failed: Kernel doesn't support poison injection ============================> unsupported error. [pid 11512] Ins 0: Pid 11514: failed - shared memory test [pid 11512] !!! Page Poisoning Test is FAILED (1 failures found). !!! [pid 11512] page-poisoning test done! To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1370425/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp