Public bug reported: Here's the log:
Jun 12 15:42:42 node73 kernel: [17196.908781] ------------[ cut here]------------ Jun 12 15:42:42 node73 kernel: [17196.909789] kernel BUG at/build/buildd/linux-3.13.0/mm/memory.c:3756! Jun 12 15:42:42 node73 kernel: [17196.911210] invalid opcode: 0000 [#1] SMPJun 12 15:42:42 node73 kernel: [17196.912130] Modules linked in: nfsdauth_rpcgss nfs_acl nfs lockd sunrpc fscache gpio_ich intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_inte l kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd sb_edac joydev edac_core ioatdma mei_me mei lpc_ich wmi ipmi_si mac _hid lp parport raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq raid1 igb hid_generic mpt2sas i2c_algo_bit raid0 raid_class usbhid dca multipath ptp sc si_transport_sas ahci hid libahci linear pps_core Jun 12 15:42:42 node73 kernel: [17196.924647] CPU: 5 PID: 25935 Comm:java Not tainted 3.13.0-29-generic #53-Ubuntu Jun 12 15:42:42 node73 kernel: [17196.926280] Hardware name: SupermicroX9DRFF-iG+/-7G+/-iTG+/-7TG+/X9DRFF-iG+/-7G+/-iTG+/-7TG+, BIOS 2.0a 04/30/2013 Jun 12 15:42:42 node73 kernel: [17196.928566] task: ffff880c4a795fc0 ti:ffff880ce7d96000 task.ti: ffff880ce7d96000 Jun 12 15:42:42 node73 kernel: [17196.930200] RIP:0010:[<ffffffff81179521>] [<ffffffff81179521>] handle_mm_fault+0xe61/0xf10 Jun 12 15:42:42 node73 kernel: [17196.932066] RSP:0018:ffff880ce7d97d98 EFLAGS: 00010246 Jun 12 15:42:42 node73 kernel: [17196.933217] RAX: 0000000000000100 RBX:000000078ddfdc38 RCX: ffff880ce7d97b00 Jun 12 15:42:42 node73 kernel: [17196.934773] RDX: ffff880c4a795fc0 RSI:0000000000000000 RDI: 80000001a82009e6 Jun 12 15:42:42 node73 kernel: [17196.936328] RBP: ffff880ce7d97e20 R08:0000000000000000 R09: 00000000000000a9 Jun 12 15:42:42 node73 kernel: [17196.937884] R10: 0000000000000001 R11:0000000000000000 R12: ffff880dee484370 Jun 12 15:42:42 node73 kernel: [17196.939440] R13: ffff881e0c4d3d40 R14:ffff88102511c280 R15: 0000000000000080 Jun 12 15:42:42 node73 kernel: [17196.940996] FS: 00007f2529340700(0000) GS:ffff88103fca0000(0000) knlGS:0000000000000000 Jun 12 15:42:42 node73 kernel: [17196.979078] CS: 0010 DS: 0000 ES:0000 CR0: 0000000080050033 Jun 12 15:42:42 node73 kernel: [17197.017222] CR2: 0000000718184000 CR3:0000001021ae8000 CR4: 00000000000407e0 Jun 12 15:42:42 node73 kernel: [17197.056416] Stack:Jun 12 15:42:42 node73 kernel: [17197.094614] 0000000000000001ffff880ce7d97db0 ffffffff8109a790 ffff880ce7d97dd0 Jun 12 15:42:42 node73 kernel: [17197.171848] ffffffff810d7b560000000000000001 ffffffff81f1fed0 ffff880ce7d97e78 Jun 12 15:42:42 node73 kernel: [17197.249793] ffffffff810d996dffff880ce7d97e48 00000000000000a9 00000001ffffffff Jun 12 15:42:42 node73 kernel: [17197.327660] Call Trace:Jun 12 15:42:42 node73 kernel: [17197.365233] [<ffffffff8109a790>] ?wake_up_state+0x10/0x20 Jun 12 15:42:42 node73 kernel: [17197.403036] [<ffffffff810d7b56>] ?wake_futex+0x66/0x90 Jun 12 15:42:42 node73 kernel: [17197.439822] [<ffffffff810d996d>] ?futex_wake_op+0x4ed/0x620 Jun 12 15:42:42 node73 kernel: [17197.475937] [<ffffffff81726164>]__do_page_fault+0x184/0x560 Jun 12 15:42:42 node73 kernel: [17197.511226] [<ffffffff8111140c>] ?acct_account_cputime+0x1c/0x20 Jun 12 15:42:42 node73 kernel: [17197.546109] [<ffffffff8109d77b>] ?account_user_time+0x8b/0xa0 Jun 12 15:42:42 node73 kernel: [17197.580167] [<ffffffff8109dd94>] ?vtime_account_user+0x54/0x60 Jun 12 15:42:42 node73 kernel: [17197.613381] [<ffffffff8172655a>]do_page_fault+0x1a/0x70 Jun 12 15:42:42 node73 kernel: [17197.645771] [<ffffffff817229c8>]page_fault+0x28/0x30 Jun 12 15:42:42 node73 kernel: [17197.677251] Code: ff 48 89 d9 4c 89 e24c 89 ee 4c 89 f7 44 89 4d c8 e8 34 c1 ff ff 85 c0 0f 85 94 f5 ff ff 49 8b 3c 24 44 8b 4d c8 e9 68 f3 ff ff <0f> 0b be 8e 00 00 00 48 c7 c7 c0 3c a6 81 44 89 4d c8 e8 48 e2 Jun 12 15:42:42 node73 kernel: [17197.772738] RIP [<ffffffff81179521>]handle_mm_fault+0xe61/0xf10 Jun 12 15:42:42 node73 kernel: [17197.804166] RSP <ffff880ce7d97d98>Jun 12 15:42:42 node73 kernel: [17197.881409] ---[ end traceb093101191f33d70 ]--- Jun 12 17:15:21 node73 kernel: [22748.792239] ------------[ cut here]------------ Please see my mail here: https://lkml.org/lkml/2014/6/19/462 And the response here (cc included @canonical.com): https://lkml.org/lkml/2014/6/19/368 Which was linked to here (Which has a patch that is said to fix this): https://lkml.org/lkml/2014/5/8/275 I applied that patch and built a kernel... it's in testing now on 2 machines out of 3 that have this problem. We have Ubuntu 14.04 on 73 single socket machines, where one has this problem, and 3 dual socket machines where 2 have this problem. Problem machines: - single socket Intel(R) Xeon(R) CPU E5-2609 0 @ 2.40GHz, Supermicro X9DR3-F - dual socket Intel(R) Xeon(R) CPU E5520 @ 2.27GHz, Dell PowerEdge R710 - dual socket Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz, Supermicro X9DRFF-iG+/-7G+/-iTG+/-7TG+ (and the other dual socket one without the problem is another PowerEdge R710, strangely enough... maybe it's just not heavily loaded like the other, prime95 for a few hours doesn't cause it either) ** Affects: linux-lts-trusty (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1335091 Title: kernel BUG - handle_mm_fault - Ubuntu 14.04 kernel 3.13.0-29-generic To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-lts-trusty/+bug/1335091/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs