Launchpad has imported 32 comments from the remote bug at https://bugzilla.redhat.com/show_bug.cgi?id=508120.
If you reply to an imported comment from within Launchpad, your comment will be sent to the remote bug automatically. Read more about Launchpad's inter-bugtracker facilities at https://help.launchpad.net/InterBugTracking. ------------------------------------------------------------------------ On 2009-06-25T17:25:26+00:00 Kalev wrote: Created attachment 349429 xm dmesg I am running an i686 rawhide domU PV machine under x86_64 xen host. After updating from F-11's kernel-PAE-2.6.29.4-167.fc11.i686 the new kernels no longer boot. They seem crash very early, so that I don't even get any printk() output in the console. Right now I can reproduce the problem with kernel-PAE-2.6.31-0.28.rc1.fc12.i686, however the same issue started with 2.6.30-something, and I also verified that I get the exact same behaviour with x86_64 domU kernels. "xm dmesg" reports the following: (XEN) Unhandled page fault in domain 12 on VCPU 0 (ec=0000) (XEN) Pagetable walk from 0000000000000014: (XEN) L4[0x000] = 0000000081d9d027 0000000000001b48 (XEN) L3[0x000] = 0000000000000000 ffffffffffffffff (XEN) domain_crash_sync called from entry.S (XEN) Domain 12 (vcpu#0) crashed on cpu#3: (XEN) ----[ Xen-3.1.2-155.el5 x86_64 debug=n Not tainted ]---- (XEN) CPU: 3 (XEN) RIP: e019:[<00000000c0a8b501>] <snip> (full trace attached) Examining matching vmlinux in gdb I get: (gdb) x/i 0x00000000c0a8b501 0xc0a8b501 <xen_start_kernel+9>: mov %gs:0x14,%eax (gdb) l *0x00000000c0a8b501 0xc0a8b501 is in xen_start_kernel (arch/x86/xen/enlighten.c:990). 985 .emergency_restart = xen_emergency_restart, 986 }; 987 988 /* First C function to be called on Xen boot */ 989 asmlinkage void __init xen_start_kernel(void) 990 { 991 pgd_t *pgd; 992 993 if (!xen_start_info) 994 return; The host is running Centos 5.3 with kernel-xen-2.6.18-155.el5 and xen-3.0.3-80.el5_3.3. Xen config: name = "fedora-rawhide" uuid = "1f162091-fa31-c67c-9da1-702bcd5cb40b" maxmem = 512 memory = 512 vcpus = 1 bootloader = "/usr/bin/pygrub" on_poweroff = "destroy" on_reboot = "restart" on_crash = "restart" vfb = [ ] disk = [ "phy:/dev/vg0/xen_rawhide,xvda,w" ] vif = [ "mac=00:16:3e:11:95:e6,bridge=xenbr0" ] Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/0 ------------------------------------------------------------------------ On 2009-08-16T19:54:04+00:00 Michal wrote: I'm seeing the same problem with a x86_64 Rawhide DomU and a RHEL5 Dom0. I believe it's the same problem Orion Poplawski reported recently on fedora-xen mailing list: https://www.redhat.com/archives/fedora-xen/2009-August/msg00008.html I got the stack trace as Mark McLoughlin suggested: michich@hammerfall ~$ sudo /usr/lib64/xen/bin/xenctx -s /tmp/System.map-2.6.31-rc5-git2 12 rip: ffffffff817290a1 xen_start_kernel+0x10 rsp: ffffffff8171df90 rax: 00000000 rbx: 00000000 rcx: 00000000 rdx: 00000000 rsi: ffffffff82fc3000 rdi: ffffffff82fc3000 rbp: ffffffff8171dff8 r8: 00000000 r9: 00000000 r10: 00000000 r11: 00000000 r12: 00000000 r13: 00000000 r14: 00000000 r15: 00000000 cs: 0000e033 ds: 00000000 fs: 00000000 gs: 00000000 Stack: 0000000000000000 0000000000000000 0000000000000000 ffffffff817290a1 000000010000e030 0000000000010096 ffffffff8171dfd8 000000000000e02b 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 Code: bd 93 ff c9 c3 55 48 89 e5 53 48 83 ec 18 48 8b 3d 27 c0 33 00 <65> 48 8b 04 25 28 00 00 00 48 89 Call Trace: [<ffffffff817290a1>] xen_start_kernel+0x10 <-- [<ffffffff817290a1>] xen_start_kernel+0x10 Then I dissasembled xen_start_kernel(): 0000000000000000 <xen_start_kernel>: 0: 55 push %rbp 1: 48 89 e5 mov %rsp,%rbp 4: 53 push %rbx 5: 48 83 ec 18 sub $0x18,%rsp 9: 48 8b 3d 00 00 00 00 mov 0x0(%rip),%rdi # 10 <xen_start_kernel+0x10> 10: 65 48 8b 04 25 28 00 mov %gs:0x28,%rax ***CRASHES HERE*** 17: 00 00 19: 48 89 45 e8 mov %rax,-0x18(%rbp) 1d: 31 c0 xor %eax,%eax ... At first these last three instructions confused me, because they did not seem to correspond to anything in the C source, but then I realized they setup the canary for stack smashing detection. So I recompiled the kernel without CONFIG_CC_STACKPROTECTOR and I got much farther with the boot (it hung after loading some drivers, I'll investigate more). I guess xen_start_kernel() (and possibly more of Xen DomU startup code) should be compiled with -fno-stack-protector. Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/1 ------------------------------------------------------------------------ On 2009-08-17T09:15:10+00:00 Chris wrote: Wow, excellent analysis, thanks. I was just starting to see this myself, but hadn't yet had time to look into it. We'll have to take this up with upstream and see what they have to say about it. Chris Lalancette Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/2 ------------------------------------------------------------------------ On 2009-08-17T16:22:05+00:00 Kevin wrote: Yeah, seeing this also on a machine in fedora infrastructure. ;( Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/3 ------------------------------------------------------------------------ On 2009-08-17T17:49:02+00:00 Jeremy wrote: Thanks for the report and analysis. I guess there's a keyword to prevent gcc from adding stack-smashing to particular functions or files... Erm... Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/4 ------------------------------------------------------------------------ On 2009-08-17T19:31:05+00:00 Jeremy wrote: Created attachment 357697 Make sure load_percpu_segment doesn't have stack-protector enabled Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/5 ------------------------------------------------------------------------ On 2009-08-17T19:31:34+00:00 Jeremy wrote: Created attachment 357698 Setup percpu segments before calling stack-protected functions Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/6 ------------------------------------------------------------------------ On 2009-08-17T19:31:59+00:00 Jeremy wrote: Do those two help? Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/7 ------------------------------------------------------------------------ On 2009-08-18T09:32:43+00:00 Michal wrote: Jeremy, yes, these patches help. The kernel starts booting with them applied. Just a suggestion: the usual way (as seen in other Makefiles) to disable the stack protection for selected source files seems to be: nostackp := $(call cc-option, -fno-stack-protector) CFLAGS_somefile.o := $(nostackp) And the kernel still hangs for me later during boot, but that's a different bug. Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/8 ------------------------------------------------------------------------ On 2009-08-18T17:53:20+00:00 Jeremy wrote: Ah, I couldn't find another instance of stack-protector being disabled. Have you reported the other bug, or is it something purely local? Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/9 ------------------------------------------------------------------------ On 2009-08-21T07:31:29+00:00 Mark wrote: Ingo has these queued up in linux-2.6-tip.git/x86/urgent: http://git.kernel.org/?p=linux/kernel/git/tip/linux-2.6-tip.git;a=commitdiff;h=ce2eef33d3 http://git.kernel.org/?p=linux/kernel/git/tip/linux-2.6-tip.git;a=commitdiff;h=5416c26635 Just need to re-test and close this when they make their way to rawhide Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/10 ------------------------------------------------------------------------ On 2009-08-21T07:51:04+00:00 Jeremy wrote: Are you sure they work? M A Young still reports crashes when they're applied. Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/11 ------------------------------------------------------------------------ On 2009-08-21T08:36:11+00:00 Mark wrote: Michal says it still hangs later on, but that its a different issue M A Young is probably testing Dom0, maybe yet another issue? Dunno, that's why I said we need to re-test :-) Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/12 ------------------------------------------------------------------------ On 2009-08-21T10:40:19+00:00 Michal wrote: I've described the other bug in http://lkml.org/lkml/2009/8/21/71 Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/13 ------------------------------------------------------------------------ On 2009-08-25T01:08:46+00:00 Chuck wrote: Should be fixed in 2.6.31-0.173.rc7.git2, which has the two x86-tip patches plus the framebuffer fix from LKML. Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/14 ------------------------------------------------------------------------ On 2009-08-25T14:30:24+00:00 Michal wrote: 2.6.31-0.173.rc7.git2 boots successfully under Xen on x86_64, but i686 still fails. Probably because load_percpu_segment(0); is under #ifdef CONFIG_X86_64 in xen_start_kernel(). Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/15 ------------------------------------------------------------------------ On 2009-08-25T16:02:48+00:00 Kevin wrote: Seems to work here under a x86_64 guest. Thanks. Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/16 ------------------------------------------------------------------------ On 2009-08-25T18:15:53+00:00 Jeremy wrote: (In reply to comment #15) > 2.6.31-0.173.rc7.git2 boots successfully under Xen on x86_64, but i686 still > fails. Probably because load_percpu_segment(0); is under #ifdef CONFIG_X86_64 > in xen_start_kernel(). 32 bit is trickier because it needs a specifically set-up GDT entry and its own segment register. Doing this setup properly ends upcalling functions with stack-protector prologs which assume the segment register is already set up. I need to work out 1) how native does this setup, and/or 2) refactor the segment register setup so that can avoid functions with stack-protector code. Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/17 ------------------------------------------------------------------------ On 2009-08-26T12:02:59+00:00 Alexander wrote: *** Bug 519342 has been marked as a duplicate of this bug. *** Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/18 ------------------------------------------------------------------------ On 2009-08-26T12:07:21+00:00 Alexander wrote: FYI: 2.6.31-0.174.rc7.git2 still fails on i386. My dom0 is recent RHEL5 and domU is F12-Alpha Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/19 ------------------------------------------------------------------------ On 2009-08-27T17:56:55+00:00 Pasi wrote: I also tried the latest rawhide tree (2.6.31-0.174.rc7.git2.fc12.i686) with virt-install on my F11 + Xen 3.4.1 + 2.6.31-rc6 pv_ops dom0 setup, and it still crashes. Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/22 ------------------------------------------------------------------------ On 2009-08-28T02:55:21+00:00 Jeremy wrote: What compiler are people using? Using F11's gcc-4.4.1-2.fc11.x86_64, it says: /home/jeremy/git/linux/arch/x86/Makefile:80: stack protector enabled but no compiler support Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/23 ------------------------------------------------------------------------ On 2009-08-28T08:28:44+00:00 Michal wrote: Jeremy, Rawhide builds currently use gcc-4.4.1-6.x86_64. You can find this information in Koji build logs, e.g.: http://kojipkgs.fedoraproject.org/packages/kernel/2.6.31/0.185.rc7.git6.fc12/data/logs/x86_64/ root.log tells you the versions of the packages used in the build. build.log has the build warnings. There was no such stack protector warning in this case. Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/24 ------------------------------------------------------------------------ On 2009-08-28T18:18:30+00:00 Jeremy wrote: Where can I get this version of gcc? "yum update --enablerepo=rawhide gcc" doesn't get me anything more recent than gcc-4.4.1-2.fc11.x86_64. Or does 32-bit stackprotector not work in the x86-64 version of the compiler? Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/25 ------------------------------------------------------------------------ On 2009-08-29T16:29:23+00:00 Michal wrote: > Where can I get this version of gcc? "yum update --enablerepo=rawhide gcc" > doesn't get me anything more recent than gcc-4.4.1-2.fc11.x86_64. Works for me, yum can see the newer version. But the gcc from Rawhide depends on newer glibc, so I do not recommend doing it. > Or does 32-bit stackprotector not work in the x86-64 version of the compiler? Bug in the stack protector detection for ARCH=i386 builds on x86_64. I've sent a patch to LKML and CCed you. Koji always builds packages using native arch toolchain, so it is not affected. Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/26 ------------------------------------------------------------------------ On 2009-09-02T17:25:59+00:00 Jeremy wrote: Created attachment 359557 Set up kernel GDT early to make -fstack-protector work under Xen This patch should comprehensively fix -fstack-protector under Xen for both 32 and 64-bit. Please test. Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/27 ------------------------------------------------------------------------ On 2009-09-02T18:59:33+00:00 Pasi wrote: Someone please add that bugfix patch to next rawhide kernel build so we get people to test it.. Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/28 ------------------------------------------------------------------------ On 2009-09-03T18:05:07+00:00 Justin wrote: The patch has been applied and should be available in the next rawhide kernel build. Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/29 ------------------------------------------------------------------------ On 2009-09-05T18:52:23+00:00 Michal wrote: 2.6.31-0.203.rc8.git2.fc12 boots successfully as Xen domU. I've tested both i686.PAE and x86_64. Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/30 ------------------------------------------------------------------------ On 2009-09-06T11:50:36+00:00 Pasi wrote: Seems to boot now. virt-install started f12/rawhide Xen domU installation OK, on F11 host with Xen 3.4.1-3 + pv_ops dom0 kernel + libvirt from F11 updates testing. Installation went fine, and the installed domU seems to have 2.6.31-0.203.rc8.git2.fc12.i686.PAE kernel running. There's a traceback on domU dmesg though.. the domU still runs fine. Write protecting the kernel text: 4352k Write protecting the kernel read-only data: 1800k ============================================= [ INFO: possible recursive locking detected ] 2.6.31-0.203.rc8.git2.fc12.i686.PAE #1 --------------------------------------------- init/1 is trying to acquire lock: (&input_pool.lock){+.+...}, at: [<c043b30e>] __wake_up+0x2b/0x61 but task is already holding lock: (&input_pool.lock){+.+...}, at: [<c068e21b>] account+0x30/0xf0 other info that might help us debug this: 2 locks held by init/1: #0: (&p->cred_guard_mutex){+.+.+.}, at: [<c0508756>] do_execve+0xa4/0x2ee #1: (&input_pool.lock){+.+...}, at: [<c068e21b>] account+0x30/0xf0 stack backtrace: Pid: 1, comm: init Not tainted 2.6.31-0.203.rc8.git2.fc12.i686.PAE #1 Call Trace: [<c08387c0>] ? printk+0x22/0x3a [<c0478b59>] __lock_acquire+0x7e9/0xb25 [<c0478f4c>] lock_acquire+0xb7/0xeb [<c043b30e>] ? __wake_up+0x2b/0x61 [<c043b30e>] ? __wake_up+0x2b/0x61 [<c083b4f7>] _spin_lock_irqsave+0x45/0x89 [<c043b30e>] ? __wake_up+0x2b/0x61 [<c043b30e>] __wake_up+0x2b/0x61 [<c068e2a0>] account+0xb5/0xf0 [<c068e3ef>] extract_entropy+0x3e/0xac [<c0406b0b>] ? xen_restore_fl_direct_end+0x0/0x1 [<c04799d7>] ? lock_release+0x186/0x19f [<c068e56e>] get_random_bytes+0x29/0x3e [<c053bbd1>] load_elf_binary+0xab9/0x106c [<c050732d>] search_binary_handler+0xd7/0x27b [<c053b118>] ? load_elf_binary+0x0/0x106c [<c0539c76>] load_script+0x1a6/0x1c8 [<c0507323>] ? search_binary_handler+0xcd/0x27b [<c0406199>] ? xen_force_evtchn_callback+0x1d/0x34 [<c0507323>] ? search_binary_handler+0xcd/0x27b [<c0406b14>] ? check_events+0x8/0xc [<c0406b0b>] ? xen_restore_fl_direct_end+0x0/0x1 [<c04799d7>] ? lock_release+0x186/0x19f [<c050732d>] search_binary_handler+0xd7/0x27b [<c0539ad0>] ? load_script+0x0/0x1c8 [<c050888b>] do_execve+0x1d9/0x2ee [<c0408359>] sys_execve+0x39/0x6e [<c0409ad0>] syscall_call+0x7/0xb [<c04f00d8>] ? sys_swapon+0x348/0xa98 [<c040d76b>] ? kernel_execve+0x27/0x3e [<c04031e0>] ? run_init_process+0x2b/0x3e [<c0403275>] ? init_post+0x82/0xe9 [<c0a9b566>] ? kernel_init+0x1f6/0x211 [<c0a9b370>] ? kernel_init+0x0/0x211 [<c040a6bf>] ? kernel_thread_helper+0x7/0x10 Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/31 ------------------------------------------------------------------------ On 2009-09-07T08:24:39+00:00 Chris wrote: (In reply to comment #29) > Seems to boot now. virt-install started f12/rawhide Xen domU installation OK, > on F11 host with Xen 3.4.1-3 + pv_ops dom0 kernel + libvirt from F11 updates > testing. > > Installation went fine, and the installed domU seems to have > 2.6.31-0.203.rc8.git2.fc12.i686.PAE kernel running. > > There's a traceback on domU dmesg though.. the domU still runs fine. It's probably worth looking through BZ quickly to see if a bug with that trace exists already, and if not, to open a new bug about it. Thanks for the testing, Chris Lalancette Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/32 ------------------------------------------------------------------------ On 2009-09-08T11:41:33+00:00 Pasi wrote: OK, new bug opened: https://bugzilla.redhat.com/show_bug.cgi?id=521800 Reply at: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/33 ** Changed in: linux (Fedora) Importance: Unknown => High ** Bug watch added: Red Hat Bugzilla #521800 https://bugzilla.redhat.com/show_bug.cgi?id=521800 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/419315 Title: 2.6.31-rc1 xen domU crashes early during boot Status in linux package in Ubuntu: Incomplete Status in linux package in Fedora: Fix Released Bug description: Booting a karmic kernel under xen crashes very early in the boot process. This is a duplicate of Red Hat bug 508120 (https://bugzilla.redhat.com/show_bug.cgi?id=508120). At very least, our kernels at the moment do not have the two fixes mentioned there, http://git.kernel.org/?p=linux/kernel/git/tip/linux-2.6-tip.git;a=commitdiff;h=ce2eef33d3 http://git.kernel.org/?p=linux/kernel/git/tip/linux-2.6-tip.git;a=commitdiff;h=5416c26635 which will cause crash. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp