Launchpad has imported 32 comments from the remote bug at
https://bugzilla.redhat.com/show_bug.cgi?id=508120.

If you reply to an imported comment from within Launchpad, your comment
will be sent to the remote bug automatically. Read more about
Launchpad's inter-bugtracker facilities at
https://help.launchpad.net/InterBugTracking.

------------------------------------------------------------------------
On 2009-06-25T17:25:26+00:00 Kalev wrote:

Created attachment 349429
xm dmesg

I am running an i686 rawhide domU PV machine under x86_64 xen host. After 
updating from F-11's kernel-PAE-2.6.29.4-167.fc11.i686 the new kernels no 
longer boot. They seem crash very early, so that I don't even get any printk() 
output in the console.
Right now I can reproduce the problem with 
kernel-PAE-2.6.31-0.28.rc1.fc12.i686, however the same issue started with 
2.6.30-something, and I also verified that I get the exact same behaviour with 
x86_64 domU kernels.

"xm dmesg" reports the following:
(XEN) Unhandled page fault in domain 12 on VCPU 0 (ec=0000)
(XEN) Pagetable walk from 0000000000000014:
(XEN)  L4[0x000] = 0000000081d9d027 0000000000001b48
(XEN)  L3[0x000] = 0000000000000000 ffffffffffffffff
(XEN) domain_crash_sync called from entry.S
(XEN) Domain 12 (vcpu#0) crashed on cpu#3:
(XEN) ----[ Xen-3.1.2-155.el5  x86_64  debug=n  Not tainted ]----
(XEN) CPU:    3
(XEN) RIP:    e019:[<00000000c0a8b501>]
<snip> (full trace attached)

Examining matching vmlinux in gdb I get:
(gdb) x/i 0x00000000c0a8b501
0xc0a8b501 <xen_start_kernel+9>:        mov    %gs:0x14,%eax
(gdb) l *0x00000000c0a8b501
0xc0a8b501 is in xen_start_kernel (arch/x86/xen/enlighten.c:990).
985             .emergency_restart = xen_emergency_restart,
986     };
987
988     /* First C function to be called on Xen boot */
989     asmlinkage void __init xen_start_kernel(void)
990     {
991             pgd_t *pgd;
992
993             if (!xen_start_info)
994                     return;

The host is running Centos 5.3 with kernel-xen-2.6.18-155.el5 and
xen-3.0.3-80.el5_3.3.

Xen config:
name = "fedora-rawhide"
uuid = "1f162091-fa31-c67c-9da1-702bcd5cb40b"
maxmem = 512
memory = 512
vcpus = 1
bootloader = "/usr/bin/pygrub"
on_poweroff = "destroy"
on_reboot = "restart"
on_crash = "restart"
vfb = [  ]
disk = [ "phy:/dev/vg0/xen_rawhide,xvda,w" ]
vif = [ "mac=00:16:3e:11:95:e6,bridge=xenbr0" ]

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/0

------------------------------------------------------------------------
On 2009-08-16T19:54:04+00:00 Michal wrote:

I'm seeing the same problem with a x86_64 Rawhide DomU and a RHEL5 Dom0.
I believe it's the same problem Orion Poplawski reported recently on fedora-xen 
mailing list: 
https://www.redhat.com/archives/fedora-xen/2009-August/msg00008.html

I got the stack trace as Mark McLoughlin suggested:

michich@hammerfall ~$ sudo /usr/lib64/xen/bin/xenctx -s 
/tmp/System.map-2.6.31-rc5-git2  12
rip: ffffffff817290a1 xen_start_kernel+0x10
rsp: ffffffff8171df90
rax: 00000000   rbx: 00000000   rcx: 00000000   rdx: 00000000
rsi: ffffffff82fc3000   rdi: ffffffff82fc3000   rbp: ffffffff8171dff8
 r8: 00000000    r9: 00000000   r10: 00000000   r11: 00000000
r12: 00000000   r13: 00000000   r14: 00000000   r15: 00000000
 cs: 0000e033    ds: 00000000    fs: 00000000    gs: 00000000

Stack:
 0000000000000000 0000000000000000 0000000000000000 ffffffff817290a1
 000000010000e030 0000000000010096 ffffffff8171dfd8 000000000000e02b
 0000000000000000 0000000000000000 0000000000000000 0000000000000000
 0000000000000000 0000000000000000

Code:
bd 93 ff c9 c3 55 48 89 e5 53 48 83 ec 18 48 8b 3d 27 c0 33 00 <65> 48 8b 04 25 
28 00 00 00 48 89 

Call Trace:
  [<ffffffff817290a1>] xen_start_kernel+0x10 <--
  [<ffffffff817290a1>] xen_start_kernel+0x10


Then I dissasembled xen_start_kernel():
0000000000000000 <xen_start_kernel>:
   0:   55                      push   %rbp
   1:   48 89 e5                mov    %rsp,%rbp
   4:   53                      push   %rbx
   5:   48 83 ec 18             sub    $0x18,%rsp
   9:   48 8b 3d 00 00 00 00    mov    0x0(%rip),%rdi        # 10 
<xen_start_kernel+0x10>
  10:   65 48 8b 04 25 28 00    mov    %gs:0x28,%rax  ***CRASHES HERE***
  17:   00 00 
  19:   48 89 45 e8             mov    %rax,-0x18(%rbp)
  1d:   31 c0                   xor    %eax,%eax
...

At first these last three instructions confused me, because they did not seem 
to correspond to anything in the C source, but then I realized they setup the 
canary for stack smashing detection.
So I recompiled the kernel without CONFIG_CC_STACKPROTECTOR and I got much 
farther with the boot (it hung after loading some drivers, I'll investigate 
more).

I guess xen_start_kernel() (and possibly more of Xen DomU startup code)
should be compiled with -fno-stack-protector.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/1

------------------------------------------------------------------------
On 2009-08-17T09:15:10+00:00 Chris wrote:

Wow, excellent analysis, thanks.  I was just starting to see this
myself, but hadn't yet had time to look into it.  We'll have to take
this up with upstream and see what they have to say about it.

Chris Lalancette

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/2

------------------------------------------------------------------------
On 2009-08-17T16:22:05+00:00 Kevin wrote:

Yeah, seeing this also on a machine in fedora infrastructure. ;(

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/3

------------------------------------------------------------------------
On 2009-08-17T17:49:02+00:00 Jeremy wrote:

Thanks for the report and analysis.  I guess there's a keyword to
prevent gcc from adding stack-smashing to particular functions or
files...  Erm...

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/4

------------------------------------------------------------------------
On 2009-08-17T19:31:05+00:00 Jeremy wrote:

Created attachment 357697
Make sure load_percpu_segment doesn't have stack-protector enabled

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/5

------------------------------------------------------------------------
On 2009-08-17T19:31:34+00:00 Jeremy wrote:

Created attachment 357698
Setup percpu segments before calling stack-protected functions

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/6

------------------------------------------------------------------------
On 2009-08-17T19:31:59+00:00 Jeremy wrote:

Do those two help?

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/7

------------------------------------------------------------------------
On 2009-08-18T09:32:43+00:00 Michal wrote:

Jeremy,
yes, these patches help. The kernel starts booting with them applied.

Just a suggestion: the usual way (as seen in other Makefiles) to disable the 
stack protection for selected source files seems to be:
nostackp := $(call cc-option, -fno-stack-protector)
CFLAGS_somefile.o := $(nostackp)

And the kernel still hangs for me later during boot, but that's a
different bug.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/8

------------------------------------------------------------------------
On 2009-08-18T17:53:20+00:00 Jeremy wrote:

Ah, I couldn't find another instance of stack-protector being disabled.

Have you reported the other bug, or is it something purely local?

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/9

------------------------------------------------------------------------
On 2009-08-21T07:31:29+00:00 Mark wrote:

Ingo has these queued up in linux-2.6-tip.git/x86/urgent:

http://git.kernel.org/?p=linux/kernel/git/tip/linux-2.6-tip.git;a=commitdiff;h=ce2eef33d3
http://git.kernel.org/?p=linux/kernel/git/tip/linux-2.6-tip.git;a=commitdiff;h=5416c26635

Just need to re-test and close this when they make their way to rawhide

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/10

------------------------------------------------------------------------
On 2009-08-21T07:51:04+00:00 Jeremy wrote:

Are you sure they work?  M A Young still reports crashes when they're
applied.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/11

------------------------------------------------------------------------
On 2009-08-21T08:36:11+00:00 Mark wrote:

Michal says it still hangs later on, but that its a different issue

M A Young is probably testing Dom0, maybe yet another issue?

Dunno, that's why I said we need to re-test :-)

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/12

------------------------------------------------------------------------
On 2009-08-21T10:40:19+00:00 Michal wrote:

I've described the other bug in http://lkml.org/lkml/2009/8/21/71

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/13

------------------------------------------------------------------------
On 2009-08-25T01:08:46+00:00 Chuck wrote:

Should be fixed in 2.6.31-0.173.rc7.git2, which has the two x86-tip
patches plus the framebuffer fix from LKML.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/14

------------------------------------------------------------------------
On 2009-08-25T14:30:24+00:00 Michal wrote:

2.6.31-0.173.rc7.git2 boots successfully under Xen on x86_64, but i686
still fails. Probably because load_percpu_segment(0); is under #ifdef
CONFIG_X86_64 in xen_start_kernel().

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/15

------------------------------------------------------------------------
On 2009-08-25T16:02:48+00:00 Kevin wrote:

Seems to work here under a x86_64 guest. Thanks.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/16

------------------------------------------------------------------------
On 2009-08-25T18:15:53+00:00 Jeremy wrote:

(In reply to comment #15)
> 2.6.31-0.173.rc7.git2 boots successfully under Xen on x86_64, but i686 still
> fails. Probably because load_percpu_segment(0); is under #ifdef CONFIG_X86_64
> in xen_start_kernel().  

32 bit is trickier because it needs a specifically set-up GDT entry and
its own segment register.  Doing this setup properly ends upcalling
functions with stack-protector prologs which assume the segment register
is already set up.  I need to work out 1) how native does this setup,
and/or 2) refactor the segment register setup so that can avoid
functions with stack-protector code.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/17

------------------------------------------------------------------------
On 2009-08-26T12:02:59+00:00 Alexander wrote:

*** Bug 519342 has been marked as a duplicate of this bug. ***

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/18

------------------------------------------------------------------------
On 2009-08-26T12:07:21+00:00 Alexander wrote:

FYI: 2.6.31-0.174.rc7.git2 still fails on i386. My dom0 is recent RHEL5
and domU is F12-Alpha

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/19

------------------------------------------------------------------------
On 2009-08-27T17:56:55+00:00 Pasi wrote:

I also tried the latest rawhide tree (2.6.31-0.174.rc7.git2.fc12.i686)
with virt-install on my F11 + Xen 3.4.1 + 2.6.31-rc6 pv_ops dom0 setup,
and it still crashes.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/22

------------------------------------------------------------------------
On 2009-08-28T02:55:21+00:00 Jeremy wrote:

What compiler are people using?  Using F11's gcc-4.4.1-2.fc11.x86_64, it
says:

/home/jeremy/git/linux/arch/x86/Makefile:80: stack protector enabled but
no compiler support

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/23

------------------------------------------------------------------------
On 2009-08-28T08:28:44+00:00 Michal wrote:

Jeremy,

Rawhide builds currently use gcc-4.4.1-6.x86_64. You can find this information 
in Koji build logs, e.g.: 
http://kojipkgs.fedoraproject.org/packages/kernel/2.6.31/0.185.rc7.git6.fc12/data/logs/x86_64/
root.log tells you the versions of the packages used in the build.
build.log has the build warnings. There was no such stack protector warning in 
this case.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/24

------------------------------------------------------------------------
On 2009-08-28T18:18:30+00:00 Jeremy wrote:

Where can I get this version of gcc?  "yum update --enablerepo=rawhide
gcc" doesn't get me anything more recent than gcc-4.4.1-2.fc11.x86_64.
Or does 32-bit stackprotector not work in the x86-64 version of the
compiler?

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/25

------------------------------------------------------------------------
On 2009-08-29T16:29:23+00:00 Michal wrote:

> Where can I get this version of gcc?  "yum update --enablerepo=rawhide gcc"
> doesn't get me anything more recent than gcc-4.4.1-2.fc11.x86_64.

Works for me, yum can see the newer version. But the gcc from Rawhide
depends on newer glibc, so I do not recommend doing it.

> Or does 32-bit stackprotector not work in the x86-64 version of the
compiler?

Bug in the stack protector detection for ARCH=i386 builds on x86_64.
I've sent a patch to LKML and CCed you.

Koji always builds packages using native arch toolchain, so it is not
affected.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/26

------------------------------------------------------------------------
On 2009-09-02T17:25:59+00:00 Jeremy wrote:

Created attachment 359557
Set up kernel GDT early to make -fstack-protector work under Xen

This patch should comprehensively fix -fstack-protector under Xen for
both 32 and 64-bit.  Please test.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/27

------------------------------------------------------------------------
On 2009-09-02T18:59:33+00:00 Pasi wrote:

Someone please add that bugfix patch to next rawhide kernel build so we
get people to test it..

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/28

------------------------------------------------------------------------
On 2009-09-03T18:05:07+00:00 Justin wrote:

The patch has been applied and should be available in the next rawhide
kernel build.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/29

------------------------------------------------------------------------
On 2009-09-05T18:52:23+00:00 Michal wrote:

2.6.31-0.203.rc8.git2.fc12 boots successfully as Xen domU. I've tested
both i686.PAE and x86_64.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/30

------------------------------------------------------------------------
On 2009-09-06T11:50:36+00:00 Pasi wrote:

Seems to boot now. virt-install started f12/rawhide Xen domU
installation OK, on F11 host with Xen 3.4.1-3 + pv_ops dom0 kernel +
libvirt from F11 updates testing.

Installation went fine, and the installed domU seems to have
2.6.31-0.203.rc8.git2.fc12.i686.PAE kernel running.

There's a traceback on domU dmesg though.. the domU still runs fine.

Write protecting the kernel text: 4352k
Write protecting the kernel read-only data: 1800k

=============================================
[ INFO: possible recursive locking detected ]
2.6.31-0.203.rc8.git2.fc12.i686.PAE #1
---------------------------------------------
init/1 is trying to acquire lock:
 (&input_pool.lock){+.+...}, at: [<c043b30e>] __wake_up+0x2b/0x61

but task is already holding lock:
 (&input_pool.lock){+.+...}, at: [<c068e21b>] account+0x30/0xf0

other info that might help us debug this:
2 locks held by init/1:
 #0:  (&p->cred_guard_mutex){+.+.+.}, at: [<c0508756>] do_execve+0xa4/0x2ee
 #1:  (&input_pool.lock){+.+...}, at: [<c068e21b>] account+0x30/0xf0

stack backtrace:
Pid: 1, comm: init Not tainted 2.6.31-0.203.rc8.git2.fc12.i686.PAE #1
Call Trace:
 [<c08387c0>] ? printk+0x22/0x3a
 [<c0478b59>] __lock_acquire+0x7e9/0xb25
 [<c0478f4c>] lock_acquire+0xb7/0xeb
 [<c043b30e>] ? __wake_up+0x2b/0x61
 [<c043b30e>] ? __wake_up+0x2b/0x61
 [<c083b4f7>] _spin_lock_irqsave+0x45/0x89
 [<c043b30e>] ? __wake_up+0x2b/0x61
 [<c043b30e>] __wake_up+0x2b/0x61
 [<c068e2a0>] account+0xb5/0xf0
 [<c068e3ef>] extract_entropy+0x3e/0xac
 [<c0406b0b>] ? xen_restore_fl_direct_end+0x0/0x1
 [<c04799d7>] ? lock_release+0x186/0x19f
 [<c068e56e>] get_random_bytes+0x29/0x3e
 [<c053bbd1>] load_elf_binary+0xab9/0x106c
 [<c050732d>] search_binary_handler+0xd7/0x27b
 [<c053b118>] ? load_elf_binary+0x0/0x106c
 [<c0539c76>] load_script+0x1a6/0x1c8
 [<c0507323>] ? search_binary_handler+0xcd/0x27b
 [<c0406199>] ? xen_force_evtchn_callback+0x1d/0x34
 [<c0507323>] ? search_binary_handler+0xcd/0x27b
 [<c0406b14>] ? check_events+0x8/0xc
 [<c0406b0b>] ? xen_restore_fl_direct_end+0x0/0x1
 [<c04799d7>] ? lock_release+0x186/0x19f
 [<c050732d>] search_binary_handler+0xd7/0x27b
 [<c0539ad0>] ? load_script+0x0/0x1c8
 [<c050888b>] do_execve+0x1d9/0x2ee
 [<c0408359>] sys_execve+0x39/0x6e
 [<c0409ad0>] syscall_call+0x7/0xb
 [<c04f00d8>] ? sys_swapon+0x348/0xa98
 [<c040d76b>] ? kernel_execve+0x27/0x3e
 [<c04031e0>] ? run_init_process+0x2b/0x3e
 [<c0403275>] ? init_post+0x82/0xe9
 [<c0a9b566>] ? kernel_init+0x1f6/0x211
 [<c0a9b370>] ? kernel_init+0x0/0x211
 [<c040a6bf>] ? kernel_thread_helper+0x7/0x10

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/31

------------------------------------------------------------------------
On 2009-09-07T08:24:39+00:00 Chris wrote:

(In reply to comment #29)
> Seems to boot now. virt-install started f12/rawhide Xen domU installation OK,
> on F11 host with Xen 3.4.1-3 + pv_ops dom0 kernel + libvirt from F11 updates
> testing.
> 
> Installation went fine, and the installed domU seems to have
> 2.6.31-0.203.rc8.git2.fc12.i686.PAE kernel running.
> 
> There's a traceback on domU dmesg though.. the domU still runs fine.

It's probably worth looking through BZ quickly to see if a bug with that
trace exists already, and if not, to open a new bug about it.

Thanks for the testing,
Chris Lalancette

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/32

------------------------------------------------------------------------
On 2009-09-08T11:41:33+00:00 Pasi wrote:

OK, new bug opened: https://bugzilla.redhat.com/show_bug.cgi?id=521800

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/comments/33


** Changed in: linux (Fedora)
   Importance: Unknown => High

** Bug watch added: Red Hat Bugzilla #521800
   https://bugzilla.redhat.com/show_bug.cgi?id=521800

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/419315

Title:
   2.6.31-rc1 xen domU crashes early during boot

Status in linux package in Ubuntu:
  Incomplete
Status in linux package in Fedora:
  Fix Released

Bug description:
  Booting a karmic kernel under xen crashes very early in the boot
  process.

  This is a duplicate of Red Hat bug  508120
  (https://bugzilla.redhat.com/show_bug.cgi?id=508120).

  At very least, our kernels at the moment do not have the two fixes mentioned 
there, 
  
http://git.kernel.org/?p=linux/kernel/git/tip/linux-2.6-tip.git;a=commitdiff;h=ce2eef33d3
  
http://git.kernel.org/?p=linux/kernel/git/tip/linux-2.6-tip.git;a=commitdiff;h=5416c26635

  which will cause crash.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/419315/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to