Subject: linux-2.6: Kernel crash when Xen hot-unplug network device Package: linux-2.6 Version: 2.6.39-bpo.2-686-pae Severity: important
*** Please type your report below this line *** I encountered kernel crash when using xenserver to hot-unplug network device of Debian PV guest. The bug is observed on both 2.6.39-bpo.2-686-pae and linux-image-3.2.0-0.bpo.1-686-pae. There is a similar bug http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=544525 which was said fixed, but it seems not. The XenServer version is XenServer5.6. I confirm 2.6.32-5-686-bigmem doesn't have this issue. Trace attached: root@r-4-TEST:~# [ 1021.224115] BUG: unable to handle kernel paging request at 06a69000 [ 1021.224128] IP: [<c121ec7a>] linkwatch_do_dev+0x96/0x9c [ 1021.224140] *pdpt = 00000000066fe027 *pde = 0000000000000000 [ 1021.224150] Oops: 0002 [#1] SMP [ 1021.224158] last sysfs file: /sys/devices/vif-2/net/eth2/broadcast [ 1021.224165] Modules linked in: xt_comment xt_mark xt_CHECKSUM xt_connmark iptable_mangle xt_tcpudp xt_state iptable_filter iptable_nat ip_tables x_tables nf_nat_ftp nf_nat nf_conntrack_ftp deflate zlib_deflate ctr twofish_generic twofish_i586 twofish_common camellia serpent blowfish cast5 des_generic cbc aes_i586 aes_generic xcbc rmd160 sha512_generic sha256_generic sha1_generic hmac crypto_null af_key isofs xenfs nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack evdev snd_pcm snd_timer snd soundcore snd_page_alloc pcspkr ext3 jbd mbcache xen_blkfront xen_netfront [ 1021.224287] [ 1021.224292] Pid: 17, comm: kworker/0:1 Not tainted 2.6.39-bpo.2-686-pae #1 [ 1021.224302] EIP: 0061:[<c121ec7a>] EFLAGS: 00010246 CPU: 0 [ 1021.224309] EIP is at linkwatch_do_dev+0x96/0x9c [ 1021.224315] EAX: 00000000 EBX: c29c0000 ECX: c29c0000 EDX: c787df02 [ 1021.224323] ESI: c787df5c EDI: c787df5c EBP: c7860550 ESP: c787df4c [ 1021.224329] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0069 [ 1021.224336] Process kworker/0:1 (pid: 17, ti=c787c000 task=c786d140 task.ti=c787c000) [ 1021.224343] Stack: [ 1021.224347] 00000002 00000000 c121ef36 c787df5c c787df5c c787df5c c1412c88 c7860540 [ 1021.224368] c7ef2d00 c7860550 c121ef95 c10483c4 c7ef8000 00000000 c121ef7d c7ef8005 [ 1021.224389] c7860540 c7ef2d00 c7ef2d00 c7860550 c1049d50 c148d700 c2801b28 c7ef2d0 [ 1021.224410] Call Trace: [ 1021.224417] [<c121ef36>] ? __linkwatch_run_queue+0x130/0x177 [ 1021.224425] [<c121ef95>] ? linkwatch_event+0x18/0x1d [ 1021.224434] [<c10483c4>] ? process_one_work+0x181/0x25e [ 1021.224442] [<c121ef7d>] ? __linkwatch_run_queue+0x177/0x177 [ 1021.224450] [<c1049d50>] ? worker_thread+0xf3/0x1f4 [ 1021.224458] [<c1049c5d>] ? manage_workers+0x15a/0x15a [ 1021.224465] [<c104c520>] ? kthread+0x63/0x68 [ 1021.224472] [<c104c4bd>] ? kthread_worker_fn+0x114/0x114 [ 1021.224481] [<c12b7c3e>] ? kernel_thread_helper+0x6/0x10 [ 1021.224486] Code: f0 00 00 00 01 74 1e 8b 43 48 a8 04 75 09 89 d8 e8 9d 7b 00 00 eb 07 89 d8 e8 1b 77 00 00 89 d8 e8 d9 4b ff ff 8b 83 58 02 00 00 [ 1021.224569] ff 08 58 5b c3 57 56 31 f6 53 89 c3 b8 54 b2 51 c1 e8 5f 32 [ 1021.224611] EIP: [<c121ec7a>] linkwatch_do_dev+0x96/0x9c SS:ESP 0069:c787df4c [ 1021.224623] CR2: 0000000006a69000 [ 1021.224631] ---[ end trace 2297bb3830396d7d ]--- 1021.224658] BUG: unable to handle kernel paging request at fffffffc [ 1021.224668] IP: [<c104c246>] kthread_data+0x6/0xa [ 1021.224676] *pdpt = 00000000028cd027 *pde = 0000000000000000 [ 1021.224686] Oops: 0000 [#2] SMP [ 1021.224694] last sysfs file: /sys/devices/vif-2/net/eth2/broadcast [ 1021.224700] Modules linked in: xt_comment xt_mark xt_CHECKSUM xt_connmark iptable_mangle xt_tcpudp xt_state iptable_filter iptable_nat ip_tables x_tables nf_nat_ftp nf_nat nf_conntrack_ftp deflate zlib_deflate ctr twofish_generic twofish_i586 twofish_common camellia serpent blowfish cast5 des_generic cbc aes_i586 aes_generic xcbc rmd160 sha512_generic sha256_generic sha1_generic hmac crypto_null af_key isofs xenfs nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack evdev snd_pcm snd_timer snd soundcore snd_page_alloc pcspkr ext3 jbd mbcache xen_blkfront xen_netfront [ 1021.224822] [ 1021.224827] Pid: 17, comm: kworker/0:1 Tainted: G D 2.6.39-bpo.2-686-pae #1 [ 1021.224838] EIP: 0061:[<c104c246>] EFLAGS: 00010002 CPU: 0 [ 1021.224844] EIP is at kthread_data+0x6/0xa [ 1021.224850] EAX: 00000000 EBX: 00000000 ECX: c786d2e4 EDX: 00000000 [ 1021.224856] ESI: 00000000 EDI: 00000000 EBP: c786d140 ESP: c787dd90 [ 1021.228008] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0069 [ 1021.228008] Process kworker/0:1 (pid: 17, ti=c787c000 task=c786d140 task.ti=c787c000) [ 1021.228008] Stack: [ 1021.228008] c10498fe c7ef61c0 00000000 c12b09cf c7819980 c10da73a c787ddeb 00000000 [ 1021.228008] c782cfc0 c148d1c0 c786d2e4 0000000d c148d1c0 c786d140 c148d1c0 c148d1c0 [ 1021.228008] c1005d14 c1005d14 c786d510 c7ef30d0 c13f4f00 c1006364 0002c055 c13f4f00 [ 1021.228008] Call Trace: [ 1021.228008] [<c10498fe>] ? wq_worker_sleeping+0x9/0x6d [ 1021.228008] [<c12b09cf>] ? schedule+0x105/0x5be [ 1021.228008] [<c10da73a>] ? d_lookup+0x1f/0x2f [ 1021.228008] [<c1005d14>] ? xen_force_evtchn_callback+0xc/0x10 [ 1021.228008] [<c1005d14>] ? xen_force_evtchn_callback+0xc/0x10 [ 1021.228008] [<c1006364>] ? check_events+0x8/0xc [ 1021.228008] [<c100635b>] ? xen_restore_fl_direct_reloc+0x4/0x4 [ 1021.228008] [<c1079e75>] ? __call_rcu+0xd6/0xdb [ 1021.228008] [<c1038dfa>] ? release_task+0x375/0x384 [ 1021.228008] [<c103a34a>] ? do_exit+0x655/0x663 [ 1021.228008] [<c12b36e5>] ? oops_end+0x95/0x99 [ 1021.228008] [<c1022de5>] ? no_context+0x13b/0x144 [ 1021.228008] [<c1022ef2>] ? bad_area_nosemaphore+0xa/0xc [ 1021.228008] [<c12b4e7e>] ? do_page_fault+0x1b0/0x361 [ 1021.228008] [<c100635b>] ? xen_restore_fl_direct_reloc+0x4/0x4 [ 1021.228008] [<c100429f>] ? xen_mc_flush+0x11d/0x13b [ 1021.228008] [<c1003889>] ? xen_end_context_switch+0xb/0x14 [ 1021.228008] [<c1007c32>] ? __switch_to+0xc4/0xe2 [ 1021.228008] [<c102c030>] ? dequeue_task+0x97/0xa6 [ 1021.228008] [<c12b4cce>] ? spurious_fault+0xfc/0xfc [ 1021.228008] [<c12b2def>] ? error_code+0x67/0x6c [ 1021.228008] [<c121ec7a>] ? linkwatch_do_dev+0x96/0x9c [ 1021.228008] [<c121ef36>] ? __linkwatch_run_queue+0x130/0x177 [ 1021.228008] [<c121ef95>] ? linkwatch_event+0x18/0x1d [ 1021.228008] [<c10483c4>] ? process_one_work+0x181/0x25e [ 1021.228008] [<c121ef7d>] ? __linkwatch_run_queue+0x177/0x177 [ 1021.228008] [<c1049d50>] ? worker_thread+0xf3/0x1f4 [ 1021.228008] [<c1049c5d>] ? manage_workers+0x15a/0x15a [ 1021.228008] [<c104c520>] ? kthread+0x63/0x68 [ 1021.228008] [<c104c4bd>] ? kthread_worker_fn+0x114/0x114 [ 1021.228008] [<c12b7c3e>] ? kernel_thread_helper+0x6/0x10 [ 1021.228008] Code: 01 46 10 8b 14 24 8d 43 08 e8 0d 5d 26 00 59 5b 5e 5f 5d c3 90 64 a1 84 81 48 c1 8b 80 78 01 00 00 8b 40 f8 c3 8b 80 78 01 00 00 <8b> 40 fc c3 31 c0 c3 8d 50 04 c7 00 00 00 00 00 89 50 04 89 50 [ 1021.228008] EIP: [<c104c246>] kthread_data+0x6/0xa SS:ESP 0069:c787dd90 [ 1021.228008] CR2: 00000000fffffffc [ 1021.228008] ---[ end trace 2297bb3830396d7e ]--- [ 1021.228008] Fixing recursive fault but reboot is needed! -- System Information: Debian Release: 6.0.3 APT prefers stable APT policy: (990, 'stable'), (400, 'testing') Architecture: i386 (i686) Kernel: Linux 2.6.39-bpo.2-686-pae (SMP w/1 CPU core) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) (ignored: LC_ALL set to en_US.UTF-8) Shell: /bin/sh linked to /bin/dash