You have been subscribed to a public bug:

After running Stress tests for 30 hours, Ubuntu16.10 KVM guest crashed
and entered xmon.

Guest Build:
--------------
4.8.0-16-generic

Tests started on guest:
--------------
BASE: LTP Base tests.. 
IO: admndisk, aio, fstest tests (on btrfs file system over 6 partitions of 2 
disks).
TCP: TCP commands: telnet, ssh, rlogin, ping etc..

XMON traces:
------------------
4:mon> t
[c00000017ffcf800] c00000000024385c end_page_writeback+0x7c/0x120
[c00000017ffcf830] d0000000022d8298 ext4_finish_bio+0x1f0/0x2e0 [ext4]
[c00000017ffcf910] d0000000022d8928 ext4_end_bio+0x70/0x170 [ext4]
[c00000017ffcf9a0] c0000000004c96cc bio_endio+0xfc/0x120
[c00000017ffcf9d0] c0000000004d5f50 blk_update_request+0xf0/0x4d0
[c00000017ffcfa60] c0000000006df2dc scsi_end_request+0x6c/0x260
[c00000017ffcfad0] c0000000006e32a4 scsi_io_completion+0x2d4/0x740
[c00000017ffcfba0] c0000000006d6714 scsi_finish_command+0x144/0x200
[c00000017ffcfc20] c0000000006e25a8 scsi_softirq_done+0x198/0x200
[c00000017ffcfca0] c0000000004e2e98 __blk_mq_complete_request_remote+0x38/0x50
[c00000017ffcfcd0] c000000000183e80 flush_smp_call_function_queue+0xd0/0x220
[c00000017ffcfd50] c000000000047aac smp_ipi_demux+0xac/0x110
[c00000017ffcfd90] c0000000000738e4 icp_hv_ipi_action+0x64/0xd0
[c00000017ffcfe00] c0000000001466d0 __handle_irq_event_percpu+0x90/0x340
[c00000017ffcfec0] c0000000001469bc handle_irq_event_percpu+0x3c/0x90
[c00000017ffcff00] c00000000014ced4 handle_percpu_irq+0x84/0xd0
[c00000017ffcff30] c000000000145664 generic_handle_irq+0x54/0x80
[c00000017ffcff60] c000000000015b20 __do_irq+0x80/0x230
[c00000017ffcff90] c00000000002a2e0 call_do_irq+0x14/0x24
[c00000013073b210] c000000000015d68 do_IRQ+0x98/0x140
[c00000013073b260] c0000000000026d8 hardware_interrupt_common+0x158/0x180
--- Exception: 501 (Hardware Interrupt) at c00000000008fe4c 
plpar_hcall_norets+0x1c/0x28
[link register   ] c00000000006c094 __spin_yield+0xa4/0xb0
[c00000013073b550] c00000017fe28b00 (unreliable)
[c00000013073b5c0] c000000000949758 _raw_spin_lock_irqsave+0x128/0x130
[c00000013073b600] d0000000017222cc ibmvscsi_queuecommand+0x54/0x4b0 [ibmvscsi]
[c00000013073b6b0] c0000000006dfc80 scsi_dispatch_cmd+0x140/0x370
[c00000013073b730] c0000000006e1ad0 scsi_queue_rq+0x770/0x920
[c00000013073b800] c0000000004e62f4 __blk_mq_run_hw_queue+0x2e4/0x570
[c00000013073b910] c0000000004e5fc8 blk_mq_run_hw_queue+0xf8/0x140
[c00000013073b940] c0000000004e8f90 blk_mq_flush_plug_list+0x160/0x1b0
[c00000013073b9c0] c0000000004d7fbc blk_flush_plug_list+0xfc/0x2b0
[c00000013073ba30] c0000000004d8708 blk_finish_plug+0x58/0x80
[c00000013073ba60] d0000000022d270c ext4_writepages+0x6c4/0xe60 [ext4]
[c00000013073bbf0] c00000000025ae80 do_writepages+0x60/0xc0
[c00000013073bc20] c000000000246c18 __filemap_fdatawrite_range+0x108/0x190
[c00000013073bcc0] c000000000246f20 filemap_write_and_wait_range+0x70/0xf0
[c00000013073bd00] d0000000022c5944 ext4_sync_file+0x24c/0x5a0 [ext4]
[c00000013073bd60] c000000000365a28 vfs_fsync_range+0x78/0x130
[c00000013073bdb0] c000000000365b90 do_fsync+0x60/0xb0
[c00000013073be00] c000000000366000 SyS_fsync+0x30/0x50
[c00000013073be30] c0000000000095e0 system_call+0x38/0x108
--- Exception: c00 (System Call) at 00003fff7b26cc98
SP (3fffc42b5280) is in userspace
4:mon> e
cpu 0x4: Vector: 300 (Data Access) at [c00000017ffcf520]
    pc: c00000000025b4ec: test_clear_page_writeback+0x1ec/0x300
    lr: c00000000025b4c0: test_clear_page_writeback+0x1c0/0x300
    sp: c00000017ffcf7a0
   msr: 8000000000009033
   dar: 2d0
 dsisr: 40000000
  current = 0xc000000036f59880
  paca    = 0xc000000007b82400   softe: 0        irq_happened: 0x09
    pid   = 1102, comm = create_datafile
Linux version 4.8.0-16-generic (buildd@bos01-ppc64el-007) (gcc version 6.2.0 
20160914 (Ubuntu 6.2.0-3ubuntu15) ) #17-Ubuntu SMP Thu Sep 22 22:45:44 UTC 2016 
(Ubuntu 4.8.0-16.17-generic 4.8.0-rc7)
4:mon>
4:mon>
4:mon> r
R00 = c00000000025b4c0   R16 = 0000000005500000
R01 = c00000017ffcf7a0   R17 = 7fffffffffffffff
R02 = c0000000010af400   R18 = 0000000000000000
R03 = 0000000000000000   R19 = c000000172e35b00
R04 = 0000000000000000   R20 = c000000035f831b0
R05 = ffffffffffffffe0   R21 = 0000000000010000
R06 = fffffffffffffffe   R22 = 0000000000000002
R07 = fffffffff8000000   R23 = 0000000004000000
R08 = 000000017f180000   R24 = 0000000000000000
R09 = 0000000000000000   R25 = c0000000010f00c4
R10 = 000000017f180000   R26 = 0000000000000000
R11 = 0000000000000230   R27 = c000000072873518
R12 = 0000000000000001   R28 = c0000000037fcdf8
R13 = c000000007b82400   R29 = 0000000000000001
R14 = 0000000000000000   R30 = c000000072873500
R15 = 0000000000000000   R31 = f0000000002423c0
pc  = c00000000025b4ec test_clear_page_writeback+0x1ec/0x300
cfar= c0000000009497c4 _raw_spin_unlock_irqrestore+0x64/0xb0
lr  = c00000000025b4c0 test_clear_page_writeback+0x1c0/0x300
msr = 8000000000009033   cr  = 28022228
ctr = c0000000002437e0   xer = 0000000020000000   trap =  300
dar = 00000000000002d0   dsisr = 40000000

The host kernel was updated to 4.8.0-17-generic as three other guests
were having crashes (not clear if the same one) and a recreate is again
in progress so sending this over to Canonical as a heads up. I did look
through upstream ext4 and page IO commits but didn't spot anything that
could be related.

** Affects: linux (Ubuntu)
     Importance: Undecided
     Assignee: Taco Screen team (taco-screen-team)
         Status: New


** Tags: architecture-ppc64le bugnameltc-146916 severity-high 
targetmilestone-inin1610
-- 
ISST-LTE:Ubuntu1610: UbuntuKVM 16.10 guest crashed after 30 hours of stress 
testing
https://bugs.launchpad.net/bugs/1628988
You received this bug notification because you are a member of Kernel Packages, 
which is subscribed to linux in Ubuntu.

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to