Same problem here. It freezes every month or under heavy load. KVM running an an software RAID. Every month because there is an montly cronjob to check the software Raid: /usr/share/mdadm/checkarray. Testing the Partition, on which the kvm images are one it, triggers the bug. The kvm images takes 100% cpu, only reboot can stop it:
syslog: May 24 06:02:47 localhost kernel: [1682547.453843] INFO: task kdmflush:465 blocked for more than 120 seconds. May 24 06:02:47 localhost kernel: [1682547.453845] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. May 24 06:02:47 localhost kernel: [1682547.453848] kdmflush D 00000000ffffffff 0 465 2 0x00000000 May 24 06:02:47 localhost kernel: [1682547.453852] ffff88032e0b99d0 0000000000000046 0000000000015dc0 0000000000015dc0 May 24 06:02:47 localhost kernel: [1682547.453856] ffff88032ec8dfd0 ffff88032e0b9fd8 0000000000015dc0 ffff88032ec8dc00 May 24 06:02:47 localhost kernel: [1682547.453860] 0000000000015dc0 ffff88032e0b9fd8 0000000000015dc0 ffff88032ec8dfd0 May 24 06:02:47 localhost kernel: [1682547.453864] Call Trace: May 24 06:02:47 localhost kernel: [1682547.453879] [<ffffffffa0074685>] wait_barrier+0xf5/0x140 [raid1] May 24 06:02:47 localhost kernel: [1682547.453885] [<ffffffff8105ded0>] ? default_wake_function+0x0/0x20 May 24 06:02:47 localhost kernel: [1682547.453890] [<ffffffffa0077651>] make_request+0x51/0x750 [raid1] May 24 06:02:47 localhost kernel: [1682547.453894] [<ffffffff81064304>] ? check_preempt_wakeup+0x1c4/0x3c0 May 24 06:02:47 localhost kernel: [1682547.453897] [<ffffffff8105f10b>] ? enqueue_task_fair+0x9b/0xa0 May 24 06:02:47 localhost kernel: [1682547.453902] [<ffffffff8142b6b0>] md_make_request+0xc0/0x130 May 24 06:02:47 localhost kernel: [1682547.453907] [<ffffffff812a1d01>] generic_make_request+0x1b1/0x4f0 May 24 06:02:47 localhost kernel: [1682547.453911] [<ffffffff810f8475>] ? mempool_alloc_slab+0x15/0x20 May 24 06:02:47 localhost kernel: [1682547.453915] [<ffffffff810f860d>] ? mempool_alloc+0x5d/0x130 May 24 06:02:47 localhost kernel: [1682547.453919] [<ffffffff814382ad>] __map_bio+0xad/0x130 May 24 06:02:47 localhost kernel: [1682547.453922] [<ffffffff814387dd>] __clone_and_map+0x4ad/0x4c0 May 24 06:02:47 localhost kernel: [1682547.453925] [<ffffffff810f860d>] ? mempool_alloc+0x5d/0x130 May 24 06:02:47 localhost kernel: [1682547.453929] [<ffffffff814398b8>] __split_and_process_bio+0x108/0x190 May 24 06:02:47 localhost kernel: [1682547.453932] [<ffffffff81439996>] dm_flush+0x56/0x70 May 24 06:02:47 localhost kernel: [1682547.453935] [<ffffffff814399fc>] dm_wq_work+0x4c/0x1c0 May 24 06:02:47 localhost kernel: [1682547.453938] [<ffffffff814399b0>] ? dm_wq_work+0x0/0x1c0 May 24 06:02:47 localhost kernel: [1682547.453942] [<ffffffff81081457>] run_workqueue+0xc7/0x1a0 May 24 06:02:47 localhost kernel: [1682547.453946] [<ffffffff810815d3>] worker_thread+0xa3/0x110 May 24 06:02:47 localhost kernel: [1682547.453950] [<ffffffff81085ff0>] ? autoremove_wake_function+0x0/0x40 May 24 06:02:47 localhost kernel: [1682547.453954] [<ffffffff81081530>] ? worker_thread+0x0/0x110 May 24 06:02:47 localhost kernel: [1682547.453957] [<ffffffff81085c76>] kthread+0x96/0xa0 May 24 06:02:47 localhost kernel: [1682547.453961] [<ffffffff810141ea>] child_rip+0xa/0x20 May 24 06:02:47 localhost kernel: [1682547.453964] [<ffffffff81085be0>] ? kthread+0x0/0xa0 May 24 06:02:47 localhost kernel: [1682547.453967] [<ffffffff810141e0>] ? child_rip+0x0/0x20 May 24 06:02:47 localhost kernel: [1682547.453971] INFO: task jbd2/dm-0-8:610 blocked for more than 120 seconds. May 24 06:02:47 localhost kernel: [1682547.453973] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. May 24 06:02:47 localhost kernel: [1682547.453975] jbd2/dm-0-8 D 00000000ffffffff 0 610 2 0x00000000 May 24 06:02:47 localhost kernel: [1682547.453979] ffff880325db1d20 0000000000000046 0000000000015dc0 0000000000015dc0 May 24 06:02:47 localhost kernel: [1682547.453983] ffff8803265703d0 ffff880325db1fd8 0000000000015dc0 ffff880326570000 May 24 06:02:47 localhost kernel: [1682547.453986] 0000000000015dc0 ffff880325db1fd8 0000000000015dc0 ffff8803265703d0 May 24 06:02:47 localhost kernel: [1682547.453990] Call Trace: May 24 06:02:47 localhost kernel: [1682547.453995] [<ffffffff8121e741>] jbd2_journal_commit_transaction+0x1c1/0x1280 May 24 06:02:47 localhost kernel: [1682547.453999] [<ffffffff81077bbc>] ? lock_timer_base+0x3c/0x70 May 24 06:02:47 localhost kernel: [1682547.454002] [<ffffffff81085ff0>] ? autoremove_wake_function+0x0/0x40 May 24 06:02:47 localhost kernel: [1682547.454006] [<ffffffff81225d7d>] kjournald2+0xbd/0x220 May 24 06:02:47 localhost kernel: [1682547.454010] [<ffffffff81085ff0>] ? autoremove_wake_function+0x0/0x40 May 24 06:02:47 localhost kernel: [1682547.454013] [<ffffffff81225cc0>] ? kjournald2+0x0/0x220 May 24 06:02:47 localhost kernel: [1682547.454016] [<ffffffff81085c76>] kthread+0x96/0xa0 May 24 06:02:47 localhost kernel: [1682547.454019] [<ffffffff810141ea>] child_rip+0xa/0x20 May 24 06:02:47 localhost kernel: [1682547.454023] [<ffffffff81085be0>] ? kthread+0x0/0xa0 May 24 06:02:47 localhost kernel: [1682547.454025] [<ffffffff810141e0>] ? child_rip+0x0/0x20 I think it is an more general Bug, maybe hardware related? If you search in the bug database: "blocked for more than 120 seconds". Several bug with a similar descriptions show up. It looks like it happens with several kernels. Some short info about my system: lsb_release -rd Description: Ubuntu 10.04.2 LTS Release: 10.04 ~ # uname -a Linux tsu 2.6.32-31-server #61-Ubuntu SMP Fri Apr 8 19:44:42 UTC 2011 x86_64 GNU/Linux thanks fabian ** Attachment added: "syslog_bug.txt" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/745785/+attachment/2143342/+files/syslog_bug.txt -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/745785 Title: KVM HOST: Freeze every ~2 weeks -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs