SRU justification: Impact: mdadm, Raid5 get stuck in uninterruptable sleep under heavy I/O load. Copying data to a Raid 5 XFS partition results in a permanent lock on several processes related to it, getting stuck in the D(+) state. Occurs when large quantities of data (10-40 GB) is copied, resulting in processes being unkillable, and the system cannot reboot and requires power cycling the server.
Fix: The patch from commit 6ed3003c19a96fe18edf8179c4be6fe14abbebbc. The fix is to not make any generic_make_request() calls in raid5 make_request until all waiting has been done. We do this by simply setting STRIPE_HANDLE instead of calling handle_stripe(). This causes a performance hit, so this patch also only calls raid5_activate_delayed() at unplug time, never in raid5. This seems to bring back the performance numbers. [quoting the commit message] Testing: Without the patch, Raid 5 using md on an XFS filesystem locks up under heavy data copying - this is repeatable. With the patch, the lock up does not occur. Patch tested from my PPA build by Andrew Cholakian (see previous message) ** Changed in: linux (Ubuntu Hardy) Status: Won't Fix => Fix Committed Target: None => ubuntu-8.04.2 ** Summary changed: - mdadm, Raid5 and XFS stuck in uninterruptable sleep + mdadm with Raid5 stuck in uninterruptable sleep -- mdadm with Raid5 stuck in uninterruptable sleep https://bugs.launchpad.net/bugs/208551 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs