Package: mdadm Version: 3.1.2-2 Severity: wishlist Tags: upstream Hi,
Especially in the case of RAID5 arrays it would often be life-saving to be able to activate a hot-spare and prepare to replace a live drive with it, without marking that drive as failed first. Consider the following scenario. Let's say we have a RAID5 array composed of sdb, sdc and sdd, with sde added as a spare (i.e. 3 active drives). sdc starts to noticeably fail. Unknown to the user, sdd also has developed a bad sector. The user marks sdc as failed and waits for sde to be synced; however, during the resync, the system hits the bad sector on sdd, causing sdd to also be marked as failed, the resync to fail and the array to become unusable. (The same can happen if an intermittent bit error occurs during the resync operation.) The algorithm I'd like to see implemented would work as follows: sdc starts to noticeably fail. The user marks it for replacement. sde is activated and the system copies everything from sdc to sde, using the redundancy provided by the other drives if/when necessary. Temporarily, while this operation is in progress, sdc and sde are both active and in the same slot; any writes that hit the array get committed to both. When sde is completely up to date, sdc gets deactivated and marked as failed. The bad sector on sdd doesn't compromise our ability to sync the hotspare. At this point, another spare could be added, sdd marked for replacement, and so on. I realise this also requires changes to the kernel. Apologies if it's already planned; I haven't seen it discussed anywhere. Best regards, Andras -- Andras Korn <korn at elan.rulez.org> - <http://chardonnay.math.bme.hu/~korn/> All that glitters may not be gold, but it sure has a high refractive index. -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org