hi - i'm to setup my 1st RAID, and i'd appreciate if any of you volunteers some time to share your valuable experience on this subject.
my scenario ----------- 0. i don't boot from the RAID. 1. read is as important as write. i don't have any application-specific scenario that makes me somehow favor one over another. so RAIDs that speed up the read (or write) while significantly harming the write (or read) is not welcome. 2. replacing failed disks may take a week or two. so, i guess that i may have several disks fail one after another in the 1-2 weeks (specially if they were bought about the same time). 3. i would like to be able to grow the RAID's total space (as needed), and increase its reliability (i.e. duplicates/partities) as needed. e.g. suppose that i got a 2TB RAID that tolerates 1 disk failure. i'd like to, at some point, to have the following options: * only increase the total space (e.g. make it 3TB), without increasing failure toleration (so 2 disk failure would result in data loss). * or, only increase the failure tolerance (e.g. such that 2 disks failure would not lead to data loss), without increasing the total space (e.g. space remains 2TB). * or, increase, both, the space and the failure tolerance at the same time. 4. only interested in software RAID. my thought ---------- i think these are not suitable: * RAID 0: fails to satisfy point (3). * RAID 1: fails to satisfy points (1) and (3). * RAIDs 4 to 6: fails to satisfy point (3) since they are stuck with a fixed tolerance towards failing disks (i.e. RAIDs 4 and 5 tolerate only 1 disk failure, and RAID 6 tolerates only 2). this leaves me with RAID 10, with the "far" layout. e.g. --layout=n2 would tolerate the failure of two disks, --layout=n3 three, etc. or is it? (i'm not sure). my questions ------------ Q1: which RAID setup would you recommend? Q2: how would the total number of disks in a RAID10 setup affect the tolerance towards the failing disks? if the total number of disks is even, then it is easy to see how this is equivalent to the classical RAID 1+0 as shown in md(4), where any disk failure is tolerated for as long as each RAID1 group has 1 disk failure only. so, we get the following combinations of disk failures that, if happen, we won't lose any data: RAID0 ------^------ RAID1 RAID1 --^-- --^-- F . . . < cases with . F . . < single disk . . F . < failures . . . F < F . . F < cases with . F F . < two disk . F . F < failures F . F . < . F F . < this gives us 4+5=9 possible disk failure scenarious where we can survive it without any data loss. but, when the number of disks is odd, then written bytes and their duplicates will start wrap around, and it is difficult for me to intuitively see how would this affect the total number of scenarious where i will survive a disk failure. Q3: what are the future growth/shrinkage options for a RAID10 setup? e.g. with respect to these: 1. read/write speed. 2. tolerance guarantee towards failing disks. 3. total available space. rgrds, cm.