Bill Rankin wrote:

Let me offer up a somewhat concrete example of a problem with hardware raid.

A local group around here kept some Very Important Data on a hardware raid array. Due to several factors, a backup was not made of certain data. The device lost a drive and started an automagic rebuild on one

Let me state the obvious here. And yes, I know I am likely "preaching to the choir"

RAID is not a backup solution. Again, RAID is not a backup solution. If you run without a backup, we can pretty much guarantee that you are going to lose data at some point in time. Again, RAID is not a backup solution.

I don't know if I mentioned it, but RAID is not a backup solution.

Anyone who believes otherwise is begging for trouble. RAID is not a backup solution.

Backing up your data is *ALWAYS* important, RAID or not. Even if it is just a mirror of the data.

of the hot spares. The sudden beating that the other drives took (because of the rebuild) caused a second hard drive to fail (always a concern with RAID5).

[... anecdote elided ...]

RAID is not a backup solution, anyone mistakenly using it as such *will* be burned.

Now while this is kind of a "perfect storm" in turns of hardware and data failure, it does illustrate the extent of control that you give up when going with a hardware raid solution. I think that the higher end

Er... with all due respect, this wasn't a hardware issue. This was a policy issue.

If your data is important, back it up. It doesn't matter if it is on a hardware or software RAID, you absolutely, positively must to a cost-benefit analysis of the value of the data and the time/effort/money it would cost to recover when (not if) something goes bump in the night.

RAID is not a backup solution.  Not sure I mentioned this.

All hardware has failure modes. All software has bugs. Your choice is which set of problems are easier to deal with. We have seen crappy hardware, and abominable software. Bugs in the linux kernel (no, there couldn't be any, nah... impossible ...) could just as easily wreck your day as a misguided firmware/hardware bug.

Backups are a risk mitigation strategy. If you have important data, you need to back it up. Moreover, I argue that you need multiple modalities of backup/restore. Call this 20+ years of experience in losing data and thinking (naively) that the backup that I have will actually restore... properly.

vendors (ie. NetApp, EMC, et al) have their reliability up to the point where this is much less of a risk. But for the low-end beer budget

Er... ah... ok. All of them have similar issues. I occasionally hear how vendor X's (make the appropriate substitution for X) item, such as a network card, or disk drive is *obviously* much better than what is available in the mass market, which is why they charge so much more for it. The last time a customer noted that about one of the above named vendors (network card as it turned out), I asked them to pull back the label on the card and see what was underneath it. Turns out it was a plain old mass market card with a (vendor X) label slapped on it. I am sorry to report that for the vast majority of cases of which I am aware, they (the above named vendors X) use generally the same mass market stuff you and I do.

Don't mistake this, EMC, Netapp and others *do* offer value. It just isn't in slapping a new label on something, charging 10x for it, and somehow convincing the people paying for it that it is magically special (that is, unless their label maker has some serious undocumented mojo in that label ...) Their value is in hyperactive support.

cluster, software raid is probably still the way to go. As for the "mid-tier" vendors, I would be very cautious and pay close attention to the worst case data lose scenario.

What we tell all our customers (aside from RAID is not a backup solution) is that they want to minimize risk. Where is the risk? Well you can trace it out. There are many ways to mitigate risk, and reduce down time. RAIN is a great example.

But you can build RAIN out of software RAID as easily as hardware RAID. Remember, all have bugs, your job is to figure out (or work with someone who does this for you) how to reduce the impact of potential bugs. RAID is not a backup, and if you run without one, well, ...


Good luck,

... yeah.


-bill

_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: [EMAIL PROTECTED]
web  : http://www.scalableinformatics.com
       http://jackrabbit.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 866 888 3112
cell : +1 734 612 4615
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to