On Wed, May 28, 2014 at 11:26 AM, Bob Sanders <rsand...@sgi.com> wrote:
> Marc Joliet, mused, then expounded:
>> Am Tue, 27 May 2014 15:39:38 -0700
>> schrieb Bob Sanders <rsand...@sgi.com>:
>>
>> While I am far from a filesystem/storage expert (I see myself as a mere 
>> user),
>> the cited threads lead me to believe that this is most likely an
>> overhyped/misunderstood class of errors (e.g., posts [1] and [2]), so I would
>> suggest reading them in their entirety.
>>
>> [0] http://comments.gmane.org/gmane.comp.file-systems.btrfs/31832
>> [1] http://permalink.gmane.org/gmane.comp.file-systems.btrfs/31871
>> [2] http://permalink.gmane.org/gmane.comp.file-systems.btrfs/31877
>> [3] http://comments.gmane.org/gmane.comp.file-systems.btrfs/31821
>>
>
> FWIW - here's the FreeNAS ZFS ECC discussion on what happens with a bad
> memory bit and no ECC memory:
>
> http://forums.freenas.org/index.php?threads/ecc-vs-non-ecc-ram-and-zfs.15449/
>

I don't think that anybody debates that if you use btrfs/zfs with
non-ECC RAM you can potentially lose some of the protection afforded
by the checksumming.

What I'd question is that this is some concern unique to btrfs/zfs.
I'd think the same failure modes would all apply to any other
filesystem.

So, the message should be that ECC RAM is better than non-ECC RAM, not
that those who use non-ECC RAM are better off using ext4 instead of
zfs/btrfs.  I'd think that any RAM-related issue that would impact
zfs/btrfs would affect ext4 just as badly, and with ext4 you're also
vulnerable to all the non-RAM-related errors that checksumming was
created to solve.

If your RAM is bad then all kinds of stuff can go wrong.  Ditto for
your cache memory in the CPU, logic circuitry in the CPU, your busses,
etc.  Most systems are not fault-tolerant of these system components
and the cost to make them fault-tolerant tends to be fairly high.  On
the other hand, the good news is that you're far more likely to have
problems with data stored on a disk than in RAM, which is probably why
we haven't bothered to improve the other components.

Rich

Reply via email to