On Fri, Oct 12, 2012 at 3:07 PM, Michael Stapleton <[email protected]> wrote: > It is easy to understand that zfs srubs can be useful, But, How often do > we scrub or the equivalent of any other file system? UFS? VXFS? > NTFS? ...
If your data has checksums, it is "standard practice" to periodically verify your checksums and correct if necessary. ECC memory does do a "scrub" every once in a while :-). The FS you named don't have checksums, so scrubbing would do no good. > For example, data deduplication uses digests on data to detect > duplication. Most dedup systems assume that if the digest is the same > for two pieces of data, then the data must be the same. > This assumption is not actually true. Two differing pieces of data can > have the same digest, but the chance of this happening is so low that > the risk is accepted. "So low" is an understatement. Have you ever taken 2 to the power of 256? (ZFS currently requires sha256 checksums if you want to do dedup.) Chances of a block being different but having a duplicate sha256 is 1 in 115792089237316195423570985008687907853269984665640564039457584007913129639936. Just for fun, let's see what those odds give you. Say you were writing all human information ever produced (2.56e+20 bytes) [1] on one ZFS filesystem (with 1-byte blocksize). Let's say you were writing this much data every second for the age of the known universe (4.3e+17 s). Your odds of having one false positive with this amount of data are 1 in 1e+39. [1] http://www.wired.co.uk/news/archive/2011-02/14/256-exabytes-of-human-information > I'm only writing this because I get the feeling some people think scrubs > are a need. Maybe people associate doing scrubs with something like > doing NTFS defrags? All scrubbing does is put stress on drives and verify that data can still be read from them. If a hard drive ever fails on you and you need to replace it (how often does that happen?), then you know "hey, just last week all the other hard drives were able to read their data under stress, so are less likely to fail on me". Jan _______________________________________________ OpenIndiana-discuss mailing list [email protected] http://openindiana.org/mailman/listinfo/openindiana-discuss
