Re: [Beowulf] dedupe filesystem

Joe Landman Fri, 05 Jun 2009 11:21:45 -0700

Lux, James P wrote:

Isn’t de-dupe just another flavor, conceptually, of a journaling filesystem..in the sense that in many systems, only a small part of the fileactually changes each time, so saving “diffs” allows one to reconstructany arbitrary version with much smaller file space.

Its really more conceptually like RLE (run length encoding) or simplecompression where you start with a pattern and a dictionary, and pointout where in the file that pattern repeats.

I guess the de-dupe is a bit more aggressive than that, in that ittheoretically can look for common “stuff” between unrelated files, so

It only looks at raw blocks. If they have the same hash signatures(think like MD5 or SHA ... hopefully with fewer collisions), then theyare duplicates.

maybe a better model is a “data compression” algorithm on the fly.

Yup this is it, but on the fly is the hard part. Doing this comparisonis computationally very expensive. The hash calculations are not cheapby any measure. You most decidedly do not wish to do this on the fly ...

And for that, it’s all about trading between cost of storage space,retrieval time, and computational effort to run the algorithm.


Exactly.

(Reliability factors into it a bit.. Compression removes redundancy,after all, but the defacto redundancy provided by having previousversions around isn’t a good “system” solution, even if it’s the onepeople use)

:)

You get a direct CBA comparison between buying the N+1th disk, and thetime/effort/money to perform this computation. In the end, the former wins.

I think one can make the argument that computation is always gettingcheaper, at a faster rate than storage density or speed (because of thephysics limits on the storage...), so the “span” over which you can docompression can be arbitrarily increased over time. TIFF and FAX docompression over a few bits. Zip and it’s ilk do compression overkilobits or megabits (depending on whether they build a custom symboltable). Dedupe is doing compression over Gigabits and terabits,presumably (although I assume that there’s a granularity at some point..A dedupe system looks at symbols that are, say, 512 bytes long, asopposed to ZIP looking at 8bit symbols, or Group4 Fax looking at 1 bitsymbols.

Most Dedup are over blocks, and I think most are doing 512 bytes or 4kbytes.

The point is that even if theoretically computation is getting cheaper,hash computations (the ones without collisions, as collisional hashesare ... um ... not good for Dedup), the calculation of the hash is stilla significant bit of computation.

One well suited for an accelerator. Which is why the Dedup market seemsto be "flooded" with accelerators (which I think are little more thanFPGAs implementing some hash computation algorithm)

The hierarchical storage is really optimizing along a different axisthan compression. It’s more like cache than compression.. Make the“average time to get to the next bit you need” smaller rather than “makesmaller number of bits”

Basically yes ... though HSM is all about driving down the cost of thelarge pool as low as possible. Tape is still used, and lots of peoplemake arguments for tape. But as John pointed out Spectra logic ismarketing a SATA eating robot, so I think the days of tape are likelymore numbered than before.

A brief anecdote. In 1989, a fellow graduate student was leaving foranother school. He was taking his data with him. He spooled up a Vax8650 unit with a tape. I asked him why this over other media. Hisresponse was, you can read a Vax tape anywhere.

In 2009, twenty years later, I think he might have a different take onthis. I put all my bits onto floppys when I left there, and moved theimportant ones to spinning rust. I can still read the floppies. Idoubt he can still read the tapes.

The point is that tape folks talk about longevity. But this makes anumber of important assumptions about the media, the drives, andavailability of replacement drives, which, as my advisor in graduateschool discovered after her drive died, are not necessarily correct oraccurate.

Granted, for a lot of systems, “time to get a bit” is proportional to“number of bits”


Yup.  But that initial latency can be huge.

While the cost of computation is decreasing rapidly, I'll argue that thecost of storage is decreasing as fast if not faster. This hasimplications for which mode is preferable ... n-plication ontodecreasing cost media, or computation to minimize the cost of the ...cheap media footprint. The CBA doesnt favor Dedup in the long term, andthough does favor HSM ... even cloud storage.


The issues there are bandwidth, bandwidth, and, you guessed it, bandwidth.


On 6/5/09 8:00 AM, "Joe Landman" <land...@scalableinformatics.com> wrote:

    John Hearns wrote:
    >  2009/6/5 Mark Hahn <h...@mcmaster.ca>:
    > > I'm not sure - is there some clear indication that one level of
    storage is
    > > not good enough?

    I hope I pointed this out before, but Dedup is all about reducing the
    need for the less expensive 'tier'.  Tiered storage has some merits,
    especially in the 'infinite size' storage realm.  Take some things
    offline, leave things you need online until they go dormant.  Define
    dormant on your own terms.



--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics,
email: land...@scalableinformatics.com
web  : http://scalableinformatics.com
       http://scalableinformatics.com/jackrabbit
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Re: [Beowulf] dedupe filesystem

Reply via email to