> MD5 alone can be somewhat dangerous even in benevolent environments: if the > data sets are large enough or you are just unlucky, you are going to hit a > colision and corrupt-or-lose-data-on-dedup sooner or later.
it doesn't seem worried about this. Admittedly, they use sha1 rather than md5, so they have 160bit instead of 128bit, with a correspondingly lower probability of collisions, but I'd be interested to know about cases where md5 lead to accidental collisions. Stefan