Ashley Pittman <ash...@pittman.co.uk> writes: > If you relied on the md5 sum alone there would be collisions and those > collisions would result in you losing data.
The question is whether the probability of collisions is high compared with other causes -- presumably hardware, assuming no-one puts figures on the software reliability. As far as I remember, the calculation for SHA-1 for Plan 9's Venti¹, which no-one seems to have mentioned, says ignore collisions for petabyte filesystems. Ob-Beowulf: You can run Venti on GNU/Linux,² but I don't know how the current implementation performs. Also, GlusterFS has a `data de-duplication translator' on its roadmap, which I didn't see mentioned. -- 1. http://plan9.bell-labs.com/sys/doc/venti/venti.html 2. http://swtch.com/plan9port/ _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf