On Tue, 8 Jun 2010, David Mathog wrote: > This is off topic so I will try to keep it short: is there an > "archival" format for large binary files which contains enough error > correction to that all original data may be recovered even if there is a > little data loss in the storage media? > > For my purposes these are disk images, sometimes .tar.gz, other times > gunzip -c of dd dumps of whole partitions which have been "cleared" by > filling the empty space with one big file full of zero, and then that > file deleted. I'm thinking of putting this information on DVD's (only > need to keep it for a few years at a time) but I don't trust that media > not to lose a sector here or there - having watched far too many > scratched DVD movies with playback problems. > > Unlike an SDLT with a bad section, the good parts of a DVD are still > readable when there is a bad block (using dd or ddrescue) but of course > even a single missing chunk makes it impossible to decompress a .gz file > correctly. So what I'm looking for is some sort of .img.gz.ecc format, > where the .ecc puts in enough redundant information to recover the > underlying img.gz even when sectors or data are missing. If no such > tool/format exists then two copies should be enough to recover all of an > .img.gz so long as the same data wasn't lost on both media, and if bad > DVD sectors always come back as "failed read", never ever showing up as > a good read but actually containing bad data. Perhaps the frame > checksum on a DVD is enough to guarantee that?
I use tar, gzip/bzip2, split - for creating a number of files of more or less similar lenghts (like, 50 megs or 100 megs, but usually 50). After that, I make par2 recovery files with par2cmdline tools (they make use of Solomon-Reed error correction) http://en.wikipedia.org/wiki/Parchive http://parchive.sourceforge.net/ I am unable to find par2cmdline via google ATM, but they should be somewhere. And last but not least, I burn it all (data + pars). HTH. Regards, Tomasz Rola -- ** A C programmer asked whether computer had Buddha's nature. ** ** As the answer, master did "rm -rif" on the programmer's home ** ** directory. And then the C programmer became enlightened... ** ** ** ** Tomasz Rola mailto:tomasz_r...@bigfoot.com ** _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf