On Wed, 06 Dec 2006 15:40:33 +0100, hendrik wrote: > > If you want to be able to recover data despite damage, it is in general > not wise to compress it, since different parts will be damaged > independently, and the undamaged parts will still be readable. > Squeezing out redundancy makes different parts of the data dependent on > one another for interpretation.
No, you _should_ compress it and then use some of the space you saved to add some carefully chosen redundancy which will allow you to reconstruct everything, not just some things, in case of failure. (E.g., using par2.) Scenario A: Compression Suppose you have 100 megabytes of files, uncompressed. You create a tar archive and compress it down to 75M. A failure occurs, and 2M of data are lost. The archive becomes impossible to decompress, and you lose everything. You are very sad. Scenario B: No compression Suppose you have 100 megabytes of files, uncompressed. A failure occurs, and 2M of data is lost. All files intersecting the broken region are destroyed (modulo any Herculean effort one is willing to put into reconstruction). You are sad, but not as sad as Scenario A. Scenario C: Compression plus redundancy Suppose you have 100 megabytes of files, uncompressed. You create a tar archive and compress it down to 75M. You then create 10M of redundancy using (e.g.) par2, for a total of 85M. A failure occurs, and 2M of data is lost. You use par2 to reconstruct the archive, and nothing is lost. (You can do this regardless of whether data, redundancy, or both are destroyed.) You are happy. HTH, Reid -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]