Tim Kientzle <t...@kientzle.com> wrote: > > > > If GNU tar archives sparse files, it creates archives that violate the > > POSIX structuring conventions for TAR archives. > > The newer GNU tar --posix support addresses this, though > it's not (yet?) the default format for GNU tar. I think the > current "1.0" variant is pretty well thought out (though I do have > a couple of small quibbles. ;-) > > Libarchive now supports the GNU tar --posix "1.0" variant when > writing sparse files.
I am not sure what you understand by posix version 1.0. The first GNU tar implementation that did move the hole description data into the POSIX extended headers created no problems because huge amounts of xheader data need to be allocated for parsing, but it was in conflict with the POSIX rules for xheaders as it _repeated_ line pairs like: 16 GNU.xxx.hole=123456 17 GNU.xxx.data=1234567 but the POSIX standard says that in case of releated entries, the last one is valid. I asume that the current variant thus cannot be called "1.0". It is different and IIRC, it contains has a very long line of hole/data pairs. This is neither easy to read (star would need to malloc space for the maximum size of the xheader as the data is not block oriented), nor does it allow to archive larger sparse files. Note that the max. size of an xheader is 8 GB. Note that this would still not allow to have a 32 bit tar program to hadle the max. size, as a 32 bit process cannot grow to even 4 GB. A file with maximum sparseness thus currently only can grow up to aprox. 3 TB until it is no longer archivable by GNU tar even with a 64 bit binary. If I compare the currently available methods for handling the sparse data, the currently used method from star still seems to be the best. - The data is block oriented and thus can be read on the fly without a need to malloc() sizeof ascii parse data - The base 256 format I introduced in the mid 1990s is smaller than archiving the numbers as decimal strings. - The base 256 format still allows 95 bits for the file size wich is sufficient for any local stgorage in a single universe, as this would take aprox. 1 MegaMol for active storage (net) mass if one bit takes one atom. - It is located in the file data space and thus unlimited in size. I am not sure whether the current GNU tar sparse format will last for a longer time and this is why I am not sure whether I should implement support for it. > > In future, the tar file format could be updated to allow sparse files to > > be archived in a single pass, but it would require ... > > I've considered approaches like this for libarchive, but > I haven't found the time to experiment with them. > > Specifically, this could be done without seeking > (and without completely ignoring the standards) > by recording a complete tar entry for each "packet" of a file If the file is archived in chunks, it could be done without seeks. BTW: Star implements a fifo since more than 20 years and because of this fifo cannot be easily upgraded to support seeking. Jörg -- EMail:jo...@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin j...@cs.tu-berlin.de (uni) joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily