Hello Dennis,
Dennis Katsonis wrote:
I was wondering whether it would be difficult or not, to add
functionality to plzip, or create a variant of it, which had tarball
indexing capabilities like pixz.
I am in the process of implementing something like that, and more, but
in tarlz, not in plzip: http://www.nongnu.org/lzip/tarlz.html
Pixz allows a more random access to the compressed tarball. Listing is
very quick, and even extracting a file at the end of a large tarball is
quite fast, not too much slower than extracting it from an uncompressed,
indexed tarball. A major advantage when extracting select files from an
archived compressed tarball.
Tarlz is not complete yet, but it can already list pretty quick if the
archive is created with the right options[1]. Parallel extraction should
be similarly quick once it is implemented.
http://www.nongnu.org/lzip/manual/tarlz_manual.html#Multi_002dthreaded-tar
If the files in the archive are large, multi-threaded '--list' on a
regular (seekable) tar.lz archive can be hundreds of times faster than
sequential '--list' because, in addition to using several processors, it
only needs to decompress part of each lzip member. See the following
example listing the Silesia corpus on a dual core machine:
tarlz -9 --no-solid -cf silesia.tar.lz silesia
time lzip -cd silesia.tar.lz | tar -tf - (5.032s)
time plzip -cd silesia.tar.lz | tar -tf - (3.256s)
time tarlz -tf silesia.tar.lz (0.020s)
I expect that tarlz, or something based on the same principles, will
obsolete conventionally compressed tar archives.
Best regards,
Antonio.
_______________________________________________
Lzip-bug mailing list
[email protected]
https://lists.nongnu.org/mailman/listinfo/lzip-bug