Mikołaj Izdebski wrote:
> Do you consider alternative bzip2 implementations available in Debian
> (lbzip2, pbzip2, p7zip-full, libcommons-compress-java) as "commonly
> available implementations"? They all produce different compressed
> files for the same input file. Moreover, lbzip2-0.23 from stable
> produces different files than lbzip2-2.1 from unstable.

I have not seen files produced by these yet afaik, but it's sure nice to
have a list to try when someone comes with a weird file; I did not know
about some of those!

Even bzip2 changed its output after 0.9.5d -- I have a program that uses
the compressor from the old version since some files needed it.

> I believe that pristine-tar generates "binary diffs" for gzip files it
> fails to reproduce, but doesn't do the same for bzip2 files. Maybe
> implementing such feature for bzip2 files is the solution?

I'll add it if I see a bz2 file that can nearly exactly be reproduced
and only needs the delta to get the rest of the way. Haven't yet.

> My point was that block size isn't the only factor the resulting file
> depends on. There is also a "work factor", as described in bzip2
> documentation. Even the same version of bzip2, with the same block
> size given, for the same input can produce different outputs, given
> that work factors are different. A proof of concept is available in
> lbzip2 git repo:

I have yet to see a bz2 file in the wild that uses a nonstandard
block size, so pristine-bz2 doesn't bother to try nonstandard block
sizes by default yet. Since bzip2 --exponential is not documented, I will
worry about it when I find a file using it in the wild.

-- 
see shy jo

Attachment: signature.asc
Description: Digital signature

Reply via email to