On Wed, 2009-01-14 at 21:28 -0500, Alec Berryman wrote:
> That's the error when it's decompressing a chunk.  How far in the file
> did it get?

One chunk, I think.  For the original example which had been compressed
with --best the chunk size appears to be 900M.  I have tried some
further testing letting rzip default the compression level and this
seems to result in a chunk size of 600M.

> If you can keep it around in the short term, that would be helpful.  Be
> careful copying it, since cp won't preserve the sparseness.

I think I promised more than I could deliver here because I realised
after I submitted the report that the only copy I have of that
particular file is the compressed one so it may not be recoverable
anyway.

On the other hand the file was a snapshot of a disk at a particular
moment in time and I have other snapshots of the same disk.  One of
these other snapshots demonstrates the same problem.

I tried using dd to create a shorter version of the other file referred
to above to see if the problem still occurs and I have managed to
reproduce the problem with a file size of 1Gb rather than the 8Gb.

> What happens if you copy the file to de-sparsify it and repeat the
> experiment?  If that still fails, is there a problem with the
> decompression if you pass it through tar first?

I think the sparseness is a red herring.  When I created the 1Gb subset
of the test file with dd this had the effect of making the subset file
non-sparse and it still demonstrates the problem.

> Does it compress/decompress fine with another compression program - say,
> gzip?

Yes, fine.

> I glanced through the code and didn't see anything obviously broken with
> respect to sparse file handling.  I won't have a chance before the
> weekend to look into it, though; even then, without an example bad file,
> I'm not sure I can track it down.  I do understand that you don't want
> to send the disk image, and I'm not sure I want to figure out a way to
> receive it, either!

I have been doing a little testing and have discovered something
interesting.  If I run rzip more than once on the same input file the
compressed file is different from one run to the next.  In some cases it
is not even the same size whereas in others the size is consistent but
an md5sum reveals that the files are not the same.  Sometimes the
resulting compressed file is able to be decompresses, other times not.

Having created a test file called test1G.img which is the initial 1Gb of
the disk image I tried running the following:

for i in 0 1 2 3 4 5 6 7 8 9
do
    rm -f test1G.rz unzip1G.img
    rzip -k -o test1G.rz test1G.img > run$i-zip.log 2>&1
    (ls -l test1G.rz; md5sum test1G.rz) > run$i.sums
    rzip -d -o unzip1G.img test1G.rz > run$i-unzip.log 2>&1
done

The resulting sizes/checksums are:

-r--r--r-- 1 user user 633543543 2009-01-14 23:58 test1G.rz
3ae61f55e706f35fabb787da42eb8c18  test1G.rz
-r--r--r-- 1 user user 633543731 2009-01-14 23:58 test1G.rz
b12db7103693679a905bb41949564fcc  test1G.rz
-r--r--r-- 1 user user 633537477 2009-01-14 23:58 test1G.rz
96557d5a1d544c92cd61f7004231c180  test1G.rz
-r--r--r-- 1 user user 633537493 2009-01-14 23:58 test1G.rz
c9e32c849b590e2d101d242d71869e01  test1G.rz
-r--r--r-- 1 user user 633537493 2009-01-14 23:58 test1G.rz
c9e32c849b590e2d101d242d71869e01  test1G.rz
-r--r--r-- 1 user user 633537493 2009-01-14 23:58 test1G.rz
d919c0fe6b50ee621d74d0f8603264ef  test1G.rz
-r--r--r-- 1 user user 633537493 2009-01-14 23:58 test1G.rz
8540d596ff877ca33db40233a45f1cc5  test1G.rz
-r--r--r-- 1 user user 633537493 2009-01-14 23:58 test1G.rz
1362ad5f2d0f4d58c87b71106efe4d19  test1G.rz
-r--r--r-- 1 user user 633537493 2009-01-14 23:58 test1G.rz
3f147e32b41420c687f969f062330406  test1G.rz
-r--r--r-- 1 user user 633541208 2009-01-14 23:58 test1G.rz
3abd6061d11b967ba3442fce55035dfa  test1G.rz

In each case no errors were logged by the compression but the
uncompression was sometimes sucessful and sometimes not as shown byt the
following:

-rw-rw-r-- 1 user user   0 2009-01-15 00:53 run0-unzip.log
-rw-rw-r-- 1 user user 107 2009-01-15 01:11 run1-unzip.log
-rw-rw-r-- 1 user user 107 2009-01-15 01:25 run2-unzip.log
-rw-rw-r-- 1 user user   0 2009-01-15 01:38 run3-unzip.log
-rw-rw-r-- 1 user user   0 2009-01-15 01:54 run4-unzip.log
-rw-rw-r-- 1 user user 107 2009-01-15 02:11 run5-unzip.log
-rw-rw-r-- 1 user user 146 2009-01-15 02:25 run6-unzip.log
-rw-rw-r-- 1 user user 107 2009-01-15 02:39 run7-unzip.log
-rw-rw-r-- 1 user user 146 2009-01-15 02:53 run8-unzip.log
-rw-rw-r-- 1 user user 107 2009-01-15 03:07 run9-unzip.log

So, according to that runs 0, 3 and 4 were successful, the rest not.

To me this could mean either faulty memory/kernel bug or there is some
deliberate randomness in the rzip algorithm.  I have seen no other
evidence of a memory fault and there does seem to be pseudo ramdom
number generator used in rzip.

Regards,
Steve.




-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Reply via email to