[issue10040] GZipFile failure on large files

2010-10-07 Thread Robert Rohde

New submission from Robert Rohde :

I attempted to use GZipFile to process a 1.93 GB file that expands to 18.8 GB.

This consistently produces the same corrupted output file that has 
approximately, but not exactly, the right output file size.

I bypassed GZipFile by calling the 7-Zip executable to open the compressed 
file.  This works correctly and consistently.

I haven't tried to figure out how GZipFile works, but I assume that this 
failure is probably related to the very large size of the files I am working 
with.  I've used GZipFile before on much smaller files with no apparent 
problems.  I have no idea what precisely goes wrong, or how to fix it, but I 
felt it was important to note that GZipFile isn't working for at least some 
very large files.

--
components: Library (Lib)
messages: 118091
nosy: Robert.Rohde
priority: normal
severity: normal
status: open
title: GZipFile failure on large files
type: behavior
versions: Python 2.7

___
Python tracker 
<http://bugs.python.org/issue10040>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10040] GZipFile failure on large files

2010-10-08 Thread Robert Rohde

Robert Rohde  added the comment:

It's Windows 7 Ultimate (64-bit) on a very high end system.

I don't think it would be very practical to distribute a 2 GB test file.  
Though I might be able to get it to a couple people if someone wanted to really 
study the issue.

Though if it is an integer overflow (or something like that), then I would 
suspect that GZipFile would show corruption most of the time once the files got 
large enough.  For example, it might occur for all files expanding to larger 
than 2^32 bytes (4 GB).  (That's just speculation, I haven't tested it except 
to note that it failed the very first time I tried to use a file this large.)

Perhaps someone familiar with the code could look for places where integers 
might overflow?

--

___
Python tracker 
<http://bugs.python.org/issue10040>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com