[issue3531] file read preallocs 'size' bytes which can cause memory problems

2008-09-13 Thread Antoine Pitrou
Changes by Antoine Pitrou <[EMAIL PROTECTED]>: -- resolution: -> rejected status: open -> closed ___ Python tracker <[EMAIL PROTECTED]> ___ ___

[issue3531] file read preallocs 'size' bytes which can cause memory problems

2008-09-11 Thread Andrew Dalke
Andrew Dalke <[EMAIL PROTECTED]> added the comment: I'm still undecided on if this is a bug or not. The problem occurs even when I'm not reading "data from a file of an unknown size." My example causes a MemoryError on my machine even though the file I'm reading contains 0 bytes. The proble

[issue3531] file read preallocs 'size' bytes which can cause memory problems

2008-09-11 Thread Antoine Pitrou
Antoine Pitrou <[EMAIL PROTECTED]> added the comment: Le jeudi 11 septembre 2008 à 16:01 +0200, Anthon van der Neut a écrit : > The thing however was resolved by reading multiple smaller chunks indeed > 1Mb if the filesize exceeds 1Mb (in the latter case the original read() > is done. It's too c

[issue3531] file read preallocs 'size' bytes which can cause memory problems

2008-09-11 Thread Antoine Pitrou
Antoine Pitrou <[EMAIL PROTECTED]> added the comment: Andrew, as for memory reallocation issues, you may take a look at #3526 where someone has similar problems on SunOS. If nobody objects, I will close the present bug as invalid. ___ Python tracker <[EMAIL P

[issue3531] file read preallocs 'size' bytes which can cause memory problems

2008-09-11 Thread Antoine Pitrou
Antoine Pitrou <[EMAIL PROTECTED]> added the comment: > My code regularily calculates the sha1 sum of 10.000 files and because > in another reuse of the code had to deal with files too big to fit in > memory I set a limit of 256Mb. Why don't you use a sensible buffer size, e.g. 1MB? Reading data

[issue3531] file read preallocs 'size' bytes which can cause memory problems

2008-09-11 Thread Anthon van der Neut
Anthon van der Neut <[EMAIL PROTECTED]> added the comment: FWIW: I have performance problems on Windows XP (SP2) with Python 2.5.1 that could be caused by this behaviour. My code regularily calculates the sha1 sum of 10.000 files and because in another reuse of the code had to deal with files too

[issue3531] file read preallocs 'size' bytes which can cause memory problems

2008-08-09 Thread Andrew Dalke
Andrew Dalke <[EMAIL PROTECTED]> added the comment: FreeBSD is why my hosting provider uses. Freebsd.org calls 2.6 "legacy" but the latest update was earlier this year. There is shared history with Macs. I don't know the details though. I just point out that the problem isn't only on Darwin

[issue3531] file read preallocs 'size' bytes which can cause memory problems

2008-08-09 Thread Antoine Pitrou
Antoine Pitrou <[EMAIL PROTECTED]> added the comment: Le samedi 09 août 2008 à 11:26 +, Andrew Dalke a écrit : > Mind you, I also get the problem on FreeBSD 2.6 so it isn't Darwin > specific. Darwin and the BSD's supposedly share a lot of common stuff. But FreeBSD 2.6 is a bit old, isn't it

[issue3531] file read preallocs 'size' bytes which can cause memory problems

2008-08-09 Thread Andrew Dalke
Andrew Dalke <[EMAIL PROTECTED]> added the comment: You're right. I mistook the string implementation for the list one which does keep a preallocated section in case of growth. Strings of course don't grow so there's no need for that. I tracked the memory allocation all the way down to obma

[issue3531] file read preallocs 'size' bytes which can cause memory problems

2008-08-09 Thread Antoine Pitrou
Antoine Pitrou <[EMAIL PROTECTED]> added the comment: Perhaps. I'm under Linux. However, at the end of the file_read() implementation in fileobject.c, you can find the following lines: if (bytesread != buffersize) _PyString_Resize(&v, bytesread); Which means that the string *is* resized at

[issue3531] file read preallocs 'size' bytes which can cause memory problems

2008-08-09 Thread Andrew Dalke
Andrew Dalke <[EMAIL PROTECTED]> added the comment: I tested it with Python 2.5 on a Mac, Python 2.5 on FreeBSD, and Python 2.6b2+ (from SVN as of this morning) on a Mac. Perhaps the memory allocator on your machine is making a promise it can't keep? ___ Py

[issue3531] file read preallocs 'size' bytes which can cause memory problems

2008-08-09 Thread Antoine Pitrou
Antoine Pitrou <[EMAIL PROTECTED]> added the comment: I can't reproduce, your code snippet works fine. What Python version is it? -- nosy: +pitrou ___ Python tracker <[EMAIL PROTECTED]> __

[issue3531] file read preallocs 'size' bytes which can cause memory problems

2008-08-08 Thread Andrew Dalke
New submission from Andrew Dalke <[EMAIL PROTECTED]>: I wrote a buggy PNG parser which ended up doing several file.read(large value). It causes a MemoryError, which was strange because the file was only a few KB long. I tracked it down to the implementation of read(). When given a size hint