[issue2523] binary buffered reading is quadratic

2008-07-28 Thread Antoine Pitrou
Antoine Pitrou <[EMAIL PROTECTED]> added the comment: Following our discussion and Guido's answer on python-3000 (*), I committed a modified fix in r65264. (*) http://mail.python.org/pipermail/python-3000/2008-July/014466.html -- resolution: -> fixed status: open -> closed ___

[issue2523] binary buffered reading is quadratic

2008-07-23 Thread Antoine Pitrou
Antoine Pitrou <[EMAIL PROTECTED]> added the comment: > When I revised the patch I had a weak understanding of nonblocking I/O. > I thought the "exponential" reads were for nonblocking I/O, but I see > now that is non-sense. Fine, so it will make the patch simpler. As for non-blocking IO, I thi

[issue2523] binary buffered reading is quadratic

2008-07-22 Thread Alexandre Vassalotti
Alexandre Vassalotti <[EMAIL PROTECTED]> added the comment: Antoine wrote: > Le lundi 21 juillet 2008 à 21:18 +, Martin v. Löwis a écrit : > > IIUC, a read of the full requested size would achieve exactly that: on a > > non-blocking stream (IIUC), a read will always return > > min(bytes_avai

[issue2523] binary buffered reading is quadratic

2008-07-21 Thread Antoine Pitrou
Antoine Pitrou <[EMAIL PROTECTED]> added the comment: Le lundi 21 juillet 2008 à 21:18 +, Martin v. Löwis a écrit : > IIUC, a read of the full requested size would achieve exactly that: on a > non-blocking stream (IIUC), a read will always return > min(bytes_available, bytes_requested). Hmm,

[issue2523] binary buffered reading is quadratic

2008-07-21 Thread Martin v. Löwis
Martin v. Löwis <[EMAIL PROTECTED]> added the comment: >> max(buffer_size, n-avail) > > I mimicked the original logic rather than rethink the algorithm. I'm not > totally > sure what motivates the original logic but the purpose seems to be that > non-blocking streams can return at least a few

[issue2523] binary buffered reading is quadratic

2008-07-21 Thread Antoine Pitrou
Antoine Pitrou <[EMAIL PROTECTED]> added the comment: Selon "Martin v. Löwis" <[EMAIL PROTECTED]>: > > Martin v. Löwis <[EMAIL PROTECTED]> added the comment: > > I don't understand the second loop (where n is given). If n is given, > there should be only a single read operation, using > > max

[issue2523] binary buffered reading is quadratic

2008-07-19 Thread Martin v. Löwis
Martin v. Löwis <[EMAIL PROTECTED]> added the comment: I don't understand the second loop (where n is given). If n is given, there should be only a single read operation, using max(buffer_size, n-avail) (i.e. the way it is in patch 2). In particular, if the stream is unbuffered, it shouldn't

[issue2523] binary buffered reading is quadratic

2008-07-17 Thread Antoine Pitrou
Antoine Pitrou <[EMAIL PROTECTED]> added the comment: If nobody objects I'll commit Alexandre's patch in a few days (after beta 2 though). ___ Python tracker <[EMAIL PROTECTED]> ___ ___

[issue2523] binary buffered reading is quadratic

2008-06-10 Thread Antoine Pitrou
Antoine Pitrou <[EMAIL PROTECTED]> added the comment: Yup. However, if you try it, you'll probably notice that it decreases performance of normal (blocking) reads as well :-) Anyway, non-blocking file objects are pretty much second-class citizens in Py3k right now, so my remark was theoretical.

[issue2523] binary buffered reading is quadratic

2008-06-09 Thread Alexandre Vassalotti
Alexandre Vassalotti <[EMAIL PROTECTED]> added the comment: Oh, that is simple to fix. You can round the value 2*avail to the nearest block by doing something like (2*avail) & ~(bksize-1) where bksize is a power of 2, or the less magic (2*avail//bksize) * bksize.

[issue2523] binary buffered reading is quadratic

2008-06-08 Thread Antoine Pitrou
Antoine Pitrou <[EMAIL PROTECTED]> added the comment: Thanks for the fixes. By the way, I don't know much about non-blocking streams, but it seems to me that "optimal" non-blocking read() would require that the chunks we ask to the OS are block-aligned, which is not the case currently (we use 2*

[issue2523] binary buffered reading is quadratic

2008-06-07 Thread Alexandre Vassalotti
Alexandre Vassalotti <[EMAIL PROTECTED]> added the comment: I reviewed the patch and I found a few bugs -- i.e., peek() was replacing the buffer content, read() wasn't written in consideration of non-blocking streams, the removal of the None check in BufferedRandom.read() was wrong. Here's an up

[issue2523] binary buffered reading is quadratic

2008-06-07 Thread Alexandre Vassalotti
Alexandre Vassalotti <[EMAIL PROTECTED]> added the comment: I am going to go through your patch as soon as I get the time -- i.e., later today or tomorrow morning. ___ Python tracker <[EMAIL PROTECTED]> __

[issue2523] binary buffered reading is quadratic

2008-06-07 Thread Antoine Pitrou
Antoine Pitrou <[EMAIL PROTECTED]> added the comment: I recommend not letting this issue rot too much :) Eating 20+ seconds to read the contents of a 10MB binary file in one pass is not very good marketing-wise, and the betas are coming soon... ___ Python trac

[issue2523] binary buffered reading is quadratic

2008-05-25 Thread Gregory P. Smith
Changes by Gregory P. Smith <[EMAIL PROTECTED]>: -- nosy: +gregory.p.smith priority: -> high __ Tracker <[EMAIL PROTECTED]> __ ___ Python-bugs-

[issue2523] binary buffered reading is quadratic

2008-05-08 Thread Antoine Pitrou
Antoine Pitrou <[EMAIL PROTECTED]> added the comment: Some code relies on -1 being usable as the default value for read() (instead of None), this new patch conforms to this expectation. It fixes some failures in test_mailbox. Added file: http://bugs.python.org/file10222/binaryio2.patch

[issue2523] binary buffered reading is quadratic

2008-05-07 Thread Antoine Pitrou
Antoine Pitrou <[EMAIL PROTECTED]> added the comment: Hi Alexandre, I first tried to use a (non-preallocated) bytearray object and, after trying several optimization schemes, I found out that the best one worked as well with an immutable bytes object :) I also found out that the bytes <-> bytear

[issue2523] binary buffered reading is quadratic

2008-05-07 Thread Alexandre Vassalotti
Alexandre Vassalotti <[EMAIL PROTECTED]> added the comment: I see that the code is still using the immutable bytes object for its buffer (which forces Python to create a new buffer every time its modified). Also, I think it worthwhile to check if using a pre-allocated bytearray object (i.e., byte

[issue2523] binary buffered reading is quadratic

2008-05-07 Thread Antoine Pitrou
Antoine Pitrou <[EMAIL PROTECTED]> added the comment: Here is a pure Python patch removing the quadratic behaviour and trying to make read operations generally faster. Here are some numbers: ./python -m timeit -s "f = open('50KB', 'rb')" "f.seek(0)" "while f.read(11): pass" -> py3k without pat

[issue2523] binary buffered reading is quadratic

2008-05-06 Thread Alexandre Vassalotti
Changes by Alexandre Vassalotti <[EMAIL PROTECTED]>: -- nosy: +alexandre.vassalotti __ Tracker <[EMAIL PROTECTED]> __ ___ Python-bugs-list maili

[issue2523] binary buffered reading is quadratic

2008-04-09 Thread Antoine Pitrou
Antoine Pitrou <[EMAIL PROTECTED]> added the comment: By the way, a simple way to fix it would be to use a native BytesIO object (as provided by Alexandre's patch in #1751) rather than a str object for the underlying buffer. __ Tracker <[EMAIL PROTECTED]>

[issue2523] binary buffered reading is quadratic

2008-03-31 Thread Antoine Pitrou
New submission from Antoine Pitrou <[EMAIL PROTECTED]>: In py3k, buffered binary IO can be quadratic when e.g. reading a whole file. This is a small test on 50KB, 100KB and 200KB files: -> py3k with buffering: ./python -m timeit -s "f = open('50KB', 'rb')" "f.seek(0); f.read()" 1000 loops, best