On Tue, Jul 17, 2012 at 2:57 PM, John O'Connor <jxo6...@rit.edu> wrote:
>>
>> The second approach is consistently 10-20% faster than the first one
>> (depending on input) for trunk Python 3.3
>>
>
> I think the difference is that StringIO spends extra time reallocating
> memory during the write loop as it grows, whereas bytes.join computes
> the allocation size first since it already knows the final length.

BytesIO is actually missing an optimisation that is already used in
StringIO: the StringIO C implementation uses a fragment accumulator
internally, and collapses that into a single string object when
getvalue() is called. BytesIO is still using the old
"resize-the-buffer-as-you-go" strategy, and thus ends up repeatedly
reallocating the buffer as the data sequence grows incrementally.

It should be optimised to work the same way StringIO does (which is
effectively the same way that the monkeypatched version works)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to