Re: [Python-Dev] PEP 393 decode() oddity

2012-03-26 Thread Serhiy Storchaka
25.03.12 23:55, mar...@v.loewis.de написав(ла): The results are fairly stable (±0.1 µsec) from run to run. It looks funny thing. This is not surprising. Thank you. Indeed, it is logical. I looked at the code and do not see how to speed it up. ___

Re: [Python-Dev] PEP 393 decode() oddity

2012-03-26 Thread Serhiy Storchaka
27.03.12 01:04, Serhiy Storchaka написав(ла): 26.03.12 01:28, Victor Stinner написав(ла): loop in cascade of independent loops which fallback onto each other (as you have already done in utf8_scanner). Sorry. Not you. Antoine Pitrou. ___ Python-Dev m

Re: [Python-Dev] PEP 393 decode() oddity

2012-03-26 Thread Serhiy Storchaka
26.03.12 01:28, Victor Stinner написав(ла): Cool, Python 3.3 is *much* faster to decode pure ASCII :-) He even faster on large data. 1000 characters is not enough to completely neutralize the constant costs of the function calls. Python 3.3 is really cool. encoding string

Re: [Python-Dev] PEP 393 decode() oddity

2012-03-25 Thread Victor Stinner
Cool, Python 3.3 is *much* faster to decode pure ASCII :-) > encoding  string                 2.7   3.2   3.3 > > ascii     " " * 1000             5.4   5.3   1.2 4.5 faster than Python 2 here. > utf-8     " " * 1000             6.7   2.4   2.1 3.2x faster It's cool because in practice, a lot

Re: [Python-Dev] PEP 393 decode() oddity

2012-03-25 Thread martin
Anyone can test. $ ./python -m timeit -s 'enc = "latin1"; import codecs; d = codecs.getdecoder(enc); x = ("\u0020" * 10).encode(enc)' 'd(x)' 1 loops, best of 3: 59.4 usec per loop $ ./python -m timeit -s 'enc = "latin1"; import codecs; d = codecs.getdecoder(enc); x = ("\u0080" * 1000

Re: [Python-Dev] PEP 393 decode() oddity

2012-03-25 Thread Paul Moore
On 25 March 2012 19:51, Serhiy Storchaka wrote: > Anyone can test. > > $ ./python -m timeit -s 'enc = "latin1"; import codecs; d = > codecs.getdecoder(enc); x = ("\u0020" * 10).encode(enc)' 'd(x)' > 1 loops, best of 3: 59.4 usec per loop > $ ./python -m timeit -s 'enc = "latin1"; import co

Re: [Python-Dev] PEP 393 decode() oddity

2012-03-25 Thread Serhiy Storchaka
25.03.12 20:01, Antoine Pitrou написав(ла): The general problem with decoding is that you don't know up front what width (1, 2 or 4 bytes) is required for the result. The solution is either to compute the width in a first pass (and decode in a second pass), or decode in a single pass and enlarge

Re: [Python-Dev] PEP 393 decode() oddity

2012-03-25 Thread Martin v. Löwis
> How serious a problem this is for the Python 3.3 release? I could do the > optimization, if someone is not working on this already. I think the people who did the original implementation (Torsten, Victor, and myself) are done with optimizations. So: contributions are welcome. I'm not aware of an

Re: [Python-Dev] PEP 393 decode() oddity

2012-03-25 Thread Antoine Pitrou
Hi, On Sun, 25 Mar 2012 19:25:10 +0300 Serhiy Storchaka wrote: > > But decoding is not so good. The general problem with decoding is that you don't know up front what width (1, 2 or 4 bytes) is required for the result. The solution is either to compute the width in a first pass (and decode in