Re: [Python-Dev] Unicode charmap decoders slow

2005-10-05 Thread Walter Dörwald
Am 05.10.2005 um 00:08 schrieb Martin v. Löwis: > Walter Dörwald wrote: > >>> This array would have to be sparse, of course. >>> >> For encoding yes, for decoding no. >> > [...] > >> For decoding it should be sufficient to use a unicode string of >> length 256. u"\ufffd" could be used for "maps

Re: [Python-Dev] Unicode charmap decoders slow

2005-10-05 Thread M.-A. Lemburg
Martin v. Löwis wrote: >>Another option would be to generate a big switch statement in C >>and let the compiler decide about the best data structure. > > I would try to avoid generating C code at all costs. Maintaining the > build processes will just be a nightmare. We could automate this usi

Re: [Python-Dev] Unicode charmap decoders slow

2005-10-05 Thread jepler
The function the module below, xlate.xlate, doesn't quite do what "".decode does. (mostly that characters that don't exist are mapped to u+fffd always, instead of having the various behaviors avilable to "".decode) It builds the fast decoding structure once per call, but when decoding 53kb of dat

[Python-Dev] Python 2.5 and ast-branch

2005-10-05 Thread Nick Coghlan
Guido van Rossum wrote: > On 10/4/05, Nick Coghlan <[EMAIL PROTECTED]> wrote: > >>I was planning on looking at your patch too, but I was waiting for an answer >>from Guido about the fate of the ast-branch for Python 2.5. Given that we have >>patches for PEP 342 and PEP 343 against the trunk, but a

Re: [Python-Dev] Python 2.5 and ast-branch

2005-10-05 Thread Guido van Rossum
On 10/5/05, Nick Coghlan <[EMAIL PROTECTED]> wrote: > Anyway, the question is: What do we want to do with ast-branch? Finish > bringing it up to Python 2.4 equivalence, make it the HEAD, and only then > implement the approved PEP's (308, 342, 343) that affect the compiler? Or > implement the approv

Re: [Python-Dev] Unicode charmap decoders slow

2005-10-05 Thread Hye-Shik Chang
On 10/5/05, M.-A. Lemburg <[EMAIL PROTECTED]> wrote: > Of course, a C version could use the same approach as > the unicodedatabase module: that of compressed lookup > tables... > > http://aggregate.org/TechPub/lcpc2002.pdf > > genccodec.py anyone ? > I had written a test codec for single b

Re: [Python-Dev] Unicode charmap decoders slow

2005-10-05 Thread Walter Dörwald
Martin v. Löwis wrote: > Tony Nelson wrote: > >>> For decoding it should be sufficient to use a unicode string of >>> length 256. u"\ufffd" could be used for "maps to undefined". Or the >>> string might be shorter and byte values greater than the length of >>> the string are treated as "maps to u

Re: [Python-Dev] Unicode charmap decoders slow

2005-10-05 Thread M.-A. Lemburg
Hye-Shik Chang wrote: > On 10/5/05, M.-A. Lemburg <[EMAIL PROTECTED]> wrote: > >>Of course, a C version could use the same approach as >>the unicodedatabase module: that of compressed lookup >>tables... >> >>http://aggregate.org/TechPub/lcpc2002.pdf >> >>genccodec.py anyone ? >> > > > I

Re: [Python-Dev] Unicode charmap decoders slow

2005-10-05 Thread Martin v. Löwis
M.-A. Lemburg wrote: >>I would try to avoid generating C code at all costs. Maintaining the >>build processes will just be a nightmare. > > > We could automate this using distutils; however I'm not sure > whether this would then also work on Windows. It wouldn't. Regards, Martin _

Re: [Python-Dev] Unicode charmap decoders slow

2005-10-05 Thread Martin v. Löwis
Walter Dörwald wrote: > OK, here's a patch that implements this enhancement to > PyUnicode_DecodeCharmap(): http://www.python.org/sf/1313939 Looks nice! > Creating the decoding_map as a string should probably be done by > gencodec.py directly. This way the first import of the codec would be >

Re: [Python-Dev] Unicode charmap decoders slow

2005-10-05 Thread M.-A. Lemburg
Martin v. Löwis wrote: > M.-A. Lemburg wrote: > >>> I would try to avoid generating C code at all costs. Maintaining the >>> build processes will just be a nightmare. >> >> >> >> We could automate this using distutils; however I'm not sure >> whether this would then also work on Windows. > > > I

Re: [Python-Dev] Unicode charmap decoders slow

2005-10-05 Thread M.-A. Lemburg
Martin v. Löwis wrote: > Walter Dörwald wrote: > >>OK, here's a patch that implements this enhancement to >>PyUnicode_DecodeCharmap(): http://www.python.org/sf/1313939 > > Looks nice! Indeed (except for the choice of the "map this character to undefined" code point). Hye-Shik, could you please

Re: [Python-Dev] Unicode charmap decoders slow

2005-10-05 Thread Martin v. Löwis
M.-A. Lemburg wrote: >>It wouldn't. > > > Could you elaborate why not ? Using distutils on Windows is really > easy... The current build process for Windows simply doesn't provide it. You expect to select "Build/All" from the menu (or some such), and expect all code to be compiled. The VC build

Re: [Python-Dev] Python 2.5 and ast-branch

2005-10-05 Thread Brett Cannon
To answer Nick's email here, I didn't respond to that initial email because it seemed specifically directed at Guido and not me. On 10/5/05, Guido van Rossum <[EMAIL PROTECTED]> wrote: > On 10/5/05, Nick Coghlan <[EMAIL PROTECTED]> wrote: > > Anyway, the question is: What do we want to do with ast

Re: [Python-Dev] Unicode charmap decoders slow

2005-10-05 Thread M.-A. Lemburg
Martin v. Löwis wrote: > M.-A. Lemburg wrote: > >>> It wouldn't. >> >> >> >> Could you elaborate why not ? Using distutils on Windows is really >> easy... > > > The current build process for Windows simply doesn't provide it. > You expect to select "Build/All" from the menu (or some such), > and

Re: [Python-Dev] Unicode charmap decoders slow

2005-10-05 Thread Trent Mick
[Martin v. Loewis wrote] > Maybe it is possible to hack up a project file to invoke distutils > as the build process, but no such project file is currently available, > nor is it known whether it is possible to create one. This is essentially what the "_ssl" project does, no? It defers to "build_

Re: [Python-Dev] Unicode charmap decoders slow

2005-10-05 Thread Martin v. Löwis
Trent Mick wrote: > [Martin v. Loewis wrote] > >>Maybe it is possible to hack up a project file to invoke distutils >>as the build process, but no such project file is currently available, >>nor is it known whether it is possible to create one. > > > This is essentially what the "_ssl" project

Re: [Python-Dev] Unicode charmap decoders slow

2005-10-05 Thread Hye-Shik Chang
On 10/6/05, M.-A. Lemburg <[EMAIL PROTECTED]> wrote: > Hye-Shik, could you please provide some timeit figures for > the fastmap encoding ? > (before applying Walter's patch, charmap decoder) % ./python Lib/timeit.py -s "s='a'*53*1024; e='iso8859_10'; u=unicode(s, e)" "s.decode(e)" 100 loops, best

[Python-Dev] Removing the block stack (was Re: PEP 343 and __with__)

2005-10-05 Thread Phillip J. Eby
At 09:50 AM 10/4/2005 +0100, Michael Hudson wrote: >(anyone still thinking about removing the block stack?). I'm not any more. My thought was that it would be good for performance, by reducing the memory allocation overhead for frames enough to allow pymalloc to be used instead of the platform

Re: [Python-Dev] Removing the block stack (was Re: PEP 343 and __with__)

2005-10-05 Thread Neal Norwitz
On 10/5/05, Phillip J. Eby <[EMAIL PROTECTED]> wrote: > At 09:50 AM 10/4/2005 +0100, Michael Hudson wrote: > >(anyone still thinking about removing the block stack?). > > I'm not any more. My thought was that it would be good for performance, by > reducing the memory allocation overhead for frames