[issue8922] Improve encoding shortcuts in PyUnicode_AsEncodedString()

2010-06-10 Thread STINNER Victor
STINNER Victor added the comment: Le jeudi 10 juin 2010 14:02:34, vous avez écrit : > Commited in 3.2 (r81869), blocked in 3.1 (r81870). This commit introduced a regression: ISO-8859-15 was seen as an alias to ISO-8859-1 because the normalized string was truncated. Fixed in r81871 (blocked in

[issue8922] Improve encoding shortcuts in PyUnicode_AsEncodedString()

2010-06-10 Thread STINNER Victor
STINNER Victor added the comment: Commited in 3.2 (r81869), blocked in 3.1 (r81870). -- Oops, I don't know why I wrote utf-16 and utf-32. I don't want to add them to the shortcuts. -- resolution: -> fixed status: open -> closed ___ Python tracker

[issue8922] Improve encoding shortcuts in PyUnicode_AsEncodedString()

2010-06-10 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: STINNER Victor wrote: > > STINNER Victor added the comment: > >> Note that these shortcut bypass the codec registry logic. > > Yes, but it's already the case without my patch. I don't think that it's > really useful to override latin1, utf-8, utf-16, ut

[issue8922] Improve encoding shortcuts in PyUnicode_AsEncodedString()

2010-06-07 Thread STINNER Victor
STINNER Victor added the comment: > Note that these shortcut bypass the codec registry logic. Yes, but it's already the case without my patch. I don't think that it's really useful to override latin1, utf-8, utf-16, utf-32 or mbcs. I prefer a faster Python :-) > we have to be careful about

[issue8922] Improve encoding shortcuts in PyUnicode_AsEncodedString()

2010-06-07 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: STINNER Victor wrote: > > STINNER Victor added the comment: > >> the shortcuts were meant for Python internal use only > > str.encode() calls PyUnicode_AsEncodedString() and bytes.decode() calls > PyUnicode_Decode(), so it is not for internal use only.

[issue8922] Improve encoding shortcuts in PyUnicode_AsEncodedString()

2010-06-07 Thread STINNER Victor
STINNER Victor added the comment: > the shortcuts were meant for Python internal use only str.encode() calls PyUnicode_AsEncodedString() and bytes.decode() calls PyUnicode_Decode(), so it is not for internal use only. Eg. "text".encode("ASCII") doesn't use the fastpath. -- _

[issue8922] Improve encoding shortcuts in PyUnicode_AsEncodedString()

2010-06-07 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: STINNER Victor wrote: > > New submission from STINNER Victor : > > PyUnicode_Decode() and PyUnicode_AsEncodedString() calls directly builtin > decoders/encoders for some known encodings (eg. "utf-8"), instead of using > the slow path (call PyCodec_Decode

[issue8922] Improve encoding shortcuts in PyUnicode_AsEncodedString()

2010-06-06 Thread STINNER Victor
New submission from STINNER Victor : PyUnicode_Decode() and PyUnicode_AsEncodedString() calls directly builtin decoders/encoders for some known encodings (eg. "utf-8"), instead of using the slow path (call PyCodec_Decode() / PyCodec_Encode()). PyUnicode_Decode() does normalize the encoding na