[issue12100] Incremental encoders of CJK codecs reset the codec at each call to encode()

2012-01-31 Thread Kang-Hao (Kenny) Lu
Changes by Kang-Hao (Kenny) Lu : -- nosy: +kennyluck ___ Python tracker <http://bugs.python.org/issue12100> ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue13913] utf-8 or utf8 or utf-8 (codec display name inconsistency)

2012-01-31 Thread Kang-Hao (Kenny) Lu
Kang-Hao (Kenny) Lu added the comment: > and the table on the documentation of the codec module suggests *utf_8* > as the name of the codec, which I believe to be equivalent to "utf_8" > because '-' is not a valid character of an identifier. typo: equivalent to &

[issue12892] UTF-16 and UTF-32 codecs should reject (lone) surrogates

2012-01-31 Thread Kang-Hao (Kenny) Lu
Kang-Hao (Kenny) Lu added the comment: > The followings are on my TODO list, although this patch doesn't depend > on any of these and can be reviewed and landed separately: > * make the surrogatepass error handler work for utf-16 and utf-32. (I >should be able to fin

[issue13916] disallow the "surrogatepass" handler for non utf-* encodings

2012-01-31 Thread Kang-Hao (Kenny) Lu
New submission from Kang-Hao (Kenny) Lu : Currently the "surrogatepass" handler always encodes the surrogates in UTF-8 and hence the behavior for, say, "\udc80".encode("latin-1", "surrogatepass").decode("latin-1") might be unexpected and

[issue13913] utf-8 or utf8 or utf-8 (codec display name inconsistency)

2012-01-31 Thread Kang-Hao (Kenny) Lu
Changes by Kang-Hao (Kenny) Lu : -- type: -> behavior ___ Python tracker <http://bugs.python.org/issue13913> ___ ___ Python-bugs-list mailing list Unsubscri

[issue13913] utf-8 or utf8 or utf-8 (codec display name inconsistency)

2012-01-31 Thread Kang-Hao (Kenny) Lu
New submission from Kang-Hao (Kenny) Lu : Since Python 3.2.2 (I don't have earlier version to test with), >>> "\udc80".encode("utf-8") UnicodeEncodeError: *utf-8* codec can't encode character '\udc80'... but >>> b"\xff".dec

[issue7856] cannot decode from or encode to big5 \xf9\xd8

2012-01-31 Thread Kang-Hao (Kenny) Lu
Changes by Kang-Hao (Kenny) Lu : -- nosy: +kennyluck ___ Python tracker <http://bugs.python.org/issue7856> ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue12892] UTF-16 and UTF-32 codecs should reject (lone) surrogates

2012-01-29 Thread Kang-Hao (Kenny) Lu
Kang-Hao (Kenny) Lu added the comment: Attached patch does the following beyond what the patch from haypo does: * call the error handler * reject 0xd800~0xdfff when decoding utf-32 The followings are on my TODO list, although this patch doesn't depend on any of these and can be rev