[issue5902] Stricter codec names

Alexander Belopolsky Wed, 23 Feb 2011 20:01:02 -0800

Alexander Belopolsky <belopol...@users.sourceforge.net> added the comment:


Ezio and I discussed on IRC the implementation of alias lookup and neither of 
us was able to point out to the function that strips non-alphanumeric 
characters from encoding names.

It turns out that there are three "normalize" functions that are successively 
applied to the encoding name during evaluation of str.encode/str.decode.

1. normalize_encoding() in unicodeobject.c
2. normalizestring() in codecs.c
3. normalize_encoding() in encodings/__init__.py

Each performs a slightly different transformation and only the last one strips 
non-alphanumeric characters.

The complexity of codec lookup is comparable with that of the import mechanism!

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue5902>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue5902] Stricter codec names

Reply via email to