[issue40845] idna encoding fails for Cherokee symbols

2020-06-02 Thread Christian Heimes
Christian Heimes added the comment: There are two IDNA standard. Python's standard library only provides IDNA 2003 and does not support IDNA 2008. # IDNA 2003 >>> '\u13e3\u13b3\u13a9'.encode('idna') b'xn--tz9ata7l' # idna package with IDNA 2008 >>> idna.encode('\u13e3\u13b3\u13a9') b'xn--f9dt

[issue40845] idna encoding fails for Cherokee symbols

2020-06-02 Thread SilentGhost
SilentGhost added the comment: I took it from your msg370615: > of even more reliable > Execute '\u13e3\u13b3\u13a9'.encode('idna') -- ___ Python tracker ___ ___

[issue40845] idna encoding fails for Cherokee symbols

2020-06-02 Thread Roman Akopov
Roman Akopov added the comment: This is how I extract data from Common Locale Data Repository v37 script assumes common\main working directory from os import walk from xml.etree import ElementTree en_root = ElementTree.parse('en.xml') for (dirpath, dirnames, filenames) in walk('.'): for f

[issue40845] idna encoding fails for Cherokee symbols

2020-06-02 Thread SilentGhost
SilentGhost added the comment: For the record: >>> 'ꮳꮃꭹ'.encode('punycode') b'tz9ata7l' >>> '\u13e3\u13b3\u13a9'.encode('punycode') b'f9dt7l' Also, your unicode-escaped string is an upper-cased version of the first string. -- nosy: +SilentGhost ___

[issue40845] idna encoding fails for Cherokee symbols

2020-06-02 Thread Roman Akopov
New submission from Roman Akopov : For a specific Cherokee string of three symbols b'\\u13e3\\u13b3\\u13a9' generating punycode representation fails. What steps will reproduce the problem? Execute 'ꮳꮃꭹ'.encode('idna') of even more reliable Execute '\u13e3\u13b3\u13a9'.encode('idna') What is t