Re: Different names for Unicode codepoint

Eli Zaretskii Thu, 21 Apr 2016 13:17:05 -0700

> From: Lele Gaifax <[email protected]>
> Date: Thu, 21 Apr 2016 21:04:32 +0200
> Cc: [email protected]
> 
> is there a particular reason for the slightly different names that Emacs
> (version 25.0.92) and Python (version 3.6.0a0) give to a single Unicode 
> entity?


They don't.

> Just to mention one codepoint, ⋖ is called "LESS THAN WITH DOT" accordingly to
> Emacs' C-x 8 RET TAB menu, while in Python:
> 
>     >>> import unicodedata
>     >>> unicodedata.name('⋖')
>     'LESS-THAN WITH DOT'
>     >>> print("\N{LESS THAN WITH DOT}")
>       File "<stdin>", line 1
>     SyntaxError: (unicode error) ...: unknown Unicode character name

Emacs shows both the "Name" and the "Old Name" properties of
characters as completion candidates, while Python evidently supports
only "Name".  If you type "C-x 8 RET LESS TAB", then you will see
among the completion candidates both "LESS THAN WITH DOT" and
"LESS-THAN WITH DOT".  The former is the "old name" of this character,
according to the Unicode Character Database (which is where Emacs
obtains the names and other properties of characters).
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Different names for Unicode codepoint

Reply via email to