Bug#279000: Python curses bindings and UTF-8

Thomas Dickey Fri, 10 Feb 2006 03:49:01 -0800

On Fri, 10 Feb 2006, Peter Samuelson wrote:


[Peter Samuelson]

In the ISO-8859 family, bytes 0x80-0xbf are invalid - and the UTF-8
encoding of "?" is 0xc3 0x84.


Doh!  Of course I meant to say bytes 0x80-0x9f are invalid.  Anyway,
ncurses seems to reject that same range of bytes even when LC_CTYPE
indicates UTF-8.


I pointed out in comp.lang.python two days about that the problem
is in python:

   Testing this, and looking to see what's going on, I notice that python
   is doing a

           setlocale(LC_ALL, "C");

   before the addstr is actually called.  (ncurses never sets the locale;
   it calls setlocale in one place to ask what it is).

   That makes ncurses think it's not really doing UTF-8, of course.  What I
   see on the screen is the U+00C5 comes out with a box and a "~E" (the
   latter being ncurses' representation in POSIX for \0x85).

--
Thomas E. Dickey
http://invisible-island.net
ftp://invisible-island.net

Bug#279000: Python curses bindings and UTF-8

Reply via email to