Re: tmux vs UTF8

hans Mon, 02 May 2016 07:50:21 -0700

On May 01 19:10:03, [email protected] wrote:
> > In the last snapshot, it seems, tmux does not do UTF8 input correctly,
> > while xterm is fine. This used to work with the ~/.xsession below.
> > 
> > When typing non-ascii in xterm or in a vim-in-an-xterm
> > ot a mutt-in-an-xterm, thay appear OK. When in a tmux window,
> > they look like garbage.
> > 
> > Interestingly, if I type some Czech text into /tmp/cz 
> > (using vim in an xterm, whre it works), and then open
> > the file with vim in tmux, the text there appears fine
> > - only _new_ text typed within tmux looks broken.
> > 
> > Has anything changed in the way tmux handles UTF8?
> 
> Such generic questions are always hard to answer.
> Yes, some things changed recently, but who knows whether
> that is related?
> 
> > Is anyone else seeing this?
> 
> Trying to reproduce and then fix is a good idea.
> However, i can't reproduce so far.
> 
> Here is what i did:
> 
>   $ cd /usr/src/usr.bin/tmux/
>   $ make cleandir
>   $ make obj
>   $ make cleandir
>   $ cvs up -dP
>   $ make depend
>   $ make
>   $ doas make install
>   $ tmux


I have rebuilt my tmux from HEAD too now.

> And now, inside the tmux window, typing in accented characters
> works fine for me, both on the ksh(1) command line and inside vim(1).
> 
> Obviously, i don't have a CZ keyboard; but this shouldn't make
> the difference, or should it?

I don't think it should.

>   schwarze@isnote $ setxkbmap -query
>   rules:      base
>   model:      pc105
>   layout:     us
>   options:    compose:ralt,altwin:left_meta_win

>   schwarze@isnote $ locale   
>   LANG=
>   LC_COLLATE="C"
>   LC_CTYPE=en_US.UTF-8
>   LC_MONETARY="C"
>   LC_NUMERIC="C"
>   LC_TIME="C"
>   LC_MESSAGES="C"
>   LC_ALL=

hans@biblio:~$ setxkbmap -query 
rules:      base
model:      pc105
layout:     us,cz
options:    grp:shifts_toggle,grp_led:scroll

hans@biblio:~$ locale
LANG=
LC_COLLATE="C"
LC_CTYPE=cs_CZ.UTF-8
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_MESSAGES=C
LC_ALL=


> I don't have an ~/.xmodmaprc, i don't know what is in yours,
> and i have no idea whether that's related.

Sorry  for not including that:

$ cat .xmodmaprc                                                  
keycode  29 = y Y y Y leftarrow yen
keycode  52 = z Z z Z degree less

The default cz variant has y/z switched, I switch it back.
This is most probably not related.


> > good line: ?????????????????? (vim in xterm)
> > bad line : ???????????????????????????????????? (vim in tmux in xterm)
> 
> That's double encoding.  Here is information regarding the first
> character from uniname(1):
> 
>   char byte UTF-32 encod name
>      0    0 00011B C4 9B LATIN SMALL LETTER E WITH CARON

> If you misinterpret that as U+00C4 U+009B and encode it again,
> you get:
> 
>   char byte UTF-32 encod name
>      0    0 0000C4 C3 84 LATIN CAPITAL LETTER A WITH DIAERESIS
>      1    2 00009B C2 9B CONTROL SEQUENCE INTRODUCER

My bad again: I am not sure I typed the same keys in those two lines.

However, the first letter in the good line is the a with a caron;
and the first letter in the bad line is capital A with a diaeresis.
But capital A with a diaeresis is not what I wanted to type,
that's where the problem is. Could this misinterpretation
be what I am seeing?

Here is include a better test of the same:
the first line contains the input obtained with pressing the keys
2 to 0 using the cz keyboard in vim in an xterm. The second line
is the input obtained by pressing the same in vim in tmux in xterm):

ěščřžýáíé
ÄÅ¡ÄÅÅ¾Ã½Ã¡ÃÃ©


        Jan

Re: tmux vs UTF8

Reply via email to