Hi Jan,

hans wrote on Sat, Apr 30, 2016 at 01:08:12PM +0200:

> In the last snapshot, it seems, tmux does not do UTF8 input correctly,
> while xterm is fine. This used to work with the ~/.xsession below.
> 
> When typing non-ascii in xterm or in a vim-in-an-xterm
> ot a mutt-in-an-xterm, thay appear OK. When in a tmux window,
> they look like garbage.
> 
> Interestingly, if I type some Czech text into /tmp/cz 
> (using vim in an xterm, whre it works), and then open
> the file with vim in tmux, the text there appears fine
> - only _new_ text typed within tmux looks broken.
> 
> Has anything changed in the way tmux handles UTF8?

Such generic questions are always hard to answer.
Yes, some things changed recently, but who knows whether
that is related?

> Is anyone else seeing this?

Trying to reproduce and then fix is a good idea.
However, i can't reproduce so far.

Here is what i did:

  $ cd /usr/src/usr.bin/tmux/
  $ make cleandir
  $ make obj
  $ make cleandir
  $ cvs up -dP
  $ make depend
  $ make
  $ doas make install
  $ tmux

And now, inside the tmux window, typing in accented characters
works fine for me, both on the ksh(1) command line and inside vim(1).

Obviously, i don't have a CZ keyboard; but this shouldn't make
the difference, or should it?

  schwarze@isnote $ setxkbmap -query
  rules:      base
  model:      pc105
  layout:     us
  options:    compose:ralt,altwin:left_meta_win

  schwarze@isnote $ locale   
  LANG=
  LC_COLLATE="C"
  LC_CTYPE=en_US.UTF-8
  LC_MONETARY="C"
  LC_NUMERIC="C"
  LC_TIME="C"
  LC_MESSAGES="C"
  LC_ALL=

I don't have an ~/.xmodmaprc, i don't know what is in yours,
and i have no idea whether that's related.

> good line: ?????????????????? (vim in xterm)
> bad line : ???????????????????????????????????? (vim in tmux in xterm)

That's double encoding.  Here is information regarding the first
character from uniname(1):

  char byte UTF-32 encod name
     0    0 00011B C4 9B LATIN SMALL LETTER E WITH CARON

If you misinterpret that as U+00C4 U+009B and encode it again,
you get:

  char byte UTF-32 encod name
     0    0 0000C4 C3 84 LATIN CAPITAL LETTER A WITH DIAERESIS
     1    2 00009B C2 9B CONTROL SEQUENCE INTRODUCER

But i can't seem to reproduce the double encoding you report...

Am i doing what you suggest?

Yours,
  Ingo

Reply via email to