On Sun, May 01, 2016 at 07:32:09PM +0100, Nicholas Marriott wrote: > Underscores means tmux does not know that your terminal supports UTF-8 > which probably means LANG or LC_ALL are not exported correctly.
LANG and LC_ALL are exportded with value "en_US.UTF-8". Exporting the other LC_* variables doesn't make any differece. Setting TERM to screen, screen-256color or rxvt (or anything else) doesn't seem to make any differece either. > On 1 May 2016 7:24 p.m., "Andreas Kusalananda K??h??ri" < > [email protected]> wrote: > > > On Sun, May 01, 2016 at 07:10:03PM +0200, Ingo Schwarze wrote: > > > Hi Jan, > > > > > > hans wrote on Sat, Apr 30, 2016 at 01:08:12PM +0200: > > > > > > > In the last snapshot, it seems, tmux does not do UTF8 input correctly, > > > > while xterm is fine. This used to work with the ~/.xsession below. > > > > > > > > When typing non-ascii in xterm or in a vim-in-an-xterm > > > > ot a mutt-in-an-xterm, thay appear OK. When in a tmux window, > > > > they look like garbage. > > > > > > > > Interestingly, if I type some Czech text into /tmp/cz > > > > (using vim in an xterm, whre it works), and then open > > > > the file with vim in tmux, the text there appears fine > > > > - only _new_ text typed within tmux looks broken. > > > > > > > > Has anything changed in the way tmux handles UTF8? > > > > > > Such generic questions are always hard to answer. > > > Yes, some things changed recently, but who knows whether > > > that is related? > > > > I'm also having problems with accented characters in tmux, but in my > > case, I get underscores when I try to enter Swedish characters (??????, > > hope that comes out right) or any accented characters, so I'm not > > sure it's the same issue as Jan had. It's been like this for quite > > some time (months, possibly since all the non-UTF/POSIX locales were > > removed). It's not just characters that I type, but mutt-in-tmux shows > > all accented characters in email as underscores too, as does less > > and cat. Typing accented characters will actually insert the right > > character into the document (although in vim-in-tmux, again, it's all > > underscores), it's just the displaying of them that is wonky. > > > > I'm using tmux without X, over an SSH connection. > > > > $ locale > > LANG=en_US.UTF-8 > > LC_COLLATE="C" > > LC_CTYPE="en_US.UTF-8" > > LC_MONETARY="C" > > LC_NUMERIC="C" > > LC_TIME="C" > > LC_MESSAGES="en_US.UTF-8" > > LC_ALL=en_US.UTF-8 > > > > (using sv_SE.UTF-8 doesn't make any differece) > > > > Outside of tmux, it seems to work ok. This is on amd64, recompiled from > > a checkout yesterday evening, running in an VM on VirtualBox. > > > > Cheers, > > Andreas > > > > > > > > > Is anyone else seeing this? > > > > > > Trying to reproduce and then fix is a good idea. > > > However, i can't reproduce so far. > > > > > > Here is what i did: > > > > > > $ cd /usr/src/usr.bin/tmux/ > > > $ make cleandir > > > $ make obj > > > $ make cleandir > > > $ cvs up -dP > > > $ make depend > > > $ make > > > $ doas make install > > > $ tmux > > > > > > And now, inside the tmux window, typing in accented characters > > > works fine for me, both on the ksh(1) command line and inside vim(1). > > > > > > Obviously, i don't have a CZ keyboard; but this shouldn't make > > > the difference, or should it? > > > > > > schwarze@isnote $ setxkbmap -query > > > rules: base > > > model: pc105 > > > layout: us > > > options: compose:ralt,altwin:left_meta_win > > > > > > schwarze@isnote $ locale > > > LANG= > > > LC_COLLATE="C" > > > LC_CTYPE=en_US.UTF-8 > > > LC_MONETARY="C" > > > LC_NUMERIC="C" > > > LC_TIME="C" > > > LC_MESSAGES="C" > > > LC_ALL= > > > > > > I don't have an ~/.xmodmaprc, i don't know what is in yours, > > > and i have no idea whether that's related. > > > > > > > good line: ?????????????????? (vim in xterm) > > > > bad line : ???????????????????????????????????? (vim in tmux in xterm) > > > > > > That's double encoding. Here is information regarding the first > > > character from uniname(1): > > > > > > char byte UTF-32 encod name > > > 0 0 00011B C4 9B LATIN SMALL LETTER E WITH CARON > > > > > > If you misinterpret that as U+00C4 U+009B and encode it again, > > > you get: > > > > > > char byte UTF-32 encod name > > > 0 0 0000C4 C3 84 LATIN CAPITAL LETTER A WITH DIAERESIS > > > 1 2 00009B C2 9B CONTROL SEQUENCE INTRODUCER > > > > > > But i can't seem to reproduce the double encoding you report... > > > > > > Am i doing what you suggest? > > > > > > Yours, > > > Ingo [demime 1.01d removed an attachment of type application/pgp-signature which had a name of signature.asc]

