On 2025-12-17 04:32:37 -0500, Thomas Dickey wrote: > On Wed, Dec 17, 2025 at 04:50:45AM +0100, Vincent Lefevre wrote: > > With xterm 405, the use of the U+FE0F VARIATION SELECTOR-16 (VS16) > > character after an emoji can completely corrupt the display with > > Mutt. GNU Screen also gets broken with the command below (issues > > with the last line of the terminal). I suspect that this is due > > to an inconsistency between the xterm behavior and wcwidth(), > > which may affect various applications that rely on wcwidth(). > > Without the Emoji width feature (which as I mentioned, I see should be > configurable), xterm's wcwidth is a close match for glibc's wcwidth. > The few differences which I noticed in testing appear to be problems with > glibc.
But if xterm's Emoji width feature is enable, I don't see how this can be fixable with wcwidth() since the width is contextual. It is the spec that would be broken. And there should be a clean way to query the terminal about its behavior. > Checking now, mutt has a wcwidth.c, which is not often used (since it's > compile-time), which is just as well because it's tables are very old. > It has a wrapper for wcwidth which makes some assumptions about iswprint > that make its behavior problematic except with glibc. There was the same issue with GNU Screen in the past, which has now been fixed by using the wcwidth function from the C library, following my bug reports: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1027733 https://savannah.gnu.org/bugs/?63634 But I did not have any issue with Mutt itself over the years. Well... Actually it seems that Mutt uses its own implementation only if such functions are not present in the C library. In configure.ac: if test "$wc_funcs" != yes && test "$wc_funcs" != no; then AC_CACHE_CHECK([for wchar_t functions], mutt_cv_wc_funcs, mutt_cv_wc_funcs=no AC_LINK_IFELSE([AC_LANG_PROGRAM([[ #define _XOPEN_SOURCE 600 #include <stddef.h> #include <stdlib.h> #ifdef HAVE_WCHAR_H #include <wchar.h> #endif #ifdef HAVE_WCTYPE_H #include <wctype.h> #endif]], [[mbrtowc(0, 0, 0, 0); wctomb(0, 0); wcwidth(0); iswprint(0); iswspace(0); towlower(0); towupper(0); iswalnum(0)]])],[mutt_cv_wc_funcs=yes],[])) wc_funcs=$mutt_cv_wc_funcs fi if test $wc_funcs = yes; then AC_DEFINE(HAVE_WC_FUNCS,1,[ Define if you are using the system's wchar_t functions. ]) else MUTT_LIB_OBJECTS="$MUTT_LIB_OBJECTS utf8.o wcwidth.o" fi So, in practice, under Linux (and most systems?), Mutt uses the wcwidth function from the C library. > > I have not checked wcswidth(). > > nor I - actually I don't believe it is often used. > mutt imitates it by repeatedly calling wcwidth, and doesn't account for VS16. > > Because mutt isn't accounting for VS16, that's an issue for which xterm > "should" be configurable, so we can accommodate programs which pass through > VS15 and VS16 without accounting for their behavior. Note that if Mutt accounted for VS16, this would break its display on most terminals (including GNOME Terminal, which seems commonly used under Linux). BTW, the behavior of GNOME Terminal has the advantage to avoid display breakage: the emoji takes 2 cells, but wcwidth is honored. So it seems that it is up to the application to advance the cursor to avoid partial overlap. > (I haven't investigated "neomutt", which may provide improvements, though > the "neo" cult appears to rely heavily upon hard-coding). Thre would still be the question of behavior with the terminals like xterm 403- and GNOME Terminal. > > But there are issues even with simple output. In a 80-column terminal: > > > > perl -C -e 'print "\x{2642}\x{FE0F}"x60, "\n"' > > perl's yet another pitfall. In developing #404, I looked into the wcwidth > data used in NetBSD/OpenBSD, which reportedly is tied to perl. That ignores > the East Asian stuff entirely, and doesn't match glibc very well. Note that I'm using perl here only for its "x" feature to repeat a string. There is exactly the same issue with printf (with the sequence repeated manually), which has the same output. > For your example, perl's irrelevant though - this is just bits... > > > I get "♂♂" in the last two columns, which is inconsistent with what > > is output before. And in case of scrolling, the spaces are missing > > in the second line. > > xterm's handling fullwidth characters by putting a non-character in the > second cell. In handling VS16, I may have overlooked some path for doing > that (something to investigate). But the behavior in mutt was consistent > with my expectation: an extra "blank" cell. BTW, there was another issue I hadn't reported here: this extra "blank" cell does not have the correct background when it is different from the default background. -- Vincent Lefèvre <[email protected]> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / Pascaline project (LIP, ENS-Lyon)

