[Bug preprocessor/49973] Column numbers count multibyte characters as multiple columns

lhyatt at gmail dot com Thu, 26 Sep 2019 13:20:27 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49973


--- Comment #18 from Lewis Hyatt <lhyatt at gmail dot com> ---
(In reply to jos...@codesourcery.com from comment #17)
> On Tue, 17 Sep 2019, lhyatt at gmail dot com wrote:
> 
> > In any case, the underlying source of wcwidth() could easily be changed as a
> > drop-in replacement so I guess it can also be decided later. The use of
> > mbrtowc() is the bigger problem, since this converts from the user's locale 
> > and
> > it needs to convert from what -finput-charset asked for (or else UTF-8)
> > instead.
> 
> If __STDC_ISO_10646__ is defined, wchar_t is Unicode and so local code 
> converting from UTF-8 to wchar_t can be used (together with wcwidth from 
> libc if available).
> 
> If __STDC_ISO_10646__ is not defined, the encoding of wchar_t is unknown.  
> Maybe in that case it's best to avoid libc's wcwidth (if any) and just use 
> a local implementation of wcwidth on the results of converting UTF-8 to 
> Unicode code points.

It seems to erase a lot of complexity just to always use an internal wcwidth(),
so that's what I ended up doing. Patch was posted to
https://gcc.gnu.org/ml/gcc-patches/2019-09/msg01558.html for your
consideration. This one just addresses diagnostics, not the input-charset or
user locale conversion stuff. I will submit those separately after this one is
reviewed. Thanks!

[Bug preprocessor/49973] Column numbers count multibyte characters as multiple columns

Reply via email to