sammccall added a comment.

In D81530#2085725 <https://reviews.llvm.org/D81530#2085725>, @jaafar wrote:

> I can confirm that this commit works equally well for the UTF-8 assertion 
> failure. Thank you!
>
> Is there some way to easily modify the source to produce a diagnostic 
> pointing to the issue's source location? I would like to file an appropriate 
> bug in Boost.


In 
https://www.boost.org/doc/libs/1_73_0/boost/spirit/home/support/char_encoding/iso8859_1.hpp
The line labeled `161  a1`, whose bytes are `"/*  \xa1  161  a1 */   
BOOST_CC_PUNCT,"` triggers it because of the `"\xa1"`.

It's not a cut and dried bug, but I do think removing the high bytes from these 
comments is a good idea, so it's probably worth filing the bug.

The C++ standard doesn't say anything about the encoding of characters on disk 
(the "input encoding" of bytes -> source character set) - it starts with the 
source character set, I think. So what do implementations do?

- clang: from reading the code, clang only supports UTF-8. It supports gcc's 
-finput-charset flag, but setting it to anything other than UTF-8 is an error!
- GCC: supports a variety of input encodings, configured with `-finput-charset` 
or locale. As such it's not really reasonable for a header designed to be 
included to require any particular value, as you can't pass a flag for just 
that header.

In practice this doesn't actually affect compilation because the bad UTF-8 
sequence in the comment is never parsed: clang just skips over it looking for 
`*/`. It probably mostly affects tools that use line/column coordinates (like 
clangd by virtue of LSP, and clang diagnostics), and tools that extract comment 
contents (doxygen et al).
But it still seems that it would be clearer to remove the literal characters 
from the source file, or write them in UTF-8.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81530/new/

https://reviews.llvm.org/D81530



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to