On Thu, Oct 7, 2021 at 9:01 AM Jakub Jelinek via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
> And another thing, if HOST_CHARSET == HOST_CHARSET_EBCDIC, how does the 
> libcpp/lex.c
> static const cppchar_t utf8_signifier = 0xC0;
> ...
>       if (*buffer->cur >= utf8_signifier)
>         {
>           if (_cpp_valid_utf8 (pfile, &buffer->cur, buffer->rlimit, 1 + 
> !first,
>                                state, &s))
>             return true;
>         }
> work?  Because in UTF-EBCDIC, >= 0xC0 isn't the right test for start of
> multi-byte character, it is more complicated and seems _cpp_valid_utf8
> assumes UTF-8 as the host charset.

FWIW, here I was following Joseph's guidance from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67224#c21 ("You can
ignore anything claiming to handle UTF-EBCDIC.")

-Lewis

Reply via email to