cor3ntin added inline comments.
================
Comment at: clang/lib/Lex/Lexer.cpp:2399
+ getSourceLocation(CurPtr));
+ bool UnicodeDecodeFailed = false;
+
----------------
aaron.ballman wrote:
> It looks like this can move into the `while` loop and we can remove some `=
> false` from within the loop?
The idea here is to not diagnose a contiguous sequence of invalid code units
more than once
The sequence `0x80x0x80x80` for example will only warn one for example. The
goal is to avoid noise when, for example there is a long comment in
Windows-1251 encoded cyrillic. it would warn one per word (because the spaces
would decode fine) rather than one per contiguous non-sense character.
It's also somewhat necessary because we can't know where the end of the invalid
"character" is, better to be greedy.
(The test file check that behavior)
================
Comment at: clang/lib/Lex/Lexer.cpp:2407
+ // diagnostic only once per sequence that cannot be decoded.
+ while ((!WarnOnInvalidUtf8 || isASCII(C)) && C != 0 && // Potentially EOF.
+ C != '\n' && C != '\r') { // Newline or DOS-style newline.
----------------
aaron.ballman wrote:
> Do you have some idea of performance cost for enabling the warning with a
> large TU? Are we talking .5%? 1%? 5%? noise?
This adds 2 comparisons when the warning is enbled (is ASCII(c) is simply `C <
0x80`) - so it should be noise, but I have not run benchmark.
It might be more noticeable within multi line comments as there the
optimization that vectorizes the comment skipping on SSE2 platforms is simply
skipped when this optimization is enabled.
================
Comment at: clang/lib/Lex/Lexer.cpp:2417
+ if (Length == 0) {
+ if (!UnicodeDecodeFailed) {
+ Diag(CurPtr, diag::warn_invalid_utf8_in_comment);
----------------
aaron.ballman wrote:
> aaron.ballman wrote:
> >
> So if unicode decoding didn't fail... diagnose? :-) The naming here seems
> unfortunate.
`if(!UnicodeDecodeFailedInThePreviousLoopIteration)` ? I'm open to suggestion
here
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D128059/new/
https://reviews.llvm.org/D128059
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits