Author: Corentin Jabot Date: 2022-07-06T22:20:04+02:00 New Revision: bf45e27a676d87944f1f13d5f0d0f39935fc4010
URL: https://github.com/llvm/llvm-project/commit/bf45e27a676d87944f1f13d5f0d0f39935fc4010 DIFF: https://github.com/llvm/llvm-project/commit/bf45e27a676d87944f1f13d5f0d0f39935fc4010.diff LOG: [Clang] Fix invalid utf-8 detection The length of valid codepoints was incorrectly calculated which was not caught before because the absence of tests for the valid codepoints scenario. Differential Revision: https://reviews.llvm.org/D129223 Added: Modified: clang/test/Lexer/comment-invalid-utf8.c llvm/lib/Support/ConvertUTF.cpp Removed: ################################################################################ diff --git a/clang/test/Lexer/comment-invalid-utf8.c b/clang/test/Lexer/comment-invalid-utf8.c index b8bf551dd8564..ed7405a3c079e 100644 --- a/clang/test/Lexer/comment-invalid-utf8.c +++ b/clang/test/Lexer/comment-invalid-utf8.c @@ -25,3 +25,14 @@ // abcd // €abcd // expected-warning@-1 {{invalid UTF-8 in comment}} + + +//§ § § 😀 ä½ å¥½ © + +/*§ § § 😀 ä½ å¥½ ©*/ + +/* +§ § § 😀 ä½ å¥½ © +*/ + +/* § § § 😀 ä½ å¥½ © */ diff --git a/llvm/lib/Support/ConvertUTF.cpp b/llvm/lib/Support/ConvertUTF.cpp index c494110cdcee1..25875d4c3184b 100644 --- a/llvm/lib/Support/ConvertUTF.cpp +++ b/llvm/lib/Support/ConvertUTF.cpp @@ -423,7 +423,7 @@ Boolean isLegalUTF8Sequence(const UTF8 *source, const UTF8 *sourceEnd) { */ unsigned getUTF8SequenceSize(const UTF8 *source, const UTF8 *sourceEnd) { int length = trailingBytesForUTF8[*source] + 1; - return (length > sourceEnd - source && isLegalUTF8(source, length)) ? length + return (length < sourceEnd - source && isLegalUTF8(source, length)) ? length : 0; } _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits