[clang] clang: Relax LangOpts checks when lexing quoted numbers during preprocessing (PR #95798)

via cfe-commits Tue, 12 Nov 2024 11:59:12 -0800

================
@@ -2068,7 +2068,8 @@ bool Lexer::LexNumericConstant(Token &Result, const char 
*CurPtr) {
   }
 
   // If we have a digit separator, continue.
-  if (C == '\'' && (LangOpts.CPlusPlus14 || LangOpts.C23)) {
+  if (C == '\'' &&
+      (LangOpts.CPlusPlus14 || LangOpts.C23 || ParsingPreprocessorDirective)) {
----------------
Sirraide wrote:


I feel like another issue is that just accepting separators uncoditionally in 
preprocessor directives (i.e. even outside of dependency scanning) would 
technically be a language extension... doesn’t that mean that we’d technically 
have to add a compatibility warning? Because we’re usually pretty strict about 
doing that for things that are effectively a compiler extension.

Also, I’m not really that familiar w/ the dependency scanner, but I’m still 
confused as to why passing down the `LangOpts` isn’t an option: in #93753, it 
was pointed out that that would impair caching the results of the scan—but 
isn’t it already the case that e.g. what files something depends on is 
dependent on the language standard? E.g. what if I write:
```c++
#if __cplusplus >= WhateverTheValueForC++17IsAgain
#    include "foo.hh"
#else
#    include "bar.hh"
#endif
```

Does it just note that the file might depend on both `foo.hh` and `bar.hh`? 
Because if so, can’t we just set the language standard in the lang opts for the 
dependency scanner to the latest e.g. C++ standard? Because that would also fix 
this issue.

If we really can’t pass down the original lang opts for whatever reason, then 
I’m in favour of adding a language option either for `DependencyScanning` or 
for just literal separators.

> I don't think per-feature LangOptions is viable. There's digit separators, 
> hex float support, binary literal support, different literal suffixes, 
> various keywords, etc and that doesn't seem likely to scale well.

I mean, we don’t need a language option for *every* feature either, but adding 
them on an as-needed basis doesn’t seem completely untenable. We already have, 
like, over 300 of them, so I feel like we’re past the point where adding more 
of them would really make that big of a difference.

That said, I can’t seem to decide which of the following options would be the 
best (other than passing down the original lang opts):
1. Adding a language option for literal separators (or for an individual 
feature in general) solves the problem of a ‘dependency scanner’ flag 
potentially being confusing—and, more generally speaking, also has the nice 
side-effect that you would no longer have to remember the full list of language 
dialects which support any one feature—except that as Aaron pointed out, I 
don’t think a dependency scanner flag would be that confusing (‘language 
dialect X (which dependency scanning would be at that point) needs to do a 
weird thing’ isn’t really that unusual imo), and also, you still need to know 
what dialect(s) introduce certain features so you know when to emit 
compatibility diagnostics.
2. Adding a language option for dependency scanning means we only need a single 
one to account for this and however many other problems we might run into wrt 
language dialects causing a problem w/ dependency scanning, but it also means 
that now every place that checks whether a feature is supported (possibly only 
in the lexer/preprocessor; again, I don’t know the dependency scanner that 
well...) also needs to check if we’re performing dependency scanning.

Either way, I’d definitely prefer either option over just... allowing this in 
preprocessor directives irrespective of the language dialect.

https://github.com/llvm/llvm-project/pull/95798
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] clang: Relax LangOpts checks when lexing quoted numbers during preprocessing (PR #95798)

Reply via email to