sammccall added inline comments.
================ Comment at: clang/lib/Lex/PPLexerChange.cpp:436 + if (PragmaAssumeNonNullLoc.isValid() && !this->PPOpts->GeneratePreamble && + !(CurLexer && CurLexer->getFileID() == PredefinesFileID) && !isEndOfMacro && !(CurLexer && CurLexer->Is_PragmaLexer)) { ---------------- dgoldman wrote: > sammccall wrote: > > sammccall wrote: > > > dgoldman wrote: > > > > sammccall wrote: > > > > > dgoldman wrote: > > > > > > sammccall wrote: > > > > > > > what's the PredefinesFileID special case about? > > > > > > See ExitedFromPredefinesFile below, this is how I check if we're > > > > > > falling off the end of the preamble region into the main file (same > > > > > > as done for the conditional stack), LMK if there's a better way. > > > > > I don't understand how this checks if we're falling off the preamble > > > > > region, and the code around ExitedFromPredefinesFile doesn't clarify > > > > > this for me. Can you explain? > > > > > > > > > > The `if (ExitedFromPredefinesFile)` appears to be handling the > > > > > logical *insertion* of preamble PP events when *consuming* a > > > > > preamble, which is not what you're trying to do here. > > > > > > > > > > The condition here is of the form "if we have an open pragma, and > > > > > we're not generating a preamble, and some other stuff, then > > > > > diagnose". So if the baseline case is "diagnose" and the preamble > > > > > case is an exception, the "other stuff" clauses don't limit the > > > > > preamble exception, they add extra exceptions! > > > > `!(CurLexer && CurLexer->getFileID() == PredefinesFileID)` makes sure > > > > that if we're consuming the preamble, we don't emit the warning. AFAICT > > > > this is the way to tell that the preamble is terminated, since the > > > > current file ID being processed is the predefines file and the file is > > > > now terminated as of this method call (since !isEndOfMacro && > > > > !(CurLexer && CurLexer->Is_PragmaLexer)). > > > > `!this->PPOpts->GeneratePreamble` makes sure that if we're generating > > > > the preamble, we don't emit the warning. We need to special case both > > > > cases otherwise we get an error when generating the preamble or when we > > > > load the preamble before even processing the rest of the main file. > > > > > > > > Does that makes sense? > > > > !(CurLexer && CurLexer->getFileID() == PredefinesFileID) makes sure > > > > that if we're consuming the preamble, we don't emit the warning > > > > > > 1) if we're consuming the preamble, what exactly is the sequence of > > > events that would otherwise lead us to emit the warning? > > > 2) what if we're in the predefines file for some other reason instead? > > > > > > I'm hoping you'll explain to me what you think the predefines file is, > > > and what its relationship to the preamble is, and so why this condition > > > is correct :-) > > > > > > My understanding (which is pretty shaky!) is that the predefines file is > > > a kind of catchall used to inject text into the preprocessor that doesn't > > > appear in the source file - that it contains definitions of builtin > > > integer types, macros defined by `-D` on the command line, and so on. If > > > it has any relationship to the preamble, it's something subtle that's to > > > do with the relative order in which the preprocessor sees entities like > > > the predefines, the preprocesor, and the main file. > > > So according to my understanding, interpreting "reaching EOF in the > > > predefines file" as "consuming a preamble" is either wrong or something > > > very subtle requiring a significant comment. > > > > > > The code you refer to below is doing something very different: it's > > > saying that if we have a preamble, then reaching the end of the > > > predefines file is the *trigger* to inject state from it! > > > > > > > !this->PPOpts->GeneratePreamble makes sure that if we're generating the > > > > preamble, we don't emit the warning > > > > > > Sure, but that doesn't sound correct at all! A preamble mostly consists > > > of parsing a lot of headers, and if any of those headers have an unpaired > > > pragma, we should be warning on that at EOF. It's only if we hit the > > > "pretend" EOF from the truncated main file that we want to suppress the > > > warning. > > > the preprocessor sees entities like the predefines, the preprocesor, and > > > the main file. > > Oops... sees the predefines, any injected events from the preamble, and the > > main file. > > > I think that definition is correct, but given that there is only one > predefines file, when it ends, it must go into the main file, no? And if we > have a pragma assume nonnull loc, it shouldn't be from the clang builtins, it > should be from the preamble (I'd certainly hope!) > > re: headers, I think there might be a misunderstanding, as soon as an > #include is seen the pragma is ended here: > https://github.com/llvm/llvm-project/blob/7631c366c8589dda488cb7ff1df26cc134002208/clang/lib/Lex/PPDirectives.cpp#L2006, > so if PragmaAssumeNonNullLoc is valid here we haven't had any includes since > the pragma was seen. > I think that definition is correct, but given that there is only one > predefines file, when it ends, it must go into the main file, no? I'm sorry, but i really can't follow what you're asking, or how it relates to the original problem, or what the problem even is. If *you're* not sure, it's probably worth tracing execution of the preprocessor and describing the precise sequence of events that cause a problem. And then trying to come up with a *direct* way of solving them. Here's my *guess* as to what's going on (but please check!): - the assume_nonnull state is very sensitive to shifts between files - it's supposed to be off before reaching EOF and before #include. (This is why it can appear to be file-local syntax but be implemented as a global flag). - therefore the AssumeNonnullLoc is not just some property of the preprocessor, it's a transient state that should be set at precisely the right point in the program. However you're setting it at the very beginning, as a side effect of ReadASTBlock(). - when consuming a preamble, the preprocessor simulates a structure like: ``` // AssumeNonnullLoc is being set here #include <predefines> // implicit #inject pp conditional stack // AssumeNonnnullLoc *should* be set here // now continue with the post-preamble part of main file ``` - because AssumeNonnullLoc is set too early, the PP events in between these locations mess with it. Fortunately that is probably only the predefines, although that's a brittle assumption to make. And we don't run into problems when entering the predefines file, because `HandleHeaderIncludeOrImport` isn't called, we just enter the file directly. However you do run into issues when exiting that file again, and so a special case works around those. If this is right, it's not a good solution. There are other observable effects of setting AssumeNonnullLoc in the wrong place. All the predefines will be parsed in "assume nonnull mode". And if any files are #included from the predefines file, then the enter/exit events will cause problems. Predefines may contain `#include` with the `-include` and `-imacros` clang flags, and due to clangd's preamble patching. In this case the right fix is to set AssumeNonnullLoc at the correct point in the file: triggered by exiting the predefines, along with the PP conditional stack. The way to do this is probably to have a second property on preprocessor which stores the loc that will be restored when the trigger occurs. (Similar to how PP distinguishes between the *current* conditional stack and the replayable one set by ASTReader when it calls `PP.setReplayablePreambleConditionalStack(ConditionalStack, SkipInfo);` > re: headers, I think there might be a misunderstanding, as soon as an > #include is seen the pragma is ended here The scenario is: We are building a preamble from main.c main.c: ``` #include "foo.h" int x; ``` foo.h: ``` #pragma clang assume_nonnull begin void foo(int*); // missing end pragma ``` We should issue a diagnostic when we hit the end of foo.h, but with the code you have here it will be supposed instead. (There's no #include in scope of the pragma here, so I don't think that's relevant) Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D122179/new/ https://reviews.llvm.org/D122179 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits