sammccall added inline comments.

================
Comment at: clang/lib/Lex/PPLexerChange.cpp:436
+  if (PragmaAssumeNonNullLoc.isValid() && !this->PPOpts->GeneratePreamble &&
+      !(CurLexer && CurLexer->getFileID() == PredefinesFileID) &&
       !isEndOfMacro && !(CurLexer && CurLexer->Is_PragmaLexer)) {
----------------
dgoldman wrote:
> sammccall wrote:
> > sammccall wrote:
> > > dgoldman wrote:
> > > > sammccall wrote:
> > > > > dgoldman wrote:
> > > > > > sammccall wrote:
> > > > > > > what's the PredefinesFileID special case about?
> > > > > > See ExitedFromPredefinesFile below, this is how I check if we're 
> > > > > > falling off the end of the preamble region into the main file (same 
> > > > > > as done for the conditional stack), LMK if there's a better way.
> > > > > I don't understand how this checks if we're falling off the preamble 
> > > > > region, and the code around ExitedFromPredefinesFile doesn't clarify 
> > > > > this for me. Can you explain?
> > > > > 
> > > > > The `if (ExitedFromPredefinesFile)` appears to be handling the 
> > > > > logical *insertion* of preamble PP events when *consuming* a 
> > > > > preamble, which is not what you're trying to do here.
> > > > > 
> > > > > The condition here is of the form "if we have an open pragma, and 
> > > > > we're not generating a preamble, and some other stuff, then 
> > > > > diagnose". So if the baseline case is "diagnose" and the preamble 
> > > > > case is an exception, the "other stuff" clauses don't limit the 
> > > > > preamble exception, they add extra exceptions!
> > > > `!(CurLexer && CurLexer->getFileID() == PredefinesFileID)` makes sure 
> > > > that if we're consuming the preamble, we don't emit the warning. AFAICT 
> > > > this is the way to tell that the preamble is terminated, since the 
> > > > current file ID being processed is the predefines file and the file is 
> > > > now terminated as of this method call (since !isEndOfMacro && 
> > > > !(CurLexer && CurLexer->Is_PragmaLexer)). 
> > > > `!this->PPOpts->GeneratePreamble` makes sure that if we're generating 
> > > > the preamble, we don't emit the warning. We need to special case both 
> > > > cases otherwise we get an error when generating the preamble or when we 
> > > > load the preamble before even processing the rest of the main file.
> > > > 
> > > > Does that makes sense? 
> > > > !(CurLexer && CurLexer->getFileID() == PredefinesFileID) makes sure 
> > > > that if we're consuming the preamble, we don't emit the warning
> > > 
> > > 1) if we're consuming the preamble, what exactly is the sequence of 
> > > events that would otherwise lead us to emit the warning?
> > > 2) what if we're in the predefines file for some other reason instead?
> > > 
> > > I'm hoping you'll explain to me what you think the predefines file is, 
> > > and what its relationship to the preamble is, and so why this condition 
> > > is correct :-)
> > > 
> > > My understanding (which is pretty shaky!) is that the predefines file is 
> > > a kind of catchall used to inject text into the preprocessor that doesn't 
> > > appear in the source file - that it contains definitions of builtin 
> > > integer types, macros defined by `-D` on the command line, and so on. If 
> > > it has any relationship to the preamble, it's something subtle that's to 
> > > do with the relative order in which the preprocessor sees entities like 
> > > the predefines, the preprocesor, and the main file.
> > > So according to my understanding, interpreting "reaching EOF in the 
> > > predefines file" as "consuming a preamble" is either wrong or something 
> > > very subtle requiring a significant comment.
> > > 
> > > The code you refer to below is doing something very different: it's 
> > > saying that if we have a preamble, then reaching the end of the 
> > > predefines file is the *trigger* to inject state from it!
> > > 
> > > > !this->PPOpts->GeneratePreamble makes sure that if we're generating the 
> > > > preamble, we don't emit the warning
> > > 
> > > Sure, but that doesn't sound correct at all! A preamble mostly consists 
> > > of parsing a lot of headers, and if any of those headers have an unpaired 
> > > pragma, we should be warning on that at EOF. It's only if we hit the 
> > > "pretend" EOF from the truncated main file that we want to suppress the 
> > > warning.
> > > the preprocessor sees entities like the predefines, the preprocesor, and 
> > > the main file.
> > Oops... sees the predefines, any injected events from the preamble, and the 
> > main file.
> > 
> I think that definition is correct, but given that there is only one 
> predefines file, when it ends, it must go into the main file, no? And if we 
> have a pragma assume nonnull loc, it shouldn't be from the clang builtins, it 
> should be from the preamble (I'd certainly hope!)
> 
> re: headers, I think there might be a misunderstanding, as soon as an 
> #include is seen the pragma is ended here: 
> https://github.com/llvm/llvm-project/blob/7631c366c8589dda488cb7ff1df26cc134002208/clang/lib/Lex/PPDirectives.cpp#L2006,
>  so if PragmaAssumeNonNullLoc is valid here we haven't had any includes since 
> the pragma was seen.
> I think that definition is correct, but given that there is only one 
> predefines file, when it ends, it must go into the main file, no?

I'm sorry, but i really can't follow what you're asking, or how it relates to 
the original problem, or what the problem even is. If *you're* not sure, it's 
probably worth tracing execution of the preprocessor and describing the precise 
sequence of events that cause a problem. And then trying to come up with a 
*direct* way of solving them.

Here's my *guess* as to what's going on (but please check!):
 - the assume_nonnull state is very sensitive to shifts between files - it's 
supposed to be off before reaching EOF and before #include. (This is why it can 
appear to be file-local syntax but be implemented as a global flag).
 - therefore the AssumeNonnullLoc is not just some property of the 
preprocessor, it's a transient state that should be set at precisely the right 
point in the program. However you're setting it at the very beginning, as a 
side effect of ReadASTBlock().
 - when consuming a preamble, the preprocessor simulates a structure like:
```
// AssumeNonnullLoc is being set here
#include <predefines> // implicit
#inject pp conditional stack
// AssumeNonnnullLoc *should* be set here
// now continue with the post-preamble part of main file
```
 - because AssumeNonnullLoc is set too early, the PP events in between these 
locations mess with it. Fortunately that is probably only the predefines, 
although that's a brittle assumption to make. And we don't run into problems 
when entering the predefines file, because `HandleHeaderIncludeOrImport` isn't 
called, we just enter the file directly. However you do run into issues when 
exiting that file again, and so a special case works around those.

If this is right, it's not a good solution. There are other observable effects 
of setting AssumeNonnullLoc in the wrong place. All the predefines will be 
parsed in "assume nonnull mode". And if any files are #included from the 
predefines file, then the enter/exit events will cause problems. Predefines may 
contain `#include` with the `-include` and `-imacros` clang flags, and due to 
clangd's preamble patching.

In this case the right fix is to set AssumeNonnullLoc at the correct point in 
the file: triggered by exiting the predefines, along with the PP conditional 
stack. The way to do this is probably to have a second property on preprocessor 
which stores the loc that will be restored when the trigger occurs. (Similar to 
how PP distinguishes between the *current* conditional stack and the replayable 
one set by ASTReader when it calls 
`PP.setReplayablePreambleConditionalStack(ConditionalStack, SkipInfo);`

> re: headers, I think there might be a misunderstanding, as soon as an 
> #include is seen the pragma is ended here

The scenario is:

We are building a preamble from main.c
main.c:
```
#include "foo.h"
int x;
```
foo.h:
```
 #pragma clang assume_nonnull begin
void foo(int*);
// missing end pragma
```
We should issue a diagnostic when we hit the end of foo.h, but with the code 
you have here it will be supposed instead. (There's no #include in scope of the 
pragma here, so I don't think that's relevant)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122179/new/

https://reviews.llvm.org/D122179

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to