[PATCH] D127284: [clang-repl] Support statements on global scope in incremental mode.

Richard Smith - zygoloid via Phabricator via cfe-commits Tue, 15 Nov 2022 23:40:11 -0800

rsmith added inline comments.


================
Comment at: clang/lib/AST/Decl.cpp:5264
+
+FunctionDecl *TopLevelStmtDecl::getOrConvertToFunction() {
+  if (FD)
----------------
I would hope that we can remove this. Instead, I think we can teach `CodeGen` 
to emit a sequence of `TopLevelStmtDecl`s directly as an LLVM IR function -- if 
it's not emitted anything else nor flushed its IR output since it last emitted 
a `TopLevelStmtDecl`, then reuse and extend the previous `Function`, otherwise 
create a new one. That would also allow us to make `TopLevelStmtDecl` model 
exactly one `Stmt`, which seems cleaner.


================
Comment at: clang/lib/Parse/ParseTentative.cpp:52-53
+  assert(getLangOpts().CPlusPlus && "Must be called for C++ only.");
+  if (DisambiguatingWithExpression) {
+    if (Tok.is(tok::identifier)) {
+      RevertingTentativeParsingAction TPA(*this);
----------------
Can we sink this into the `switch` on the token kind below?


================
Comment at: clang/lib/Parse/Parser.cpp:1033
+        !isDeclarationStatement(/*DisambiguatingWithExpression=*/true))
+      SingleDecl = ParseTopLevelStmtDecl();
+
----------------
v.g.vassilev wrote:
> There is a remaining challenge which probably could be addressed outside of 
> this patch.
> 
> Consider this statement block:
> ```
> int i =  12;
> ++i; 
> i--;
> 
> template<typename T> struct A { };
> ```
> 
> Ideally we should model `++i; i--;` as a single `TopLevelStmtDecl` as the 
> statement block is contiguous. That would require the creation of 2 AST nodes 
> per block (one for the `TopLevelStmtDecl` and one for its conversion to 
> `FunctionDecl`). This will give us also a nice property on the REPL side 
> where the user could decide to squash multiple statements into a statement 
> block to save on memory.
> 
> To do so, we will need to use `isDeclarationStatement` as a stop rule in 
> `ParseTopLevelDecl`. In turn, this would mean that we should duplicate all of 
> the switch cases described in the `ParseExternalDeclaration` function here. 
> [We need teach  `isDeclarationStatement` everything we know about 
> declarations, eg. it must tell us to stop when we see definition `struct A`].
> 
> The last version of this patch goes in the opposite direction, trying to 
> minimize the code duplication (bloat?) by wrapping each global statement into 
> a `TopLevelStmtDecl`, reusing the logic in `ParseExternalDeclaration`. 
> However, we pay the price for 2 AST node allocations per global statement. 
> That is a serious hit for people that want to control the parsing granularity 
> of an interpreter.
> 
> I wonder if we can do something better hitting both requirements in some 
> smart way I cannot see...
It seems to me that the big cost here is creating a `FunctionDecl` and all of 
its ancillary components; a `TopLevelStmtDecl` is pretty cheap. I don't think 
it should be necessary to create that `FunctionDecl` at all -- we should be 
able to go straight from `TopLevelStmtDecl` to an IR function like we go 
straight from a `VarDecl` for a global function to its initializer IR function 
without creating a `FunctionDecl`.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D127284/new/

https://reviews.llvm.org/D127284

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D127284: [clang-repl] Support statements on global scope in incremental mode.

Reply via email to