[PATCH] D96033: [clang-repl] Land initial infrastructure for incremental parsing

John McCall via Phabricator via cfe-commits Fri, 16 Apr 2021 01:07:17 -0700

rjmccall added a comment.

Sorry for the delay.  That seems like a reasonable summary of our discussion.  
Let me try to lay out the right architecture for this as I see it.


Conceptually, we can split the translation unit into a sequence of partial 
translation units (PTUs).  Every declaration will be associated with a unique 
PTU that owns it.

The first key insight here is that the owning PTU isn't always the "active" 
(most recent) PTU, and it isn't always the PTU that the declaration "comes 
from".  A new declaration (that isn't a redeclaration or specialization of 
anything) does belong to the active PTU.  A template specialization, however, 
belongs to the most recent PTU of all the declarations in its signature — 
mostly that means that it can be pulled into a more recent PTU by its template 
arguments.

The second key insight is that processing a PTU might extend an earlier PTU.  
Rolling back the later PTU shouldn't throw that extension away.  For example, 
if the second PTU defines a template, and the third PTU requires that template 
to be instantiated at `float`, that template specialization is still part of 
the second PTU.  Similarly, if the fifth PTU uses an inline function belonging 
to the fourth, that definition still belongs to the fourth.  When we go to emit 
code in a new PTU, we map each declaration we have to emit back to its owning 
PTU and emit it in a new module for just the extensions to that PTU.  We keep 
track of all the modules we've emitted for a PTU so that we can unload them all 
if we decide to roll it back.

Most declarations/definitions will only refer to entities from the same or 
earlier PTUs.  However, it is possible (primarily by defining a 
previously-declared entity, but also through templates or ADL) for an entity 
that belongs to one PTU to refer to something from a later PTU.  We will have 
to keep track of this and prevent unwinding to later PTU when we recognize it.  
Fortunately, this should be very rare; and crucially, we don't have to do the 
bookkeeping for this if we've only got one PTU, e.g. in normal compilation.  
Otherwise, PTUs after the first just need to record enough metadata to be able 
to revert any changes they've made to declarations belonging to earlier PTUs, 
e.g. to redeclaration chains or template specialization lists.

It should even eventually be possible for PTUs to provide their own slab 
allocators which can be thrown away as part of rolling back the PTU.  We can 
maintain a notion of the active allocator and allocate things like Stmt/Expr 
nodes in it, temporarily changing it to the appropriate PTU whenever we go to 
do something like instantiate a function template.  More care will be required 
when allocating declarations and types, though.

We would want the PTU to be efficiently recoverable from a `Decl`; I'm not sure 
how best to do that.  An easy option that would cover most declarations would 
be to make multiple `TranslationUnitDecl`s and parent the declarations 
appropriately, but I don't think that's good enough for things like member 
function templates, since an instantiation of that would still be parented by 
its original class.  Maybe we can work this into the DC chain somehow, like how 
lexical DCs are.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D96033/new/

https://reviews.llvm.org/D96033

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D96033: [clang-repl] Land initial infrastructure for incremental parsing

Reply via email to