hfinkel added inline comments.
================
Comment at: lib/CodeGen/CGDecl.cpp:1291
+// where the pointer to the local variable is the key in the map.
+void CodeGenFunction::EmitAutoVarNoAlias(const AutoVarEmission &emission) {
+ assert(emission.Variable && "emission was not valid!");
----------------
rjmccall wrote:
> This should really be a subroutine of EmitAutoVarAlloca; that will implicitly
> pick this logic up for a bunch of less standard places in IRGen that create
> local variables. Please make it a static function (if possible) called
> EmitNoAliasScope and call it immediately after EmitVarAnnotations.
Done (I didn't make it static, which is certainly possible, but didn't look
much cleaner to me - let me know if you want it static anyway).
================
Comment at: lib/CodeGen/CGStmt.cpp:537
+ llvm::LLVMContext::MD_noalias),
+ NewScopeList));
+
----------------
rjmccall wrote:
> rjmccall wrote:
> > hfinkel wrote:
> > > rjmccall wrote:
> > > > This is a very strange representation. Every memory operation in the
> > > > lexical block is annotated with a list of all of the scopes that were
> > > > entered within the block, even if they were entered after the
> > > > operation. But for some reason, not with nested scopes?
> > > >
> > > > What's the right patch for me to read about this representation?
> > > Perhaps unfortunately, this is an artifact of the way that restrict is
> > > defined in C. It applies to all accesses in the block in which the
> > > variable is declared, even those before the declaration of the
> > > restrict-qualified local itself.
> > >
> > > It should work with nested scopes, in the sense that we add these things
> > > as we complete each scope. So we add things to the inner scope, and then
> > > when we complete the outer scope, we go back over the instructions
> > > (including those in the inner scope because the scope recording recurses
> > > up the scope hierarchy), and adds the outer scopes - it concatenates them
> > > to any added by the inner (nested) scopes.
> > >
> > > The noalias intrinsic's LangRef updates are in D9375.
> > Ah, I see the recursion now. Please add a comment explaining that
> > expectation here.
> I would still like this comment. It should be enough to explain how
> MemoryInsts actually gets filled: there's a callback from inserting an
> instruction which adds all memory instructions to every active lexical scope
> that's recording them.
Added.
================
Comment at: lib/CodeGen/CodeGenFunction.cpp:1986
+ if (I->mayReadOrWriteMemory())
+ recordMemoryInstruction(I);
}
----------------
rjmccall wrote:
> This condition will include calls, which I assume is unnecessary (except
> maybe memory intrinsics?). More seriously, it will also include loads and
> stores into allocas, which will sweep in all sorts of bookkeeping temporaries
> that IRGen uses in more complex code-generation situations. You might want
> to consider ways to avoid recording this kind of instruction that are very
> likely to just get immediately optimized away by e.g. mem2reg.
>
> Please also modify LexicalScope so that it records whether there's any active
> recording scope in the CodeGenFunction. Lexical scopes are nested, so that
> should be as easy as saving the current state when you enter a scope and
> restoring it when you leave. That will allow you to optimize this code by
> completely bypassing the recursion and even the check for whether the
> instruction is a memory instruction in the extremely common case of a
> function with no restrict variables.
>
> In general, a lot of things about this approach seem to have worryingly poor
> asymptotic performance in the face of large functions or (especially) many
> restrict variables. (You could, for example, have chosen to have each memory
> instruction link to its enclosing noalias scope, which would contain a link
> to its enclosing scope and a list of the variables it directly declares.
> This would be a more complex representation to consume, but it would not
> require inherently quadratic frontend work building all these lists.) The
> only saving grace is that we expect very little code to using restrict, so it
> becomes vital to ensure that your code is immediately short-circuited when
> nothing is using restrict.
> This condition will include calls, which I assume is unnecessary (except
> maybe memory intrinsics?).
No, getting calls here was intentional. The AA implementation can specifically
deal with calls (especially those that are known only to access memory given by
their arguments).
> More seriously, it will also include loads and stores into allocas, which
> will sweep in all sorts of bookkeeping temporaries that IRGen uses in more
> complex code-generation situations. You might want to consider ways to avoid
> recording this kind of instruction that are very likely to just get
> immediately optimized away by e.g. mem2reg.
I agree, but this might be difficult to do in general. We should be able to
avoid annotating accesses to allocas that don't escape. Do we have a sound way
to determine at this stage whether a local variable has had its address taken?
> Please also modify LexicalScope so that it records whether there's any active
> recording scope in the CodeGenFunction. Lexical scopes are nested, so that
> should be as easy as saving the current state when you enter a scope and
> restoring it when you leave. That will allow you to optimize this code by
> completely bypassing the recursion and even the check for whether the
> instruction is a memory instruction in the extremely common case of a
> function with no restrict variables.
I added a variable to the scope to cache whether or not anything is being
recorded. This avoids the quadratic recording-check behavior (unless some scope
is actually recording something).
> In general, a lot of things about this approach seem to have worryingly poor
> asymptotic performance in the face of large functions or (especially) many
> restrict variables. (You could, for example, have chosen to have each memory
> instruction link to its enclosing noalias scope, which would contain a link
> to its enclosing scope and a list of the variables it directly declares. This
> would be a more complex representation to consume, but it would not require
> inherently quadratic frontend work building all these lists.) The only saving
> grace is that we expect very little code to using restrict, so it becomes
> vital to ensure that your code is immediately short-circuited when nothing is
> using restrict.
We're definitely on the same page here. The reasons I didn't worry about this
too much is that, as you note, we expect this to be rare, and moreover, the
underlying metadata representation being created is itself quadratic
(#restricts x #accesses). The good news is that this new representation is
actually less verbose than the existing noalias metadata. The bad news is that,
if this turns out to be a problem, and it could, then we'll need to either add
cutoffs or design some different representation (or both).
================
Comment at: lib/CodeGen/CodeGenFunction.h:597
+ // those blocks) so that we can later add the appropriate metadata. Record
+ // this instruction and so the same in any parent scopes.
+ void recordMemoryInstruction(llvm::Instruction *I) {
----------------
rjmccall wrote:
> I would suggest this wording:
>
> /// Record the given memory access instruction in this and all of its
> enclosing scopes.
> /// When we close this scope, we'll add the list of all the
> restrict-qualified local variables
> /// it declared to all the memory accesses which occurred within it.
> Recording is only
> /// enabled under optimization, and it's disabled in a scope unless it
> actually declares
> /// any local restrict variables.
Sounds good (updated).
https://reviews.llvm.org/D9403
_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits