v.g.vassilev added inline comments.
================
Comment at: clang/lib/CodeGen/CodeGenAction.cpp:908
+CodeGenerator *CodeGenAction::getCodeGenerator() const {
+ return BEConsumer->getCodeGenerator();
----------------
lhames wrote:
> v.g.vassilev wrote:
> > sgraenitz wrote:
> > > v.g.vassilev wrote:
> > > > @rjmccall, we were wondering if there is a better way to ask CodeGen to
> > > > start a new module. The current approach seems to be drilling hole in a
> > > > number of abstraction layers.
> > > >
> > > > In the past we have touched that area a little in
> > > > https://reviews.llvm.org/D34444 and the answer may be already there but
> > > > I fail to connect the dots.
> > > >
> > > > Recently, we thought about having a new FrontendAction callback for
> > > > beginning a new phase when compiling incremental input. We need to keep
> > > > track of the created objects (needed for error recovery) in our
> > > > Transaction. We can have a map of `Transaction*` to `llvm::Module*` in
> > > > CodeGen. The issue is that new JITs take ownership of the
> > > > `llvm::Module*` which seems to make it impossible to support jitted
> > > > code removal with that model (cc: @lhames, @rsmith).
> > > When compiling incrementally, doeas a "new phase" mean that all
> > > subsequent code will go into a new module from then on? How will
> > > dependencies to previous symbols be handled? Are all symbols external?
> > >
> > > > The issue is that new JITs take ownership of the llvm::Module*
> > >
> > > That's true, but you can still keep a raw pointer to it, which will be
> > > valid at least as long as the module wasn't linked. Afterwards it depends
> > > on the linker:
> > > * RuntimeDyld can return ownership of the object's memory range via
> > > `NotifyEmittedFunction`
> > > * JITLink provides the `ReturnObjectBufferFunction` for the same purpose
> > >
> > > > seems to make it impossible to support jitted code removal with that
> > > > model
> > >
> > > Can you figure out what symbols are affected and remove these? A la:
> > > https://github.com/llvm/llvm-project/blob/13f4448ae7db1a47/llvm/include/llvm/ExecutionEngine/Orc/Core.h#L1020
> > >
> > > I think @anarazel has ported a client with code removal to OrcV2
> > > successfully in the past. Maybe there's something we can learn from it.
> > > When compiling incrementally, doeas a "new phase" mean that all
> > > subsequent code will go into a new module from then on? How will
> > > dependencies to previous symbols be handled? Are all symbols external?
> >
> > There is some discussion on this here https://reviews.llvm.org/D34444#812418
> >
> > I think the relevant bit is that 'we have just one ever growing TU [...]
> > which we send to the RuntimeDyLD allowing only JIT to resolve symbols from
> > it. We aid the JIT when resolving symbols with internal linkage by
> > changing all internal linkage to external (We haven't seen issues with that
> > approach)'.
> >
> > >
> > > > The issue is that new JITs take ownership of the llvm::Module*
> > >
> > > That's true, but you can still keep a raw pointer to it, which will be
> > > valid at least as long as the module wasn't linked.
> >
> > That was my first implementation when I upgraded cling to llvm9 where the
> > `shared_ptr`s went to `unique_ptr`s. This was quite problematic for many of
> > the use cases we support as the JIT is somewhat unpredictable to the
> > high-level API user.
> >
> >
> > >Afterwards it depends on the linker:
> > > * RuntimeDyld can return ownership of the object's memory range via
> > > `NotifyEmittedFunction`
> > > * JITLink provides the `ReturnObjectBufferFunction` for the same purpose
> > >
> >
> > That's exactly what we ended up doing (I would like to thank Lang here who
> > gave a similar advice).
> >
> > > > seems to make it impossible to support jitted code removal with that
> > > > model
> > >
> > > Can you figure out what symbols are affected and remove these? A la:
> > > https://github.com/llvm/llvm-project/blob/13f4448ae7db1a47/llvm/include/llvm/ExecutionEngine/Orc/Core.h#L1020
> > >
> > > I think @anarazel has ported a client with code removal to OrcV2
> > > successfully in the past. Maybe there's something we can learn from it.
> >
> > Indeed. That's not yet on my radar as seemed somewhat distant in time.
> >
> > Recently, we thought about having a new FrontendAction callback for
> > beginning a new phase when compiling incremental input. We need to keep
> > track of the created objects (needed for error recovery) in our
> > Transaction. We can have a map of Transaction* to llvm::Module* in CodeGen.
> > The issue is that new JITs take ownership of the llvm::Module* which seems
> > to make it impossible to support jitted code removal with that model (cc:
> > @lhames, @rsmith).
>
> In the new APIs, in order to enable removable code, you can associate Modules
> with ResourceTrackers when they're added to the JIT. The ResourceTrackers
> then allow for removal. Idiomatic usage looks like:
>
> auto Mod = /* create module */;
> auto RT = JD.createResourceTracker();
> J.addModule(RT, std::move(Mod));
> //...
> if (auto Err = RT.remove())
> /* handle Err */;
>
> > we have just one ever growing TU [...] which we send to RuntimeDyld...
>
> So is a TU the same as an llvm::Module in this context? If so, how do you
> reconcile that with the JIT taking ownership of modules? Are you just copying
> the Module each time before adding it?
>
> > We need to keep track of the created objects (needed for error recovery) in
> > our Transaction.
>
> Do you need the Module* for error recovery? Or just the Decls?
> > Recently, we thought about having a new FrontendAction callback for
> > beginning a new phase when compiling incremental input. We need to keep
> > track of the created objects (needed for error recovery) in our
> > Transaction. We can have a map of Transaction* to llvm::Module* in CodeGen.
> > The issue is that new JITs take ownership of the llvm::Module* which seems
> > to make it impossible to support jitted code removal with that model (cc:
> > @lhames, @rsmith).
>
> In the new APIs, in order to enable removable code, you can associate Modules
> with ResourceTrackers when they're added to the JIT. The ResourceTrackers
> then allow for removal. Idiomatic usage looks like:
>
> auto Mod = /* create module */;
> auto RT = JD.createResourceTracker();
> J.addModule(RT, std::move(Mod));
> //...
> if (auto Err = RT.remove())
> /* handle Err */;
Nice, thanks!
>
> > we have just one ever growing TU [...] which we send to RuntimeDyld...
>
> So is a TU the same as an llvm::Module in this context? If so, how do you
> reconcile that with the JIT taking ownership of modules? Are you just copying
> the Module each time before adding it?
Each incremental chunk with which the TU grows has a corresponding
`llvm::Module`. Once clang's CodeGen is done for the particular module it
transfers the ownership to the `Transaction` which, in turn, hands it to the
JIT and once the JIT is done it retains the ownership again.
>
> > We need to keep track of the created objects (needed for error recovery) in
> > our Transaction.
>
> Do you need the Module* for error recovery? Or just the Decls?
Yes, we need a `llvm::Module` that corresponds to the Decls as sometimes
CodeGen will decide not to emit a Decl.
================
Comment at: clang/lib/Interpreter/IncrementalExecutor.cpp:29-51
+ using namespace llvm::orc;
+ llvm::ErrorAsOutParameter EAO(&Err);
+ auto JitOrErr = LLJITBuilder().create();
+ if (auto Err2 = JitOrErr.takeError()) {
+ Err = std::move(Err2);
+ return;
+ }
----------------
lhames wrote:
> I think this can be shortened to:
>
> using namespace llvm::orc;
> llvm::ErrorAsOutParameter EAO(&Err);
>
> if (auto JitOrErr = LLJITBuilder.create())
> Jit = std::move(*JitOrErr);
> else {
> Err = JitOrErr.takeError();
> return;
> }
>
> const auto &DL = Jit->getDataLayout();
> if (auto PSGOrErr =
> DynamicLibrarySearchGenerator::GetForCurrentProcess(DL.getGlobalPrefix()))
> Jit->getMainJITDylib().addGenerator(std::move(*PSGOrErr));
> else {
> Err = PSGOrErr.takeError();
> return;
> }
>
> You don't need the call to
> `llvm::sys::DynamicLibrary::LoadLibraryPermanently(nullptr);` any more:
> DynamicLibrarySearchGenerator::GetForCurrentProcess does that for you.
Cool, thanks!
================
Comment at: clang/lib/Interpreter/IncrementalExecutor.cpp:56-59
+llvm::Error IncrementalExecutor::addModule(std::unique_ptr<llvm::Module> M) {
+ llvm::orc::ThreadSafeContext TSCtx(std::make_unique<llvm::LLVMContext>());
+ return Jit->addIRModule(llvm::orc::ThreadSafeModule(std::move(M), TSCtx));
+}
----------------
lhames wrote:
> This doesn't look right. The ThreadSafeContext has to contain the LLVMContext
> for the module, but here you're creating a new unrelated ThreadSafeContext.
Thanks. I think I fixed it now. Can you take a look?
================
Comment at: clang/lib/Interpreter/IncrementalExecutor.h:36
+ llvm::Error addModule(std::unique_ptr<llvm::Module> M);
+ llvm::Error runCtors() const;
+};
----------------
lhames wrote:
> v.g.vassilev wrote:
> > sgraenitz wrote:
> > > v.g.vassilev wrote:
> > > > teemperor wrote:
> > > > > Should we maybe merge `runCtors` and `addModule`? Not sure if there
> > > > > is a use case for adding a Module but not running Ctors. Also
> > > > > documentation.
> > > > The case we have is when there is no JIT -- currently we have such a
> > > > case in IncrementalProcessingTest I think. Another example, which will
> > > > show up in future patches, is the registration of atexit handlers. That
> > > > is, before we `runCtors` we make a pass over the LLVM IR and collect
> > > > some specific details and (possibly change the IR and then run).
> > > >
> > > > I'd rather keep it separate for now if that's okay.
> > > > Should we maybe merge runCtors and addModule?
> > >
> > > +1 even though there may be open questions regarding incremental
> > > initialization.
> > >
> > > > The case we have is when there is no JIT -- currently we have such a
> > > > case in IncrementalProcessingTest
> > >
> > > Can you run anything if there is no JIT? I think what you have in
> > > `IncrementalProcessing.EmitCXXGlobalInitFunc` is
> > > `getGlobalInit(llvm::Module*)`, which checks for symbol names with a
> > > specific prefix.
> > >
> > > > before we runCtors we make a pass over the LLVM IR and collect some
> > > > specific details and (possibly change the IR and then run).
> > >
> > > The idiomatic solution for such modifications would use an
> > > IRTransformLayer as in:
> > > https://github.com/llvm/llvm-project/blob/13f4448ae7db1a47/llvm/examples/OrcV2Examples/LLJITWithOptimizingIRTransform/LLJITWithOptimizingIRTransform.cpp#L108
> > >
> > > > Another example, which will show up in future patches, is the
> > > > registration of atexit handlers
> > >
> > > `atexit` handlers as well as global ctors/dtors should be covered by
> > > LLJIT PlatformSupport. The LLJITBuilder will inject a
> > > GenericLLVMIRPlatformSupport instance by default:
> > > https://github.com/llvm/llvm-project/blob/13f4448ae7db1a47/llvm/lib/ExecutionEngine/Orc/LLJIT.cpp#L125
> > >
> > > It's not as comprehensive as e.g. the MachO implementation, but should be
> > > sufficient for your use-case as you have IR for all your JITed code. (It
> > > would NOT work if you cached object files, reloaded them in a subsequent
> > > session and wanted to run their ctors.) So, your below call to
> > > `initialize()` should do it already.
> > >
> > > > Should we maybe merge runCtors and addModule?
> > >
> > > +1 even though there may be open questions regarding incremental
> > > initialization.
> > >
> > > > The case we have is when there is no JIT -- currently we have such a
> > > > case in IncrementalProcessingTest
> > >
> > > Can you run anything if there is no JIT? I think what you have in
> > > `IncrementalProcessing.EmitCXXGlobalInitFunc` is
> > > `getGlobalInit(llvm::Module*)`, which checks for symbol names with a
> > > specific prefix.
> >
> > Yes, I'd think such mode is useful for testing but also for other cases
> > where the user is handed a Transaction* and allowed to make some
> > modification before processing the `llvm::Module`
> >
> > >
> > > > before we runCtors we make a pass over the LLVM IR and collect some
> > > > specific details and (possibly change the IR and then run).
> > >
> > > The idiomatic solution for such modifications would use an
> > > IRTransformLayer as in:
> > > https://github.com/llvm/llvm-project/blob/13f4448ae7db1a47/llvm/examples/OrcV2Examples/LLJITWithOptimizingIRTransform/LLJITWithOptimizingIRTransform.cpp#L108
> >
> > That looks very nice. It assumes the JIT is open to the users, here we open
> > only the `llvm::Module` (I am not arguing if that's a good idea in general).
> >
> > >
> > > > Another example, which will show up in future patches, is the
> > > > registration of atexit handlers
> > >
> > > `atexit` handlers as well as global ctors/dtors should be covered by
> > > LLJIT PlatformSupport. The LLJITBuilder will inject a
> > > GenericLLVMIRPlatformSupport instance by default:
> > > https://github.com/llvm/llvm-project/blob/13f4448ae7db1a47/llvm/lib/ExecutionEngine/Orc/LLJIT.cpp#L125
> >
> > Does that give me control over when the `atexit` handlers are called? Can
> > the interpreter call them at its choice?
> >
> > >
> > > It's not as comprehensive as e.g. the MachO implementation, but should be
> > > sufficient for your use-case as you have IR for all your JITed code. (It
> > > would NOT work if you cached object files, reloaded them in a subsequent
> > > session and wanted to run their ctors.) So, your below call to
> > > `initialize()` should do it already.
> > >
> >
> >
> >> Should we maybe merge runCtors and addModule?
> > +1 even though there may be open questions regarding incremental
> > initialization.
>
> In the long term constructors should be run via the Orc runtime (currently
> planned for initial release in LLVM 13 later this year). I like the idea of
> keeping "add module" and "run initializers" as two separate steps, with
> initializers being run only when you execute a top level expression. It would
> allow for workflows like this:
>
> interpreter% :load a.cpp
> interpreter% :load b.cpp
>
> where an initializer in a.cpp depends on code in b.cpp. It would also allow
> for defining constructors with forward references in the REPL itself.
>
> The Orc runtime is currently focused on emulating the usual execution
> environment: The canonical way to execute initializers is by calling
> jit_dlopen on the target JITDylib. I think the plan should be to generalize
> this behavior (either in the jit_dlopen contract, or by introducing a
> jit_dlopen_repl function) to allow for repeated calls to dlopen, with each
> subsequent dlopen call executing any discovered-but-not-yet-run initializers.
>
>
>
> >> Does that give me control over when the atexit handlers are called? Can
> >> the interpreter call them at its choice?
> >
> > It's not as comprehensive as e.g. the MachO implementation, but should be
> > sufficient for your use-case as you have IR for all your JITed code. (It
> > would NOT work if you cached object files, reloaded them in a subsequent
> > session and wanted to run their ctors.) So, your below call to initialize()
> > should do it already.
>
> Yep -- initialize should run the constructors, which should call cxa_atexit.
> The cxa_atexit calls should be interposed by GenericLLVMIRPlatform, and the
> atexits run when you call LLJIT::deinitialize on the JITDylib. There are some
> basic regression tests for this, but it hasn't been stress tested yet.
>
> GenericLLVMIRPlatform should actually support initializers in cached object
> files that were compiled from Modules added to LLJIT: The platform replaces
> llvm.global_ctors with an init function with a known name, then looks for
> that name in objects loaded for the cache. At least that was the plan, I
> don't recall whether it has actually been tested. What definitely doesn't
> work is running initializers in objects produced outside LLJIT. That will be
> fixed by JITLink/ELF and the Orc Runtime though (and already works for MachO
> in the runtime prototype).
@sgraenitz, @lhames, thanks for the clarifications.
I am marking your comments as resolved (for easier tracking on my end). If the
intent was to change something in this patch could you elaborate a little more
what specifically I need to do here?
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D96033/new/
https://reviews.llvm.org/D96033
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits