v.g.vassilev added inline comments.
================ Comment at: clang/lib/CodeGen/CodeGenAction.cpp:908 +CodeGenerator *CodeGenAction::getCodeGenerator() const { + return BEConsumer->getCodeGenerator(); ---------------- lhames wrote: > v.g.vassilev wrote: > > sgraenitz wrote: > > > v.g.vassilev wrote: > > > > @rjmccall, we were wondering if there is a better way to ask CodeGen to > > > > start a new module. The current approach seems to be drilling hole in a > > > > number of abstraction layers. > > > > > > > > In the past we have touched that area a little in > > > > https://reviews.llvm.org/D34444 and the answer may be already there but > > > > I fail to connect the dots. > > > > > > > > Recently, we thought about having a new FrontendAction callback for > > > > beginning a new phase when compiling incremental input. We need to keep > > > > track of the created objects (needed for error recovery) in our > > > > Transaction. We can have a map of `Transaction*` to `llvm::Module*` in > > > > CodeGen. The issue is that new JITs take ownership of the > > > > `llvm::Module*` which seems to make it impossible to support jitted > > > > code removal with that model (cc: @lhames, @rsmith). > > > When compiling incrementally, doeas a "new phase" mean that all > > > subsequent code will go into a new module from then on? How will > > > dependencies to previous symbols be handled? Are all symbols external? > > > > > > > The issue is that new JITs take ownership of the llvm::Module* > > > > > > That's true, but you can still keep a raw pointer to it, which will be > > > valid at least as long as the module wasn't linked. Afterwards it depends > > > on the linker: > > > * RuntimeDyld can return ownership of the object's memory range via > > > `NotifyEmittedFunction` > > > * JITLink provides the `ReturnObjectBufferFunction` for the same purpose > > > > > > > seems to make it impossible to support jitted code removal with that > > > > model > > > > > > Can you figure out what symbols are affected and remove these? A la: > > > https://github.com/llvm/llvm-project/blob/13f4448ae7db1a47/llvm/include/llvm/ExecutionEngine/Orc/Core.h#L1020 > > > > > > I think @anarazel has ported a client with code removal to OrcV2 > > > successfully in the past. Maybe there's something we can learn from it. > > > When compiling incrementally, doeas a "new phase" mean that all > > > subsequent code will go into a new module from then on? How will > > > dependencies to previous symbols be handled? Are all symbols external? > > > > There is some discussion on this here https://reviews.llvm.org/D34444#812418 > > > > I think the relevant bit is that 'we have just one ever growing TU [...] > > which we send to the RuntimeDyLD allowing only JIT to resolve symbols from > > it. We aid the JIT when resolving symbols with internal linkage by > > changing all internal linkage to external (We haven't seen issues with that > > approach)'. > > > > > > > > > The issue is that new JITs take ownership of the llvm::Module* > > > > > > That's true, but you can still keep a raw pointer to it, which will be > > > valid at least as long as the module wasn't linked. > > > > That was my first implementation when I upgraded cling to llvm9 where the > > `shared_ptr`s went to `unique_ptr`s. This was quite problematic for many of > > the use cases we support as the JIT is somewhat unpredictable to the > > high-level API user. > > > > > > >Afterwards it depends on the linker: > > > * RuntimeDyld can return ownership of the object's memory range via > > > `NotifyEmittedFunction` > > > * JITLink provides the `ReturnObjectBufferFunction` for the same purpose > > > > > > > That's exactly what we ended up doing (I would like to thank Lang here who > > gave a similar advice). > > > > > > seems to make it impossible to support jitted code removal with that > > > > model > > > > > > Can you figure out what symbols are affected and remove these? A la: > > > https://github.com/llvm/llvm-project/blob/13f4448ae7db1a47/llvm/include/llvm/ExecutionEngine/Orc/Core.h#L1020 > > > > > > I think @anarazel has ported a client with code removal to OrcV2 > > > successfully in the past. Maybe there's something we can learn from it. > > > > Indeed. That's not yet on my radar as seemed somewhat distant in time. > > > > Recently, we thought about having a new FrontendAction callback for > > beginning a new phase when compiling incremental input. We need to keep > > track of the created objects (needed for error recovery) in our > > Transaction. We can have a map of Transaction* to llvm::Module* in CodeGen. > > The issue is that new JITs take ownership of the llvm::Module* which seems > > to make it impossible to support jitted code removal with that model (cc: > > @lhames, @rsmith). > > In the new APIs, in order to enable removable code, you can associate Modules > with ResourceTrackers when they're added to the JIT. The ResourceTrackers > then allow for removal. Idiomatic usage looks like: > > auto Mod = /* create module */; > auto RT = JD.createResourceTracker(); > J.addModule(RT, std::move(Mod)); > //... > if (auto Err = RT.remove()) > /* handle Err */; > > > we have just one ever growing TU [...] which we send to RuntimeDyld... > > So is a TU the same as an llvm::Module in this context? If so, how do you > reconcile that with the JIT taking ownership of modules? Are you just copying > the Module each time before adding it? > > > We need to keep track of the created objects (needed for error recovery) in > > our Transaction. > > Do you need the Module* for error recovery? Or just the Decls? > > Recently, we thought about having a new FrontendAction callback for > > beginning a new phase when compiling incremental input. We need to keep > > track of the created objects (needed for error recovery) in our > > Transaction. We can have a map of Transaction* to llvm::Module* in CodeGen. > > The issue is that new JITs take ownership of the llvm::Module* which seems > > to make it impossible to support jitted code removal with that model (cc: > > @lhames, @rsmith). > > In the new APIs, in order to enable removable code, you can associate Modules > with ResourceTrackers when they're added to the JIT. The ResourceTrackers > then allow for removal. Idiomatic usage looks like: > > auto Mod = /* create module */; > auto RT = JD.createResourceTracker(); > J.addModule(RT, std::move(Mod)); > //... > if (auto Err = RT.remove()) > /* handle Err */; Nice, thanks! > > > we have just one ever growing TU [...] which we send to RuntimeDyld... > > So is a TU the same as an llvm::Module in this context? If so, how do you > reconcile that with the JIT taking ownership of modules? Are you just copying > the Module each time before adding it? Each incremental chunk with which the TU grows has a corresponding `llvm::Module`. Once clang's CodeGen is done for the particular module it transfers the ownership to the `Transaction` which, in turn, hands it to the JIT and once the JIT is done it retains the ownership again. > > > We need to keep track of the created objects (needed for error recovery) in > > our Transaction. > > Do you need the Module* for error recovery? Or just the Decls? Yes, we need a `llvm::Module` that corresponds to the Decls as sometimes CodeGen will decide not to emit a Decl. ================ Comment at: clang/lib/Interpreter/IncrementalExecutor.cpp:29-51 + using namespace llvm::orc; + llvm::ErrorAsOutParameter EAO(&Err); + auto JitOrErr = LLJITBuilder().create(); + if (auto Err2 = JitOrErr.takeError()) { + Err = std::move(Err2); + return; + } ---------------- lhames wrote: > I think this can be shortened to: > > using namespace llvm::orc; > llvm::ErrorAsOutParameter EAO(&Err); > > if (auto JitOrErr = LLJITBuilder.create()) > Jit = std::move(*JitOrErr); > else { > Err = JitOrErr.takeError(); > return; > } > > const auto &DL = Jit->getDataLayout(); > if (auto PSGOrErr = > DynamicLibrarySearchGenerator::GetForCurrentProcess(DL.getGlobalPrefix())) > Jit->getMainJITDylib().addGenerator(std::move(*PSGOrErr)); > else { > Err = PSGOrErr.takeError(); > return; > } > > You don't need the call to > `llvm::sys::DynamicLibrary::LoadLibraryPermanently(nullptr);` any more: > DynamicLibrarySearchGenerator::GetForCurrentProcess does that for you. Cool, thanks! ================ Comment at: clang/lib/Interpreter/IncrementalExecutor.cpp:56-59 +llvm::Error IncrementalExecutor::addModule(std::unique_ptr<llvm::Module> M) { + llvm::orc::ThreadSafeContext TSCtx(std::make_unique<llvm::LLVMContext>()); + return Jit->addIRModule(llvm::orc::ThreadSafeModule(std::move(M), TSCtx)); +} ---------------- lhames wrote: > This doesn't look right. The ThreadSafeContext has to contain the LLVMContext > for the module, but here you're creating a new unrelated ThreadSafeContext. Thanks. I think I fixed it now. Can you take a look? ================ Comment at: clang/lib/Interpreter/IncrementalExecutor.h:36 + llvm::Error addModule(std::unique_ptr<llvm::Module> M); + llvm::Error runCtors() const; +}; ---------------- lhames wrote: > v.g.vassilev wrote: > > sgraenitz wrote: > > > v.g.vassilev wrote: > > > > teemperor wrote: > > > > > Should we maybe merge `runCtors` and `addModule`? Not sure if there > > > > > is a use case for adding a Module but not running Ctors. Also > > > > > documentation. > > > > The case we have is when there is no JIT -- currently we have such a > > > > case in IncrementalProcessingTest I think. Another example, which will > > > > show up in future patches, is the registration of atexit handlers. That > > > > is, before we `runCtors` we make a pass over the LLVM IR and collect > > > > some specific details and (possibly change the IR and then run). > > > > > > > > I'd rather keep it separate for now if that's okay. > > > > Should we maybe merge runCtors and addModule? > > > > > > +1 even though there may be open questions regarding incremental > > > initialization. > > > > > > > The case we have is when there is no JIT -- currently we have such a > > > > case in IncrementalProcessingTest > > > > > > Can you run anything if there is no JIT? I think what you have in > > > `IncrementalProcessing.EmitCXXGlobalInitFunc` is > > > `getGlobalInit(llvm::Module*)`, which checks for symbol names with a > > > specific prefix. > > > > > > > before we runCtors we make a pass over the LLVM IR and collect some > > > > specific details and (possibly change the IR and then run). > > > > > > The idiomatic solution for such modifications would use an > > > IRTransformLayer as in: > > > https://github.com/llvm/llvm-project/blob/13f4448ae7db1a47/llvm/examples/OrcV2Examples/LLJITWithOptimizingIRTransform/LLJITWithOptimizingIRTransform.cpp#L108 > > > > > > > Another example, which will show up in future patches, is the > > > > registration of atexit handlers > > > > > > `atexit` handlers as well as global ctors/dtors should be covered by > > > LLJIT PlatformSupport. The LLJITBuilder will inject a > > > GenericLLVMIRPlatformSupport instance by default: > > > https://github.com/llvm/llvm-project/blob/13f4448ae7db1a47/llvm/lib/ExecutionEngine/Orc/LLJIT.cpp#L125 > > > > > > It's not as comprehensive as e.g. the MachO implementation, but should be > > > sufficient for your use-case as you have IR for all your JITed code. (It > > > would NOT work if you cached object files, reloaded them in a subsequent > > > session and wanted to run their ctors.) So, your below call to > > > `initialize()` should do it already. > > > > > > > Should we maybe merge runCtors and addModule? > > > > > > +1 even though there may be open questions regarding incremental > > > initialization. > > > > > > > The case we have is when there is no JIT -- currently we have such a > > > > case in IncrementalProcessingTest > > > > > > Can you run anything if there is no JIT? I think what you have in > > > `IncrementalProcessing.EmitCXXGlobalInitFunc` is > > > `getGlobalInit(llvm::Module*)`, which checks for symbol names with a > > > specific prefix. > > > > Yes, I'd think such mode is useful for testing but also for other cases > > where the user is handed a Transaction* and allowed to make some > > modification before processing the `llvm::Module` > > > > > > > > > before we runCtors we make a pass over the LLVM IR and collect some > > > > specific details and (possibly change the IR and then run). > > > > > > The idiomatic solution for such modifications would use an > > > IRTransformLayer as in: > > > https://github.com/llvm/llvm-project/blob/13f4448ae7db1a47/llvm/examples/OrcV2Examples/LLJITWithOptimizingIRTransform/LLJITWithOptimizingIRTransform.cpp#L108 > > > > That looks very nice. It assumes the JIT is open to the users, here we open > > only the `llvm::Module` (I am not arguing if that's a good idea in general). > > > > > > > > > Another example, which will show up in future patches, is the > > > > registration of atexit handlers > > > > > > `atexit` handlers as well as global ctors/dtors should be covered by > > > LLJIT PlatformSupport. The LLJITBuilder will inject a > > > GenericLLVMIRPlatformSupport instance by default: > > > https://github.com/llvm/llvm-project/blob/13f4448ae7db1a47/llvm/lib/ExecutionEngine/Orc/LLJIT.cpp#L125 > > > > Does that give me control over when the `atexit` handlers are called? Can > > the interpreter call them at its choice? > > > > > > > > It's not as comprehensive as e.g. the MachO implementation, but should be > > > sufficient for your use-case as you have IR for all your JITed code. (It > > > would NOT work if you cached object files, reloaded them in a subsequent > > > session and wanted to run their ctors.) So, your below call to > > > `initialize()` should do it already. > > > > > > > > >> Should we maybe merge runCtors and addModule? > > +1 even though there may be open questions regarding incremental > > initialization. > > In the long term constructors should be run via the Orc runtime (currently > planned for initial release in LLVM 13 later this year). I like the idea of > keeping "add module" and "run initializers" as two separate steps, with > initializers being run only when you execute a top level expression. It would > allow for workflows like this: > > interpreter% :load a.cpp > interpreter% :load b.cpp > > where an initializer in a.cpp depends on code in b.cpp. It would also allow > for defining constructors with forward references in the REPL itself. > > The Orc runtime is currently focused on emulating the usual execution > environment: The canonical way to execute initializers is by calling > jit_dlopen on the target JITDylib. I think the plan should be to generalize > this behavior (either in the jit_dlopen contract, or by introducing a > jit_dlopen_repl function) to allow for repeated calls to dlopen, with each > subsequent dlopen call executing any discovered-but-not-yet-run initializers. > > > > >> Does that give me control over when the atexit handlers are called? Can > >> the interpreter call them at its choice? > > > > It's not as comprehensive as e.g. the MachO implementation, but should be > > sufficient for your use-case as you have IR for all your JITed code. (It > > would NOT work if you cached object files, reloaded them in a subsequent > > session and wanted to run their ctors.) So, your below call to initialize() > > should do it already. > > Yep -- initialize should run the constructors, which should call cxa_atexit. > The cxa_atexit calls should be interposed by GenericLLVMIRPlatform, and the > atexits run when you call LLJIT::deinitialize on the JITDylib. There are some > basic regression tests for this, but it hasn't been stress tested yet. > > GenericLLVMIRPlatform should actually support initializers in cached object > files that were compiled from Modules added to LLJIT: The platform replaces > llvm.global_ctors with an init function with a known name, then looks for > that name in objects loaded for the cache. At least that was the plan, I > don't recall whether it has actually been tested. What definitely doesn't > work is running initializers in objects produced outside LLJIT. That will be > fixed by JITLink/ELF and the Orc Runtime though (and already works for MachO > in the runtime prototype). @sgraenitz, @lhames, thanks for the clarifications. I am marking your comments as resolved (for easier tracking on my end). If the intent was to change something in this patch could you elaborate a little more what specifically I need to do here? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D96033/new/ https://reviews.llvm.org/D96033 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits