================
@@ -1692,6 +1692,19 @@ 
PassBuilder::buildFatLTODefaultPipeline(OptimizationLevel Level, bool ThinLTO,
   if (ThinLTO && PGOOpt && PGOOpt->Action == PGOOptions::SampleUse)
     MPM.addPass(buildThinLTODefaultPipeline(Level, /*ImportSummary=*/nullptr));
   else {
+    // ModuleSimplification does not run the coroutine passes for
+    // ThinLTOPreLink, so we need the coroutine passes to run for ThinLTO
+    // builds, otherwise they will miscompile.
+    if (ThinLTO) {
+      // TODO: replace w/ buildCoroWrapper() when it takes phase and level into
+      // consideration.
+      CGSCCPassManager CGPM;
+      CGPM.addPass(CoroSplitPass(Level != OptimizationLevel::O0));
+      CGPM.addPass(CoroAnnotationElidePass());
+      MPM.addPass(createModuleToPostOrderCGSCCPassAdaptor(std::move(CGPM)));
+      MPM.addPass(CoroCleanupPass());
----------------
nikic wrote:

I think the current PR is a good way to start, as it avoids the miscompilation 
in a minimal way that is suitable for backporting.

Using the module simplification pipeline here (+ the module optimization 
pipeline) would effectively be the same as using the ThinLTO post-link pipeline.

I ran some tests, and it looks like adding `-ffat-lto-objects` adds about 10% 
overhead to the clang build 
(https://llvm-compile-time-tracker.com/compare.php?from=44923d8631fb28b4de54d4210762f256c3894cef&to=2bdf721c2d37af6fcbd931d963f586478cb55f17&stat=instructions:u).
 On top of that, switching this from using the module optimization pipeline to 
the ThinLTO post-link pipeline (your original PR) adds an extra 4%: 
https://llvm-compile-time-tracker.com/compare.php?from=2bdf721c2d37af6fcbd931d963f586478cb55f17&to=63666418fbe19f30bf971796747a751b4e1c57f3&stat=instructions:u

Adding 4% overhead is not great, but also not terrible, so maybe it's worth it 
to avoid pipeline compatibility issues in a principled way. It does make the 
codegen for FatLTO ELF diverge more from a normal compilation though.

Ideally we'd be able to rerun the simplification pipeline, but skip the inliner 
pipeline for already optimized functions. (I think @aeubanks had a prototype 
that did that for the actual ThinLTO scenario, by looking at 
available_externally functions. The situation here is somewhat different.)

https://github.com/llvm/llvm-project/pull/134434
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to