nickdesaulniers added a comment.

In D141451#4064298 <https://reviews.llvm.org/D141451#4064298>, @dblaikie wrote:

> Right - I was thinking more, as above, about directly using the existing 
> metadata generation (if it's too expensive to enable by default, then 
> possibly under an off-by-default warning or other flag) that the inliner 
> already knows how to read and write, rather than creating new/different 
> metadata handling.
> Again, might be worth knowing what the cost of the debug info metadata loc 
> tracking mode is.

For a first order approximation, I have this diff applied:

  diff --git a/clang/include/clang/Basic/CodeGenOptions.def 
b/clang/include/clang/Basic/CodeGenOptions.def
  index 436226c6f178..6d5049803188 100644
  --- a/clang/include/clang/Basic/CodeGenOptions.def
  +++ b/clang/include/clang/Basic/CodeGenOptions.def
  @@ -386,7 +386,7 @@ VALUE_CODEGENOPT(SmallDataLimit, 32, 0)
   VALUE_CODEGENOPT(SSPBufferSize, 32, 0)
   
   /// The kind of generated debug info.
  -ENUM_CODEGENOPT(DebugInfo, codegenoptions::DebugInfoKind, 4, 
codegenoptions::NoDebugInfo)
  +ENUM_CODEGENOPT(DebugInfo, codegenoptions::DebugInfoKind, 4, 
codegenoptions::LocTrackingOnly)
   
   /// Whether to generate macro debug info.
   CODEGENOPT(MacroDebugInfo, 1, 0)

This changes the default value of `codegenopts.DebugInfo` from 
`codegenoptions::NoDebugInfo` to `codegenoptions::LocTrackingOnly`, which is 
what the optimization remark emitter infra uses.  This emits more debug info 
than we need (metadata for every statement in IR rather than JUST `call` 
instructions).  But it would give me analogous metadata I could use to solve 
this problem addressed by this patch, and it would guarantee that I had precise 
col/line info.

Comparing 30 Linux kernel x86_64 defconfig (i.e. no debug info) builds with vs 
without that change:

Without (baseline):

  $ hyperfine --prepare 'make LLVM=1 -j128 -s clean' --runs 30 'make LLVM=1 
-j128 -s'
  Benchmark 1: make LLVM=1 -j128 -s
    Time (mean ± σ):     61.592 s ±  0.156 s    [User: 4378.360 s, System: 
312.040 s]
    Range (min … max):   61.283 s … 62.026 s    30 runs

With diff from above:

  $ hyperfine --prepare 'make LLVM=1 -j128 -s clean' --runs 30 'make LLVM=1 
-j128 -s'  
  Benchmark 1: make LLVM=1 -j128 -s
    Time (mean ± σ):     62.228 s ±  0.523 s    [User: 4433.828 s, System: 
312.908 s]
    Range (min … max):   61.825 s … 64.912 s    30 runs

So that's a slowdown from the mean of ~1.02% ((1 - 61.592/62.228)*100). We 
probably could claw some of that back if we had another level of 
`codegenoptions` between `NoDebugInfo` and `LocTrackingOnly` that only emitted 
LocTracking info for call insts (stated another way, omit `DILocation` for all 
instructions other than `call`). I'm guessing that would take significant work 
to add to clang; I wasn't able how to figure out how to do so quickly.  I 
imagine updating the non-debug-info clang tests to also be a treat.  Is a 1% 
compile time performance hit worth more precise backend diagnostics? Probably 
(IMO). Perhaps worth an RFC?

---

FWIW, here are measurements of D141451 <https://reviews.llvm.org/D141451> @ 
Diff 488371 at the same base as the baseline:

  $ hyperfine --prepare 'make LLVM=1 -j128 -s clean' --runs 30 'make LLVM=1 
-j128 -s'
  Benchmark 1: make LLVM=1 -j128 -s
    Time (mean ± σ):     61.702 s ±  0.456 s    [User: 4395.106 s, System: 
313.120 s]
    Range (min … max):   61.409 s … 64.000 s    30 runs

That's ~0.17% slower (and my machine is running hot from benchmarking for the 
past 1.5hrs). Note this patch/approach doesn't have precise line info.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141451/new/

https://reviews.llvm.org/D141451

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
  • [PATCH] D141451: [clang] ... Nick Desaulniers via Phabricator via cfe-commits

Reply via email to