[clang] [llvm] [PseudoProbe] Mix and reorder block and call probe ID in lexical order (PR #75092)
https://github.com/wlei-llvm created https://github.com/llvm/llvm-project/pull/75092 Before all the call probe ids are after block ids, in this change, it mixed the call probe and block probe by reordering them in lexical(line-number) order. For example: ``` main(): BB1 if(...) BB2 foo(..); else BB3 bar(...); BB4 ``` Before the profile is ``` main 1: .. 2: .. 3: ... 4: ... 5: foo ... 6: bar ... ``` Now the new order is ``` main 1: .. 2: .. 3: foo ... 4: ... 5: bar ... 6: ... ``` This can potentially make it more tolerant of profile mismatch, either from stale profile or frontend change. e.g. before if we add one block, even the block is the last one, all the call probes are shifted and mismatched. Moreover, this makes better use of call-anchor based stale profile matching. Blocks are matched based on the closest anchor, there would be more anchors used for the matching, reduce the mismatch scope. >From 90c3c829e67492e92b86517e92a49c6f90e83b4f Mon Sep 17 00:00:00 2001 From: wlei Date: Sun, 10 Dec 2023 18:30:42 -0800 Subject: [PATCH] [PseudoProbe] Mix and reorder block and call probe ID in lexical order --- clang/test/CodeGen/pseudo-probe-emit.c| 8 +++ .../llvm/Transforms/IPO/SampleProfileProbe.h | 3 +-- .../lib/Transforms/IPO/SampleProfileProbe.cpp | 16 +++--- .../Inputs/pseudo-probe-profile.prof | 8 +++ .../Inputs/pseudo-probe-update.prof | 8 +++ .../SampleProfile/pseudo-probe-dangle.ll | 12 +- .../pseudo-probe-discriminator.ll | 6 ++--- .../pseudo-probe-profile-metadata-2.ll| 15 ++--- .../SampleProfile/pseudo-probe-profile.ll | 22 +-- .../SampleProfile/pseudo-probe-update.ll | 11 +- .../SampleProfile/pseudo-probe-verify.ll | 16 +++--- 11 files changed, 56 insertions(+), 69 deletions(-) diff --git a/clang/test/CodeGen/pseudo-probe-emit.c b/clang/test/CodeGen/pseudo-probe-emit.c index c7a3f7e6d5b02b..360f831e842945 100644 --- a/clang/test/CodeGen/pseudo-probe-emit.c +++ b/clang/test/CodeGen/pseudo-probe-emit.c @@ -10,9 +10,9 @@ void foo(int x) { // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID:]], i64 1, i32 0, i64 -1) if (x == 0) // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 2, i32 0, i64 -1) -bar(); +bar(); // probe id : 3 else -// CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 3, i32 0, i64 -1) -go(); - // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 4, i32 0, i64 -1) +// CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 4, i32 0, i64 -1) +go(); // probe id : 5 + // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 6, i32 0, i64 -1) } diff --git a/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h b/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h index 0f2729a9462de2..69b87adf105fd1 100644 --- a/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h +++ b/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h @@ -82,8 +82,7 @@ class SampleProfileProber { uint32_t getBlockId(const BasicBlock *BB) const; uint32_t getCallsiteId(const Instruction *Call) const; void computeCFGHash(); - void computeProbeIdForBlocks(); - void computeProbeIdForCallsites(); + void computeProbeId(); Function *F; diff --git a/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp b/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp index 8f0b12d0cfedfc..e2c5eaf261d298 100644 --- a/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp +++ b/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp @@ -173,8 +173,7 @@ SampleProfileProber::SampleProfileProber(Function &Func, BlockProbeIds.clear(); CallProbeIds.clear(); LastProbeId = (uint32_t)PseudoProbeReservedId::Last; - computeProbeIdForBlocks(); - computeProbeIdForCallsites(); + computeProbeId(); computeCFGHash(); } @@ -209,7 +208,7 @@ void SampleProfileProber::computeCFGHash() { << ", Hash = " << FunctionHash << "\n"); } -void SampleProfileProber::computeProbeIdForBlocks() { +void SampleProfileProber::computeProbeId() { DenseSet KnownColdBlocks; computeEHOnlyBlocks(*F, KnownColdBlocks); // Insert pseudo probe to non-cold blocks only. This will reduce IR size as @@ -218,18 +217,9 @@ void SampleProfileProber::computeProbeIdForBlocks() { ++LastProbeId; if (!KnownColdBlocks.contains(&BB)) BlockProbeIds[&BB] = LastProbeId; - } -} - -void SampleProfileProber::computeProbeIdForCallsites() { - LLVMContext &Ctx = F->getContext(); - Module *M = F->getParent(); - for (auto &BB : *F) { for (auto &I : BB) { - if (!isa(I)) -continue; - if (isa(&I)) + if (!isa(I) || isa(&I)) continue; // The current implementation uses the lower 16 bits of the discriminator diff --git a/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-profile.prof b/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-profile.prof ind
[clang] [llvm] [PseudoProbe] Mix and reorder block and call probe ID in lexical order (PR #75092)
https://github.com/wlei-llvm updated https://github.com/llvm/llvm-project/pull/75092 >From fe05f64f7ae0decc17d98bb694bbade178df7a4b Mon Sep 17 00:00:00 2001 From: wlei Date: Sun, 10 Dec 2023 18:30:42 -0800 Subject: [PATCH] [PseudoProbe] Mix and reorder block and call probe ID in lexical order --- clang/test/CodeGen/pseudo-probe-emit.c| 8 +++ .../llvm/Transforms/IPO/SampleProfileProbe.h | 3 +-- .../lib/Transforms/IPO/SampleProfileProbe.cpp | 19 +--- .../Inputs/pseudo-probe-profile.prof | 8 +++ .../Inputs/pseudo-probe-update.prof | 8 +++ .../SampleProfile/pseudo-probe-dangle.ll | 12 +- .../pseudo-probe-discriminator.ll | 6 ++--- .../pseudo-probe-profile-metadata-2.ll| 15 ++--- .../SampleProfile/pseudo-probe-profile.ll | 22 +-- .../SampleProfile/pseudo-probe-update.ll | 11 +- .../SampleProfile/pseudo-probe-verify.ll | 16 +++--- 11 files changed, 59 insertions(+), 69 deletions(-) diff --git a/clang/test/CodeGen/pseudo-probe-emit.c b/clang/test/CodeGen/pseudo-probe-emit.c index c7a3f7e6d5b02..360f831e84294 100644 --- a/clang/test/CodeGen/pseudo-probe-emit.c +++ b/clang/test/CodeGen/pseudo-probe-emit.c @@ -10,9 +10,9 @@ void foo(int x) { // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID:]], i64 1, i32 0, i64 -1) if (x == 0) // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 2, i32 0, i64 -1) -bar(); +bar(); // probe id : 3 else -// CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 3, i32 0, i64 -1) -go(); - // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 4, i32 0, i64 -1) +// CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 4, i32 0, i64 -1) +go(); // probe id : 5 + // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 6, i32 0, i64 -1) } diff --git a/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h b/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h index 0f2729a9462de..69b87adf105fd 100644 --- a/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h +++ b/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h @@ -82,8 +82,7 @@ class SampleProfileProber { uint32_t getBlockId(const BasicBlock *BB) const; uint32_t getCallsiteId(const Instruction *Call) const; void computeCFGHash(); - void computeProbeIdForBlocks(); - void computeProbeIdForCallsites(); + void computeProbeId(); Function *F; diff --git a/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp b/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp index 8f0b12d0cfedf..6c6bb18bcb3c9 100644 --- a/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp +++ b/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp @@ -173,8 +173,7 @@ SampleProfileProber::SampleProfileProber(Function &Func, BlockProbeIds.clear(); CallProbeIds.clear(); LastProbeId = (uint32_t)PseudoProbeReservedId::Last; - computeProbeIdForBlocks(); - computeProbeIdForCallsites(); + computeProbeId(); computeCFGHash(); } @@ -209,7 +208,10 @@ void SampleProfileProber::computeCFGHash() { << ", Hash = " << FunctionHash << "\n"); } -void SampleProfileProber::computeProbeIdForBlocks() { +void SampleProfileProber::computeProbeId() { + LLVMContext &Ctx = F->getContext(); + Module *M = F->getParent(); + DenseSet KnownColdBlocks; computeEHOnlyBlocks(*F, KnownColdBlocks); // Insert pseudo probe to non-cold blocks only. This will reduce IR size as @@ -218,18 +220,9 @@ void SampleProfileProber::computeProbeIdForBlocks() { ++LastProbeId; if (!KnownColdBlocks.contains(&BB)) BlockProbeIds[&BB] = LastProbeId; - } -} -void SampleProfileProber::computeProbeIdForCallsites() { - LLVMContext &Ctx = F->getContext(); - Module *M = F->getParent(); - - for (auto &BB : *F) { for (auto &I : BB) { - if (!isa(I)) -continue; - if (isa(&I)) + if (!isa(I) || isa(&I)) continue; // The current implementation uses the lower 16 bits of the discriminator diff --git a/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-profile.prof b/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-profile.prof index ba4c6117dc96a..d3847946b9403 100644 --- a/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-profile.prof +++ b/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-profile.prof @@ -1,8 +1,8 @@ foo:3200:13 1: 13 2: 7 - 3: 6 - 4: 13 - 5: 7 _Z3barv:2 _Z3foov:5 - 6: 6 _Z3barv:4 _Z3foov:2 + 4: 6 + 6: 13 + 3: 7 _Z3barv:2 _Z3foov:5 + 5: 6 _Z3barv:4 _Z3foov:2 !CFGChecksum: 563022570642068 diff --git a/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-update.prof b/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-update.prof index 62f9bd5992e73..213bf0b6f81cc 100644 --- a/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-update.prof +++ b/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-update.prof @@ -1,8 +1,8 @@ foo:3200:13 1: 13 2: 7 - 3: 6 -
[llvm] [clang] [PseudoProbe] Mix and reorder block and call probe ID in lexical order (PR #75092)
wlei-llvm wrote: > We will need some protection against accidentally consuming old profile with > new toolchain and vice versa. The cost of investigating mysterious perf > regression can be high and we'd rather simply error out in those cases. Maybe > consider some flag for `SecProfSummaryFlags` > Agreed with this concern. To do this, we probably also need a flag in the binary, because otherwise if we use the new toolchain for prof-gen but the binary built on the old toolchain, we then would generate a profile with this flag on but the order is the old one. My understanding is the (probe) version of profile should line up with the version of the binary. Maybe make this more general, we can introduce a "pseudo_probe_version" thing, similar to instrPGO https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/ProfileData/InstrProf.h#L996 . It can be useful if we want to update other pseudo probe formats in future. > (also wondering if this needs to be per-function flag since technically some > function/module can be compiled with old order while others with new order). This seems very complicated, supposing you meant it's mixed compiled case(some objects files are built on old toolchain and some are on new toolchain), then in profile generation time, it's hard to know which order the function uses. That requires some function level version thing.. maybe add version to `pseudo_probe_desc` which is function-level and encode the version. > Can we gauge perf benefit of mixed order encoding? i.e. profile an old source > rev (several weeks ago?) with both probe order, and build with new source rev > using that old profile, then see how old vs new order compare in terms of > perf and efficacy with incremental PGO. Yes, I saw some perf wins using the new order, it's not as easily measurable as the call anchor, as you said, I had to use a old rev to run profile collection using the new order. I will show you the results offline. https://github.com/llvm/llvm-project/pull/75092 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [PseudoProbe] Mix and reorder block and call probe ID in lexical order (PR #75092)
https://github.com/wlei-llvm updated https://github.com/llvm/llvm-project/pull/75092 >From ccfee2d0c5399b03b53ef79a4645ab0d10efeafd Mon Sep 17 00:00:00 2001 From: wlei Date: Sun, 10 Dec 2023 18:30:42 -0800 Subject: [PATCH 1/2] [PseudoProbe] Mix and reorder block and call probe ID in lexical order --- clang/test/CodeGen/pseudo-probe-emit.c| 8 +++ .../llvm/Transforms/IPO/SampleProfileProbe.h | 3 +-- .../lib/Transforms/IPO/SampleProfileProbe.cpp | 19 +--- .../Inputs/pseudo-probe-profile.prof | 8 +++ .../Inputs/pseudo-probe-update.prof | 8 +++ .../SampleProfile/pseudo-probe-dangle.ll | 12 +- .../pseudo-probe-discriminator.ll | 6 ++--- .../pseudo-probe-profile-metadata-2.ll| 15 ++--- .../SampleProfile/pseudo-probe-profile.ll | 22 +-- .../SampleProfile/pseudo-probe-update.ll | 11 +- .../SampleProfile/pseudo-probe-verify.ll | 16 +++--- 11 files changed, 59 insertions(+), 69 deletions(-) diff --git a/clang/test/CodeGen/pseudo-probe-emit.c b/clang/test/CodeGen/pseudo-probe-emit.c index c7a3f7e6d5b02b..360f831e842945 100644 --- a/clang/test/CodeGen/pseudo-probe-emit.c +++ b/clang/test/CodeGen/pseudo-probe-emit.c @@ -10,9 +10,9 @@ void foo(int x) { // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID:]], i64 1, i32 0, i64 -1) if (x == 0) // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 2, i32 0, i64 -1) -bar(); +bar(); // probe id : 3 else -// CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 3, i32 0, i64 -1) -go(); - // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 4, i32 0, i64 -1) +// CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 4, i32 0, i64 -1) +go(); // probe id : 5 + // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 6, i32 0, i64 -1) } diff --git a/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h b/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h index 0f2729a9462de2..69b87adf105fd1 100644 --- a/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h +++ b/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h @@ -82,8 +82,7 @@ class SampleProfileProber { uint32_t getBlockId(const BasicBlock *BB) const; uint32_t getCallsiteId(const Instruction *Call) const; void computeCFGHash(); - void computeProbeIdForBlocks(); - void computeProbeIdForCallsites(); + void computeProbeId(); Function *F; diff --git a/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp b/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp index 8f0b12d0cfedfc..6c6bb18bcb3c9d 100644 --- a/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp +++ b/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp @@ -173,8 +173,7 @@ SampleProfileProber::SampleProfileProber(Function &Func, BlockProbeIds.clear(); CallProbeIds.clear(); LastProbeId = (uint32_t)PseudoProbeReservedId::Last; - computeProbeIdForBlocks(); - computeProbeIdForCallsites(); + computeProbeId(); computeCFGHash(); } @@ -209,7 +208,10 @@ void SampleProfileProber::computeCFGHash() { << ", Hash = " << FunctionHash << "\n"); } -void SampleProfileProber::computeProbeIdForBlocks() { +void SampleProfileProber::computeProbeId() { + LLVMContext &Ctx = F->getContext(); + Module *M = F->getParent(); + DenseSet KnownColdBlocks; computeEHOnlyBlocks(*F, KnownColdBlocks); // Insert pseudo probe to non-cold blocks only. This will reduce IR size as @@ -218,18 +220,9 @@ void SampleProfileProber::computeProbeIdForBlocks() { ++LastProbeId; if (!KnownColdBlocks.contains(&BB)) BlockProbeIds[&BB] = LastProbeId; - } -} -void SampleProfileProber::computeProbeIdForCallsites() { - LLVMContext &Ctx = F->getContext(); - Module *M = F->getParent(); - - for (auto &BB : *F) { for (auto &I : BB) { - if (!isa(I)) -continue; - if (isa(&I)) + if (!isa(I) || isa(&I)) continue; // The current implementation uses the lower 16 bits of the discriminator diff --git a/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-profile.prof b/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-profile.prof index ba4c6117dc96ab..d3847946b94033 100644 --- a/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-profile.prof +++ b/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-profile.prof @@ -1,8 +1,8 @@ foo:3200:13 1: 13 2: 7 - 3: 6 - 4: 13 - 5: 7 _Z3barv:2 _Z3foov:5 - 6: 6 _Z3barv:4 _Z3foov:2 + 4: 6 + 6: 13 + 3: 7 _Z3barv:2 _Z3foov:5 + 5: 6 _Z3barv:4 _Z3foov:2 !CFGChecksum: 563022570642068 diff --git a/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-update.prof b/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-update.prof index 62f9bd5992e735..213bf0b6f81cc4 100644 --- a/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-update.prof +++ b/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-update.prof @@ -1,8 +1,8 @@ foo:3200:13 1: 13
[clang] [llvm] [PseudoProbe] Mix and reorder block and call probe ID in lexical order (PR #75092)
wlei-llvm wrote: > > Agreed with this concern. To do this, we probably also need a flag in the > > binary, because otherwise if we use the new toolchain for prof-gen but the > > binary built on the old toolchain, we then would generate a profile with > > this flag on but the order is the old one. My understanding is the (probe) > > version of profile should line up with the version of the binary. Maybe > > make this more general, we can introduce a "pseudo_probe_version" thing, > > similar to instrPGO > > https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/ProfileData/InstrProf.h#L996 > > . It can be useful if we want to update other pseudo probe formats in > > future. > > Versioning is one way to do this. Alternatively, we should be able to detect > the case during llvm-profgen time when all call site probe and block probe > ids are separated, which can then be used to set flags? Sounds good. Add a new function(`checkProbeIDIsInMixedOrder`) to check the old order in profile generation time. Always write a SecProfSummaryFlags flag, if the function return true. https://github.com/llvm/llvm-project/pull/75092 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Clang][Sema] Fix crash when using name of UnresolvedUsingValueDecl with template arguments (PR #83842)
wlei-llvm wrote: Hi: We hit a crash/assertion, and bisected to this. Here is the stack dump: ``` clang++: /home/wlei/local/upstream/llvm-project/llvm/include/llvm/Support/Casting.h:578: decltype(auto) llvm::cast(From *) [To = clang::CXXRecordDecl, From = clang::DeclContext]: Assertion `isa(Val) && "cast() argument of incompatible type!"' failed. PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script. Stack dump: 0. Program arguments: /home/wlei/local/upstream/llvm-build/bin/clang++ @/tmp/fbcc.xxbvookk/compiler.argsfile 1. fbcode/multifeed/ranking/user_profile/GFIHCountsMap.cpp:633:55: current parser token ')' 2. fbcode/multifeed/ranking/user_profile/GFIHCountsMap.cpp:26:1: parsing namespace 'facebook' 3. fbcode/multifeed/ranking/user_profile/GFIHCountsMap.cpp:622:29: parsing function body 'facebook::multifeed::ranking::GFIHCountsMap::getTotalAggregationIfConfigured' 4. fbcode/multifeed/ranking/user_profile/GFIHCountsMap.cpp:622:29: in compound statement ('{}') #0 0x7f25c2bf9588 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/wlei/local/upstream/llvm-project/llvm/lib/Support/Unix/Signals.inc:723:13 #1 0x7f25c2bf7630 llvm::sys::RunSignalHandlers() /home/wlei/local/upstream/llvm-project/llvm/lib/Support/Signals.cpp:106:18 #2 0x7f25c2bf8c41 llvm::sys::CleanupOnSignal(unsigned long) /home/wlei/local/upstream/llvm-project/llvm/lib/Support/Unix/Signals.inc:0:3 #3 0x7f25c2b3f908 (anonymous namespace)::CrashRecoveryContextImpl::HandleCrash(int, unsigned long) /home/wlei/local/upstream/llvm-project/llvm/lib/Support/CrashRecoveryContext.cpp:73:5 #4 0x7f25c2b3f908 CrashRecoverySignalHandler(int) /home/wlei/local/upstream/llvm-project/llvm/lib/Support/CrashRecoveryContext.cpp:390:51 #5 0x7f25c7612d20 __restore_rt (/lib64/libpthread.so.0+0x12d20) #6 0x7f25c1a4e52f raise (/lib64/libc.so.6+0x4e52f) #7 0x7f25c1a21e65 abort (/lib64/libc.so.6+0x21e65) #8 0x7f25c1a21d39 _nl_load_domain.cold.0 (/lib64/libc.so.6+0x21d39) #9 0x7f25c1a46e86 (/lib64/libc.so.6+0x46e86) #10 0x7f25bf3a934c FindDeclaringClass(clang::NamedDecl*) /home/wlei/local/upstream/llvm-project/clang/lib/Sema/SemaAccess.cpp:0:0 #11 0x7f25bf3a9265 (anonymous namespace)::AccessTarget::initialize() /home/wlei/local/upstream/llvm-project/clang/lib/Sema/SemaAccess.cpp:267:24 #12 0x7f25bf3a686c clang::sema::AccessedEntity::isQuiet() const /home/wlei/local/upstream/llvm-project/clang/include/clang/Sema/DelayedDiagnostic.h:75:50 #13 0x7f25bf3a686c clang::sema::AccessedEntity::setDiag(unsigned int) /home/wlei/local/upstream/llvm-project/clang/include/clang/Sema/DelayedDiagnostic.h:104:5 #14 0x7f25bf3a686c clang::Sema::CheckUnresolvedLookupAccess(clang::UnresolvedLookupExpr*, clang::DeclAccessPair) /home/wlei/local/upstream/llvm-project/clang/lib/Sema/SemaAccess.cpp:1567:10 #15 0x7f25bfa98e83 clang::DeclarationNameInfo::getLoc() const /home/wlei/local/upstream/llvm-project/clang/include/clang/AST/DeclarationName.h:797:42 #16 0x7f25bfa98e83 clang::OverloadExpr::getNameLoc() const /home/wlei/local/upstream/llvm-project/clang/include/clang/AST/ExprCXX.h:3078:55 #17 0x7f25bfa98e83 FinishOverloadedCallExpr(clang::Sema&, clang::Scope*, clang::Expr*, clang::UnresolvedLookupExpr*, clang::SourceLocation, llvm::MutableArrayRef, clang::SourceLocation, clang::Expr*, clang::OverloadCandidateSet*, clang::OverloadCandidate**, clang::OverloadingResult, bool) /home/wlei/local/upstream/llvm-project/clang/lib/Sema/SemaOverload.cpp:14062:47 #18 0x7f25bfa98c8f clang::Sema::BuildOverloadedCallExpr(clang::Scope*, clang::Expr*, clang::UnresolvedLookupExpr*, clang::SourceLocation, llvm::MutableArrayRef, clang::SourceLocation, clang::Expr*, bool, bool) /home/wlei/local/upstream/llvm-project/clang/lib/Sema/SemaOverload.cpp:14200:10 #19 0x7f25bf70511d clang::Sema::BuildCallExpr(clang::Scope*, clang::Expr*, clang::SourceLocation, llvm::MutableArrayRef, clang::SourceLocation, clang::Expr*, bool, bool) /home/wlei/local/upstream/llvm-project/clang/lib/Sema/SemaExpr.cpp:7276:16 #20 0x7f25bf720e6b clang::Sema::ActOnCallExpr(clang::Scope*, clang::Expr*, clang::SourceLocation, llvm::MutableArrayRef, clang::SourceLocation, clang::Expr*) /home/wlei/local/upstream/llvm-project/clang/lib/Sema/SemaExpr.cpp:7167:7 #21 0x7f25c2d59dca clang::Parser::ParsePostfixExpressionSuffix(clang::ActionResult) /home/wlei/local/upstream/llvm-project/clang/lib/Parse/ParseExpr.cpp:2181:23 #22 0x7f25c2d5af8a clang::Parser::ParseCastExpression(clang::Parser::CastParseKind, bool, bool&, clang::Parser::TypeCastState, bool, bool*) /home/wlei/local/upstream/llvm-project/clang/lib/Parse/ParseExpr.cpp:1890:7 #23 0x7f25c2d56e2b clang::Parser::ParseCastExpression(clang::Parser::CastParseKind, bool, clang::Parser::TypeCastState, bool, bool*) /home/wlei/l
[clang] [Clang][Sema] Fix crash when using name of UnresolvedUsingValueDecl with template arguments (PR #83842)
wlei-llvm wrote: Here is the repro from our side. It's reduced by creduce, there are some unrelated error, the assertion is the real issue. ``` typedef a; template < typename b, b c > struct d { static constexpr b e = c; }; typedef d< bool, true > f; typedef d< bool, false > g; template < bool, typename, typename > struct aa; template < typename... > struct h; template < typename i, typename j, typename ab, typename... k > struct h< i, j, ab, k... > : aa< i::e, i, h<> >::l {}; template < typename > struct ad : g {}; template < typename b > struct ae : ad< b >{}; template < typename ai, typename aj, bool = h< ae< ai >, aj , aj >::e > class ak { template < typename > static f al(int); public: typedef decltype(al< aj >(0)) l; }; template < typename ai, typename aj > struct am : ak< ai, aj >::l {}; template < bool , typename az, typename > struct aa { typedef az l; }template < typename ai, typename aj > constexpr bool bj = am< ai, aj >::e; template < typename bx, typename by > concept bz = bj< bx , by >; template < typename... > class cy template < typename m > concept n = bz< m , m > template < typename b > concept o = n< b > template < a , typename > struct tuple_element; template < typename p, typename r > class cy< p, r > {}; template < a q, typename... t > __tuple_element_t< q, cy<> > get( cy< t... > ) ; class s ; template < class w > class u { w begin() ; }; template < typename x > class v { using y = x ; using dp = decltype(static_cast< y (*)() >( nullptr)()[{}]) ; dp operator*() } template < typename x > struct dr { using ds = x::dt; using dp = u< ds >; }template < o x > class du { using dr< x > ::dp; } enum class dy class dz template < typename eb > struct ec : eb ; template < typename... ed > struct ee :ec< cy< ed... > > {}; template < o... ej > class ek { public: using el = a; using dw = ee< ej ... >; using dt = v< ek > ; dw operator[](el ) ; }; class { using eo = s ; using ep = ek< eo, dz >; using eq = du< ep >::dp; eq get(dy cat) { auto es = get(cat) auto et = es.begin(); get< 0 >(*et) ``` https://github.com/llvm/llvm-project/pull/83842 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Clang][Sema] Fix crash when using name of UnresolvedUsingValueDecl with template arguments (PR #83842)
wlei-llvm wrote: ``` clang++ -std=gnu++20 test.cpp clang-18: /../llvm-project/llvm/include/llvm/Support/Casting.h:578: decltype(auto) llvm::cast(From *) [To = clang::CXXRecordDecl, From = clang::DeclContext]: Assertion `isa(Val) && "cast() argument of incompatible type!"' failed. PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script. 1. test.bk.10.cpp:90:15: current parser token ')' 2. test.bk.10.cpp:81:1: parsing struct/union/class body '(unnamed class at test.bk.10.cpp:81:1)' 3. test.bk.10.cpp:87:18: parsing function body '(anonymous class)::get' 4. test.bk.10.cpp:87:18: in compound statement ('{}') #0 0x7fb54bdf9588 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/wlei/local/upstream/llvm-project/llvm/lib/Support/Unix/Signals.inc:723:13 #1 0x7fb54bdf7630 llvm::sys::RunSignalHandlers() /home/wlei/local/upstream/llvm-project/llvm/lib/Support/Signals.cpp:106:18 #2 0x7fb54bdf9c5d SignalHandler(int) /home/wlei/local/upstream/llvm-project/llvm/lib/Support/Unix/Signals.inc:413:1 #3 0x7fb550812d20 __restore_rt (/lib64/libpthread.so.0+0x12d20) #4 0x7fb54ac4e52f raise (/lib64/libc.so.6+0x4e52f) #5 0x7fb54ac21e65 abort (/lib64/libc.so.6+0x21e65) #6 0x7fb54ac21d39 _nl_load_domain.cold.0 (/lib64/libc.so.6+0x21d39) #7 0x7fb54ac46e86 (/lib64/libc.so.6+0x46e86) #8 0x7fb5485a934c FindDeclaringClass(clang::NamedDecl*) /home/wlei/local/upstream/llvm-project/clang/lib/Sema/SemaAccess.cpp:0:0 #9 0x7fb5485a9265 (anonymous namespace)::AccessTarget::initialize() /home/wlei/local/upstream/llvm-project/clang/lib/Sema/SemaAccess.cpp:267:24 #10 0x7fb5485a686c clang::sema::AccessedEntity::isQuiet() const /home/wlei/local/upstream/llvm-project/clang/include/clang/Sema/DelayedDiagnostic.h:75:50 #11 0x7fb5485a686c clang::sema::AccessedEntity::setDiag(unsigned int) /home/wlei/local/upstream/llvm-project/clang/include/clang/Sema/DelayedDiagnostic.h:104:5 #12 0x7fb5485a686c clang::Sema::CheckUnresolvedLookupAccess(clang::UnresolvedLookupExpr*, clang::DeclAccessPair) /home/wlei/local/upstream/llvm-project/clang/lib/Sema/SemaAccess.cpp:1567:10 #13 0x7fb548c98e83 clang::DeclarationNameInfo::getLoc() const /home/wlei/local/upstream/llvm-project/clang/include/clang/AST/DeclarationName.h:797:42 #14 0x7fb548c98e83 clang::OverloadExpr::getNameLoc() const /home/wlei/local/upstream/llvm-project/clang/include/clang/AST/ExprCXX.h:3078:55 #15 0x7fb548c98e83 FinishOverloadedCallExpr(clang::Sema&, clang::Scope*, clang::Expr*, clang::UnresolvedLookupExpr*, clang::SourceLocation, llvm::MutableArrayRef, clang::SourceLocation, clang::Expr*, clang::OverloadCandidateSet*, clang::OverloadCandidate**, clang::OverloadingResult, bool) /home/wlei/local/upstream/llvm-project/clang/lib/Sema/SemaOverload.cpp:14062:47 #16 0x7fb548c98c8f clang::Sema::BuildOverloadedCallExpr(clang::Scope*, clang::Expr*, clang::UnresolvedLookupExpr*, clang::SourceLocation, llvm::MutableArrayRef, clang::SourceLocation, clang::Expr*, bool, bool) /home/wlei/local/upstream/llvm-project/clang/lib/Sema/SemaOverload.cpp:14200:10 #17 0x7fb54890511d clang::Sema::BuildCallExpr(clang::Scope*, clang::Expr*, clang::SourceLocation, llvm::MutableArrayRef, clang::SourceLocation, clang::Expr*, bool, bool) /home/wlei/local/upstream/llvm-project/clang/lib/Sema/SemaExpr.cpp:7276:16 #18 0x7fb548920e6b clang::Sema::ActOnCallExpr(clang::Scope*, clang::Expr*, clang::SourceLocation, llvm::MutableArrayRef, clang::SourceLocation, clang::Expr*) /home/wlei/local/upstream/llvm-project/clang/lib/Sema/SemaExpr.cpp:7167:7 #19 0x7fb54bf59dca clang::Parser::ParsePostfixExpressionSuffix(clang::ActionResult) /home/wlei/local/upstream/llvm-project/clang/lib/Parse/ParseExpr.cpp:2181:23 #20 0x7fb54bf5af8a clang::Parser::ParseCastExpression(clang::Parser::CastParseKind, bool, bool&, clang::Parser::TypeCastState, bool, bool*) /home/wlei/local/upstream/llvm-project/clang/lib/Parse/ParseExpr.cpp:1890:7 #21 0x7fb54bf56e2b clang::Parser::ParseCastExpression(clang::Parser::CastParseKind, bool, clang::Parser::TypeCastState, bool, bool*) /home/wlei/local/upstream/llvm-project/clang/lib/Parse/ParseExpr.cpp:693:20 #22 0x7fb54bf56e2b clang::Parser: ``` https://github.com/llvm/llvm-project/pull/83842 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [PseudoProbe] Mix and reorder block and call probe ID in lexical order (PR #75092)
https://github.com/wlei-llvm updated https://github.com/llvm/llvm-project/pull/75092 >From ccfee2d0c5399b03b53ef79a4645ab0d10efeafd Mon Sep 17 00:00:00 2001 From: wlei Date: Sun, 10 Dec 2023 18:30:42 -0800 Subject: [PATCH 1/3] [PseudoProbe] Mix and reorder block and call probe ID in lexical order --- clang/test/CodeGen/pseudo-probe-emit.c| 8 +++ .../llvm/Transforms/IPO/SampleProfileProbe.h | 3 +-- .../lib/Transforms/IPO/SampleProfileProbe.cpp | 19 +--- .../Inputs/pseudo-probe-profile.prof | 8 +++ .../Inputs/pseudo-probe-update.prof | 8 +++ .../SampleProfile/pseudo-probe-dangle.ll | 12 +- .../pseudo-probe-discriminator.ll | 6 ++--- .../pseudo-probe-profile-metadata-2.ll| 15 ++--- .../SampleProfile/pseudo-probe-profile.ll | 22 +-- .../SampleProfile/pseudo-probe-update.ll | 11 +- .../SampleProfile/pseudo-probe-verify.ll | 16 +++--- 11 files changed, 59 insertions(+), 69 deletions(-) diff --git a/clang/test/CodeGen/pseudo-probe-emit.c b/clang/test/CodeGen/pseudo-probe-emit.c index c7a3f7e6d5b02b..360f831e842945 100644 --- a/clang/test/CodeGen/pseudo-probe-emit.c +++ b/clang/test/CodeGen/pseudo-probe-emit.c @@ -10,9 +10,9 @@ void foo(int x) { // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID:]], i64 1, i32 0, i64 -1) if (x == 0) // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 2, i32 0, i64 -1) -bar(); +bar(); // probe id : 3 else -// CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 3, i32 0, i64 -1) -go(); - // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 4, i32 0, i64 -1) +// CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 4, i32 0, i64 -1) +go(); // probe id : 5 + // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 6, i32 0, i64 -1) } diff --git a/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h b/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h index 0f2729a9462de2..69b87adf105fd1 100644 --- a/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h +++ b/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h @@ -82,8 +82,7 @@ class SampleProfileProber { uint32_t getBlockId(const BasicBlock *BB) const; uint32_t getCallsiteId(const Instruction *Call) const; void computeCFGHash(); - void computeProbeIdForBlocks(); - void computeProbeIdForCallsites(); + void computeProbeId(); Function *F; diff --git a/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp b/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp index 8f0b12d0cfedfc..6c6bb18bcb3c9d 100644 --- a/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp +++ b/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp @@ -173,8 +173,7 @@ SampleProfileProber::SampleProfileProber(Function &Func, BlockProbeIds.clear(); CallProbeIds.clear(); LastProbeId = (uint32_t)PseudoProbeReservedId::Last; - computeProbeIdForBlocks(); - computeProbeIdForCallsites(); + computeProbeId(); computeCFGHash(); } @@ -209,7 +208,10 @@ void SampleProfileProber::computeCFGHash() { << ", Hash = " << FunctionHash << "\n"); } -void SampleProfileProber::computeProbeIdForBlocks() { +void SampleProfileProber::computeProbeId() { + LLVMContext &Ctx = F->getContext(); + Module *M = F->getParent(); + DenseSet KnownColdBlocks; computeEHOnlyBlocks(*F, KnownColdBlocks); // Insert pseudo probe to non-cold blocks only. This will reduce IR size as @@ -218,18 +220,9 @@ void SampleProfileProber::computeProbeIdForBlocks() { ++LastProbeId; if (!KnownColdBlocks.contains(&BB)) BlockProbeIds[&BB] = LastProbeId; - } -} -void SampleProfileProber::computeProbeIdForCallsites() { - LLVMContext &Ctx = F->getContext(); - Module *M = F->getParent(); - - for (auto &BB : *F) { for (auto &I : BB) { - if (!isa(I)) -continue; - if (isa(&I)) + if (!isa(I) || isa(&I)) continue; // The current implementation uses the lower 16 bits of the discriminator diff --git a/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-profile.prof b/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-profile.prof index ba4c6117dc96ab..d3847946b94033 100644 --- a/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-profile.prof +++ b/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-profile.prof @@ -1,8 +1,8 @@ foo:3200:13 1: 13 2: 7 - 3: 6 - 4: 13 - 5: 7 _Z3barv:2 _Z3foov:5 - 6: 6 _Z3barv:4 _Z3foov:2 + 4: 6 + 6: 13 + 3: 7 _Z3barv:2 _Z3foov:5 + 5: 6 _Z3barv:4 _Z3foov:2 !CFGChecksum: 563022570642068 diff --git a/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-update.prof b/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-update.prof index 62f9bd5992e735..213bf0b6f81cc4 100644 --- a/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-update.prof +++ b/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-update.prof @@ -1,8 +1,8 @@ foo:3200:13 1: 13
[clang] [llvm] [PseudoProbe] Mix and reorder block and call probe ID in lexical order (PR #75092)
wlei-llvm wrote: As discussed offline, reverted the complicated order detection change. https://github.com/llvm/llvm-project/pull/75092 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [PseudoProbe] Mix and reorder block and call probe ID in lexical order (PR #75092)
https://github.com/wlei-llvm updated https://github.com/llvm/llvm-project/pull/75092 >From e6f735a336f87659ee3236fc795ad4b7107dff4d Mon Sep 17 00:00:00 2001 From: wlei Date: Sun, 10 Dec 2023 18:30:42 -0800 Subject: [PATCH 1/4] [PseudoProbe] Mix and reorder block and call probe ID in lexical order --- clang/test/CodeGen/pseudo-probe-emit.c| 8 +++ .../llvm/Transforms/IPO/SampleProfileProbe.h | 3 +-- .../lib/Transforms/IPO/SampleProfileProbe.cpp | 19 +--- .../Inputs/pseudo-probe-profile.prof | 8 +++ .../Inputs/pseudo-probe-update.prof | 8 +++ .../SampleProfile/pseudo-probe-dangle.ll | 12 +- .../pseudo-probe-discriminator.ll | 6 ++--- .../pseudo-probe-profile-metadata-2.ll| 15 ++--- .../SampleProfile/pseudo-probe-profile.ll | 22 +-- .../SampleProfile/pseudo-probe-update.ll | 11 +- .../SampleProfile/pseudo-probe-verify.ll | 16 +++--- 11 files changed, 59 insertions(+), 69 deletions(-) diff --git a/clang/test/CodeGen/pseudo-probe-emit.c b/clang/test/CodeGen/pseudo-probe-emit.c index c7a3f7e6d5b02b..360f831e842945 100644 --- a/clang/test/CodeGen/pseudo-probe-emit.c +++ b/clang/test/CodeGen/pseudo-probe-emit.c @@ -10,9 +10,9 @@ void foo(int x) { // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID:]], i64 1, i32 0, i64 -1) if (x == 0) // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 2, i32 0, i64 -1) -bar(); +bar(); // probe id : 3 else -// CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 3, i32 0, i64 -1) -go(); - // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 4, i32 0, i64 -1) +// CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 4, i32 0, i64 -1) +go(); // probe id : 5 + // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 6, i32 0, i64 -1) } diff --git a/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h b/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h index 0f2729a9462de2..69b87adf105fd1 100644 --- a/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h +++ b/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h @@ -82,8 +82,7 @@ class SampleProfileProber { uint32_t getBlockId(const BasicBlock *BB) const; uint32_t getCallsiteId(const Instruction *Call) const; void computeCFGHash(); - void computeProbeIdForBlocks(); - void computeProbeIdForCallsites(); + void computeProbeId(); Function *F; diff --git a/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp b/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp index 090e5560483edb..975c1bc2347aa4 100644 --- a/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp +++ b/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp @@ -173,8 +173,7 @@ SampleProfileProber::SampleProfileProber(Function &Func, BlockProbeIds.clear(); CallProbeIds.clear(); LastProbeId = (uint32_t)PseudoProbeReservedId::Last; - computeProbeIdForBlocks(); - computeProbeIdForCallsites(); + computeProbeId(); computeCFGHash(); } @@ -207,7 +206,10 @@ void SampleProfileProber::computeCFGHash() { << ", Hash = " << FunctionHash << "\n"); } -void SampleProfileProber::computeProbeIdForBlocks() { +void SampleProfileProber::computeProbeId() { + LLVMContext &Ctx = F->getContext(); + Module *M = F->getParent(); + DenseSet KnownColdBlocks; computeEHOnlyBlocks(*F, KnownColdBlocks); // Insert pseudo probe to non-cold blocks only. This will reduce IR size as @@ -216,18 +218,9 @@ void SampleProfileProber::computeProbeIdForBlocks() { ++LastProbeId; if (!KnownColdBlocks.contains(&BB)) BlockProbeIds[&BB] = LastProbeId; - } -} -void SampleProfileProber::computeProbeIdForCallsites() { - LLVMContext &Ctx = F->getContext(); - Module *M = F->getParent(); - - for (auto &BB : *F) { for (auto &I : BB) { - if (!isa(I)) -continue; - if (isa(&I)) + if (!isa(I) || isa(&I)) continue; // The current implementation uses the lower 16 bits of the discriminator diff --git a/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-profile.prof b/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-profile.prof index ba4c6117dc96ab..d3847946b94033 100644 --- a/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-profile.prof +++ b/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-profile.prof @@ -1,8 +1,8 @@ foo:3200:13 1: 13 2: 7 - 3: 6 - 4: 13 - 5: 7 _Z3barv:2 _Z3foov:5 - 6: 6 _Z3barv:4 _Z3foov:2 + 4: 6 + 6: 13 + 3: 7 _Z3barv:2 _Z3foov:5 + 5: 6 _Z3barv:4 _Z3foov:2 !CFGChecksum: 563022570642068 diff --git a/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-update.prof b/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-update.prof index 62f9bd5992e735..213bf0b6f81cc4 100644 --- a/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-update.prof +++ b/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-update.prof @@ -1,8 +1,8 @@ foo:3200:13 1: 13
[clang] [llvm] [PseudoProbe] Mix and reorder block and call probe ID in lexical order (PR #75092)
wlei-llvm wrote: > LGTM. I'd also have a change that errors out on huge staleness go in together > with this one.. Was thinking to use a separate diff for that, feels non-trivial to do it, but I'm also good for doing in this. Just pushed one version for this, it use heuristics to check this, note that the current threshold flags is just based my "gut feeling", so I'm going to tune those flags on our internal services, hope that we can achieve the goal: 1) Use the current prod profile, so all build should pass without any false positives break. 2) Use the incompatible profile, so all build should error out. Any early feedback is appreciated. https://github.com/llvm/llvm-project/pull/75092 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [PseudoProbe] Mix and reorder block and call probe ID in lexical order (PR #75092)
wlei-llvm wrote: > > > LGTM. I'd also have a change that errors out on huge staleness go in > > > together with this one.. > > > > > > Was thinking to use a separate diff for that, feels non-trivial to do it, > > but I'm also good for doing in this. > > Just pushed one version for this, it use heuristics to check this, note > > that the current threshold flags is just based my "gut feeling", so I'm > > going to tune those flags on our internal services, hope that we can > > achieve the goal: > > > > 1. Use the current prod profile, so all build should pass without any false > > positives break. > > 2. Use the incompatible profile, so all build should error out. > > > > Any early feedback is appreciated. > > Sorry I wasn't clear. The error out change should be a separate patch as > logically it's independent of the original mix order change. By go in > together, I was mostly thinking about two patches that go in together.. Oh, that's my initial thought too, Ok, I will create a new PR, and hold this until we can go in together. https://github.com/llvm/llvm-project/pull/75092 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [PseudoProbe] Mix and reorder block and call probe ID in lexical order (PR #75092)
https://github.com/wlei-llvm updated https://github.com/llvm/llvm-project/pull/75092 >From e6f735a336f87659ee3236fc795ad4b7107dff4d Mon Sep 17 00:00:00 2001 From: wlei Date: Sun, 10 Dec 2023 18:30:42 -0800 Subject: [PATCH 1/5] [PseudoProbe] Mix and reorder block and call probe ID in lexical order --- clang/test/CodeGen/pseudo-probe-emit.c| 8 +++ .../llvm/Transforms/IPO/SampleProfileProbe.h | 3 +-- .../lib/Transforms/IPO/SampleProfileProbe.cpp | 19 +--- .../Inputs/pseudo-probe-profile.prof | 8 +++ .../Inputs/pseudo-probe-update.prof | 8 +++ .../SampleProfile/pseudo-probe-dangle.ll | 12 +- .../pseudo-probe-discriminator.ll | 6 ++--- .../pseudo-probe-profile-metadata-2.ll| 15 ++--- .../SampleProfile/pseudo-probe-profile.ll | 22 +-- .../SampleProfile/pseudo-probe-update.ll | 11 +- .../SampleProfile/pseudo-probe-verify.ll | 16 +++--- 11 files changed, 59 insertions(+), 69 deletions(-) diff --git a/clang/test/CodeGen/pseudo-probe-emit.c b/clang/test/CodeGen/pseudo-probe-emit.c index c7a3f7e6d5b02b..360f831e842945 100644 --- a/clang/test/CodeGen/pseudo-probe-emit.c +++ b/clang/test/CodeGen/pseudo-probe-emit.c @@ -10,9 +10,9 @@ void foo(int x) { // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID:]], i64 1, i32 0, i64 -1) if (x == 0) // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 2, i32 0, i64 -1) -bar(); +bar(); // probe id : 3 else -// CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 3, i32 0, i64 -1) -go(); - // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 4, i32 0, i64 -1) +// CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 4, i32 0, i64 -1) +go(); // probe id : 5 + // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 6, i32 0, i64 -1) } diff --git a/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h b/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h index 0f2729a9462de2..69b87adf105fd1 100644 --- a/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h +++ b/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h @@ -82,8 +82,7 @@ class SampleProfileProber { uint32_t getBlockId(const BasicBlock *BB) const; uint32_t getCallsiteId(const Instruction *Call) const; void computeCFGHash(); - void computeProbeIdForBlocks(); - void computeProbeIdForCallsites(); + void computeProbeId(); Function *F; diff --git a/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp b/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp index 090e5560483edb..975c1bc2347aa4 100644 --- a/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp +++ b/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp @@ -173,8 +173,7 @@ SampleProfileProber::SampleProfileProber(Function &Func, BlockProbeIds.clear(); CallProbeIds.clear(); LastProbeId = (uint32_t)PseudoProbeReservedId::Last; - computeProbeIdForBlocks(); - computeProbeIdForCallsites(); + computeProbeId(); computeCFGHash(); } @@ -207,7 +206,10 @@ void SampleProfileProber::computeCFGHash() { << ", Hash = " << FunctionHash << "\n"); } -void SampleProfileProber::computeProbeIdForBlocks() { +void SampleProfileProber::computeProbeId() { + LLVMContext &Ctx = F->getContext(); + Module *M = F->getParent(); + DenseSet KnownColdBlocks; computeEHOnlyBlocks(*F, KnownColdBlocks); // Insert pseudo probe to non-cold blocks only. This will reduce IR size as @@ -216,18 +218,9 @@ void SampleProfileProber::computeProbeIdForBlocks() { ++LastProbeId; if (!KnownColdBlocks.contains(&BB)) BlockProbeIds[&BB] = LastProbeId; - } -} -void SampleProfileProber::computeProbeIdForCallsites() { - LLVMContext &Ctx = F->getContext(); - Module *M = F->getParent(); - - for (auto &BB : *F) { for (auto &I : BB) { - if (!isa(I)) -continue; - if (isa(&I)) + if (!isa(I) || isa(&I)) continue; // The current implementation uses the lower 16 bits of the discriminator diff --git a/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-profile.prof b/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-profile.prof index ba4c6117dc96ab..d3847946b94033 100644 --- a/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-profile.prof +++ b/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-profile.prof @@ -1,8 +1,8 @@ foo:3200:13 1: 13 2: 7 - 3: 6 - 4: 13 - 5: 7 _Z3barv:2 _Z3foov:5 - 6: 6 _Z3barv:4 _Z3foov:2 + 4: 6 + 6: 13 + 3: 7 _Z3barv:2 _Z3foov:5 + 5: 6 _Z3barv:4 _Z3foov:2 !CFGChecksum: 563022570642068 diff --git a/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-update.prof b/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-update.prof index 62f9bd5992e735..213bf0b6f81cc4 100644 --- a/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-update.prof +++ b/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-update.prof @@ -1,8 +1,8 @@ foo:3200:13 1: 13
[flang] [libcxx] [libcxxabi] [libc] [llvm] [clang] [compiler-rt] [clang-tools-extra] [lld] [CSSPGO] Compute and report post-match profile staleness (PR #79090)
https://github.com/wlei-llvm updated https://github.com/llvm/llvm-project/pull/79090 >From 54784e26f33178efd21b0289a1f673d66ea26cc3 Mon Sep 17 00:00:00 2001 From: wlei Date: Mon, 22 Jan 2024 19:16:26 -0800 Subject: [PATCH] [CSSPGO] Support post-match profile staleness metrics --- llvm/lib/Transforms/IPO/SampleProfile.cpp | 440 +++--- .../Inputs/profile-mismatch.prof | 7 +- .../SampleProfile/profile-mismatch.ll | 12 +- .../pseudo-probe-profile-mismatch-thinlto.ll | 6 +- .../pseudo-probe-profile-mismatch.ll | 76 +-- 5 files changed, 324 insertions(+), 217 deletions(-) diff --git a/llvm/lib/Transforms/IPO/SampleProfile.cpp b/llvm/lib/Transforms/IPO/SampleProfile.cpp index 2fd8668d15e200f..a7170faa65dc07c 100644 --- a/llvm/lib/Transforms/IPO/SampleProfile.cpp +++ b/llvm/lib/Transforms/IPO/SampleProfile.cpp @@ -433,12 +433,19 @@ using CandidateQueue = PriorityQueue, CandidateComparer>; +using IRAnchorMap = std::map; +using ProfileAnchorMap = std::map>; + // Sample profile matching - fuzzy match. class SampleProfileMatcher { Module &M; SampleProfileReader &Reader; const PseudoProbeManager *ProbeManager; SampleProfileMap FlattenedProfiles; + + std::unordered_map FuncIRAnchors; + std::unordered_map FuncProfileAnchors; + // For each function, the matcher generates a map, of which each entry is a // mapping from the source location of current build to the source location in // the profile. @@ -448,6 +455,8 @@ class SampleProfileMatcher { uint64_t TotalProfiledCallsites = 0; uint64_t NumMismatchedCallsites = 0; uint64_t MismatchedCallsiteSamples = 0; + uint64_t PostMatchNumMismatchedCallsites = 0; + uint64_t PostMatchMismatchedCallsiteSamples = 0; uint64_t TotalCallsiteSamples = 0; uint64_t TotalProfiledFunc = 0; uint64_t NumMismatchedFuncHash = 0; @@ -474,24 +483,22 @@ class SampleProfileMatcher { return nullptr; } void runOnFunction(const Function &F); - void findIRAnchors(const Function &F, - std::map &IRAnchors); - void findProfileAnchors( + void findFuncAnchors(); + void UpdateIRAnchors(); + void findIRAnchors(const Function &F, IRAnchorMap &IRAnchors); + void findProfileAnchors(const FunctionSamples &FS, + ProfileAnchorMap &ProfileAnchors); + void countMismatchedHashSamples(const FunctionSamples &FS); + void countProfileMismatches(bool IsPreMatch); + void countMismatchedHashes(const Function &F, const FunctionSamples &FS); + void countMismatchedCallsites( + const Function &F, + StringMap> &FuncToMismatchCallsites, + uint64_t &FuncProfiledCallsites, uint64_t &FuncMismatchedCallsites) const; + void countMismatchedCallsiteSamples( const FunctionSamples &FS, - std::map> - &ProfileAnchors); - void countMismatchedSamples(const FunctionSamples &FS); - void countProfileMismatches( - const Function &F, const FunctionSamples &FS, - const std::map &IRAnchors, - const std::map> - &ProfileAnchors); - void countProfileCallsiteMismatches( - const FunctionSamples &FS, - const std::map &IRAnchors, - const std::map> - &ProfileAnchors, - uint64_t &FuncMismatchedCallsites, uint64_t &FuncProfiledCallsites); + StringMap> &FuncToMismatchCallsites, + uint64_t &FuncMismatchedCallsiteSamples) const; LocToLocMap &getIRToProfileLocationMap(const Function &F) { auto Ret = FuncMappings.try_emplace( FunctionSamples::getCanonicalFnName(F.getName()), LocToLocMap()); @@ -499,11 +506,10 @@ class SampleProfileMatcher { } void distributeIRToProfileLocationMap(); void distributeIRToProfileLocationMap(FunctionSamples &FS); - void runStaleProfileMatching( - const Function &F, const std::map &IRAnchors, - const std::map> - &ProfileAnchors, - LocToLocMap &IRToProfileLocationMap); + void runStaleProfileMatching(); + void runStaleProfileMatching(const Function &F, const IRAnchorMap &IRAnchors, + const ProfileAnchorMap &ProfileAnchors, + LocToLocMap &IRToProfileLocationMap); }; /// Sample profile pass. @@ -1129,7 +1135,7 @@ void SampleProfileLoader::findExternalInlineCandidate( CalleeSample->getContext().hasAttribute(ContextShouldBeInlined); if (!PreInline && CalleeSample->getHeadSamplesEstimate() < Threshold) continue; - + Function *Func = SymbolMap.lookup(CalleeSample->getFunction()); // Add to the import list only when it's defined out of module. if (!Func || Func->isDeclaration()) @@ -2123,8 +2129,8 @@ bool SampleProfileLoader::doInitialization(Module &M, return true; } -void SampleProfileMatcher::findIRAnchors( -const Function &F, std::map &IRAnchors) { +void SampleProfileMatcher::findIRAnchors(const Function &F, + IRAnchorMap &IRAnchors) { /
[libc] [llvm] [clang-tools-extra] [libcxxabi] [flang] [lld] [compiler-rt] [clang] [libcxx] [CSSPGO] Compute and report post-match profile staleness (PR #79090)
@@ -2422,59 +2342,226 @@ void SampleProfileMatcher::runStaleProfileMatching( } } -void SampleProfileMatcher::runOnFunction(const Function &F) { - // We need to use flattened function samples for matching. - // Unlike IR, which includes all callsites from the source code, the callsites - // in profile only show up when they are hit by samples, i,e. the profile - // callsites in one context may differ from those in another context. To get - // the maximum number of callsites, we merge the function profiles from all - // contexts, aka, the flattened profile to find profile anchors. - const auto *FSFlattened = getFlattenedSamplesFor(F); - if (!FSFlattened) -return; +void SampleProfileMatcher::runStaleProfileMatching() { + for (const auto &F : M) { +if (F.isDeclaration() || !F.hasFnAttribute("use-sample-profile")) + continue; +const auto *FSFlattened = getFlattenedSamplesFor(F); +if (!FSFlattened) + continue; +auto IR = FuncIRAnchors.find(&F); +auto P = FuncProfileAnchors.find(&F); +if (IR == FuncIRAnchors.end() || P == FuncProfileAnchors.end()) + continue; - // Anchors for IR. It's a map from IR location to callee name, callee name is - // empty for non-call instruction and use a dummy name(UnknownIndirectCallee) - // for unknown indrect callee name. - std::map IRAnchors; - findIRAnchors(F, IRAnchors); - // Anchors for profile. It's a map from callsite location to a set of callee - // name. - std::map> ProfileAnchors; - findProfileAnchors(*FSFlattened, ProfileAnchors); - - // Detect profile mismatch for profile staleness metrics report. - // Skip reporting the metrics for imported functions. - if (!GlobalValue::isAvailableExternallyLinkage(F.getLinkage()) && - (ReportProfileStaleness || PersistProfileStaleness)) { -// Use top-level nested FS for counting profile mismatch metrics since -// currently once a callsite is mismatched, all its children profiles are -// dropped. -if (const auto *FS = Reader.getSamplesFor(F)) - countProfileMismatches(F, *FS, IRAnchors, ProfileAnchors); +// Run profile matching for checksum mismatched profile, currently only +// support for pseudo-probe. +if (FunctionSamples::ProfileIsProbeBased && +!ProbeManager->profileIsValid(F, *FSFlattened)) { + runStaleProfileMatching(F, IR->second, P->second, + getIRToProfileLocationMap(F)); +} } - // Run profile matching for checksum mismatched profile, currently only - // support for pseudo-probe. - if (SalvageStaleProfile && FunctionSamples::ProfileIsProbeBased && - !ProbeManager->profileIsValid(F, *FSFlattened)) { -// The matching result will be saved to IRToProfileLocationMap, create a new -// map for each function. -runStaleProfileMatching(F, IRAnchors, ProfileAnchors, -getIRToProfileLocationMap(F)); - } + distributeIRToProfileLocationMap(); } -void SampleProfileMatcher::runOnModule() { +void SampleProfileMatcher::findFuncAnchors() { ProfileConverter::flattenProfile(Reader.getProfiles(), FlattenedProfiles, FunctionSamples::ProfileIsCS); - for (auto &F : M) { + for (const auto &F : M) { if (F.isDeclaration() || !F.hasFnAttribute("use-sample-profile")) continue; -runOnFunction(F); +// We need to use flattened function samples for matching. +// Unlike IR, which includes all callsites from the source code, the +// callsites in profile only show up when they are hit by samples, i,e. the +// profile callsites in one context may differ from those in another +// context. To get the maximum number of callsites, we merge the function +// profiles from all contexts, aka, the flattened profile to find profile +// anchors. +const auto *FSFlattened = getFlattenedSamplesFor(F); +if (!FSFlattened) + continue; + +// Anchors for IR. It's a map from IR location to callee name, callee name +// is empty for non-call instruction and use a dummy +// name(UnknownIndirectCallee) for unknown indrect callee name. +auto IR = FuncIRAnchors.emplace(&F, IRAnchorMap()); +findIRAnchors(F, IR.first->second); + +// Anchors for profile. It's a map from callsite location to a set of callee +// name. +auto P = FuncProfileAnchors.emplace(&F, ProfileAnchorMap()); +findProfileAnchors(*FSFlattened, P.first->second); + } +} + +void SampleProfileMatcher::countMismatchedCallsiteSamples( +const FunctionSamples &FS, +StringMap> &FuncToMismatchCallsites, +uint64_t &FuncMismatchedCallsiteSamples) const { + auto It = FuncToMismatchCallsites.find(FS.getFuncName()); + // Skip it if no mismatched callsite or this is an external function. + if (It == FuncToMismatchCallsites.end() || It->second.empty()) +return; + const auto &MismatchCallsites = It->second; + for (const auto &I : FS.getBodySamples()) { +if
[libc] [llvm] [clang-tools-extra] [libcxxabi] [flang] [lld] [compiler-rt] [clang] [libcxx] [CSSPGO] Compute and report post-match profile staleness (PR #79090)
@@ -2422,59 +2342,226 @@ void SampleProfileMatcher::runStaleProfileMatching( } } -void SampleProfileMatcher::runOnFunction(const Function &F) { - // We need to use flattened function samples for matching. - // Unlike IR, which includes all callsites from the source code, the callsites - // in profile only show up when they are hit by samples, i,e. the profile - // callsites in one context may differ from those in another context. To get - // the maximum number of callsites, we merge the function profiles from all - // contexts, aka, the flattened profile to find profile anchors. - const auto *FSFlattened = getFlattenedSamplesFor(F); - if (!FSFlattened) -return; +void SampleProfileMatcher::runStaleProfileMatching() { + for (const auto &F : M) { +if (F.isDeclaration() || !F.hasFnAttribute("use-sample-profile")) + continue; +const auto *FSFlattened = getFlattenedSamplesFor(F); +if (!FSFlattened) + continue; +auto IR = FuncIRAnchors.find(&F); +auto P = FuncProfileAnchors.find(&F); +if (IR == FuncIRAnchors.end() || P == FuncProfileAnchors.end()) + continue; - // Anchors for IR. It's a map from IR location to callee name, callee name is - // empty for non-call instruction and use a dummy name(UnknownIndirectCallee) - // for unknown indrect callee name. - std::map IRAnchors; - findIRAnchors(F, IRAnchors); - // Anchors for profile. It's a map from callsite location to a set of callee - // name. - std::map> ProfileAnchors; - findProfileAnchors(*FSFlattened, ProfileAnchors); - - // Detect profile mismatch for profile staleness metrics report. - // Skip reporting the metrics for imported functions. - if (!GlobalValue::isAvailableExternallyLinkage(F.getLinkage()) && - (ReportProfileStaleness || PersistProfileStaleness)) { -// Use top-level nested FS for counting profile mismatch metrics since -// currently once a callsite is mismatched, all its children profiles are -// dropped. -if (const auto *FS = Reader.getSamplesFor(F)) - countProfileMismatches(F, *FS, IRAnchors, ProfileAnchors); +// Run profile matching for checksum mismatched profile, currently only +// support for pseudo-probe. +if (FunctionSamples::ProfileIsProbeBased && +!ProbeManager->profileIsValid(F, *FSFlattened)) { + runStaleProfileMatching(F, IR->second, P->second, + getIRToProfileLocationMap(F)); +} } - // Run profile matching for checksum mismatched profile, currently only - // support for pseudo-probe. - if (SalvageStaleProfile && FunctionSamples::ProfileIsProbeBased && - !ProbeManager->profileIsValid(F, *FSFlattened)) { -// The matching result will be saved to IRToProfileLocationMap, create a new -// map for each function. -runStaleProfileMatching(F, IRAnchors, ProfileAnchors, -getIRToProfileLocationMap(F)); - } + distributeIRToProfileLocationMap(); } -void SampleProfileMatcher::runOnModule() { +void SampleProfileMatcher::findFuncAnchors() { ProfileConverter::flattenProfile(Reader.getProfiles(), FlattenedProfiles, FunctionSamples::ProfileIsCS); - for (auto &F : M) { + for (const auto &F : M) { if (F.isDeclaration() || !F.hasFnAttribute("use-sample-profile")) continue; -runOnFunction(F); +// We need to use flattened function samples for matching. +// Unlike IR, which includes all callsites from the source code, the +// callsites in profile only show up when they are hit by samples, i,e. the +// profile callsites in one context may differ from those in another +// context. To get the maximum number of callsites, we merge the function +// profiles from all contexts, aka, the flattened profile to find profile +// anchors. +const auto *FSFlattened = getFlattenedSamplesFor(F); +if (!FSFlattened) + continue; + +// Anchors for IR. It's a map from IR location to callee name, callee name +// is empty for non-call instruction and use a dummy +// name(UnknownIndirectCallee) for unknown indrect callee name. +auto IR = FuncIRAnchors.emplace(&F, IRAnchorMap()); +findIRAnchors(F, IR.first->second); + +// Anchors for profile. It's a map from callsite location to a set of callee +// name. +auto P = FuncProfileAnchors.emplace(&F, ProfileAnchorMap()); +findProfileAnchors(*FSFlattened, P.first->second); + } +} + +void SampleProfileMatcher::countMismatchedCallsiteSamples( +const FunctionSamples &FS, +StringMap> &FuncToMismatchCallsites, +uint64_t &FuncMismatchedCallsiteSamples) const { + auto It = FuncToMismatchCallsites.find(FS.getFuncName()); + // Skip it if no mismatched callsite or this is an external function. + if (It == FuncToMismatchCallsites.end() || It->second.empty()) +return; + const auto &MismatchCallsites = It->second; + for (const auto &I : FS.getBodySamples()) { +if
[libc] [llvm] [clang-tools-extra] [libcxxabi] [flang] [lld] [compiler-rt] [clang] [libcxx] [CSSPGO] Compute and report post-match profile staleness (PR #79090)
@@ -2422,59 +2342,226 @@ void SampleProfileMatcher::runStaleProfileMatching( } } -void SampleProfileMatcher::runOnFunction(const Function &F) { - // We need to use flattened function samples for matching. - // Unlike IR, which includes all callsites from the source code, the callsites - // in profile only show up when they are hit by samples, i,e. the profile - // callsites in one context may differ from those in another context. To get - // the maximum number of callsites, we merge the function profiles from all - // contexts, aka, the flattened profile to find profile anchors. - const auto *FSFlattened = getFlattenedSamplesFor(F); - if (!FSFlattened) -return; +void SampleProfileMatcher::runStaleProfileMatching() { + for (const auto &F : M) { +if (F.isDeclaration() || !F.hasFnAttribute("use-sample-profile")) + continue; +const auto *FSFlattened = getFlattenedSamplesFor(F); +if (!FSFlattened) + continue; +auto IR = FuncIRAnchors.find(&F); +auto P = FuncProfileAnchors.find(&F); +if (IR == FuncIRAnchors.end() || P == FuncProfileAnchors.end()) + continue; - // Anchors for IR. It's a map from IR location to callee name, callee name is - // empty for non-call instruction and use a dummy name(UnknownIndirectCallee) - // for unknown indrect callee name. - std::map IRAnchors; - findIRAnchors(F, IRAnchors); - // Anchors for profile. It's a map from callsite location to a set of callee - // name. - std::map> ProfileAnchors; - findProfileAnchors(*FSFlattened, ProfileAnchors); - - // Detect profile mismatch for profile staleness metrics report. - // Skip reporting the metrics for imported functions. - if (!GlobalValue::isAvailableExternallyLinkage(F.getLinkage()) && - (ReportProfileStaleness || PersistProfileStaleness)) { -// Use top-level nested FS for counting profile mismatch metrics since -// currently once a callsite is mismatched, all its children profiles are -// dropped. -if (const auto *FS = Reader.getSamplesFor(F)) - countProfileMismatches(F, *FS, IRAnchors, ProfileAnchors); +// Run profile matching for checksum mismatched profile, currently only +// support for pseudo-probe. +if (FunctionSamples::ProfileIsProbeBased && +!ProbeManager->profileIsValid(F, *FSFlattened)) { + runStaleProfileMatching(F, IR->second, P->second, + getIRToProfileLocationMap(F)); +} } - // Run profile matching for checksum mismatched profile, currently only - // support for pseudo-probe. - if (SalvageStaleProfile && FunctionSamples::ProfileIsProbeBased && - !ProbeManager->profileIsValid(F, *FSFlattened)) { -// The matching result will be saved to IRToProfileLocationMap, create a new -// map for each function. -runStaleProfileMatching(F, IRAnchors, ProfileAnchors, -getIRToProfileLocationMap(F)); - } + distributeIRToProfileLocationMap(); } -void SampleProfileMatcher::runOnModule() { +void SampleProfileMatcher::findFuncAnchors() { ProfileConverter::flattenProfile(Reader.getProfiles(), FlattenedProfiles, FunctionSamples::ProfileIsCS); - for (auto &F : M) { + for (const auto &F : M) { if (F.isDeclaration() || !F.hasFnAttribute("use-sample-profile")) continue; -runOnFunction(F); +// We need to use flattened function samples for matching. +// Unlike IR, which includes all callsites from the source code, the +// callsites in profile only show up when they are hit by samples, i,e. the +// profile callsites in one context may differ from those in another +// context. To get the maximum number of callsites, we merge the function +// profiles from all contexts, aka, the flattened profile to find profile +// anchors. +const auto *FSFlattened = getFlattenedSamplesFor(F); +if (!FSFlattened) + continue; + +// Anchors for IR. It's a map from IR location to callee name, callee name +// is empty for non-call instruction and use a dummy +// name(UnknownIndirectCallee) for unknown indrect callee name. +auto IR = FuncIRAnchors.emplace(&F, IRAnchorMap()); +findIRAnchors(F, IR.first->second); + +// Anchors for profile. It's a map from callsite location to a set of callee +// name. +auto P = FuncProfileAnchors.emplace(&F, ProfileAnchorMap()); +findProfileAnchors(*FSFlattened, P.first->second); + } +} + +void SampleProfileMatcher::countMismatchedCallsiteSamples( +const FunctionSamples &FS, +StringMap> &FuncToMismatchCallsites, +uint64_t &FuncMismatchedCallsiteSamples) const { + auto It = FuncToMismatchCallsites.find(FS.getFuncName()); + // Skip it if no mismatched callsite or this is an external function. + if (It == FuncToMismatchCallsites.end() || It->second.empty()) +return; + const auto &MismatchCallsites = It->second; + for (const auto &I : FS.getBodySamples()) { +if
[lld] [clang] [clang-tools-extra] [compiler-rt] [libcxxabi] [flang] [llvm] [libc] [libcxx] [CSSPGO] Compute and report post-match profile staleness (PR #79090)
wlei-llvm wrote: Thanks so much for the feedback. Answering some high-level questions, will address others later. https://github.com/llvm/llvm-project/pull/79090 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[libcxx] [compiler-rt] [lld] [libc] [llvm] [flang] [clang-tools-extra] [clang] [libcxxabi] [CSSPGO] Compute and report post-match profile staleness (PR #79090)
https://github.com/wlei-llvm edited https://github.com/llvm/llvm-project/pull/79090 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[llvm] [clang] [PseudoProbe] Mix and reorder block and call probe ID in lexical order (PR #75092)
https://github.com/wlei-llvm edited https://github.com/llvm/llvm-project/pull/75092 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [PseudoProbe] Mix and reorder block and call probe ID in lexical order (PR #75092)
https://github.com/wlei-llvm commented: Thanks for the feedback, it's indeed complicated (even than I expected..) https://github.com/llvm/llvm-project/pull/75092 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [PseudoProbe] Mix and reorder block and call probe ID in lexical order (PR #75092)
@@ -471,6 +471,66 @@ ProfileGenerator::getTopLevelFunctionProfile(FunctionId FuncName) { return ProfileMap.Create(Context); } +// Use a heuristic to determine probe order by checking callsite insertion +// position relative to the BB probes. Sort all the BB probes is in a list, for +// each calliste, compute "ratio = insert position / length of the list". For +// the old order, the probe ids for BB should be all before(smaller than) the +// probe ids for callsite, this ratio should be equal to or close to 1. +bool checkProbeIDIsInMixedOrder(const SampleProfileMap &Profiles) { + // Use flattned profile to maximize the callsites in the profile. + SampleProfileMap flattenProfile; + ProfileConverter::flattenProfile(Profiles, flattenProfile); + + uint32_t PossibleOldOrderNum = 0; + uint32_t PossibleNewOrderNum = 0; + + for (const auto &I : flattenProfile) { +const FunctionSamples &FS = I.second; +// Skip small functions whose profile order are likely random. +if (FS.getBodySamples().size() < 10) + continue; + +std::set PossibleBBProbeIDs; +std::set CallsiteIDs; +for (const auto &I : FS.getBodySamples()) { + if (I.second.getCallTargets().empty()) +PossibleBBProbeIDs.insert(I.first.LineOffset); + else +CallsiteIDs.insert(I.first.LineOffset); +} + +if (PossibleBBProbeIDs.empty() || CallsiteIDs.empty()) + continue; + +uint32_t DistanceToBeginSum = 0; +for (const auto &ID : CallsiteIDs) + DistanceToBeginSum += std::distance(PossibleBBProbeIDs.begin(), + PossibleBBProbeIDs.upper_bound(ID)); +uint32_t LengthSum = PossibleBBProbeIDs.size() * CallsiteIDs.size(); + +// Note that PossibleBBProbeIDs could contains some callsite probe id, the wlei-llvm wrote: > I'm curious what lead to empty call targets for a call location covered by > LBR.. yeah, I need to investigate where the empty call targets comes from.(I'm sure I saw a lot of empty callsite names, the count are all zero) https://github.com/llvm/llvm-project/pull/75092 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[llvm] [clang] [PseudoProbe] Mix and reorder block and call probe ID in lexical order (PR #75092)
@@ -471,6 +471,66 @@ ProfileGenerator::getTopLevelFunctionProfile(FunctionId FuncName) { return ProfileMap.Create(Context); } +// Use a heuristic to determine probe order by checking callsite insertion +// position relative to the BB probes. Sort all the BB probes is in a list, for +// each calliste, compute "ratio = insert position / length of the list". For +// the old order, the probe ids for BB should be all before(smaller than) the +// probe ids for callsite, this ratio should be equal to or close to 1. +bool checkProbeIDIsInMixedOrder(const SampleProfileMap &Profiles) { + // Use flattned profile to maximize the callsites in the profile. + SampleProfileMap flattenProfile; + ProfileConverter::flattenProfile(Profiles, flattenProfile); + + uint32_t PossibleOldOrderNum = 0; + uint32_t PossibleNewOrderNum = 0; + + for (const auto &I : flattenProfile) { +const FunctionSamples &FS = I.second; +// Skip small functions whose profile order are likely random. +if (FS.getBodySamples().size() < 10) + continue; + +std::set PossibleBBProbeIDs; +std::set CallsiteIDs; +for (const auto &I : FS.getBodySamples()) { + if (I.second.getCallTargets().empty()) +PossibleBBProbeIDs.insert(I.first.LineOffset); + else +CallsiteIDs.insert(I.first.LineOffset); +} + +if (PossibleBBProbeIDs.empty() || CallsiteIDs.empty()) + continue; + +uint32_t DistanceToBeginSum = 0; +for (const auto &ID : CallsiteIDs) + DistanceToBeginSum += std::distance(PossibleBBProbeIDs.begin(), + PossibleBBProbeIDs.upper_bound(ID)); +uint32_t LengthSum = PossibleBBProbeIDs.size() * CallsiteIDs.size(); + +// Note that PossibleBBProbeIDs could contains some callsite probe id, the +// ratio is not exactly 1 for the old order, hence use a smaller threshold +// to determine. +if ((float)DistanceToBeginSum / LengthSum > 0.8) + PossibleOldOrderNum++; +else + PossibleNewOrderNum++; + } + return PossibleNewOrderNum >= PossibleOldOrderNum; wlei-llvm wrote: Sorry, it was also far complicated than I expected, try to explain more about this, we can discuss offline. The fact is that, in theory there can be cases that all the calls are after BB probes for the new order ID(like all hot profiled calls are in the last BB), so even we can decide the probe is call or not, we still can't just use one function or one probe as a line to decide the order, that's why here I use this complex statistical way.(It appears it works well) > If we can't decide whether a probe is originally a call probe or a block > probe, that makes the checking less reliable So far sadly no from profile. We can query the type of probe from the original profiled binary, like parsing the binary asm, but that seems a lot of infra work.(Also as mentioned above, even the case all calls are after all BB also can't decide the order) > In reality, do we see cases where there are many PossibleNewOrderNum and > PossibleOldOrderNum for one profile? Yep, both are actually a lot in a large production-used profile, that's why I use a 0.8 not a 0.9 things, I spent some time tuning this threshold, right now it works very well on my large profile test. > but if that's case we can't even confidently issue error. To some degree, this heuristic is reliable(as long as there are enough big functions). From my statistics, using 0.8 as threshold showed it works really good to detect the order, the two num are distinguishable. For both side, the `PossibleOrderNum` is much larger than the other num, the ratio(PossibleNewOrderNum/PossibleOldOrderNum) is almost ~0.9 for new order, a large room to swing, and vice versa. (like PossibleNewOrderNum is 9X more than PossibleOldOrderNum for a new order. Or PossibleOldOrderNum is 9X more than PossibleNewOrderNum for an old order). But TBH, I'm sure how it works for a small profile, what i tested is all larger than 100M profile, I know that for a small function, it's quite random, it just has little calls, I have to filter out those small function as showed in the code. https://github.com/llvm/llvm-project/pull/75092 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[flang] [llvm] [compiler-rt] [libcxx] [libcxxabi] [libc] [clang-tools-extra] [lld] [clang] [CSSPGO] Compute and report post-match profile staleness (PR #79090)
@@ -2422,59 +2342,226 @@ void SampleProfileMatcher::runStaleProfileMatching( } } -void SampleProfileMatcher::runOnFunction(const Function &F) { - // We need to use flattened function samples for matching. - // Unlike IR, which includes all callsites from the source code, the callsites - // in profile only show up when they are hit by samples, i,e. the profile - // callsites in one context may differ from those in another context. To get - // the maximum number of callsites, we merge the function profiles from all - // contexts, aka, the flattened profile to find profile anchors. - const auto *FSFlattened = getFlattenedSamplesFor(F); - if (!FSFlattened) -return; +void SampleProfileMatcher::runStaleProfileMatching() { + for (const auto &F : M) { +if (F.isDeclaration() || !F.hasFnAttribute("use-sample-profile")) + continue; +const auto *FSFlattened = getFlattenedSamplesFor(F); +if (!FSFlattened) + continue; +auto IR = FuncIRAnchors.find(&F); +auto P = FuncProfileAnchors.find(&F); +if (IR == FuncIRAnchors.end() || P == FuncProfileAnchors.end()) + continue; - // Anchors for IR. It's a map from IR location to callee name, callee name is - // empty for non-call instruction and use a dummy name(UnknownIndirectCallee) - // for unknown indrect callee name. - std::map IRAnchors; - findIRAnchors(F, IRAnchors); - // Anchors for profile. It's a map from callsite location to a set of callee - // name. - std::map> ProfileAnchors; - findProfileAnchors(*FSFlattened, ProfileAnchors); - - // Detect profile mismatch for profile staleness metrics report. - // Skip reporting the metrics for imported functions. - if (!GlobalValue::isAvailableExternallyLinkage(F.getLinkage()) && - (ReportProfileStaleness || PersistProfileStaleness)) { -// Use top-level nested FS for counting profile mismatch metrics since -// currently once a callsite is mismatched, all its children profiles are -// dropped. -if (const auto *FS = Reader.getSamplesFor(F)) - countProfileMismatches(F, *FS, IRAnchors, ProfileAnchors); +// Run profile matching for checksum mismatched profile, currently only +// support for pseudo-probe. +if (FunctionSamples::ProfileIsProbeBased && +!ProbeManager->profileIsValid(F, *FSFlattened)) { + runStaleProfileMatching(F, IR->second, P->second, + getIRToProfileLocationMap(F)); +} } - // Run profile matching for checksum mismatched profile, currently only - // support for pseudo-probe. - if (SalvageStaleProfile && FunctionSamples::ProfileIsProbeBased && - !ProbeManager->profileIsValid(F, *FSFlattened)) { -// The matching result will be saved to IRToProfileLocationMap, create a new -// map for each function. -runStaleProfileMatching(F, IRAnchors, ProfileAnchors, -getIRToProfileLocationMap(F)); - } + distributeIRToProfileLocationMap(); } -void SampleProfileMatcher::runOnModule() { +void SampleProfileMatcher::findFuncAnchors() { ProfileConverter::flattenProfile(Reader.getProfiles(), FlattenedProfiles, FunctionSamples::ProfileIsCS); - for (auto &F : M) { + for (const auto &F : M) { if (F.isDeclaration() || !F.hasFnAttribute("use-sample-profile")) continue; -runOnFunction(F); +// We need to use flattened function samples for matching. +// Unlike IR, which includes all callsites from the source code, the +// callsites in profile only show up when they are hit by samples, i,e. the +// profile callsites in one context may differ from those in another +// context. To get the maximum number of callsites, we merge the function +// profiles from all contexts, aka, the flattened profile to find profile +// anchors. +const auto *FSFlattened = getFlattenedSamplesFor(F); +if (!FSFlattened) + continue; + +// Anchors for IR. It's a map from IR location to callee name, callee name +// is empty for non-call instruction and use a dummy +// name(UnknownIndirectCallee) for unknown indrect callee name. +auto IR = FuncIRAnchors.emplace(&F, IRAnchorMap()); +findIRAnchors(F, IR.first->second); + +// Anchors for profile. It's a map from callsite location to a set of callee +// name. +auto P = FuncProfileAnchors.emplace(&F, ProfileAnchorMap()); +findProfileAnchors(*FSFlattened, P.first->second); + } +} + +void SampleProfileMatcher::countMismatchedCallsiteSamples( +const FunctionSamples &FS, +StringMap> &FuncToMismatchCallsites, +uint64_t &FuncMismatchedCallsiteSamples) const { + auto It = FuncToMismatchCallsites.find(FS.getFuncName()); + // Skip it if no mismatched callsite or this is an external function. + if (It == FuncToMismatchCallsites.end() || It->second.empty()) +return; + const auto &MismatchCallsites = It->second; + for (const auto &I : FS.getBodySamples()) { +if
[llvm] [clang] [PseudoProbe] Mix and reorder block and call probe ID in lexical order (PR #75092)
@@ -471,6 +471,66 @@ ProfileGenerator::getTopLevelFunctionProfile(FunctionId FuncName) { return ProfileMap.Create(Context); } +// Use a heuristic to determine probe order by checking callsite insertion +// position relative to the BB probes. Sort all the BB probes is in a list, for +// each calliste, compute "ratio = insert position / length of the list". For +// the old order, the probe ids for BB should be all before(smaller than) the +// probe ids for callsite, this ratio should be equal to or close to 1. +bool checkProbeIDIsInMixedOrder(const SampleProfileMap &Profiles) { + // Use flattned profile to maximize the callsites in the profile. + SampleProfileMap flattenProfile; + ProfileConverter::flattenProfile(Profiles, flattenProfile); + + uint32_t PossibleOldOrderNum = 0; + uint32_t PossibleNewOrderNum = 0; + + for (const auto &I : flattenProfile) { +const FunctionSamples &FS = I.second; +// Skip small functions whose profile order are likely random. +if (FS.getBodySamples().size() < 10) + continue; + +std::set PossibleBBProbeIDs; +std::set CallsiteIDs; +for (const auto &I : FS.getBodySamples()) { + if (I.second.getCallTargets().empty()) +PossibleBBProbeIDs.insert(I.first.LineOffset); + else +CallsiteIDs.insert(I.first.LineOffset); +} + +if (PossibleBBProbeIDs.empty() || CallsiteIDs.empty()) + continue; + +uint32_t DistanceToBeginSum = 0; +for (const auto &ID : CallsiteIDs) + DistanceToBeginSum += std::distance(PossibleBBProbeIDs.begin(), + PossibleBBProbeIDs.upper_bound(ID)); +uint32_t LengthSum = PossibleBBProbeIDs.size() * CallsiteIDs.size(); + +// Note that PossibleBBProbeIDs could contains some callsite probe id, the +// ratio is not exactly 1 for the old order, hence use a smaller threshold +// to determine. +if ((float)DistanceToBeginSum / LengthSum > 0.8) + PossibleOldOrderNum++; +else + PossibleNewOrderNum++; + } + return PossibleNewOrderNum >= PossibleOldOrderNum; wlei-llvm wrote: > If we only traverse the probe sections, would that be enough to verify? yes, that should be enough, we can check some large functions if there are calls before BBs to decide old order. > Another idea is like you said, just move the checks to compile time. But > maybe we don't need to detect order specifically, but just detect huge > staleness and issue error. Say if a profile is >=90% (or maybe 100%?) stale > at compile time for a large file or a file with hot functions, issue error. > That would be not specific to this order issue, but also generally useful? Yes, that should also work, it should be useful for catching other mismatch cases, like compiler upgrade. https://github.com/llvm/llvm-project/pull/75092 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[libc] [lld] [libcxx] [libcxxabi] [compiler-rt] [llvm] [clang] [flang] [clang-tools-extra] [CSSPGO] Compute and report post-match profile staleness (PR #79090)
https://github.com/wlei-llvm updated https://github.com/llvm/llvm-project/pull/79090 >From 54784e26f33178efd21b0289a1f673d66ea26cc3 Mon Sep 17 00:00:00 2001 From: wlei Date: Mon, 22 Jan 2024 19:16:26 -0800 Subject: [PATCH 1/3] [CSSPGO] Support post-match profile staleness metrics --- llvm/lib/Transforms/IPO/SampleProfile.cpp | 440 +++--- .../Inputs/profile-mismatch.prof | 7 +- .../SampleProfile/profile-mismatch.ll | 12 +- .../pseudo-probe-profile-mismatch-thinlto.ll | 6 +- .../pseudo-probe-profile-mismatch.ll | 76 +-- 5 files changed, 324 insertions(+), 217 deletions(-) diff --git a/llvm/lib/Transforms/IPO/SampleProfile.cpp b/llvm/lib/Transforms/IPO/SampleProfile.cpp index 2fd8668d15e200f..a7170faa65dc07c 100644 --- a/llvm/lib/Transforms/IPO/SampleProfile.cpp +++ b/llvm/lib/Transforms/IPO/SampleProfile.cpp @@ -433,12 +433,19 @@ using CandidateQueue = PriorityQueue, CandidateComparer>; +using IRAnchorMap = std::map; +using ProfileAnchorMap = std::map>; + // Sample profile matching - fuzzy match. class SampleProfileMatcher { Module &M; SampleProfileReader &Reader; const PseudoProbeManager *ProbeManager; SampleProfileMap FlattenedProfiles; + + std::unordered_map FuncIRAnchors; + std::unordered_map FuncProfileAnchors; + // For each function, the matcher generates a map, of which each entry is a // mapping from the source location of current build to the source location in // the profile. @@ -448,6 +455,8 @@ class SampleProfileMatcher { uint64_t TotalProfiledCallsites = 0; uint64_t NumMismatchedCallsites = 0; uint64_t MismatchedCallsiteSamples = 0; + uint64_t PostMatchNumMismatchedCallsites = 0; + uint64_t PostMatchMismatchedCallsiteSamples = 0; uint64_t TotalCallsiteSamples = 0; uint64_t TotalProfiledFunc = 0; uint64_t NumMismatchedFuncHash = 0; @@ -474,24 +483,22 @@ class SampleProfileMatcher { return nullptr; } void runOnFunction(const Function &F); - void findIRAnchors(const Function &F, - std::map &IRAnchors); - void findProfileAnchors( + void findFuncAnchors(); + void UpdateIRAnchors(); + void findIRAnchors(const Function &F, IRAnchorMap &IRAnchors); + void findProfileAnchors(const FunctionSamples &FS, + ProfileAnchorMap &ProfileAnchors); + void countMismatchedHashSamples(const FunctionSamples &FS); + void countProfileMismatches(bool IsPreMatch); + void countMismatchedHashes(const Function &F, const FunctionSamples &FS); + void countMismatchedCallsites( + const Function &F, + StringMap> &FuncToMismatchCallsites, + uint64_t &FuncProfiledCallsites, uint64_t &FuncMismatchedCallsites) const; + void countMismatchedCallsiteSamples( const FunctionSamples &FS, - std::map> - &ProfileAnchors); - void countMismatchedSamples(const FunctionSamples &FS); - void countProfileMismatches( - const Function &F, const FunctionSamples &FS, - const std::map &IRAnchors, - const std::map> - &ProfileAnchors); - void countProfileCallsiteMismatches( - const FunctionSamples &FS, - const std::map &IRAnchors, - const std::map> - &ProfileAnchors, - uint64_t &FuncMismatchedCallsites, uint64_t &FuncProfiledCallsites); + StringMap> &FuncToMismatchCallsites, + uint64_t &FuncMismatchedCallsiteSamples) const; LocToLocMap &getIRToProfileLocationMap(const Function &F) { auto Ret = FuncMappings.try_emplace( FunctionSamples::getCanonicalFnName(F.getName()), LocToLocMap()); @@ -499,11 +506,10 @@ class SampleProfileMatcher { } void distributeIRToProfileLocationMap(); void distributeIRToProfileLocationMap(FunctionSamples &FS); - void runStaleProfileMatching( - const Function &F, const std::map &IRAnchors, - const std::map> - &ProfileAnchors, - LocToLocMap &IRToProfileLocationMap); + void runStaleProfileMatching(); + void runStaleProfileMatching(const Function &F, const IRAnchorMap &IRAnchors, + const ProfileAnchorMap &ProfileAnchors, + LocToLocMap &IRToProfileLocationMap); }; /// Sample profile pass. @@ -1129,7 +1135,7 @@ void SampleProfileLoader::findExternalInlineCandidate( CalleeSample->getContext().hasAttribute(ContextShouldBeInlined); if (!PreInline && CalleeSample->getHeadSamplesEstimate() < Threshold) continue; - + Function *Func = SymbolMap.lookup(CalleeSample->getFunction()); // Add to the import list only when it's defined out of module. if (!Func || Func->isDeclaration()) @@ -2123,8 +2129,8 @@ bool SampleProfileLoader::doInitialization(Module &M, return true; } -void SampleProfileMatcher::findIRAnchors( -const Function &F, std::map &IRAnchors) { +void SampleProfileMatcher::findIRAnchors(const Function &F, + IRAnchorMap &IRAnchors) {
[libcxx] [llvm] [lld] [libc] [flang] [compiler-rt] [libcxxabi] [clang] [clang-tools-extra] [CSSPGO] Compute and report post-match profile staleness (PR #79090)
@@ -2205,93 +2230,141 @@ void SampleProfileMatcher::countMismatchedSamples(const FunctionSamples &FS) { countMismatchedSamples(CS.second); } -void SampleProfileMatcher::countProfileMismatches( -const Function &F, const FunctionSamples &FS, -const std::map &IRAnchors, +void ProfileMatchStats::countMismatchedCallsites( +const Function &F, const std::map &IRAnchors, +const std::map> +&ProfileAnchors, +const LocToLocMap &IRToProfileLocationMap) { + auto &MismatchedCallsites = + FuncMismatchedCallsites[FunctionSamples::getCanonicalFnName(F.getName())]; + + auto MapIRLocToProfileLoc = [&](const LineLocation &IRLoc) { wlei-llvm wrote: Yep, the benefit here is pre-match and post-match using the same function`countMismatchedCallsites`, no special process to parsing the matching results. For pre-match, we just pass a empty `IRToProfileLocationMap` and no changes to `RunStaleProfileMatching`. https://github.com/llvm/llvm-project/pull/79090 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[libc] [libcxx] [libcxxabi] [clang-tools-extra] [llvm] [compiler-rt] [lld] [clang] [flang] [CSSPGO] Compute and report post-match profile staleness (PR #79090)
@@ -433,6 +433,44 @@ using CandidateQueue = PriorityQueue, CandidateComparer>; +// Profile matching statstics. +class ProfileMatchStats { wlei-llvm wrote: I was thinking that too until I worked with this `StringMap> FuncMismatchedCallsites;` which introduce a lot of actions making me think to put them into a new class. But as you suggested below, we can move `FuncMismatchedCallsites` out of the stats, or even we don't need a new class. https://github.com/llvm/llvm-project/pull/79090 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[flang] [lld] [llvm] [clang] [libcxxabi] [libc] [libcxx] [clang-tools-extra] [compiler-rt] [CSSPGO] Compute and report post-match profile staleness (PR #79090)
@@ -433,6 +433,44 @@ using CandidateQueue = PriorityQueue, CandidateComparer>; +// Profile matching statstics. +class ProfileMatchStats { + const Module &M; + SampleProfileReader &Reader; + const PseudoProbeManager *ProbeManager; + +public: + ProfileMatchStats(const Module &M, SampleProfileReader &Reader, +const PseudoProbeManager *ProbeManager) + : M(M), Reader(Reader), ProbeManager(ProbeManager) {} + + uint64_t NumMismatchedCallsites = 0; + uint64_t TotalProfiledCallsites = 0; + uint64_t MismatchedCallsiteSamples = 0; + uint64_t NumMismatchedFuncHash = 0; + uint64_t TotalProfiledFunc = 0; + uint64_t MismatchedFuncHashSamples = 0; wlei-llvm wrote: Sounds good to remove the "Hash" https://github.com/llvm/llvm-project/pull/79090 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[libc] [lld] [flang] [libcxxabi] [compiler-rt] [libcxx] [clang-tools-extra] [llvm] [clang] [CSSPGO] Compute and report post-match profile staleness (PR #79090)
@@ -2460,63 +2528,108 @@ void SampleProfileMatcher::runOnFunction(const Function &F) { !ProbeManager->profileIsValid(F, *FSFlattened)) { // The matching result will be saved to IRToProfileLocationMap, create a new // map for each function. +auto &IRToProfileLocationMap = getIRToProfileLocationMap(F); runStaleProfileMatching(F, IRAnchors, ProfileAnchors, -getIRToProfileLocationMap(F)); +IRToProfileLocationMap); +PostMatchStats.countMismatchedCallsites(F, IRAnchors, ProfileAnchors, +IRToProfileLocationMap); } } -void SampleProfileMatcher::runOnModule() { - ProfileConverter::flattenProfile(Reader.getProfiles(), FlattenedProfiles, - FunctionSamples::ProfileIsCS); - for (auto &F : M) { -if (F.isDeclaration() || !F.hasFnAttribute("use-sample-profile")) - continue; -runOnFunction(F); - } - if (SalvageStaleProfile) -distributeIRToProfileLocationMap(); - +void SampleProfileMatcher::reportOrPersistProfileStats() { if (ReportProfileStaleness) { if (FunctionSamples::ProfileIsProbeBased) { - errs() << "(" << NumMismatchedFuncHash << "/" << TotalProfiledFunc << ")" + errs() << "(" << PreMatchStats.NumMismatchedFuncHash << "/" + << PreMatchStats.TotalProfiledFunc << ")" << " of functions' profile are invalid and " - << " (" << MismatchedFuncHashSamples << "/" << TotalFuncHashSamples - << ")" + << " (" << PreMatchStats.MismatchedFuncHashSamples << "/" + << PreMatchStats.TotalFunctionSamples << ")" << " of samples are discarded due to function hash mismatch.\n"; } -errs() << "(" << NumMismatchedCallsites << "/" << TotalProfiledCallsites - << ")" +errs() << "(" << PreMatchStats.NumMismatchedCallsites << "/" + << PreMatchStats.TotalProfiledCallsites << ")" << " of callsites' profile are invalid and " - << "(" << MismatchedCallsiteSamples << "/" << TotalCallsiteSamples - << ")" + << "(" << PreMatchStats.MismatchedCallsiteSamples << "/" + << PreMatchStats.TotalFunctionSamples << ")" << " of samples are discarded due to callsite location mismatch.\n"; +if (SalvageStaleProfile) { + uint64_t NumRecoveredCallsites = PostMatchStats.TotalProfiledCallsites - + PostMatchStats.NumMismatchedCallsites; + uint64_t NumMismatchedCallsites = + PreMatchStats.NumMismatchedCallsites - NumRecoveredCallsites; + errs() << "Out of " << PostMatchStats.TotalProfiledCallsites + << " callsites used for profile matching, " + << NumRecoveredCallsites + << " callsites have been recovered. After the matching, (" + << NumMismatchedCallsites << "/" + << PreMatchStats.TotalProfiledCallsites + << ") of callsites are still invalid (" + << PostMatchStats.MismatchedCallsiteSamples << "/" + << PreMatchStats.TotalFunctionSamples << ")" + << " of samples are still discarded.\n"; +} } if (PersistProfileStaleness) { LLVMContext &Ctx = M.getContext(); MDBuilder MDB(Ctx); SmallVector> ProfStatsVec; +ProfStatsVec.emplace_back("NumMismatchedCallsites", + PreMatchStats.NumMismatchedCallsites); +ProfStatsVec.emplace_back("TotalProfiledCallsites", + PreMatchStats.TotalProfiledCallsites); +ProfStatsVec.emplace_back("MismatchedCallsiteSamples", + PreMatchStats.MismatchedCallsiteSamples); +ProfStatsVec.emplace_back("TotalProfiledFunc", + PreMatchStats.TotalProfiledFunc); +ProfStatsVec.emplace_back("TotalFunctionSamples", + PreMatchStats.TotalFunctionSamples); if (FunctionSamples::ProfileIsProbeBased) { - ProfStatsVec.emplace_back("NumMismatchedFuncHash", NumMismatchedFuncHash); - ProfStatsVec.emplace_back("TotalProfiledFunc", TotalProfiledFunc); + ProfStatsVec.emplace_back("NumMismatchedFuncHash", +PreMatchStats.NumMismatchedFuncHash); ProfStatsVec.emplace_back("MismatchedFuncHashSamples", -MismatchedFuncHashSamples); - ProfStatsVec.emplace_back("TotalFuncHashSamples", TotalFuncHashSamples); +PreMatchStats.MismatchedFuncHashSamples); +} +if (SalvageStaleProfile) { + ProfStatsVec.emplace_back("PostMatchNumMismatchedCallsites", +PostMatchStats.NumMismatchedCallsites); + ProfStatsVec.emplace_back("NumCallsitesForMatching", +PostMatchStats.TotalProfiledCallsites); + ProfStatsVec.emplace_back("PostMatch
[llvm] [clang] [libcxx] [libc] [compiler-rt] [lld] [libcxxabi] [flang] [clang-tools-extra] [CSSPGO] Compute and report post-match profile staleness (PR #79090)
@@ -2460,63 +2528,108 @@ void SampleProfileMatcher::runOnFunction(const Function &F) { !ProbeManager->profileIsValid(F, *FSFlattened)) { // The matching result will be saved to IRToProfileLocationMap, create a new // map for each function. +auto &IRToProfileLocationMap = getIRToProfileLocationMap(F); runStaleProfileMatching(F, IRAnchors, ProfileAnchors, -getIRToProfileLocationMap(F)); +IRToProfileLocationMap); +PostMatchStats.countMismatchedCallsites(F, IRAnchors, ProfileAnchors, +IRToProfileLocationMap); } } -void SampleProfileMatcher::runOnModule() { - ProfileConverter::flattenProfile(Reader.getProfiles(), FlattenedProfiles, - FunctionSamples::ProfileIsCS); - for (auto &F : M) { -if (F.isDeclaration() || !F.hasFnAttribute("use-sample-profile")) - continue; -runOnFunction(F); - } - if (SalvageStaleProfile) -distributeIRToProfileLocationMap(); - +void SampleProfileMatcher::reportOrPersistProfileStats() { if (ReportProfileStaleness) { if (FunctionSamples::ProfileIsProbeBased) { - errs() << "(" << NumMismatchedFuncHash << "/" << TotalProfiledFunc << ")" + errs() << "(" << PreMatchStats.NumMismatchedFuncHash << "/" + << PreMatchStats.TotalProfiledFunc << ")" << " of functions' profile are invalid and " - << " (" << MismatchedFuncHashSamples << "/" << TotalFuncHashSamples - << ")" + << " (" << PreMatchStats.MismatchedFuncHashSamples << "/" + << PreMatchStats.TotalFunctionSamples << ")" << " of samples are discarded due to function hash mismatch.\n"; } -errs() << "(" << NumMismatchedCallsites << "/" << TotalProfiledCallsites - << ")" +errs() << "(" << PreMatchStats.NumMismatchedCallsites << "/" + << PreMatchStats.TotalProfiledCallsites << ")" << " of callsites' profile are invalid and " - << "(" << MismatchedCallsiteSamples << "/" << TotalCallsiteSamples - << ")" + << "(" << PreMatchStats.MismatchedCallsiteSamples << "/" + << PreMatchStats.TotalFunctionSamples << ")" << " of samples are discarded due to callsite location mismatch.\n"; +if (SalvageStaleProfile) { + uint64_t NumRecoveredCallsites = PostMatchStats.TotalProfiledCallsites - + PostMatchStats.NumMismatchedCallsites; + uint64_t NumMismatchedCallsites = + PreMatchStats.NumMismatchedCallsites - NumRecoveredCallsites; + errs() << "Out of " << PostMatchStats.TotalProfiledCallsites + << " callsites used for profile matching, " + << NumRecoveredCallsites + << " callsites have been recovered. After the matching, (" + << NumMismatchedCallsites << "/" + << PreMatchStats.TotalProfiledCallsites + << ") of callsites are still invalid (" + << PostMatchStats.MismatchedCallsiteSamples << "/" + << PreMatchStats.TotalFunctionSamples << ")" + << " of samples are still discarded.\n"; +} } if (PersistProfileStaleness) { LLVMContext &Ctx = M.getContext(); MDBuilder MDB(Ctx); SmallVector> ProfStatsVec; +ProfStatsVec.emplace_back("NumMismatchedCallsites", + PreMatchStats.NumMismatchedCallsites); +ProfStatsVec.emplace_back("TotalProfiledCallsites", + PreMatchStats.TotalProfiledCallsites); +ProfStatsVec.emplace_back("MismatchedCallsiteSamples", + PreMatchStats.MismatchedCallsiteSamples); +ProfStatsVec.emplace_back("TotalProfiledFunc", + PreMatchStats.TotalProfiledFunc); +ProfStatsVec.emplace_back("TotalFunctionSamples", + PreMatchStats.TotalFunctionSamples); if (FunctionSamples::ProfileIsProbeBased) { - ProfStatsVec.emplace_back("NumMismatchedFuncHash", NumMismatchedFuncHash); - ProfStatsVec.emplace_back("TotalProfiledFunc", TotalProfiledFunc); + ProfStatsVec.emplace_back("NumMismatchedFuncHash", +PreMatchStats.NumMismatchedFuncHash); ProfStatsVec.emplace_back("MismatchedFuncHashSamples", -MismatchedFuncHashSamples); - ProfStatsVec.emplace_back("TotalFuncHashSamples", TotalFuncHashSamples); +PreMatchStats.MismatchedFuncHashSamples); +} +if (SalvageStaleProfile) { + ProfStatsVec.emplace_back("PostMatchNumMismatchedCallsites", +PostMatchStats.NumMismatchedCallsites); + ProfStatsVec.emplace_back("NumCallsitesForMatching", +PostMatchStats.TotalProfiledCallsites); + ProfStatsVec.emplace_back("PostMatch
[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)
https://github.com/wlei-llvm updated https://github.com/llvm/llvm-project/pull/109837 >From 07a2cab3fa5df2965f9f7da9ee2d3603581a47f1 Mon Sep 17 00:00:00 2001 From: wlei Date: Sun, 22 Sep 2024 20:23:20 -0700 Subject: [PATCH 1/5] [InstrPGO] Instrument sampling profile based cold function --- clang/include/clang/Driver/Options.td | 6 + clang/lib/Driver/ToolChain.cpp| 4 +++- clang/lib/Driver/ToolChains/Clang.cpp | 18 ++ clang/test/CodeGen/Inputs/pgo-cold-func.prof | 2 ++ .../test/CodeGen/pgo-cold-function-coverage.c | 12 ++ llvm/lib/Passes/PassBuilderPipelines.cpp | 18 ++ .../Instrumentation/PGOInstrumentation.cpp| 16 + .../PGOProfile/instr-gen-cold-function.ll | 24 +++ 8 files changed, 99 insertions(+), 1 deletion(-) create mode 100644 clang/test/CodeGen/Inputs/pgo-cold-func.prof create mode 100644 clang/test/CodeGen/pgo-cold-function-coverage.c create mode 100644 llvm/test/Transforms/PGOProfile/instr-gen-cold-function.ll diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 002f60350543d9..6bb92427f2d53f 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -1784,6 +1784,12 @@ defm debug_info_for_profiling : BoolFOption<"debug-info-for-profiling", PosFlag, NegFlag>; +def fprofile_generate_cold_function_coverage : Flag<["-"], "fprofile-generate-cold-function-coverage">, +Group, Visibility<[ClangOption, CLOption]>, +HelpText<"Generate instrumented code to cold functions into default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env var)">; +def fprofile_generate_cold_function_coverage_EQ : Joined<["-"], "fprofile-generate-cold-function-coverage=">, +Group, Visibility<[ClangOption, CLOption]>, MetaVarName<"">, +HelpText<"Generate instrumented code to cold functions into /default.profraw (overridden by LLVM_PROFILE_FILE env var)">; def fprofile_instr_generate : Flag<["-"], "fprofile-instr-generate">, Group, Visibility<[ClangOption, CLOption]>, HelpText<"Generate instrumented code to collect execution counts into default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env var)">; diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp index 16f9b629fc538c..e56db150c6b47e 100644 --- a/clang/lib/Driver/ToolChain.cpp +++ b/clang/lib/Driver/ToolChain.cpp @@ -889,7 +889,9 @@ bool ToolChain::needsProfileRT(const ArgList &Args) { Args.hasArg(options::OPT_fprofile_instr_generate) || Args.hasArg(options::OPT_fprofile_instr_generate_EQ) || Args.hasArg(options::OPT_fcreate_profile) || - Args.hasArg(options::OPT_forder_file_instrumentation); + Args.hasArg(options::OPT_forder_file_instrumentation) || + Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage) || + Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage_EQ); } bool ToolChain::needsGCovInstrumentation(const llvm::opt::ArgList &Args) { diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index 0bab48caf1a5e2..00602a08232ba2 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -649,6 +649,24 @@ static void addPGOAndCoverageFlags(const ToolChain &TC, Compilation &C, } } + if (auto *ColdFuncCoverageArg = Args.getLastArg( + options::OPT_fprofile_generate_cold_function_coverage, + options::OPT_fprofile_generate_cold_function_coverage_EQ)) { +SmallString<128> Path( +ColdFuncCoverageArg->getOption().matches( +options::OPT_fprofile_generate_cold_function_coverage_EQ) +? ColdFuncCoverageArg->getValue() +: ""); +llvm::sys::path::append(Path, "default_%m.profraw"); +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back(Args.MakeArgString( +Twine("--instrument-sample-cold-function-path=") + Path)); +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back("--instrument-cold-function-coverage"); +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back("--pgo-function-entry-coverage"); + } + Arg *PGOGenArg = nullptr; if (PGOGenerateArg) { assert(!CSPGOGenerateArg); diff --git a/clang/test/CodeGen/Inputs/pgo-cold-func.prof b/clang/test/CodeGen/Inputs/pgo-cold-func.prof new file mode 100644 index 00..e50be02e0a8545 --- /dev/null +++ b/clang/test/CodeGen/Inputs/pgo-cold-func.prof @@ -0,0 +1,2 @@ +foo:1:1 + 1: 1 diff --git a/clang/test/CodeGen/pgo-cold-function-coverage.c b/clang/test/CodeGen/pgo-cold-function-coverage.c new file mode 100644 index 00..0d2767c022b9f1 --- /dev/null +++ b/clang/test/CodeGen/pgo-cold-function-coverage.c @@ -0,0 +1,12 @@ +// Test -fprofile-generate-cold-function-coverage +// RUN: %clang -O2 -fprofile-generate-cold-function-coverage=/xxx
[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)
@@ -1119,6 +1125,18 @@ PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level, // removed. MPM.addPass( PGOIndirectCallPromotion(true /* IsInLTO */, true /* SamplePGO */)); + +if (InstrumentSampleColdFuncPath.getNumOccurrences() && +Phase != ThinOrFullLTOPhase::ThinLTOPostLink) { + assert(!InstrumentSampleColdFuncPath.empty() && + "File path is requeired for instrumentation generation"); + InstrumentColdFunctionCoverage = true; + addPreInlinerPasses(MPM, Level, Phase); + addPGOInstrPasses(MPM, Level, /* RunProfileGen */ true, +/* IsCS */ false, /* AtomicCounterUpdate */ false, +InstrumentSampleColdFuncPath, "", +IntrusiveRefCntPtr()); +} wlei-llvm wrote: Good point, that's more flexible, thanks! https://github.com/llvm/llvm-project/pull/109837 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)
https://github.com/wlei-llvm updated https://github.com/llvm/llvm-project/pull/109837 >From 07a2cab3fa5df2965f9f7da9ee2d3603581a47f1 Mon Sep 17 00:00:00 2001 From: wlei Date: Sun, 22 Sep 2024 20:23:20 -0700 Subject: [PATCH 1/6] [InstrPGO] Instrument sampling profile based cold function --- clang/include/clang/Driver/Options.td | 6 + clang/lib/Driver/ToolChain.cpp| 4 +++- clang/lib/Driver/ToolChains/Clang.cpp | 18 ++ clang/test/CodeGen/Inputs/pgo-cold-func.prof | 2 ++ .../test/CodeGen/pgo-cold-function-coverage.c | 12 ++ llvm/lib/Passes/PassBuilderPipelines.cpp | 18 ++ .../Instrumentation/PGOInstrumentation.cpp| 16 + .../PGOProfile/instr-gen-cold-function.ll | 24 +++ 8 files changed, 99 insertions(+), 1 deletion(-) create mode 100644 clang/test/CodeGen/Inputs/pgo-cold-func.prof create mode 100644 clang/test/CodeGen/pgo-cold-function-coverage.c create mode 100644 llvm/test/Transforms/PGOProfile/instr-gen-cold-function.ll diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 002f60350543d9..6bb92427f2d53f 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -1784,6 +1784,12 @@ defm debug_info_for_profiling : BoolFOption<"debug-info-for-profiling", PosFlag, NegFlag>; +def fprofile_generate_cold_function_coverage : Flag<["-"], "fprofile-generate-cold-function-coverage">, +Group, Visibility<[ClangOption, CLOption]>, +HelpText<"Generate instrumented code to cold functions into default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env var)">; +def fprofile_generate_cold_function_coverage_EQ : Joined<["-"], "fprofile-generate-cold-function-coverage=">, +Group, Visibility<[ClangOption, CLOption]>, MetaVarName<"">, +HelpText<"Generate instrumented code to cold functions into /default.profraw (overridden by LLVM_PROFILE_FILE env var)">; def fprofile_instr_generate : Flag<["-"], "fprofile-instr-generate">, Group, Visibility<[ClangOption, CLOption]>, HelpText<"Generate instrumented code to collect execution counts into default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env var)">; diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp index 16f9b629fc538c..e56db150c6b47e 100644 --- a/clang/lib/Driver/ToolChain.cpp +++ b/clang/lib/Driver/ToolChain.cpp @@ -889,7 +889,9 @@ bool ToolChain::needsProfileRT(const ArgList &Args) { Args.hasArg(options::OPT_fprofile_instr_generate) || Args.hasArg(options::OPT_fprofile_instr_generate_EQ) || Args.hasArg(options::OPT_fcreate_profile) || - Args.hasArg(options::OPT_forder_file_instrumentation); + Args.hasArg(options::OPT_forder_file_instrumentation) || + Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage) || + Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage_EQ); } bool ToolChain::needsGCovInstrumentation(const llvm::opt::ArgList &Args) { diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index 0bab48caf1a5e2..00602a08232ba2 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -649,6 +649,24 @@ static void addPGOAndCoverageFlags(const ToolChain &TC, Compilation &C, } } + if (auto *ColdFuncCoverageArg = Args.getLastArg( + options::OPT_fprofile_generate_cold_function_coverage, + options::OPT_fprofile_generate_cold_function_coverage_EQ)) { +SmallString<128> Path( +ColdFuncCoverageArg->getOption().matches( +options::OPT_fprofile_generate_cold_function_coverage_EQ) +? ColdFuncCoverageArg->getValue() +: ""); +llvm::sys::path::append(Path, "default_%m.profraw"); +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back(Args.MakeArgString( +Twine("--instrument-sample-cold-function-path=") + Path)); +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back("--instrument-cold-function-coverage"); +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back("--pgo-function-entry-coverage"); + } + Arg *PGOGenArg = nullptr; if (PGOGenerateArg) { assert(!CSPGOGenerateArg); diff --git a/clang/test/CodeGen/Inputs/pgo-cold-func.prof b/clang/test/CodeGen/Inputs/pgo-cold-func.prof new file mode 100644 index 00..e50be02e0a8545 --- /dev/null +++ b/clang/test/CodeGen/Inputs/pgo-cold-func.prof @@ -0,0 +1,2 @@ +foo:1:1 + 1: 1 diff --git a/clang/test/CodeGen/pgo-cold-function-coverage.c b/clang/test/CodeGen/pgo-cold-function-coverage.c new file mode 100644 index 00..0d2767c022b9f1 --- /dev/null +++ b/clang/test/CodeGen/pgo-cold-function-coverage.c @@ -0,0 +1,12 @@ +// Test -fprofile-generate-cold-function-coverage +// RUN: %clang -O2 -fprofile-generate-cold-function-coverage=/xxx
[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)
@@ -1119,6 +1125,18 @@ PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level, // removed. MPM.addPass( PGOIndirectCallPromotion(true /* IsInLTO */, true /* SamplePGO */)); + +if (InstrumentSampleColdFuncPath.getNumOccurrences() && +Phase != ThinOrFullLTOPhase::ThinLTOPostLink) { + assert(!InstrumentSampleColdFuncPath.empty() && + "File path is requeired for instrumentation generation"); + InstrumentColdFunctionCoverage = true; + addPreInlinerPasses(MPM, Level, Phase); + addPGOInstrPasses(MPM, Level, /* RunProfileGen */ true, +/* IsCS */ false, /* AtomicCounterUpdate */ false, +InstrumentSampleColdFuncPath, "", +IntrusiveRefCntPtr()); +} wlei-llvm wrote: > I don't see how there would be any functional difference between using > `IsSampleColdFuncCovGen` and `IsXXXColdFuncCovGen`, but I could be missing > something. If we can have a single flag for all cases, I think that would be > cleaner and I can also try to use it for my case. I see, sounds good. My previous thought is: there could be functional difference for using the `-instrument-cold-function-coverage` flag with sampling PGO flag(`-fprofile-sample-use`) vs without sampling PGO flags. With the sampling PGO flag, if a function symbol shows in the binary but not in the profile, we can treat it as cold function(set 0 entry count), we then would instrument those function. But without sampling PGO flag(also no other PGO options), the entry count is unknown(-1), then we don't instrument them. So user needs to be aware of the combination use of those flags, I was then trying to prevent the misuse, to disallow the use of `--instrument-cold-function-coverage` without (sampling) PGO use options(use the assertion in the version). But on a second thought, if we want to extend for other PGO options, say if the profile annotation and instrumentation are not in the same pipeline, it's probably hard to check this locally, we can leave it to the build system. https://github.com/llvm/llvm-project/pull/109837 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)
https://github.com/wlei-llvm updated https://github.com/llvm/llvm-project/pull/109837 >From 07a2cab3fa5df2965f9f7da9ee2d3603581a47f1 Mon Sep 17 00:00:00 2001 From: wlei Date: Sun, 22 Sep 2024 20:23:20 -0700 Subject: [PATCH 1/4] [InstrPGO] Instrument sampling profile based cold function --- clang/include/clang/Driver/Options.td | 6 + clang/lib/Driver/ToolChain.cpp| 4 +++- clang/lib/Driver/ToolChains/Clang.cpp | 18 ++ clang/test/CodeGen/Inputs/pgo-cold-func.prof | 2 ++ .../test/CodeGen/pgo-cold-function-coverage.c | 12 ++ llvm/lib/Passes/PassBuilderPipelines.cpp | 18 ++ .../Instrumentation/PGOInstrumentation.cpp| 16 + .../PGOProfile/instr-gen-cold-function.ll | 24 +++ 8 files changed, 99 insertions(+), 1 deletion(-) create mode 100644 clang/test/CodeGen/Inputs/pgo-cold-func.prof create mode 100644 clang/test/CodeGen/pgo-cold-function-coverage.c create mode 100644 llvm/test/Transforms/PGOProfile/instr-gen-cold-function.ll diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 002f60350543d9..6bb92427f2d53f 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -1784,6 +1784,12 @@ defm debug_info_for_profiling : BoolFOption<"debug-info-for-profiling", PosFlag, NegFlag>; +def fprofile_generate_cold_function_coverage : Flag<["-"], "fprofile-generate-cold-function-coverage">, +Group, Visibility<[ClangOption, CLOption]>, +HelpText<"Generate instrumented code to cold functions into default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env var)">; +def fprofile_generate_cold_function_coverage_EQ : Joined<["-"], "fprofile-generate-cold-function-coverage=">, +Group, Visibility<[ClangOption, CLOption]>, MetaVarName<"">, +HelpText<"Generate instrumented code to cold functions into /default.profraw (overridden by LLVM_PROFILE_FILE env var)">; def fprofile_instr_generate : Flag<["-"], "fprofile-instr-generate">, Group, Visibility<[ClangOption, CLOption]>, HelpText<"Generate instrumented code to collect execution counts into default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env var)">; diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp index 16f9b629fc538c..e56db150c6b47e 100644 --- a/clang/lib/Driver/ToolChain.cpp +++ b/clang/lib/Driver/ToolChain.cpp @@ -889,7 +889,9 @@ bool ToolChain::needsProfileRT(const ArgList &Args) { Args.hasArg(options::OPT_fprofile_instr_generate) || Args.hasArg(options::OPT_fprofile_instr_generate_EQ) || Args.hasArg(options::OPT_fcreate_profile) || - Args.hasArg(options::OPT_forder_file_instrumentation); + Args.hasArg(options::OPT_forder_file_instrumentation) || + Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage) || + Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage_EQ); } bool ToolChain::needsGCovInstrumentation(const llvm::opt::ArgList &Args) { diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index 0bab48caf1a5e2..00602a08232ba2 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -649,6 +649,24 @@ static void addPGOAndCoverageFlags(const ToolChain &TC, Compilation &C, } } + if (auto *ColdFuncCoverageArg = Args.getLastArg( + options::OPT_fprofile_generate_cold_function_coverage, + options::OPT_fprofile_generate_cold_function_coverage_EQ)) { +SmallString<128> Path( +ColdFuncCoverageArg->getOption().matches( +options::OPT_fprofile_generate_cold_function_coverage_EQ) +? ColdFuncCoverageArg->getValue() +: ""); +llvm::sys::path::append(Path, "default_%m.profraw"); +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back(Args.MakeArgString( +Twine("--instrument-sample-cold-function-path=") + Path)); +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back("--instrument-cold-function-coverage"); +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back("--pgo-function-entry-coverage"); + } + Arg *PGOGenArg = nullptr; if (PGOGenerateArg) { assert(!CSPGOGenerateArg); diff --git a/clang/test/CodeGen/Inputs/pgo-cold-func.prof b/clang/test/CodeGen/Inputs/pgo-cold-func.prof new file mode 100644 index 00..e50be02e0a8545 --- /dev/null +++ b/clang/test/CodeGen/Inputs/pgo-cold-func.prof @@ -0,0 +1,2 @@ +foo:1:1 + 1: 1 diff --git a/clang/test/CodeGen/pgo-cold-function-coverage.c b/clang/test/CodeGen/pgo-cold-function-coverage.c new file mode 100644 index 00..0d2767c022b9f1 --- /dev/null +++ b/clang/test/CodeGen/pgo-cold-function-coverage.c @@ -0,0 +1,12 @@ +// Test -fprofile-generate-cold-function-coverage +// RUN: %clang -O2 -fprofile-generate-cold-function-coverage=/xxx
[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)
https://github.com/wlei-llvm edited https://github.com/llvm/llvm-project/pull/109837 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)
https://github.com/wlei-llvm updated https://github.com/llvm/llvm-project/pull/109837 >From 07a2cab3fa5df2965f9f7da9ee2d3603581a47f1 Mon Sep 17 00:00:00 2001 From: wlei Date: Sun, 22 Sep 2024 20:23:20 -0700 Subject: [PATCH 1/2] [InstrPGO] Instrument sampling profile based cold function --- clang/include/clang/Driver/Options.td | 6 + clang/lib/Driver/ToolChain.cpp| 4 +++- clang/lib/Driver/ToolChains/Clang.cpp | 18 ++ clang/test/CodeGen/Inputs/pgo-cold-func.prof | 2 ++ .../test/CodeGen/pgo-cold-function-coverage.c | 12 ++ llvm/lib/Passes/PassBuilderPipelines.cpp | 18 ++ .../Instrumentation/PGOInstrumentation.cpp| 16 + .../PGOProfile/instr-gen-cold-function.ll | 24 +++ 8 files changed, 99 insertions(+), 1 deletion(-) create mode 100644 clang/test/CodeGen/Inputs/pgo-cold-func.prof create mode 100644 clang/test/CodeGen/pgo-cold-function-coverage.c create mode 100644 llvm/test/Transforms/PGOProfile/instr-gen-cold-function.ll diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 002f60350543d9..6bb92427f2d53f 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -1784,6 +1784,12 @@ defm debug_info_for_profiling : BoolFOption<"debug-info-for-profiling", PosFlag, NegFlag>; +def fprofile_generate_cold_function_coverage : Flag<["-"], "fprofile-generate-cold-function-coverage">, +Group, Visibility<[ClangOption, CLOption]>, +HelpText<"Generate instrumented code to cold functions into default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env var)">; +def fprofile_generate_cold_function_coverage_EQ : Joined<["-"], "fprofile-generate-cold-function-coverage=">, +Group, Visibility<[ClangOption, CLOption]>, MetaVarName<"">, +HelpText<"Generate instrumented code to cold functions into /default.profraw (overridden by LLVM_PROFILE_FILE env var)">; def fprofile_instr_generate : Flag<["-"], "fprofile-instr-generate">, Group, Visibility<[ClangOption, CLOption]>, HelpText<"Generate instrumented code to collect execution counts into default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env var)">; diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp index 16f9b629fc538c..e56db150c6b47e 100644 --- a/clang/lib/Driver/ToolChain.cpp +++ b/clang/lib/Driver/ToolChain.cpp @@ -889,7 +889,9 @@ bool ToolChain::needsProfileRT(const ArgList &Args) { Args.hasArg(options::OPT_fprofile_instr_generate) || Args.hasArg(options::OPT_fprofile_instr_generate_EQ) || Args.hasArg(options::OPT_fcreate_profile) || - Args.hasArg(options::OPT_forder_file_instrumentation); + Args.hasArg(options::OPT_forder_file_instrumentation) || + Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage) || + Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage_EQ); } bool ToolChain::needsGCovInstrumentation(const llvm::opt::ArgList &Args) { diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index 0bab48caf1a5e2..00602a08232ba2 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -649,6 +649,24 @@ static void addPGOAndCoverageFlags(const ToolChain &TC, Compilation &C, } } + if (auto *ColdFuncCoverageArg = Args.getLastArg( + options::OPT_fprofile_generate_cold_function_coverage, + options::OPT_fprofile_generate_cold_function_coverage_EQ)) { +SmallString<128> Path( +ColdFuncCoverageArg->getOption().matches( +options::OPT_fprofile_generate_cold_function_coverage_EQ) +? ColdFuncCoverageArg->getValue() +: ""); +llvm::sys::path::append(Path, "default_%m.profraw"); +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back(Args.MakeArgString( +Twine("--instrument-sample-cold-function-path=") + Path)); +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back("--instrument-cold-function-coverage"); +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back("--pgo-function-entry-coverage"); + } + Arg *PGOGenArg = nullptr; if (PGOGenerateArg) { assert(!CSPGOGenerateArg); diff --git a/clang/test/CodeGen/Inputs/pgo-cold-func.prof b/clang/test/CodeGen/Inputs/pgo-cold-func.prof new file mode 100644 index 00..e50be02e0a8545 --- /dev/null +++ b/clang/test/CodeGen/Inputs/pgo-cold-func.prof @@ -0,0 +1,2 @@ +foo:1:1 + 1: 1 diff --git a/clang/test/CodeGen/pgo-cold-function-coverage.c b/clang/test/CodeGen/pgo-cold-function-coverage.c new file mode 100644 index 00..0d2767c022b9f1 --- /dev/null +++ b/clang/test/CodeGen/pgo-cold-function-coverage.c @@ -0,0 +1,12 @@ +// Test -fprofile-generate-cold-function-coverage +// RUN: %clang -O2 -fprofile-generate-cold-function-coverage=/xxx
[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)
@@ -319,6 +319,29 @@ static cl::opt PGOFunctionCriticalEdgeThreshold( cl::desc("Do not instrument functions with the number of critical edges " " greater than this threshold.")); +static cl::opt ColdFuncCoverageMaxEntryCount( +"cold-function-coverage-max-entry-count", cl::init(0), cl::Hidden, +cl::desc("When enabling cold function coverage instrumentation, skip " + "instrumenting the function whose entry count is above the given " + "value")); + +static cl::opt InstrumentColdFunctionCoverageMode( +"instrument-cold-function-coverage-mode", wlei-llvm wrote: Agreed, I can't see other modes so far, changed to `pgo-treat-unknown-as-cold` https://github.com/llvm/llvm-project/pull/109837 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)
@@ -1784,6 +1784,12 @@ defm debug_info_for_profiling : BoolFOption<"debug-info-for-profiling", PosFlag, NegFlag>; +def fprofile_generate_cold_function_coverage : Flag<["-"], "fprofile-generate-cold-function-coverage">, +Group, Visibility<[ClangOption, CLOption]>, +HelpText<"Generate instrumented code to cold functions into default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env var)">; wlei-llvm wrote: thanks for the suggestion! https://github.com/llvm/llvm-project/pull/109837 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)
https://github.com/wlei-llvm updated https://github.com/llvm/llvm-project/pull/109837 >From 07a2cab3fa5df2965f9f7da9ee2d3603581a47f1 Mon Sep 17 00:00:00 2001 From: wlei Date: Sun, 22 Sep 2024 20:23:20 -0700 Subject: [PATCH 1/7] [InstrPGO] Instrument sampling profile based cold function --- clang/include/clang/Driver/Options.td | 6 + clang/lib/Driver/ToolChain.cpp| 4 +++- clang/lib/Driver/ToolChains/Clang.cpp | 18 ++ clang/test/CodeGen/Inputs/pgo-cold-func.prof | 2 ++ .../test/CodeGen/pgo-cold-function-coverage.c | 12 ++ llvm/lib/Passes/PassBuilderPipelines.cpp | 18 ++ .../Instrumentation/PGOInstrumentation.cpp| 16 + .../PGOProfile/instr-gen-cold-function.ll | 24 +++ 8 files changed, 99 insertions(+), 1 deletion(-) create mode 100644 clang/test/CodeGen/Inputs/pgo-cold-func.prof create mode 100644 clang/test/CodeGen/pgo-cold-function-coverage.c create mode 100644 llvm/test/Transforms/PGOProfile/instr-gen-cold-function.ll diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 002f60350543d9..6bb92427f2d53f 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -1784,6 +1784,12 @@ defm debug_info_for_profiling : BoolFOption<"debug-info-for-profiling", PosFlag, NegFlag>; +def fprofile_generate_cold_function_coverage : Flag<["-"], "fprofile-generate-cold-function-coverage">, +Group, Visibility<[ClangOption, CLOption]>, +HelpText<"Generate instrumented code to cold functions into default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env var)">; +def fprofile_generate_cold_function_coverage_EQ : Joined<["-"], "fprofile-generate-cold-function-coverage=">, +Group, Visibility<[ClangOption, CLOption]>, MetaVarName<"">, +HelpText<"Generate instrumented code to cold functions into /default.profraw (overridden by LLVM_PROFILE_FILE env var)">; def fprofile_instr_generate : Flag<["-"], "fprofile-instr-generate">, Group, Visibility<[ClangOption, CLOption]>, HelpText<"Generate instrumented code to collect execution counts into default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env var)">; diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp index 16f9b629fc538c..e56db150c6b47e 100644 --- a/clang/lib/Driver/ToolChain.cpp +++ b/clang/lib/Driver/ToolChain.cpp @@ -889,7 +889,9 @@ bool ToolChain::needsProfileRT(const ArgList &Args) { Args.hasArg(options::OPT_fprofile_instr_generate) || Args.hasArg(options::OPT_fprofile_instr_generate_EQ) || Args.hasArg(options::OPT_fcreate_profile) || - Args.hasArg(options::OPT_forder_file_instrumentation); + Args.hasArg(options::OPT_forder_file_instrumentation) || + Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage) || + Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage_EQ); } bool ToolChain::needsGCovInstrumentation(const llvm::opt::ArgList &Args) { diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index 0bab48caf1a5e2..00602a08232ba2 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -649,6 +649,24 @@ static void addPGOAndCoverageFlags(const ToolChain &TC, Compilation &C, } } + if (auto *ColdFuncCoverageArg = Args.getLastArg( + options::OPT_fprofile_generate_cold_function_coverage, + options::OPT_fprofile_generate_cold_function_coverage_EQ)) { +SmallString<128> Path( +ColdFuncCoverageArg->getOption().matches( +options::OPT_fprofile_generate_cold_function_coverage_EQ) +? ColdFuncCoverageArg->getValue() +: ""); +llvm::sys::path::append(Path, "default_%m.profraw"); +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back(Args.MakeArgString( +Twine("--instrument-sample-cold-function-path=") + Path)); +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back("--instrument-cold-function-coverage"); +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back("--pgo-function-entry-coverage"); + } + Arg *PGOGenArg = nullptr; if (PGOGenerateArg) { assert(!CSPGOGenerateArg); diff --git a/clang/test/CodeGen/Inputs/pgo-cold-func.prof b/clang/test/CodeGen/Inputs/pgo-cold-func.prof new file mode 100644 index 00..e50be02e0a8545 --- /dev/null +++ b/clang/test/CodeGen/Inputs/pgo-cold-func.prof @@ -0,0 +1,2 @@ +foo:1:1 + 1: 1 diff --git a/clang/test/CodeGen/pgo-cold-function-coverage.c b/clang/test/CodeGen/pgo-cold-function-coverage.c new file mode 100644 index 00..0d2767c022b9f1 --- /dev/null +++ b/clang/test/CodeGen/pgo-cold-function-coverage.c @@ -0,0 +1,12 @@ +// Test -fprofile-generate-cold-function-coverage +// RUN: %clang -O2 -fprofile-generate-cold-function-coverage=/xxx
[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)
@@ -649,6 +649,24 @@ static void addPGOAndCoverageFlags(const ToolChain &TC, Compilation &C, } } + if (auto *ColdFuncCoverageArg = Args.getLastArg( + options::OPT_fprofile_generate_cold_function_coverage, + options::OPT_fprofile_generate_cold_function_coverage_EQ)) { +SmallString<128> Path( +ColdFuncCoverageArg->getOption().matches( +options::OPT_fprofile_generate_cold_function_coverage_EQ) +? ColdFuncCoverageArg->getValue() +: ""); +llvm::sys::path::append(Path, "default_%m.profraw"); +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back(Args.MakeArgString( +Twine("--instrument-cold-function-coverage-path=") + Path)); +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back("--instrument-cold-function-coverage"); wlei-llvm wrote: I really like this idea, however, this seems indeed not easy(needs a lot of refactoring). As you said, the `InstrProfileOutput` filed is shared with other PGO flag(in our case, it's the `fprofile-sample-use=`) , see [`PGOOptions.ProfileFIle`](https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/Support/PGOOptions.h#L43). One thing we can do is to split this field into two: `ProfileUseFile` and `ProfileGenFole`, let all `profile--gererate` go to `ProfileGenFile` and let `sample-use/instr-use go` to `ProfileUseFile`. I think that's doable and not a big change(maybe a separate diff). Then in sample-use case, we can use the `ProfileGenFIle` path for cold coverage work. WDYT? Though it would still not be perfect, the logic here https://github.com/llvm/llvm-project/blob/main/clang/lib/CodeGen/BackendUtil.cpp#L811-L841 is complicated/ would still need more refactoring. I was stuck here when thinking if `-fprofile-sample-use` and `--fprofile-instrument-path=` are both used, which branch(`CodeGenOpts.hasProfileIRInstr()` or `!CodeGenOpts.SampleProfileFile.empty()`) should it belongs to. For this version, I just removed the duplicated flag(`--pgo-instrument-cold-function-only`) and added a FIXME comment to explain this. https://github.com/llvm/llvm-project/pull/109837 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)
@@ -1784,6 +1784,12 @@ defm debug_info_for_profiling : BoolFOption<"debug-info-for-profiling", PosFlag, NegFlag>; +def fprofile_generate_cold_function_coverage : Flag<["-"], "fprofile-generate-cold-function-coverage">, wlei-llvm wrote: > I meant, why not just use `clang ... -mllvm > -instrument-cold-function-coverage`? Is this a clang - only feature (i.e. > rust can't use it?) Is it just for symmetry with the current PGO flags? > > (This is not a pushback, mainly curious. Also the patch would be quite > smaller if you didn't need to pass through the flags from clang to llvm) I see, thanks for the suggestion! We also need to link runtime lib(`compiler_rt.profile.a`) but yeah, I agree it's possible to pass directly to llvm and linker without through clang. Then the full command line would be like ``` clang ... -mllvm -instrument-cold-function-coverage -mllvm -instrument-sample-cold-function-path= -mllvm --pgo-function-entry-coverage ld.lld ... --u__llvm_runtime_variable .../lib/x86_64-unknown-linux-gnu/libclang_rt.profile.a ``` So yes, adding the clang driver flag is kind of to mirror current[ IRPGO flags](https://fburl.com/na3cp3gn), for easy maintenance purpose(given that `-fprofile-generate` doesn't work with `-fprofile-sample-use`) and also to centralize the configuration for the convenience. IMO, the `compiler_rt` is probably the main blocker here, I didn't find an easy way to bundle it with a llvm flag. Appreciate any further suggestions! https://github.com/llvm/llvm-project/pull/109837 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)
wlei-llvm wrote: Thanks for the context! > Why not use the existing `-pgo-function-entry-coverage` > (https://discourse.llvm.org/t/instrprofiling-lightweight-instrumentation/59113/14?u=ellishg) > LLVM flag? It takes advantage of the `llvm.instrprof.cover` intrinsic which > has less size and runtime overhead than `llvm.instrprof.increment`. We do use the `-pgo-function-entry-coverage` in this PR, see [here](https://github.com/llvm/llvm-project/pull/109837/files#diff-bac41c71569f27df21a843bcd74d2e604ed508afbdf14161dfb545c5d228R666-R667). but furthermore, we skip instrumenting the functions that are covered by sampling PGO profile. > It also supports IRPGO and CSIRPGO while it seems that this PR requires a > sample PGO input. Yeah, as I commented above, unfortunately currently the IRPGO main flag is incompatible with the Sampling PGO flag(also a bit diverged for the pipeline passes), that's why we have to add extra instr passes under sampling PGO pipeline and added a new driver flag. https://github.com/llvm/llvm-project/pull/109837 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)
@@ -1784,6 +1784,12 @@ defm debug_info_for_profiling : BoolFOption<"debug-info-for-profiling", PosFlag, NegFlag>; +def fprofile_generate_cold_function_coverage : Flag<["-"], "fprofile-generate-cold-function-coverage">, wlei-llvm wrote: Sorry if my PR description is not clear. Note that there is no use for `-fprofile-generate` and `-fprofile-instr-generate` here, so a driver flag here is to configure the instr file path and make linker to link the compiler.profile object files (just similar to `-fprofile-[instr]-generate=`). The reason for not using `-fprofile-[instr]-generate` is because those flags conflict with `-fprofile-sample-use`, see [PGOOptions](https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/Support/PGOOptions.h#L27-L43), `ProfileFile` is a shared file path which means the two flags are actually mutually exclusive. Another option is to make `-fprofile-generate` compatible with `-fprofile-samle-use`(like refactoring PGOOptions or adding another flag to configure the file path things), this seems to me they are more complicated than the current one. But I’m open to any suggestions on this. https://github.com/llvm/llvm-project/pull/109837 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)
wlei-llvm wrote: > For your use case, can you use > [ProfileSymbolList](https://github.com/llvm/llvm-project/blob/32ffc9fdc2cd422c88c926b862adb3de726e3888/llvm/include/llvm/ProfileData/SampleProf.h#L1509-L1512) > to identify very cold functions (executed but unsampled) or are you indeed > looking for functions that are never executed? We are indeed focusing on functions that are never executed. The ProfileSymbolList is already used to identify functions with a “0” entry count, which is the target of the instrumentation in this work. We then aim to further distinguish the dead functions from them. > > Will this be used to guide developers with diagnostics or more aggressive > compiler driven optimizations? We haven’t considered to use them for any compiler optimization. The primary goal is to improve the developer experience, just to take the general benefits from removing dead code: improving readability, good codebase maintainability, reducing build time, etc. Another potential use might be to verify the sampling PGO profile coverage/quality, say to check if we missed any functions that are actually hot. https://github.com/llvm/llvm-project/pull/109837 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function v2 (PR #110330)
https://github.com/wlei-llvm closed https://github.com/llvm/llvm-project/pull/110330 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] specify clang --target to fix breakage on AIX (PR #114127)
https://github.com/wlei-llvm created https://github.com/llvm/llvm-project/pull/114127 `-fprofile-sample-use` is not supported on AIX, which caused a CI failure. >From 2c2ddcc9bda277c6da7b51147785c0f1e88b848d Mon Sep 17 00:00:00 2001 From: wlei Date: Tue, 29 Oct 2024 13:28:33 -0700 Subject: [PATCH] specify clang --target to fix breakage on AIX --- clang/test/CodeGen/pgo-cold-function-coverage.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/clang/test/CodeGen/pgo-cold-function-coverage.c b/clang/test/CodeGen/pgo-cold-function-coverage.c index fd1e1e7e14cda5..3003cdc3e15e02 100644 --- a/clang/test/CodeGen/pgo-cold-function-coverage.c +++ b/clang/test/CodeGen/pgo-cold-function-coverage.c @@ -1,7 +1,7 @@ // Test -fprofile-generate-cold-function-coverage // RUN: rm -rf %t && split-file %s %t -// RUN: %clang -O2 -fprofile-generate-cold-function-coverage=/xxx/yyy/ -fprofile-sample-accurate -fprofile-sample-use=%t/pgo-cold-func.prof -S -emit-llvm -o - %t/pgo-cold-func.c | FileCheck %s +// RUN: %clang --target=x86_64 -O2 -fprofile-generate-cold-function-coverage=/xxx/yyy/ -fprofile-sample-accurate -fprofile-sample-use=%t/pgo-cold-func.prof -S -emit-llvm -o - %t/pgo-cold-func.c | FileCheck %s // CHECK: @__llvm_profile_filename = {{.*}} c"/xxx/yyy/default_%m.profraw\00" ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstrPGO] Support cold function coverage instrumentation (PR #109837)
wlei-llvm wrote: > Hi, this PR is causing a regression on the AIX bot here > https://lab.llvm.org/buildbot/#/builders/64/builds/1321/steps/6/logs/FAIL__Clang__pgo-cold-function-coverage_c. > Would you be able to take a look? I think it can be resolved by using > clang_cc1 in the testcase of marking it unsupported for AIX (see > [951919e](https://github.com/llvm/llvm-project/commit/951919e5112cabbd63c7a3bf424736efca81d964)) Sorry for the breakage, looks like we can bypass it by adding `--target=` like https://github.com/llvm/llvm-project/commit/f92551ba6b5aaef28d886ea6b3c4f9774fa8fadd https://github.com/llvm/llvm-project/pull/109837 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstrPGO] Support cold function coverage instrumentation (PR #109837)
@@ -1182,8 +1187,13 @@ PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level, const bool IsCtxProfUse = !UseCtxProfile.empty() && Phase == ThinOrFullLTOPhase::ThinLTOPreLink; + // Enable cold function coverage instrumentation if + // InstrumentColdFuncOnlyPath is provided. + const bool IsColdFuncOnlyInstrGen = PGOInstrumentColdFunctionOnly = wlei-llvm wrote: Thanks for pointing out. It's intentional to set the global variable, I wasn't aware it will cause data race. I will try to change it to local one. https://github.com/llvm/llvm-project/pull/109837 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] specify clang --target to fix breakage on AIX (PR #114127)
https://github.com/wlei-llvm updated https://github.com/llvm/llvm-project/pull/114127 >From 2c2ddcc9bda277c6da7b51147785c0f1e88b848d Mon Sep 17 00:00:00 2001 From: wlei Date: Tue, 29 Oct 2024 13:28:33 -0700 Subject: [PATCH] specify clang --target to fix breakage on AIX --- clang/test/CodeGen/pgo-cold-function-coverage.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/clang/test/CodeGen/pgo-cold-function-coverage.c b/clang/test/CodeGen/pgo-cold-function-coverage.c index fd1e1e7e14cda5..3003cdc3e15e02 100644 --- a/clang/test/CodeGen/pgo-cold-function-coverage.c +++ b/clang/test/CodeGen/pgo-cold-function-coverage.c @@ -1,7 +1,7 @@ // Test -fprofile-generate-cold-function-coverage // RUN: rm -rf %t && split-file %s %t -// RUN: %clang -O2 -fprofile-generate-cold-function-coverage=/xxx/yyy/ -fprofile-sample-accurate -fprofile-sample-use=%t/pgo-cold-func.prof -S -emit-llvm -o - %t/pgo-cold-func.c | FileCheck %s +// RUN: %clang --target=x86_64 -O2 -fprofile-generate-cold-function-coverage=/xxx/yyy/ -fprofile-sample-accurate -fprofile-sample-use=%t/pgo-cold-func.prof -S -emit-llvm -o - %t/pgo-cold-func.c | FileCheck %s // CHECK: @__llvm_profile_filename = {{.*}} c"/xxx/yyy/default_%m.profraw\00" ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] specify clang --target to fix breakage on AIX (PR #114127)
https://github.com/wlei-llvm closed https://github.com/llvm/llvm-project/pull/114127 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstrPGO] Avoid using global variable to fix potential data race (PR #114364)
https://github.com/wlei-llvm updated https://github.com/llvm/llvm-project/pull/114364 >From a7b444bd75d6f83ed0f5692783990a59f36e8459 Mon Sep 17 00:00:00 2001 From: wlei Date: Thu, 31 Oct 2024 09:58:27 -0700 Subject: [PATCH 1/3] Reapply "[InstrPGO] Support cold function coverage instrumentation (#109837)" This reverts commit d924a9ba03a05b417676e84f6c81aac44f907f71. --- clang/include/clang/Driver/Options.td | 6 clang/lib/Driver/ToolChain.cpp| 4 ++- clang/lib/Driver/ToolChains/Clang.cpp | 20 +++ .../test/CodeGen/pgo-cold-function-coverage.c | 19 ++ ...fprofile-generate-cold-function-coverage.c | 8 + llvm/lib/Passes/PassBuilderPipelines.cpp | 17 - .../Instrumentation/PGOInstrumentation.cpp| 19 ++ .../PGOProfile/instr-gen-cold-function.ll | 35 +++ 8 files changed, 126 insertions(+), 2 deletions(-) create mode 100644 clang/test/CodeGen/pgo-cold-function-coverage.c create mode 100644 clang/test/Driver/fprofile-generate-cold-function-coverage.c create mode 100644 llvm/test/Transforms/PGOProfile/instr-gen-cold-function.ll diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index c8bc2fe377b8ec..2814d2b1bf3733 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -1786,6 +1786,12 @@ defm debug_info_for_profiling : BoolFOption<"debug-info-for-profiling", PosFlag, NegFlag>; +def fprofile_generate_cold_function_coverage : Flag<["-"], "fprofile-generate-cold-function-coverage">, +Group, Visibility<[ClangOption, CLOption]>, +HelpText<"Generate instrumented code to collect coverage info for cold functions into default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env var)">; +def fprofile_generate_cold_function_coverage_EQ : Joined<["-"], "fprofile-generate-cold-function-coverage=">, +Group, Visibility<[ClangOption, CLOption]>, MetaVarName<"">, +HelpText<"Generate instrumented code to collect coverage info for cold functions into /default.profraw (overridden by LLVM_PROFILE_FILE env var)">; def fprofile_instr_generate : Flag<["-"], "fprofile-instr-generate">, Group, Visibility<[ClangOption, CLOption]>, HelpText<"Generate instrumented code to collect execution counts into default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env var)">; diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp index 6d3ede40691093..bdf3da0c96adca 100644 --- a/clang/lib/Driver/ToolChain.cpp +++ b/clang/lib/Driver/ToolChain.cpp @@ -899,7 +899,9 @@ bool ToolChain::needsProfileRT(const ArgList &Args) { Args.hasArg(options::OPT_fprofile_instr_generate) || Args.hasArg(options::OPT_fprofile_instr_generate_EQ) || Args.hasArg(options::OPT_fcreate_profile) || - Args.hasArg(options::OPT_forder_file_instrumentation); + Args.hasArg(options::OPT_forder_file_instrumentation) || + Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage) || + Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage_EQ); } bool ToolChain::needsGCovInstrumentation(const llvm::opt::ArgList &Args) { diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index 04b3832327a99c..4c6f508f1f24a6 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -632,6 +632,26 @@ static void addPGOAndCoverageFlags(const ToolChain &TC, Compilation &C, } } + if (auto *ColdFuncCoverageArg = Args.getLastArg( + options::OPT_fprofile_generate_cold_function_coverage, + options::OPT_fprofile_generate_cold_function_coverage_EQ)) { +SmallString<128> Path( +ColdFuncCoverageArg->getOption().matches( +options::OPT_fprofile_generate_cold_function_coverage_EQ) +? ColdFuncCoverageArg->getValue() +: ""); +llvm::sys::path::append(Path, "default_%m.profraw"); +// FIXME: Idealy the file path should be passed through +// `-fprofile-instrument-path=`(InstrProfileOutput), however, this field is +// shared with other profile use path(see PGOOptions), we need to refactor +// PGOOptions to make it work. +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back(Args.MakeArgString( +Twine("--instrument-cold-function-only-path=") + Path)); +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back("--pgo-function-entry-coverage"); + } + Arg *PGOGenArg = nullptr; if (PGOGenerateArg) { assert(!CSPGOGenerateArg); diff --git a/clang/test/CodeGen/pgo-cold-function-coverage.c b/clang/test/CodeGen/pgo-cold-function-coverage.c new file mode 100644 index 00..fd1e1e7e14cda5 --- /dev/null +++ b/clang/test/CodeGen/pgo-cold-function-coverage.c @@ -0,0 +1,19 @@ +// Test -fprofile-generate-cold-function-coverage + +// RUN: rm -rf %t && sp
[clang] [llvm] [InstrPGO] Avoid using global variable to fix potential data race (PR #114364)
https://github.com/wlei-llvm created https://github.com/llvm/llvm-project/pull/114364 In https://github.com/llvm/llvm-project/pull/109837, it sets a global variable(`PGOInstrumentColdFunctionOnly`) in PassBuilderPipelines.cpp which introduced a data race detected by TSan. To fix this, I decouple the flag setting, the flags are now set separately(`instrument-cold-function-only-path` is required to be used with `--pgo-instrument-cold-function-only`). >From f912c30955541bd18f5fa56eff1c222672c0fa7f Mon Sep 17 00:00:00 2001 From: wlei Date: Wed, 30 Oct 2024 22:21:24 -0700 Subject: [PATCH] fix datarace --- clang/lib/Driver/ToolChains/Clang.cpp | 2 ++ .../fprofile-generate-cold-function-coverage.c| 1 + llvm/lib/Passes/PassBuilderPipelines.cpp | 15 ++- 3 files changed, 13 insertions(+), 5 deletions(-) diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index 4c6f508f1f24a6..04ae99d417d4f7 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -649,6 +649,8 @@ static void addPGOAndCoverageFlags(const ToolChain &TC, Compilation &C, CmdArgs.push_back(Args.MakeArgString( Twine("--instrument-cold-function-only-path=") + Path)); CmdArgs.push_back("-mllvm"); +CmdArgs.push_back("--pgo-instrument-cold-function-only"); +CmdArgs.push_back("-mllvm"); CmdArgs.push_back("--pgo-function-entry-coverage"); } diff --git a/clang/test/Driver/fprofile-generate-cold-function-coverage.c b/clang/test/Driver/fprofile-generate-cold-function-coverage.c index 9b2f46423f34b1..a8ca69cf224cd0 100644 --- a/clang/test/Driver/fprofile-generate-cold-function-coverage.c +++ b/clang/test/Driver/fprofile-generate-cold-function-coverage.c @@ -1,6 +1,7 @@ // RUN: %clang -### -c -fprofile-generate-cold-function-coverage %s 2>&1 | FileCheck %s // CHECK: "--instrument-cold-function-only-path=default_%m.profraw" // CHECK: "--pgo-function-entry-coverage" +// CHECK: "--pgo-instrument-cold-function-only" // CHECK-NOT: "-fprofile-instrument" // CHECK-NOT: "-fprofile-instrument-path= diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp b/llvm/lib/Passes/PassBuilderPipelines.cpp index c391853c8d0c2b..d5ef7c654c35cb 100644 --- a/llvm/lib/Passes/PassBuilderPipelines.cpp +++ b/llvm/lib/Passes/PassBuilderPipelines.cpp @@ -298,7 +298,9 @@ static cl::opt UseLoopVersioningLICM( static cl::opt InstrumentColdFuncOnlyPath( "instrument-cold-function-only-path", cl::init(""), -cl::desc("File path for cold function only instrumentation"), cl::Hidden); +cl::desc("File path for cold function only instrumentation(requires use " + "with --pgo-instrument-cold-function-only)"), +cl::Hidden); extern cl::opt UseCtxProfile; extern cl::opt PGOInstrumentColdFunctionOnly; @@ -1188,10 +1190,13 @@ PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level, const bool IsCtxProfUse = !UseCtxProfile.empty() && Phase == ThinOrFullLTOPhase::ThinLTOPreLink; - // Enable cold function coverage instrumentation if - // InstrumentColdFuncOnlyPath is provided. - const bool IsColdFuncOnlyInstrGen = PGOInstrumentColdFunctionOnly = - IsPGOPreLink && !InstrumentColdFuncOnlyPath.empty(); + assert( + (InstrumentColdFuncOnlyPath.empty() || PGOInstrumentColdFunctionOnly) && + "--instrument-cold-function-only-path is provided but " + "--pgo-instrument-cold-function-only is not enabled"); + const bool IsColdFuncOnlyInstrGen = PGOInstrumentColdFunctionOnly && + IsPGOPreLink && + !InstrumentColdFuncOnlyPath.empty(); if (IsPGOInstrGen || IsPGOInstrUse || IsMemprofUse || IsCtxProfGen || IsCtxProfUse || IsColdFuncOnlyInstrGen) ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstrPGO] Support cold function coverage instrumentation (PR #109837)
https://github.com/wlei-llvm edited https://github.com/llvm/llvm-project/pull/109837 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstrPGO] Support cold function coverage instrumentation (PR #109837)
https://github.com/wlei-llvm closed https://github.com/llvm/llvm-project/pull/109837 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)
https://github.com/wlei-llvm updated https://github.com/llvm/llvm-project/pull/109837 >From 07a2cab3fa5df2965f9f7da9ee2d3603581a47f1 Mon Sep 17 00:00:00 2001 From: wlei Date: Sun, 22 Sep 2024 20:23:20 -0700 Subject: [PATCH 1/8] [InstrPGO] Instrument sampling profile based cold function --- clang/include/clang/Driver/Options.td | 6 + clang/lib/Driver/ToolChain.cpp| 4 +++- clang/lib/Driver/ToolChains/Clang.cpp | 18 ++ clang/test/CodeGen/Inputs/pgo-cold-func.prof | 2 ++ .../test/CodeGen/pgo-cold-function-coverage.c | 12 ++ llvm/lib/Passes/PassBuilderPipelines.cpp | 18 ++ .../Instrumentation/PGOInstrumentation.cpp| 16 + .../PGOProfile/instr-gen-cold-function.ll | 24 +++ 8 files changed, 99 insertions(+), 1 deletion(-) create mode 100644 clang/test/CodeGen/Inputs/pgo-cold-func.prof create mode 100644 clang/test/CodeGen/pgo-cold-function-coverage.c create mode 100644 llvm/test/Transforms/PGOProfile/instr-gen-cold-function.ll diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 002f60350543d9..6bb92427f2d53f 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -1784,6 +1784,12 @@ defm debug_info_for_profiling : BoolFOption<"debug-info-for-profiling", PosFlag, NegFlag>; +def fprofile_generate_cold_function_coverage : Flag<["-"], "fprofile-generate-cold-function-coverage">, +Group, Visibility<[ClangOption, CLOption]>, +HelpText<"Generate instrumented code to cold functions into default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env var)">; +def fprofile_generate_cold_function_coverage_EQ : Joined<["-"], "fprofile-generate-cold-function-coverage=">, +Group, Visibility<[ClangOption, CLOption]>, MetaVarName<"">, +HelpText<"Generate instrumented code to cold functions into /default.profraw (overridden by LLVM_PROFILE_FILE env var)">; def fprofile_instr_generate : Flag<["-"], "fprofile-instr-generate">, Group, Visibility<[ClangOption, CLOption]>, HelpText<"Generate instrumented code to collect execution counts into default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env var)">; diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp index 16f9b629fc538c..e56db150c6b47e 100644 --- a/clang/lib/Driver/ToolChain.cpp +++ b/clang/lib/Driver/ToolChain.cpp @@ -889,7 +889,9 @@ bool ToolChain::needsProfileRT(const ArgList &Args) { Args.hasArg(options::OPT_fprofile_instr_generate) || Args.hasArg(options::OPT_fprofile_instr_generate_EQ) || Args.hasArg(options::OPT_fcreate_profile) || - Args.hasArg(options::OPT_forder_file_instrumentation); + Args.hasArg(options::OPT_forder_file_instrumentation) || + Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage) || + Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage_EQ); } bool ToolChain::needsGCovInstrumentation(const llvm::opt::ArgList &Args) { diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index 0bab48caf1a5e2..00602a08232ba2 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -649,6 +649,24 @@ static void addPGOAndCoverageFlags(const ToolChain &TC, Compilation &C, } } + if (auto *ColdFuncCoverageArg = Args.getLastArg( + options::OPT_fprofile_generate_cold_function_coverage, + options::OPT_fprofile_generate_cold_function_coverage_EQ)) { +SmallString<128> Path( +ColdFuncCoverageArg->getOption().matches( +options::OPT_fprofile_generate_cold_function_coverage_EQ) +? ColdFuncCoverageArg->getValue() +: ""); +llvm::sys::path::append(Path, "default_%m.profraw"); +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back(Args.MakeArgString( +Twine("--instrument-sample-cold-function-path=") + Path)); +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back("--instrument-cold-function-coverage"); +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back("--pgo-function-entry-coverage"); + } + Arg *PGOGenArg = nullptr; if (PGOGenerateArg) { assert(!CSPGOGenerateArg); diff --git a/clang/test/CodeGen/Inputs/pgo-cold-func.prof b/clang/test/CodeGen/Inputs/pgo-cold-func.prof new file mode 100644 index 00..e50be02e0a8545 --- /dev/null +++ b/clang/test/CodeGen/Inputs/pgo-cold-func.prof @@ -0,0 +1,2 @@ +foo:1:1 + 1: 1 diff --git a/clang/test/CodeGen/pgo-cold-function-coverage.c b/clang/test/CodeGen/pgo-cold-function-coverage.c new file mode 100644 index 00..0d2767c022b9f1 --- /dev/null +++ b/clang/test/CodeGen/pgo-cold-function-coverage.c @@ -0,0 +1,12 @@ +// Test -fprofile-generate-cold-function-coverage +// RUN: %clang -O2 -fprofile-generate-cold-function-coverage=/xxx
[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)
https://github.com/wlei-llvm updated https://github.com/llvm/llvm-project/pull/109837 >From 07a2cab3fa5df2965f9f7da9ee2d3603581a47f1 Mon Sep 17 00:00:00 2001 From: wlei Date: Sun, 22 Sep 2024 20:23:20 -0700 Subject: [PATCH 1/8] [InstrPGO] Instrument sampling profile based cold function --- clang/include/clang/Driver/Options.td | 6 + clang/lib/Driver/ToolChain.cpp| 4 +++- clang/lib/Driver/ToolChains/Clang.cpp | 18 ++ clang/test/CodeGen/Inputs/pgo-cold-func.prof | 2 ++ .../test/CodeGen/pgo-cold-function-coverage.c | 12 ++ llvm/lib/Passes/PassBuilderPipelines.cpp | 18 ++ .../Instrumentation/PGOInstrumentation.cpp| 16 + .../PGOProfile/instr-gen-cold-function.ll | 24 +++ 8 files changed, 99 insertions(+), 1 deletion(-) create mode 100644 clang/test/CodeGen/Inputs/pgo-cold-func.prof create mode 100644 clang/test/CodeGen/pgo-cold-function-coverage.c create mode 100644 llvm/test/Transforms/PGOProfile/instr-gen-cold-function.ll diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 002f60350543d9..6bb92427f2d53f 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -1784,6 +1784,12 @@ defm debug_info_for_profiling : BoolFOption<"debug-info-for-profiling", PosFlag, NegFlag>; +def fprofile_generate_cold_function_coverage : Flag<["-"], "fprofile-generate-cold-function-coverage">, +Group, Visibility<[ClangOption, CLOption]>, +HelpText<"Generate instrumented code to cold functions into default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env var)">; +def fprofile_generate_cold_function_coverage_EQ : Joined<["-"], "fprofile-generate-cold-function-coverage=">, +Group, Visibility<[ClangOption, CLOption]>, MetaVarName<"">, +HelpText<"Generate instrumented code to cold functions into /default.profraw (overridden by LLVM_PROFILE_FILE env var)">; def fprofile_instr_generate : Flag<["-"], "fprofile-instr-generate">, Group, Visibility<[ClangOption, CLOption]>, HelpText<"Generate instrumented code to collect execution counts into default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env var)">; diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp index 16f9b629fc538c..e56db150c6b47e 100644 --- a/clang/lib/Driver/ToolChain.cpp +++ b/clang/lib/Driver/ToolChain.cpp @@ -889,7 +889,9 @@ bool ToolChain::needsProfileRT(const ArgList &Args) { Args.hasArg(options::OPT_fprofile_instr_generate) || Args.hasArg(options::OPT_fprofile_instr_generate_EQ) || Args.hasArg(options::OPT_fcreate_profile) || - Args.hasArg(options::OPT_forder_file_instrumentation); + Args.hasArg(options::OPT_forder_file_instrumentation) || + Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage) || + Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage_EQ); } bool ToolChain::needsGCovInstrumentation(const llvm::opt::ArgList &Args) { diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index 0bab48caf1a5e2..00602a08232ba2 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -649,6 +649,24 @@ static void addPGOAndCoverageFlags(const ToolChain &TC, Compilation &C, } } + if (auto *ColdFuncCoverageArg = Args.getLastArg( + options::OPT_fprofile_generate_cold_function_coverage, + options::OPT_fprofile_generate_cold_function_coverage_EQ)) { +SmallString<128> Path( +ColdFuncCoverageArg->getOption().matches( +options::OPT_fprofile_generate_cold_function_coverage_EQ) +? ColdFuncCoverageArg->getValue() +: ""); +llvm::sys::path::append(Path, "default_%m.profraw"); +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back(Args.MakeArgString( +Twine("--instrument-sample-cold-function-path=") + Path)); +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back("--instrument-cold-function-coverage"); +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back("--pgo-function-entry-coverage"); + } + Arg *PGOGenArg = nullptr; if (PGOGenerateArg) { assert(!CSPGOGenerateArg); diff --git a/clang/test/CodeGen/Inputs/pgo-cold-func.prof b/clang/test/CodeGen/Inputs/pgo-cold-func.prof new file mode 100644 index 00..e50be02e0a8545 --- /dev/null +++ b/clang/test/CodeGen/Inputs/pgo-cold-func.prof @@ -0,0 +1,2 @@ +foo:1:1 + 1: 1 diff --git a/clang/test/CodeGen/pgo-cold-function-coverage.c b/clang/test/CodeGen/pgo-cold-function-coverage.c new file mode 100644 index 00..0d2767c022b9f1 --- /dev/null +++ b/clang/test/CodeGen/pgo-cold-function-coverage.c @@ -0,0 +1,12 @@ +// Test -fprofile-generate-cold-function-coverage +// RUN: %clang -O2 -fprofile-generate-cold-function-coverage=/xxx
[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)
https://github.com/wlei-llvm edited https://github.com/llvm/llvm-project/pull/109837 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstrPGO] Avoid using global variable to fix potential data race (PR #114364)
https://github.com/wlei-llvm ready_for_review https://github.com/llvm/llvm-project/pull/114364 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstrPGO] Support cold function coverage instrumentation (PR #109837)
@@ -1182,8 +1187,13 @@ PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level, const bool IsCtxProfUse = !UseCtxProfile.empty() && Phase == ThinOrFullLTOPhase::ThinLTOPreLink; + // Enable cold function coverage instrumentation if + // InstrumentColdFuncOnlyPath is provided. + const bool IsColdFuncOnlyInstrGen = PGOInstrumentColdFunctionOnly = wlei-llvm wrote: Should be fixed by https://github.com/llvm/llvm-project/pull/114364 https://github.com/llvm/llvm-project/pull/109837 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstrPGO] Avoid using global variable to fix potential data race (PR #114364)
https://github.com/wlei-llvm updated https://github.com/llvm/llvm-project/pull/114364 >From a7b444bd75d6f83ed0f5692783990a59f36e8459 Mon Sep 17 00:00:00 2001 From: wlei Date: Thu, 31 Oct 2024 09:58:27 -0700 Subject: [PATCH 1/3] Reapply "[InstrPGO] Support cold function coverage instrumentation (#109837)" This reverts commit d924a9ba03a05b417676e84f6c81aac44f907f71. --- clang/include/clang/Driver/Options.td | 6 clang/lib/Driver/ToolChain.cpp| 4 ++- clang/lib/Driver/ToolChains/Clang.cpp | 20 +++ .../test/CodeGen/pgo-cold-function-coverage.c | 19 ++ ...fprofile-generate-cold-function-coverage.c | 8 + llvm/lib/Passes/PassBuilderPipelines.cpp | 17 - .../Instrumentation/PGOInstrumentation.cpp| 19 ++ .../PGOProfile/instr-gen-cold-function.ll | 35 +++ 8 files changed, 126 insertions(+), 2 deletions(-) create mode 100644 clang/test/CodeGen/pgo-cold-function-coverage.c create mode 100644 clang/test/Driver/fprofile-generate-cold-function-coverage.c create mode 100644 llvm/test/Transforms/PGOProfile/instr-gen-cold-function.ll diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index c8bc2fe377b8ec..2814d2b1bf3733 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -1786,6 +1786,12 @@ defm debug_info_for_profiling : BoolFOption<"debug-info-for-profiling", PosFlag, NegFlag>; +def fprofile_generate_cold_function_coverage : Flag<["-"], "fprofile-generate-cold-function-coverage">, +Group, Visibility<[ClangOption, CLOption]>, +HelpText<"Generate instrumented code to collect coverage info for cold functions into default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env var)">; +def fprofile_generate_cold_function_coverage_EQ : Joined<["-"], "fprofile-generate-cold-function-coverage=">, +Group, Visibility<[ClangOption, CLOption]>, MetaVarName<"">, +HelpText<"Generate instrumented code to collect coverage info for cold functions into /default.profraw (overridden by LLVM_PROFILE_FILE env var)">; def fprofile_instr_generate : Flag<["-"], "fprofile-instr-generate">, Group, Visibility<[ClangOption, CLOption]>, HelpText<"Generate instrumented code to collect execution counts into default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env var)">; diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp index 6d3ede40691093..bdf3da0c96adca 100644 --- a/clang/lib/Driver/ToolChain.cpp +++ b/clang/lib/Driver/ToolChain.cpp @@ -899,7 +899,9 @@ bool ToolChain::needsProfileRT(const ArgList &Args) { Args.hasArg(options::OPT_fprofile_instr_generate) || Args.hasArg(options::OPT_fprofile_instr_generate_EQ) || Args.hasArg(options::OPT_fcreate_profile) || - Args.hasArg(options::OPT_forder_file_instrumentation); + Args.hasArg(options::OPT_forder_file_instrumentation) || + Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage) || + Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage_EQ); } bool ToolChain::needsGCovInstrumentation(const llvm::opt::ArgList &Args) { diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index 04b3832327a99c..4c6f508f1f24a6 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -632,6 +632,26 @@ static void addPGOAndCoverageFlags(const ToolChain &TC, Compilation &C, } } + if (auto *ColdFuncCoverageArg = Args.getLastArg( + options::OPT_fprofile_generate_cold_function_coverage, + options::OPT_fprofile_generate_cold_function_coverage_EQ)) { +SmallString<128> Path( +ColdFuncCoverageArg->getOption().matches( +options::OPT_fprofile_generate_cold_function_coverage_EQ) +? ColdFuncCoverageArg->getValue() +: ""); +llvm::sys::path::append(Path, "default_%m.profraw"); +// FIXME: Idealy the file path should be passed through +// `-fprofile-instrument-path=`(InstrProfileOutput), however, this field is +// shared with other profile use path(see PGOOptions), we need to refactor +// PGOOptions to make it work. +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back(Args.MakeArgString( +Twine("--instrument-cold-function-only-path=") + Path)); +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back("--pgo-function-entry-coverage"); + } + Arg *PGOGenArg = nullptr; if (PGOGenerateArg) { assert(!CSPGOGenerateArg); diff --git a/clang/test/CodeGen/pgo-cold-function-coverage.c b/clang/test/CodeGen/pgo-cold-function-coverage.c new file mode 100644 index 00..fd1e1e7e14cda5 --- /dev/null +++ b/clang/test/CodeGen/pgo-cold-function-coverage.c @@ -0,0 +1,19 @@ +// Test -fprofile-generate-cold-function-coverage + +// RUN: rm -rf %t && sp
[clang] [llvm] [InstrPGO] Avoid using global variable to fix potential data race (PR #114364)
https://github.com/wlei-llvm closed https://github.com/llvm/llvm-project/pull/114364 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstrPGO] Avoid using global variable to fix potential data race (PR #114364)
https://github.com/wlei-llvm updated https://github.com/llvm/llvm-project/pull/114364 >From a7b444bd75d6f83ed0f5692783990a59f36e8459 Mon Sep 17 00:00:00 2001 From: wlei Date: Thu, 31 Oct 2024 09:58:27 -0700 Subject: [PATCH 1/4] Reapply "[InstrPGO] Support cold function coverage instrumentation (#109837)" This reverts commit d924a9ba03a05b417676e84f6c81aac44f907f71. --- clang/include/clang/Driver/Options.td | 6 clang/lib/Driver/ToolChain.cpp| 4 ++- clang/lib/Driver/ToolChains/Clang.cpp | 20 +++ .../test/CodeGen/pgo-cold-function-coverage.c | 19 ++ ...fprofile-generate-cold-function-coverage.c | 8 + llvm/lib/Passes/PassBuilderPipelines.cpp | 17 - .../Instrumentation/PGOInstrumentation.cpp| 19 ++ .../PGOProfile/instr-gen-cold-function.ll | 35 +++ 8 files changed, 126 insertions(+), 2 deletions(-) create mode 100644 clang/test/CodeGen/pgo-cold-function-coverage.c create mode 100644 clang/test/Driver/fprofile-generate-cold-function-coverage.c create mode 100644 llvm/test/Transforms/PGOProfile/instr-gen-cold-function.ll diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index c8bc2fe377b8ec..2814d2b1bf3733 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -1786,6 +1786,12 @@ defm debug_info_for_profiling : BoolFOption<"debug-info-for-profiling", PosFlag, NegFlag>; +def fprofile_generate_cold_function_coverage : Flag<["-"], "fprofile-generate-cold-function-coverage">, +Group, Visibility<[ClangOption, CLOption]>, +HelpText<"Generate instrumented code to collect coverage info for cold functions into default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env var)">; +def fprofile_generate_cold_function_coverage_EQ : Joined<["-"], "fprofile-generate-cold-function-coverage=">, +Group, Visibility<[ClangOption, CLOption]>, MetaVarName<"">, +HelpText<"Generate instrumented code to collect coverage info for cold functions into /default.profraw (overridden by LLVM_PROFILE_FILE env var)">; def fprofile_instr_generate : Flag<["-"], "fprofile-instr-generate">, Group, Visibility<[ClangOption, CLOption]>, HelpText<"Generate instrumented code to collect execution counts into default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env var)">; diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp index 6d3ede40691093..bdf3da0c96adca 100644 --- a/clang/lib/Driver/ToolChain.cpp +++ b/clang/lib/Driver/ToolChain.cpp @@ -899,7 +899,9 @@ bool ToolChain::needsProfileRT(const ArgList &Args) { Args.hasArg(options::OPT_fprofile_instr_generate) || Args.hasArg(options::OPT_fprofile_instr_generate_EQ) || Args.hasArg(options::OPT_fcreate_profile) || - Args.hasArg(options::OPT_forder_file_instrumentation); + Args.hasArg(options::OPT_forder_file_instrumentation) || + Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage) || + Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage_EQ); } bool ToolChain::needsGCovInstrumentation(const llvm::opt::ArgList &Args) { diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index 04b3832327a99c..4c6f508f1f24a6 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -632,6 +632,26 @@ static void addPGOAndCoverageFlags(const ToolChain &TC, Compilation &C, } } + if (auto *ColdFuncCoverageArg = Args.getLastArg( + options::OPT_fprofile_generate_cold_function_coverage, + options::OPT_fprofile_generate_cold_function_coverage_EQ)) { +SmallString<128> Path( +ColdFuncCoverageArg->getOption().matches( +options::OPT_fprofile_generate_cold_function_coverage_EQ) +? ColdFuncCoverageArg->getValue() +: ""); +llvm::sys::path::append(Path, "default_%m.profraw"); +// FIXME: Idealy the file path should be passed through +// `-fprofile-instrument-path=`(InstrProfileOutput), however, this field is +// shared with other profile use path(see PGOOptions), we need to refactor +// PGOOptions to make it work. +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back(Args.MakeArgString( +Twine("--instrument-cold-function-only-path=") + Path)); +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back("--pgo-function-entry-coverage"); + } + Arg *PGOGenArg = nullptr; if (PGOGenerateArg) { assert(!CSPGOGenerateArg); diff --git a/clang/test/CodeGen/pgo-cold-function-coverage.c b/clang/test/CodeGen/pgo-cold-function-coverage.c new file mode 100644 index 00..fd1e1e7e14cda5 --- /dev/null +++ b/clang/test/CodeGen/pgo-cold-function-coverage.c @@ -0,0 +1,19 @@ +// Test -fprofile-generate-cold-function-coverage + +// RUN: rm -rf %t && sp
[clang] [llvm] [InstrPGO] Avoid using global variable to fix potential data race (PR #114364)
https://github.com/wlei-llvm updated https://github.com/llvm/llvm-project/pull/114364 >From a7b444bd75d6f83ed0f5692783990a59f36e8459 Mon Sep 17 00:00:00 2001 From: wlei Date: Thu, 31 Oct 2024 09:58:27 -0700 Subject: [PATCH 1/4] Reapply "[InstrPGO] Support cold function coverage instrumentation (#109837)" This reverts commit d924a9ba03a05b417676e84f6c81aac44f907f71. --- clang/include/clang/Driver/Options.td | 6 clang/lib/Driver/ToolChain.cpp| 4 ++- clang/lib/Driver/ToolChains/Clang.cpp | 20 +++ .../test/CodeGen/pgo-cold-function-coverage.c | 19 ++ ...fprofile-generate-cold-function-coverage.c | 8 + llvm/lib/Passes/PassBuilderPipelines.cpp | 17 - .../Instrumentation/PGOInstrumentation.cpp| 19 ++ .../PGOProfile/instr-gen-cold-function.ll | 35 +++ 8 files changed, 126 insertions(+), 2 deletions(-) create mode 100644 clang/test/CodeGen/pgo-cold-function-coverage.c create mode 100644 clang/test/Driver/fprofile-generate-cold-function-coverage.c create mode 100644 llvm/test/Transforms/PGOProfile/instr-gen-cold-function.ll diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index c8bc2fe377b8ec..2814d2b1bf3733 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -1786,6 +1786,12 @@ defm debug_info_for_profiling : BoolFOption<"debug-info-for-profiling", PosFlag, NegFlag>; +def fprofile_generate_cold_function_coverage : Flag<["-"], "fprofile-generate-cold-function-coverage">, +Group, Visibility<[ClangOption, CLOption]>, +HelpText<"Generate instrumented code to collect coverage info for cold functions into default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env var)">; +def fprofile_generate_cold_function_coverage_EQ : Joined<["-"], "fprofile-generate-cold-function-coverage=">, +Group, Visibility<[ClangOption, CLOption]>, MetaVarName<"">, +HelpText<"Generate instrumented code to collect coverage info for cold functions into /default.profraw (overridden by LLVM_PROFILE_FILE env var)">; def fprofile_instr_generate : Flag<["-"], "fprofile-instr-generate">, Group, Visibility<[ClangOption, CLOption]>, HelpText<"Generate instrumented code to collect execution counts into default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env var)">; diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp index 6d3ede40691093..bdf3da0c96adca 100644 --- a/clang/lib/Driver/ToolChain.cpp +++ b/clang/lib/Driver/ToolChain.cpp @@ -899,7 +899,9 @@ bool ToolChain::needsProfileRT(const ArgList &Args) { Args.hasArg(options::OPT_fprofile_instr_generate) || Args.hasArg(options::OPT_fprofile_instr_generate_EQ) || Args.hasArg(options::OPT_fcreate_profile) || - Args.hasArg(options::OPT_forder_file_instrumentation); + Args.hasArg(options::OPT_forder_file_instrumentation) || + Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage) || + Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage_EQ); } bool ToolChain::needsGCovInstrumentation(const llvm::opt::ArgList &Args) { diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index 04b3832327a99c..4c6f508f1f24a6 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -632,6 +632,26 @@ static void addPGOAndCoverageFlags(const ToolChain &TC, Compilation &C, } } + if (auto *ColdFuncCoverageArg = Args.getLastArg( + options::OPT_fprofile_generate_cold_function_coverage, + options::OPT_fprofile_generate_cold_function_coverage_EQ)) { +SmallString<128> Path( +ColdFuncCoverageArg->getOption().matches( +options::OPT_fprofile_generate_cold_function_coverage_EQ) +? ColdFuncCoverageArg->getValue() +: ""); +llvm::sys::path::append(Path, "default_%m.profraw"); +// FIXME: Idealy the file path should be passed through +// `-fprofile-instrument-path=`(InstrProfileOutput), however, this field is +// shared with other profile use path(see PGOOptions), we need to refactor +// PGOOptions to make it work. +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back(Args.MakeArgString( +Twine("--instrument-cold-function-only-path=") + Path)); +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back("--pgo-function-entry-coverage"); + } + Arg *PGOGenArg = nullptr; if (PGOGenerateArg) { assert(!CSPGOGenerateArg); diff --git a/clang/test/CodeGen/pgo-cold-function-coverage.c b/clang/test/CodeGen/pgo-cold-function-coverage.c new file mode 100644 index 00..fd1e1e7e14cda5 --- /dev/null +++ b/clang/test/CodeGen/pgo-cold-function-coverage.c @@ -0,0 +1,19 @@ +// Test -fprofile-generate-cold-function-coverage + +// RUN: rm -rf %t && sp
[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)
https://github.com/wlei-llvm created https://github.com/llvm/llvm-project/pull/109837 None >From b693ff3c33dcf21c1343d9d3432b1d7410a8b579 Mon Sep 17 00:00:00 2001 From: wlei Date: Sun, 22 Sep 2024 20:23:20 -0700 Subject: [PATCH] [InstrPGO] Instrument sampling profile based cold function --- clang/include/clang/Driver/Options.td | 6 + clang/lib/Driver/ToolChain.cpp| 4 +++- clang/lib/Driver/ToolChains/Clang.cpp | 14 +++ llvm/lib/Passes/PassBuilderPipelines.cpp | 16 + .../Instrumentation/PGOInstrumentation.cpp| 14 +++ .../PGOProfile/instr-gen-cold-function.ll | 24 +++ 6 files changed, 77 insertions(+), 1 deletion(-) create mode 100644 llvm/test/Transforms/PGOProfile/instr-gen-cold-function.ll diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 002f60350543d9..0ac289089839d2 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -1784,6 +1784,12 @@ defm debug_info_for_profiling : BoolFOption<"debug-info-for-profiling", PosFlag, NegFlag>; +def fprofile_sample_cold_function : Flag<["-"], "fprofile-sample-cold-function">, +Group, Visibility<[ClangOption, CLOption]>, +HelpText<"Generate instrumented code to cold functions guided by sampling-based profile into default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env var)">; +def fprofile_sample_cold_function_EQ : Joined<["-"], "fprofile-sample-cold-function=">, +Group, Visibility<[ClangOption, CLOption]>, MetaVarName<"">, +HelpText<"Generate instrumented code to cold functions guided by sampling-based profile into (overridden by LLVM_PROFILE_FILE env var)">; def fprofile_instr_generate : Flag<["-"], "fprofile-instr-generate">, Group, Visibility<[ClangOption, CLOption]>, HelpText<"Generate instrumented code to collect execution counts into default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env var)">; diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp index 16f9b629fc538c..5cd821cfe780c4 100644 --- a/clang/lib/Driver/ToolChain.cpp +++ b/clang/lib/Driver/ToolChain.cpp @@ -889,7 +889,9 @@ bool ToolChain::needsProfileRT(const ArgList &Args) { Args.hasArg(options::OPT_fprofile_instr_generate) || Args.hasArg(options::OPT_fprofile_instr_generate_EQ) || Args.hasArg(options::OPT_fcreate_profile) || - Args.hasArg(options::OPT_forder_file_instrumentation); + Args.hasArg(options::OPT_forder_file_instrumentation) || + Args.hasArg(options::OPT_fprofile_sample_cold_function) || + Args.hasArg(options::OPT_fprofile_sample_cold_function_EQ); } bool ToolChain::needsGCovInstrumentation(const llvm::opt::ArgList &Args) { diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index 0bab48caf1a5e2..85e9e02ca8a4ee 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -649,6 +649,20 @@ static void addPGOAndCoverageFlags(const ToolChain &TC, Compilation &C, } } + if (auto *SampleColdArg = + Args.getLastArg(options::OPT_fprofile_sample_cold_function, + options::OPT_fprofile_sample_cold_function_EQ)) { +SmallString<128> Path; +if (SampleColdArg->getOption().matches(options::OPT_fprofile_sample_cold_function_EQ)) + Path.append(SampleColdArg->getValue()); +else + Path.append("default_%m.profraw"); + +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back(Args.MakeArgString( +Twine("--instrument-sample-cold-function-path=") + Path)); + } + Arg *PGOGenArg = nullptr; if (PGOGenerateArg) { assert(!CSPGOGenerateArg); diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp b/llvm/lib/Passes/PassBuilderPipelines.cpp index 8f151a99b11709..5d1d8480b1472d 100644 --- a/llvm/lib/Passes/PassBuilderPipelines.cpp +++ b/llvm/lib/Passes/PassBuilderPipelines.cpp @@ -296,7 +296,13 @@ static cl::opt UseLoopVersioningLICM( "enable-loop-versioning-licm", cl::init(false), cl::Hidden, cl::desc("Enable the experimental Loop Versioning LICM pass")); +static cl::opt InstrumentSampleColdFuncPath( +"instrument-sample-cold-function-path", cl::init(""), +cl::desc("File path for instrumenting sampling PGO guided cold functions"), +cl::Hidden); + extern cl::opt UseCtxProfile; +extern cl::opt InstrumentColdFunction; namespace llvm { extern cl::opt EnableMemProfContextDisambiguation; @@ -1119,6 +1125,16 @@ PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level, // removed. MPM.addPass( PGOIndirectCallPromotion(true /* IsInLTO */, true /* SamplePGO */)); + +if (InstrumentSampleColdFuncPath.getNumOccurrences() && Phase != ThinOrFullLTOPhase::ThinLTOPostLink) { + assert(!InstrumentSampleColdFuncPath.e
[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #110330)
https://github.com/wlei-llvm created https://github.com/llvm/llvm-project/pull/110330 None >From d660d3b9a043a3530a735c1b95790116f6366062 Mon Sep 17 00:00:00 2001 From: wlei Date: Sun, 22 Sep 2024 20:23:20 -0700 Subject: [PATCH 1/2] [InstrPGO] Instrument sampling profile based cold function --- clang/include/clang/Driver/Options.td | 6 + clang/lib/Driver/ToolChain.cpp| 4 +++- clang/lib/Driver/ToolChains/Clang.cpp | 13 ++ llvm/lib/Passes/PassBuilderPipelines.cpp | 17 + .../Instrumentation/PGOInstrumentation.cpp| 14 +++ .../PGOProfile/instr-gen-cold-function.ll | 24 +++ 6 files changed, 77 insertions(+), 1 deletion(-) create mode 100644 llvm/test/Transforms/PGOProfile/instr-gen-cold-function.ll diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index aedc4c16d4e9d5..230b9604b8a8aa 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -1785,6 +1785,12 @@ defm debug_info_for_profiling : BoolFOption<"debug-info-for-profiling", PosFlag, NegFlag>; +def fprofile_sample_cold_function : Flag<["-"], "fprofile-sample-cold-function">, +Group, Visibility<[ClangOption, CLOption]>, +HelpText<"Generate instrumented code to cold functions guided by sampling-based profile into default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env var)">; +def fprofile_sample_cold_function_EQ : Joined<["-"], "fprofile-sample-cold-function=">, +Group, Visibility<[ClangOption, CLOption]>, MetaVarName<"">, +HelpText<"Generate instrumented code to cold functions guided by sampling-based profile into (overridden by LLVM_PROFILE_FILE env var)">; def fprofile_instr_generate : Flag<["-"], "fprofile-instr-generate">, Group, Visibility<[ClangOption, CLOption]>, HelpText<"Generate instrumented code to collect execution counts into default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env var)">; diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp index 16f9b629fc538c..5cd821cfe780c4 100644 --- a/clang/lib/Driver/ToolChain.cpp +++ b/clang/lib/Driver/ToolChain.cpp @@ -889,7 +889,9 @@ bool ToolChain::needsProfileRT(const ArgList &Args) { Args.hasArg(options::OPT_fprofile_instr_generate) || Args.hasArg(options::OPT_fprofile_instr_generate_EQ) || Args.hasArg(options::OPT_fcreate_profile) || - Args.hasArg(options::OPT_forder_file_instrumentation); + Args.hasArg(options::OPT_forder_file_instrumentation) || + Args.hasArg(options::OPT_fprofile_sample_cold_function) || + Args.hasArg(options::OPT_fprofile_sample_cold_function_EQ); } bool ToolChain::needsGCovInstrumentation(const llvm::opt::ArgList &Args) { diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index b9987288d82d10..fced94833aaf79 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -649,6 +649,19 @@ static void addPGOAndCoverageFlags(const ToolChain &TC, Compilation &C, } } + if (auto *SampleColdArg = + Args.getLastArg(options::OPT_fprofile_sample_cold_function, + options::OPT_fprofile_sample_cold_function_EQ)) { +SmallString<128> Path(SampleColdArg->getOption().matches( + options::OPT_fprofile_sample_cold_function_EQ) + ? SampleColdArg->getValue() + : ""); +llvm::sys::path::append(Path, "default_%m.profraw"); +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back(Args.MakeArgString( +Twine("--instrument-sample-cold-function-path=") + Path)); + } + Arg *PGOGenArg = nullptr; if (PGOGenerateArg) { assert(!CSPGOGenerateArg); diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp b/llvm/lib/Passes/PassBuilderPipelines.cpp index 8f151a99b11709..0514d17db20721 100644 --- a/llvm/lib/Passes/PassBuilderPipelines.cpp +++ b/llvm/lib/Passes/PassBuilderPipelines.cpp @@ -296,7 +296,13 @@ static cl::opt UseLoopVersioningLICM( "enable-loop-versioning-licm", cl::init(false), cl::Hidden, cl::desc("Enable the experimental Loop Versioning LICM pass")); +static cl::opt InstrumentSampleColdFuncPath( +"instrument-sample-cold-function-path", cl::init(""), +cl::desc("File path for instrumenting sampling PGO guided cold functions"), +cl::Hidden); + extern cl::opt UseCtxProfile; +extern cl::opt InstrumentColdFunction; namespace llvm { extern cl::opt EnableMemProfContextDisambiguation; @@ -1119,6 +1125,17 @@ PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level, // removed. MPM.addPass( PGOIndirectCallPromotion(true /* IsInLTO */, true /* SamplePGO */)); + +if (InstrumentSampleColdFuncPath.getNumOccurrences() && +Phase != ThinOrF
[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function v2 (PR #110330)
https://github.com/wlei-llvm edited https://github.com/llvm/llvm-project/pull/110330 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)
https://github.com/wlei-llvm edited https://github.com/llvm/llvm-project/pull/109837 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)
https://github.com/wlei-llvm updated https://github.com/llvm/llvm-project/pull/109837 >From 36a747b5ef2123dd10909b91ad963e30af63b3b8 Mon Sep 17 00:00:00 2001 From: wlei Date: Sun, 22 Sep 2024 20:23:20 -0700 Subject: [PATCH] [InstrPGO] Instrument sampling profile based cold function --- clang/include/clang/Driver/Options.td | 6 + clang/lib/Driver/ToolChain.cpp| 4 +++- clang/lib/Driver/ToolChains/Clang.cpp | 12 ++ llvm/lib/Passes/PassBuilderPipelines.cpp | 17 + .../Instrumentation/PGOInstrumentation.cpp| 14 +++ .../PGOProfile/instr-gen-cold-function.ll | 24 +++ 6 files changed, 76 insertions(+), 1 deletion(-) create mode 100644 llvm/test/Transforms/PGOProfile/instr-gen-cold-function.ll diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 002f60350543d9..0ac289089839d2 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -1784,6 +1784,12 @@ defm debug_info_for_profiling : BoolFOption<"debug-info-for-profiling", PosFlag, NegFlag>; +def fprofile_sample_cold_function : Flag<["-"], "fprofile-sample-cold-function">, +Group, Visibility<[ClangOption, CLOption]>, +HelpText<"Generate instrumented code to cold functions guided by sampling-based profile into default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env var)">; +def fprofile_sample_cold_function_EQ : Joined<["-"], "fprofile-sample-cold-function=">, +Group, Visibility<[ClangOption, CLOption]>, MetaVarName<"">, +HelpText<"Generate instrumented code to cold functions guided by sampling-based profile into (overridden by LLVM_PROFILE_FILE env var)">; def fprofile_instr_generate : Flag<["-"], "fprofile-instr-generate">, Group, Visibility<[ClangOption, CLOption]>, HelpText<"Generate instrumented code to collect execution counts into default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env var)">; diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp index 16f9b629fc538c..5cd821cfe780c4 100644 --- a/clang/lib/Driver/ToolChain.cpp +++ b/clang/lib/Driver/ToolChain.cpp @@ -889,7 +889,9 @@ bool ToolChain::needsProfileRT(const ArgList &Args) { Args.hasArg(options::OPT_fprofile_instr_generate) || Args.hasArg(options::OPT_fprofile_instr_generate_EQ) || Args.hasArg(options::OPT_fcreate_profile) || - Args.hasArg(options::OPT_forder_file_instrumentation); + Args.hasArg(options::OPT_forder_file_instrumentation) || + Args.hasArg(options::OPT_fprofile_sample_cold_function) || + Args.hasArg(options::OPT_fprofile_sample_cold_function_EQ); } bool ToolChain::needsGCovInstrumentation(const llvm::opt::ArgList &Args) { diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index 0bab48caf1a5e2..6e128dfc888ab4 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -649,6 +649,18 @@ static void addPGOAndCoverageFlags(const ToolChain &TC, Compilation &C, } } + if (auto *SampleColdArg = + Args.getLastArg(options::OPT_fprofile_sample_cold_function, + options::OPT_fprofile_sample_cold_function_EQ)) { +SmallString<128> Path(SampleColdArg->getOption().matches( + options::OPT_fprofile_sample_cold_function_EQ) + ? SampleColdArg->getValue() + : "default_%m.profraw"); +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back(Args.MakeArgString( +Twine("--instrument-sample-cold-function-path=") + Path)); + } + Arg *PGOGenArg = nullptr; if (PGOGenerateArg) { assert(!CSPGOGenerateArg); diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp b/llvm/lib/Passes/PassBuilderPipelines.cpp index 8f151a99b11709..0514d17db20721 100644 --- a/llvm/lib/Passes/PassBuilderPipelines.cpp +++ b/llvm/lib/Passes/PassBuilderPipelines.cpp @@ -296,7 +296,13 @@ static cl::opt UseLoopVersioningLICM( "enable-loop-versioning-licm", cl::init(false), cl::Hidden, cl::desc("Enable the experimental Loop Versioning LICM pass")); +static cl::opt InstrumentSampleColdFuncPath( +"instrument-sample-cold-function-path", cl::init(""), +cl::desc("File path for instrumenting sampling PGO guided cold functions"), +cl::Hidden); + extern cl::opt UseCtxProfile; +extern cl::opt InstrumentColdFunction; namespace llvm { extern cl::opt EnableMemProfContextDisambiguation; @@ -1119,6 +1125,17 @@ PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level, // removed. MPM.addPass( PGOIndirectCallPromotion(true /* IsInLTO */, true /* SamplePGO */)); + +if (InstrumentSampleColdFuncPath.getNumOccurrences() && +Phase != ThinOrFullLTOPhase::ThinLTOPostLink) { + assert(!Inst
[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function v2 (PR #110330)
https://github.com/wlei-llvm edited https://github.com/llvm/llvm-project/pull/110330 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)
https://github.com/wlei-llvm edited https://github.com/llvm/llvm-project/pull/109837 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)
https://github.com/wlei-llvm updated https://github.com/llvm/llvm-project/pull/109837 >From 6a9b51130269975b5832a5439897dcb34b9dc942 Mon Sep 17 00:00:00 2001 From: wlei Date: Sun, 22 Sep 2024 20:23:20 -0700 Subject: [PATCH] [InstrPGO] Instrument sampling profile based cold function --- clang/include/clang/Driver/Options.td | 6 + clang/lib/Driver/ToolChain.cpp| 4 +++- clang/lib/Driver/ToolChains/Clang.cpp | 10 llvm/lib/Passes/PassBuilderPipelines.cpp | 17 + .../Instrumentation/PGOInstrumentation.cpp| 14 +++ .../PGOProfile/instr-gen-cold-function.ll | 24 +++ 6 files changed, 74 insertions(+), 1 deletion(-) create mode 100644 llvm/test/Transforms/PGOProfile/instr-gen-cold-function.ll diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 002f60350543d9..0ac289089839d2 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -1784,6 +1784,12 @@ defm debug_info_for_profiling : BoolFOption<"debug-info-for-profiling", PosFlag, NegFlag>; +def fprofile_sample_cold_function : Flag<["-"], "fprofile-sample-cold-function">, +Group, Visibility<[ClangOption, CLOption]>, +HelpText<"Generate instrumented code to cold functions guided by sampling-based profile into default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env var)">; +def fprofile_sample_cold_function_EQ : Joined<["-"], "fprofile-sample-cold-function=">, +Group, Visibility<[ClangOption, CLOption]>, MetaVarName<"">, +HelpText<"Generate instrumented code to cold functions guided by sampling-based profile into (overridden by LLVM_PROFILE_FILE env var)">; def fprofile_instr_generate : Flag<["-"], "fprofile-instr-generate">, Group, Visibility<[ClangOption, CLOption]>, HelpText<"Generate instrumented code to collect execution counts into default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env var)">; diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp index 16f9b629fc538c..5cd821cfe780c4 100644 --- a/clang/lib/Driver/ToolChain.cpp +++ b/clang/lib/Driver/ToolChain.cpp @@ -889,7 +889,9 @@ bool ToolChain::needsProfileRT(const ArgList &Args) { Args.hasArg(options::OPT_fprofile_instr_generate) || Args.hasArg(options::OPT_fprofile_instr_generate_EQ) || Args.hasArg(options::OPT_fcreate_profile) || - Args.hasArg(options::OPT_forder_file_instrumentation); + Args.hasArg(options::OPT_forder_file_instrumentation) || + Args.hasArg(options::OPT_fprofile_sample_cold_function) || + Args.hasArg(options::OPT_fprofile_sample_cold_function_EQ); } bool ToolChain::needsGCovInstrumentation(const llvm::opt::ArgList &Args) { diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index 0bab48caf1a5e2..2b3e0047412d3e 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -649,6 +649,16 @@ static void addPGOAndCoverageFlags(const ToolChain &TC, Compilation &C, } } + if (auto *SampleColdArg = + Args.getLastArg(options::OPT_fprofile_sample_cold_function, + options::OPT_fprofile_sample_cold_function_EQ)) { +SmallString<128> Path (SampleColdArg->getOption().matches( +options::OPT_fprofile_sample_cold_function_EQ) ? SampleColdArg->getValue(): "default_%m.profraw"); +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back(Args.MakeArgString( +Twine("--instrument-sample-cold-function-path=") + Path)); + } + Arg *PGOGenArg = nullptr; if (PGOGenerateArg) { assert(!CSPGOGenerateArg); diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp b/llvm/lib/Passes/PassBuilderPipelines.cpp index 8f151a99b11709..0514d17db20721 100644 --- a/llvm/lib/Passes/PassBuilderPipelines.cpp +++ b/llvm/lib/Passes/PassBuilderPipelines.cpp @@ -296,7 +296,13 @@ static cl::opt UseLoopVersioningLICM( "enable-loop-versioning-licm", cl::init(false), cl::Hidden, cl::desc("Enable the experimental Loop Versioning LICM pass")); +static cl::opt InstrumentSampleColdFuncPath( +"instrument-sample-cold-function-path", cl::init(""), +cl::desc("File path for instrumenting sampling PGO guided cold functions"), +cl::Hidden); + extern cl::opt UseCtxProfile; +extern cl::opt InstrumentColdFunction; namespace llvm { extern cl::opt EnableMemProfContextDisambiguation; @@ -1119,6 +1125,17 @@ PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level, // removed. MPM.addPass( PGOIndirectCallPromotion(true /* IsInLTO */, true /* SamplePGO */)); + +if (InstrumentSampleColdFuncPath.getNumOccurrences() && +Phase != ThinOrFullLTOPhase::ThinLTOPostLink) { + assert(!InstrumentSampleColdFuncPath.empty() && + "File path is requeired for ins
[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)
https://github.com/wlei-llvm updated https://github.com/llvm/llvm-project/pull/109837 >From 436ab3bfb3d7f1978ba1598ec7d3ca006706cbd3 Mon Sep 17 00:00:00 2001 From: wlei Date: Sun, 22 Sep 2024 20:23:20 -0700 Subject: [PATCH] [InstrPGO] Instrument sampling profile based cold function --- clang/include/clang/Driver/Options.td | 6 + clang/lib/Driver/ToolChain.cpp| 4 +++- clang/lib/Driver/ToolChains/Clang.cpp | 13 ++ llvm/lib/Passes/PassBuilderPipelines.cpp | 17 + .../Instrumentation/PGOInstrumentation.cpp| 14 +++ .../PGOProfile/instr-gen-cold-function.ll | 24 +++ 6 files changed, 77 insertions(+), 1 deletion(-) create mode 100644 llvm/test/Transforms/PGOProfile/instr-gen-cold-function.ll diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 002f60350543d9..0ac289089839d2 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -1784,6 +1784,12 @@ defm debug_info_for_profiling : BoolFOption<"debug-info-for-profiling", PosFlag, NegFlag>; +def fprofile_sample_cold_function : Flag<["-"], "fprofile-sample-cold-function">, +Group, Visibility<[ClangOption, CLOption]>, +HelpText<"Generate instrumented code to cold functions guided by sampling-based profile into default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env var)">; +def fprofile_sample_cold_function_EQ : Joined<["-"], "fprofile-sample-cold-function=">, +Group, Visibility<[ClangOption, CLOption]>, MetaVarName<"">, +HelpText<"Generate instrumented code to cold functions guided by sampling-based profile into (overridden by LLVM_PROFILE_FILE env var)">; def fprofile_instr_generate : Flag<["-"], "fprofile-instr-generate">, Group, Visibility<[ClangOption, CLOption]>, HelpText<"Generate instrumented code to collect execution counts into default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env var)">; diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp index 16f9b629fc538c..5cd821cfe780c4 100644 --- a/clang/lib/Driver/ToolChain.cpp +++ b/clang/lib/Driver/ToolChain.cpp @@ -889,7 +889,9 @@ bool ToolChain::needsProfileRT(const ArgList &Args) { Args.hasArg(options::OPT_fprofile_instr_generate) || Args.hasArg(options::OPT_fprofile_instr_generate_EQ) || Args.hasArg(options::OPT_fcreate_profile) || - Args.hasArg(options::OPT_forder_file_instrumentation); + Args.hasArg(options::OPT_forder_file_instrumentation) || + Args.hasArg(options::OPT_fprofile_sample_cold_function) || + Args.hasArg(options::OPT_fprofile_sample_cold_function_EQ); } bool ToolChain::needsGCovInstrumentation(const llvm::opt::ArgList &Args) { diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index 0bab48caf1a5e2..f4a073d2036d12 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -649,6 +649,19 @@ static void addPGOAndCoverageFlags(const ToolChain &TC, Compilation &C, } } + if (auto *SampleColdArg = + Args.getLastArg(options::OPT_fprofile_sample_cold_function, + options::OPT_fprofile_sample_cold_function_EQ)) { +SmallString<128> Path(SampleColdArg->getOption().matches( + options::OPT_fprofile_sample_cold_function_EQ) + ? SampleColdArg->getValue() + : ""); +llvm::sys::path::append(Path, "default_%m.profraw"); +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back(Args.MakeArgString( +Twine("--instrument-sample-cold-function-path=") + Path)); + } + Arg *PGOGenArg = nullptr; if (PGOGenerateArg) { assert(!CSPGOGenerateArg); diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp b/llvm/lib/Passes/PassBuilderPipelines.cpp index 8f151a99b11709..0514d17db20721 100644 --- a/llvm/lib/Passes/PassBuilderPipelines.cpp +++ b/llvm/lib/Passes/PassBuilderPipelines.cpp @@ -296,7 +296,13 @@ static cl::opt UseLoopVersioningLICM( "enable-loop-versioning-licm", cl::init(false), cl::Hidden, cl::desc("Enable the experimental Loop Versioning LICM pass")); +static cl::opt InstrumentSampleColdFuncPath( +"instrument-sample-cold-function-path", cl::init(""), +cl::desc("File path for instrumenting sampling PGO guided cold functions"), +cl::Hidden); + extern cl::opt UseCtxProfile; +extern cl::opt InstrumentColdFunction; namespace llvm { extern cl::opt EnableMemProfContextDisambiguation; @@ -1119,6 +1125,17 @@ PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level, // removed. MPM.addPass( PGOIndirectCallPromotion(true /* IsInLTO */, true /* SamplePGO */)); + +if (InstrumentSampleColdFuncPath.getNumOccurrences() && +Phase != ThinOrFullLTOPhase
[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)
https://github.com/wlei-llvm updated https://github.com/llvm/llvm-project/pull/109837 >From 07a2cab3fa5df2965f9f7da9ee2d3603581a47f1 Mon Sep 17 00:00:00 2001 From: wlei Date: Sun, 22 Sep 2024 20:23:20 -0700 Subject: [PATCH] [InstrPGO] Instrument sampling profile based cold function --- clang/include/clang/Driver/Options.td | 6 + clang/lib/Driver/ToolChain.cpp| 4 +++- clang/lib/Driver/ToolChains/Clang.cpp | 18 ++ clang/test/CodeGen/Inputs/pgo-cold-func.prof | 2 ++ .../test/CodeGen/pgo-cold-function-coverage.c | 12 ++ llvm/lib/Passes/PassBuilderPipelines.cpp | 18 ++ .../Instrumentation/PGOInstrumentation.cpp| 16 + .../PGOProfile/instr-gen-cold-function.ll | 24 +++ 8 files changed, 99 insertions(+), 1 deletion(-) create mode 100644 clang/test/CodeGen/Inputs/pgo-cold-func.prof create mode 100644 clang/test/CodeGen/pgo-cold-function-coverage.c create mode 100644 llvm/test/Transforms/PGOProfile/instr-gen-cold-function.ll diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 002f60350543d9..6bb92427f2d53f 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -1784,6 +1784,12 @@ defm debug_info_for_profiling : BoolFOption<"debug-info-for-profiling", PosFlag, NegFlag>; +def fprofile_generate_cold_function_coverage : Flag<["-"], "fprofile-generate-cold-function-coverage">, +Group, Visibility<[ClangOption, CLOption]>, +HelpText<"Generate instrumented code to cold functions into default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env var)">; +def fprofile_generate_cold_function_coverage_EQ : Joined<["-"], "fprofile-generate-cold-function-coverage=">, +Group, Visibility<[ClangOption, CLOption]>, MetaVarName<"">, +HelpText<"Generate instrumented code to cold functions into /default.profraw (overridden by LLVM_PROFILE_FILE env var)">; def fprofile_instr_generate : Flag<["-"], "fprofile-instr-generate">, Group, Visibility<[ClangOption, CLOption]>, HelpText<"Generate instrumented code to collect execution counts into default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env var)">; diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp index 16f9b629fc538c..e56db150c6b47e 100644 --- a/clang/lib/Driver/ToolChain.cpp +++ b/clang/lib/Driver/ToolChain.cpp @@ -889,7 +889,9 @@ bool ToolChain::needsProfileRT(const ArgList &Args) { Args.hasArg(options::OPT_fprofile_instr_generate) || Args.hasArg(options::OPT_fprofile_instr_generate_EQ) || Args.hasArg(options::OPT_fcreate_profile) || - Args.hasArg(options::OPT_forder_file_instrumentation); + Args.hasArg(options::OPT_forder_file_instrumentation) || + Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage) || + Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage_EQ); } bool ToolChain::needsGCovInstrumentation(const llvm::opt::ArgList &Args) { diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index 0bab48caf1a5e2..00602a08232ba2 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -649,6 +649,24 @@ static void addPGOAndCoverageFlags(const ToolChain &TC, Compilation &C, } } + if (auto *ColdFuncCoverageArg = Args.getLastArg( + options::OPT_fprofile_generate_cold_function_coverage, + options::OPT_fprofile_generate_cold_function_coverage_EQ)) { +SmallString<128> Path( +ColdFuncCoverageArg->getOption().matches( +options::OPT_fprofile_generate_cold_function_coverage_EQ) +? ColdFuncCoverageArg->getValue() +: ""); +llvm::sys::path::append(Path, "default_%m.profraw"); +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back(Args.MakeArgString( +Twine("--instrument-sample-cold-function-path=") + Path)); +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back("--instrument-cold-function-coverage"); +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back("--pgo-function-entry-coverage"); + } + Arg *PGOGenArg = nullptr; if (PGOGenerateArg) { assert(!CSPGOGenerateArg); diff --git a/clang/test/CodeGen/Inputs/pgo-cold-func.prof b/clang/test/CodeGen/Inputs/pgo-cold-func.prof new file mode 100644 index 00..e50be02e0a8545 --- /dev/null +++ b/clang/test/CodeGen/Inputs/pgo-cold-func.prof @@ -0,0 +1,2 @@ +foo:1:1 + 1: 1 diff --git a/clang/test/CodeGen/pgo-cold-function-coverage.c b/clang/test/CodeGen/pgo-cold-function-coverage.c new file mode 100644 index 00..0d2767c022b9f1 --- /dev/null +++ b/clang/test/CodeGen/pgo-cold-function-coverage.c @@ -0,0 +1,12 @@ +// Test -fprofile-generate-cold-function-coverage +// RUN: %clang -O2 -fprofile-generate-cold-function-coverage=/xxx/yyy/
[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)
https://github.com/wlei-llvm edited https://github.com/llvm/llvm-project/pull/109837 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)
https://github.com/wlei-llvm ready_for_review https://github.com/llvm/llvm-project/pull/109837 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)
https://github.com/wlei-llvm edited https://github.com/llvm/llvm-project/pull/109837 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)
https://github.com/wlei-llvm updated https://github.com/llvm/llvm-project/pull/109837 >From 07a2cab3fa5df2965f9f7da9ee2d3603581a47f1 Mon Sep 17 00:00:00 2001 From: wlei Date: Sun, 22 Sep 2024 20:23:20 -0700 Subject: [PATCH 1/3] [InstrPGO] Instrument sampling profile based cold function --- clang/include/clang/Driver/Options.td | 6 + clang/lib/Driver/ToolChain.cpp| 4 +++- clang/lib/Driver/ToolChains/Clang.cpp | 18 ++ clang/test/CodeGen/Inputs/pgo-cold-func.prof | 2 ++ .../test/CodeGen/pgo-cold-function-coverage.c | 12 ++ llvm/lib/Passes/PassBuilderPipelines.cpp | 18 ++ .../Instrumentation/PGOInstrumentation.cpp| 16 + .../PGOProfile/instr-gen-cold-function.ll | 24 +++ 8 files changed, 99 insertions(+), 1 deletion(-) create mode 100644 clang/test/CodeGen/Inputs/pgo-cold-func.prof create mode 100644 clang/test/CodeGen/pgo-cold-function-coverage.c create mode 100644 llvm/test/Transforms/PGOProfile/instr-gen-cold-function.ll diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 002f60350543d9..6bb92427f2d53f 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -1784,6 +1784,12 @@ defm debug_info_for_profiling : BoolFOption<"debug-info-for-profiling", PosFlag, NegFlag>; +def fprofile_generate_cold_function_coverage : Flag<["-"], "fprofile-generate-cold-function-coverage">, +Group, Visibility<[ClangOption, CLOption]>, +HelpText<"Generate instrumented code to cold functions into default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env var)">; +def fprofile_generate_cold_function_coverage_EQ : Joined<["-"], "fprofile-generate-cold-function-coverage=">, +Group, Visibility<[ClangOption, CLOption]>, MetaVarName<"">, +HelpText<"Generate instrumented code to cold functions into /default.profraw (overridden by LLVM_PROFILE_FILE env var)">; def fprofile_instr_generate : Flag<["-"], "fprofile-instr-generate">, Group, Visibility<[ClangOption, CLOption]>, HelpText<"Generate instrumented code to collect execution counts into default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env var)">; diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp index 16f9b629fc538c..e56db150c6b47e 100644 --- a/clang/lib/Driver/ToolChain.cpp +++ b/clang/lib/Driver/ToolChain.cpp @@ -889,7 +889,9 @@ bool ToolChain::needsProfileRT(const ArgList &Args) { Args.hasArg(options::OPT_fprofile_instr_generate) || Args.hasArg(options::OPT_fprofile_instr_generate_EQ) || Args.hasArg(options::OPT_fcreate_profile) || - Args.hasArg(options::OPT_forder_file_instrumentation); + Args.hasArg(options::OPT_forder_file_instrumentation) || + Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage) || + Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage_EQ); } bool ToolChain::needsGCovInstrumentation(const llvm::opt::ArgList &Args) { diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index 0bab48caf1a5e2..00602a08232ba2 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -649,6 +649,24 @@ static void addPGOAndCoverageFlags(const ToolChain &TC, Compilation &C, } } + if (auto *ColdFuncCoverageArg = Args.getLastArg( + options::OPT_fprofile_generate_cold_function_coverage, + options::OPT_fprofile_generate_cold_function_coverage_EQ)) { +SmallString<128> Path( +ColdFuncCoverageArg->getOption().matches( +options::OPT_fprofile_generate_cold_function_coverage_EQ) +? ColdFuncCoverageArg->getValue() +: ""); +llvm::sys::path::append(Path, "default_%m.profraw"); +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back(Args.MakeArgString( +Twine("--instrument-sample-cold-function-path=") + Path)); +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back("--instrument-cold-function-coverage"); +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back("--pgo-function-entry-coverage"); + } + Arg *PGOGenArg = nullptr; if (PGOGenerateArg) { assert(!CSPGOGenerateArg); diff --git a/clang/test/CodeGen/Inputs/pgo-cold-func.prof b/clang/test/CodeGen/Inputs/pgo-cold-func.prof new file mode 100644 index 00..e50be02e0a8545 --- /dev/null +++ b/clang/test/CodeGen/Inputs/pgo-cold-func.prof @@ -0,0 +1,2 @@ +foo:1:1 + 1: 1 diff --git a/clang/test/CodeGen/pgo-cold-function-coverage.c b/clang/test/CodeGen/pgo-cold-function-coverage.c new file mode 100644 index 00..0d2767c022b9f1 --- /dev/null +++ b/clang/test/CodeGen/pgo-cold-function-coverage.c @@ -0,0 +1,12 @@ +// Test -fprofile-generate-cold-function-coverage +// RUN: %clang -O2 -fprofile-generate-cold-function-coverage=/xxx
[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)
@@ -1784,6 +1784,12 @@ defm debug_info_for_profiling : BoolFOption<"debug-info-for-profiling", PosFlag, NegFlag>; +def fprofile_generate_cold_function_coverage : Flag<["-"], "fprofile-generate-cold-function-coverage">, wlei-llvm wrote: >Would -Wl,-lclang_rt.profile work? Got it, I think it should work, except it requires another linker flag: the `--u__llvm_runtime_variable`, we can configure it to linker too, but basically [those instr PGO flags](https://github.com/llvm/llvm-project/blob/main/clang/lib/Driver/ToolChain.cpp#L885-L893) control [addProfileRTLibs](https://github.com/llvm/llvm-project/blob/main/clang/lib/Driver/ToolChains/Linux.cpp#L841-L849) (which seems not equivalent to `-Wl,-lclang_rt.profile`), I just wanted to mirror those flags so that we don't need to maintain if anything changes to `addProfileRTLibs` in the future. (Though I feel this code should be very stable, should not be a big problem, so mainly for the convenience for compiler user to use one flag instead of using/maintaining multiple flags ) Overall, I agree that it's feasible to do it without clang flag. I'm not very familiar with the convention for adding driver flags, so if you think this doesn't justify it, I will drop it from this patch. Thanks for the discussion! https://github.com/llvm/llvm-project/pull/109837 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)
@@ -1119,6 +1125,18 @@ PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level, // removed. MPM.addPass( PGOIndirectCallPromotion(true /* IsInLTO */, true /* SamplePGO */)); + +if (InstrumentSampleColdFuncPath.getNumOccurrences() && +Phase != ThinOrFullLTOPhase::ThinLTOPostLink) { + assert(!InstrumentSampleColdFuncPath.empty() && + "File path is requeired for instrumentation generation"); + InstrumentColdFunctionCoverage = true; + addPreInlinerPasses(MPM, Level, Phase); + addPGOInstrPasses(MPM, Level, /* RunProfileGen */ true, +/* IsCS */ false, /* AtomicCounterUpdate */ false, +InstrumentSampleColdFuncPath, "", +IntrusiveRefCntPtr()); +} wlei-llvm wrote: > After looking at this a bit closer, I'm not sure why it needs to be tied to > closely to SamplePGO. Couldn't we move this out of the `LoadSampleProfile` > and move it after `IsPGOInstrUse/IsCtxProfUse`? That way we can use IRPGO, > CSIRPGO, and SamplePGO profile counts to block instrumentation hot functions. Ah, good point, moved them closer to other IRPGO passes. Thanks for the suggestion! https://github.com/llvm/llvm-project/pull/109837 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)
@@ -0,0 +1,24 @@ +; RUN: opt < %s --passes=pgo-instr-gen -instrument-cold-function-coverage -S | FileCheck --check-prefixes=COLD %s +; RUN: opt < %s --passes=pgo-instr-gen -instrument-cold-function-coverage -cold-function-coverage-max-entry-count=1 -S | FileCheck --check-prefixes=ENTRY-COUNT %s + +; COLD: call void @llvm.instrprof.increment(ptr @__profn_foo, i64 [[#]], i32 1, i32 0) wlei-llvm wrote: Sounds good!(it should still work without `-pgo-function-entry-coverage`, but agree to test the most common case) https://github.com/llvm/llvm-project/pull/109837 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)
@@ -1119,6 +1125,18 @@ PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level, // removed. MPM.addPass( PGOIndirectCallPromotion(true /* IsInLTO */, true /* SamplePGO */)); + +if (InstrumentSampleColdFuncPath.getNumOccurrences() && +Phase != ThinOrFullLTOPhase::ThinLTOPostLink) { + assert(!InstrumentSampleColdFuncPath.empty() && + "File path is requeired for instrumentation generation"); + InstrumentColdFunctionCoverage = true; + addPreInlinerPasses(MPM, Level, Phase); + addPGOInstrPasses(MPM, Level, /* RunProfileGen */ true, +/* IsCS */ false, /* AtomicCounterUpdate */ false, +InstrumentSampleColdFuncPath, "", +IntrusiveRefCntPtr()); +} wlei-llvm wrote: do you mean the `sample` in `InstrumentSampleColdFuncPath`flag or all other variables(like `IsSampleColdFuncCovGen`)? I thought something like: ``` if (IsSampleColdFuncCovGen || IsXXXColdFuncCovGen) { addPGOInstrPasses(.., .., InstrumentColdFuncCoveragePath, .. ) } ``` `InstrumentColdFuncCoveragePath` would be a general flag that can be used for all cold function coverage case. but currently `IsSampleColdFuncCovGen` only represents for sampling PGO cold func. And If we want to extend it in future, then we can add more bool flag IsXXXColdFuncCovGen... I also added an assertion: ``` assert((InstrumentColdFuncCoveragePath.empty() || LoadSampleProfile) && "Sampling-based cold functon coverage is not supported without " "providing sampling PGO profile"); ``` Seems I have to remove that if we want it to be general. (just curious) do you plan to extend it for your case? https://github.com/llvm/llvm-project/pull/109837 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] Reapply "[Clang][Sema] Always use latest redeclaration of primary template" (PR #114569)
wlei-llvm wrote: Hi, here is the reduced repro (not perfect, but was as far as I can get), thanks! ``` namespace std { template _Tp __declval(long); template auto declval() noexcept -> decltype(__declval<_Tp>(0)); template class __is_convertible_helper { template()))> static int __test; typedef decltype(__test<_From, _To>0) type; }; template struct is_convertible : public __is_convertible_helper<_From, _To>::type ; template inline constexpr bool is_same_v = __is_same(_Tp, _Up); } enum _Lock_policy { _S_single, _S_mutex, _S_atomic }; static const _Lock_policy __default_lock_policy = _S_atomic; template class __shared_ptr; template struct __sp_compatible_with : is_convertible<_Yp*, _Tp*>::type ; template class __shared_ptr { using _SafeConv = typename enable_if::type; template using _Compatible = typename enable_if__sp_compatible_with<_Yp*, _Tp*>::value, _Res::type; public: __shared_ptr0 noexcept : _M_ptr0, _M_refcount0 __shared_ptr0 noexcept = default; __shared_ptr& operator=0 noexcept = default; ~__shared_ptr0 = default; template> __shared_ptr(const __shared_ptr<_Yp, _Lp>& __r) noexcept }; namespace storage { class S ; } namespace storage template concept StatState = std::is_same_v; namespace storage template class Tracker; template std::shared_ptr> createTracker() ; template class Tracker { public: template std::shared_ptr> friend createTracker(); }; namespace storage class Pool { public: Pool() { g = createTracker(); } private: std::shared_ptr g; }; void create0 () { auto tracker = createTracker(); } ``` please ignore the irrelevant syntax error, the real error is : ``` clang++ -ferror-limit= -std=gnu++20 -c test.cpp clang++: /home/wlei/local/upstream/llvm-project/clang/lib/AST/ExprConstant.cpp:16601: bool clang::Expr::EvaluateAsConstantExpr(EvalResult &, const ASTContext &, ConstantExprKind) const: Assertion `!isValueDependent() && "Expression evaluator can't be called on a dependent expression."' failed. PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script. Stack dump: 0. Program arguments: /home/wlei/local/upstream/llvm-release/bin/clang++ -ferror-limit= -std=gnu++20 -c test.cpp 1. test.cpp:78:32: current parser token ')' 2. test.cpp:77:18: parsing function body 'create0' 3. test.cpp:77:18: in compound statement ('{}') #0 0x7fcf547fb128 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/wlei/local/upstream/llvm-project/llvm/lib/Support/Unix/Signals.inc:723:13 #1 0x7fcf547f9040 llvm::sys::RunSignalHandlers() /home/wlei/local/upstream/llvm-project/llvm/lib/Support/Signals.cpp:106:18 #2 0x7fcf5473c7b6 (anonymous namespace)::CrashRecoveryContextImpl::HandleCrash(int, unsigned long) /home/wlei/local/upstream/llvm-project/llvm/lib/Support/CrashRecoveryContext.cpp:73:5 #3 0x7fcf5473c7b6 CrashRecoverySignalHandler(int) /home/wlei/local/upstream/llvm-project/llvm/lib/Support/CrashRecoveryContext.cpp:390:51 #4 0x7fcf53e3e730 __restore_rt (/lib64/libc.so.6+0x3e730) #5 0x7fcf53e8bacc __pthread_kill_implementation (/lib64/libc.so.6+0x8bacc) #6 0x7fcf53e3e686 gsignal (/lib64/libc.so.6+0x3e686) #7 0x7fcf53e28833 abort (/lib64/libc.so.6+0x28833) #8 0x7fcf53e2875b _nl_load_domain.cold (/lib64/libc.so.6+0x2875b) #9 0x7fcf53e373c6 (/lib64/libc.so.6+0x373c6) #10 0x7fcf523da5b1 clang::Expr::EvaluateAsConstantExpr(clang::Expr::EvalResult&, clang::ASTContext const&, clang::Expr::ConstantExprKind) const /home/wlei/local/upstream/llvm-project/clang/lib/AST/ExprConstant.cpp:0:0 #11 0x7fcf50f9b41d clang::ActionResult calculateConstraintSatisfaction(clang::Sema&, clang::Expr const*, clang::ConstraintSatisfaction&, calculateConstraintSatisfaction(clang::Sema&, clang::NamedDecl const*, clang::SourceLocation, clang::MultiLevelTemplateArgumentList const&, clang::Expr const*, clang::ConstraintSatisfaction&)::ConstraintEvaluator const&) /home/wlei/local/upstream/llvm-project/clang/lib/Sema/SemaConcept.cpp:389:71 #12 0x7fcf50f941eb clang::ActionResult::isInvalid() const /home/wlei/local/upstream/llvm-project/clang/include/clang/Sema/Ownership.h:199:41 #13 0x7fcf50f941eb CheckConstraintSatisfaction(clang::Sema&, clang::NamedDecl const*, llvm::ArrayRef, llvm::SmallVectorImpl&, clang::MultiLevelTemplateArgumentList const&, clang::SourceRange, clang::ConstraintSatisfaction&) /home/wlei/local/upstream/llvm-project/clang/lib/Sema/SemaConcept.cpp:600:13 #14 0x7fcf50f93f6a clang::Sema::CheckConstraintSatisfaction(clang::NamedDecl const*, llvm::ArrayRef, llvm::SmallVectorImpl&, clang::MultiLevelTemplateArgumentList const&, clang::SourceRange, clang::ConstraintSatisfaction&) /home/wlei/local/upstream/llvm-project/clang/li
[clang] Reapply "[Clang][Sema] Refactor collection of multi-level template argument lists (#106585, #111173)" (PR #111852)
wlei-llvm wrote: Hi @sdkrystian , FYI, we also hit an error/assertion that bisected to this https://github.com/llvm/llvm-project/pull/114569 ``` lvm-project/clang/lib/AST/ExprConstant.cpp:16601: bool clang::Expr::EvaluateAsConstantExpr(EvalResult &, const ASTContext &, ConstantExprKind) const: Assertion `!isValueDependent() && "Expression evaluator can't be called on a dependent expression."' failed. ``` I'm trying working on reducing the repro(it seems very slow). https://github.com/llvm/llvm-project/pull/111852 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [compiler-rt] [libc] [lld] [lldb] [llvm] [mlir] [BOLT] Match blocks with pseudo probes (PR #99891)
wlei-llvm wrote: > > > > > > Ping @wlei-llvm > > > > > > > > > > > > > > > Sorry for the delay. The new version addressed my last comment (with > > > > > just minor nits). However, I didn't fully follow the new features > > > > > related to `ProbeMatchSpecs` stuffs. Could you add more descriptions > > > > > to the diff summary? Or if it’s not a lot of work, could we split it > > > > > into two patches? We could commit the first part, and I will review > > > > > the second part separately. > > > > > > > > > > > > NVM, I think now I get what `ProbeMatchSpecs` does, it's a vector > > > > because a function can have multiple sections(function split) > > > > > > > > > Thank you for reviewing and sorry for the delay from my end, was busy > > > with profile quality work. > > > ProbeMatchSpecs is a mechanism to match probes belonging to another > > > binary function. I'm going to utilize it in probe-based function matching > > > (#100446). For example: > > > source function: > > > ``` > > > void foo() { > > > bar(); > > > } > > > ``` > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > profiled binary: bar is not inlined => have top-level function bar new > > > binary (where the profile is applied to): bar is inlined into foo. > > > Right now, BOLT does 1:1 matching between profile functions and binary > > > functions based on the name. #100446 will extend this to N:M where > > > multiple profiles can be matched to one binary function (as in the > > > example above where binary function foo would use profiles for foo and > > > bar), and one profile can be matched to multiple binary functions (eg if > > > bar was inlined into multiple functions). > > > In this diff, ProbeMatchSpecs would only have one BinaryFunctionProfile > > > (existing name-based matching). > > > > > > Thanks for the explanation! I was confused of the use of `ProbeMatchSpecs`, > > it would be great to add more descriptions in the diff summary or somewhere > > in the comments, or in the another patch when it's used(IMO if we add a > > feature, but we doesn't use it in the patch, it should be in the future > > patch when it's used) > > Thank you. Added description of ProbeMatchSpecs to the summary. I decided to > introduce it in this diff because there's tight coupling between probe-based > block matching and function matching. This way, probe-based function matching > would not need to change how block matching works. Makes sense, thanks for clarifying! https://github.com/llvm/llvm-project/pull/99891 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [compiler-rt] [libc] [lld] [lldb] [llvm] [mlir] [BOLT] Match blocks with pseudo probes (PR #99891)
wlei-llvm wrote: > > > > Ping @wlei-llvm > > > > > > > > > Sorry for the delay. The new version addressed my last comment (with just > > > minor nits). However, I didn't fully follow the new features related to > > > `ProbeMatchSpecs` stuffs. Could you add more descriptions to the diff > > > summary? Or if it’s not a lot of work, could we split it into two > > > patches? We could commit the first part, and I will review the second > > > part separately. > > > > > > NVM, I think now I get what `ProbeMatchSpecs` does, it's a vector because a > > function can have multiple sections(function split) > > Thank you for reviewing and sorry for the delay from my end, was busy with > profile quality work. > > ProbeMatchSpecs is a mechanism to match probes belonging to another binary > function. I'm going to utilize it in probe-based function matching (#100446). > For example: > > source function: > > ``` > void foo() { > bar(); > } > ``` > > profiled binary: bar is not inlined => have top-level function bar new binary > (where the profile is applied to): bar is inlined into foo. > > Right now, BOLT does 1:1 matching between profile functions and binary > functions based on the name. #100446 will extend this to N:M where multiple > profiles can be matched to one binary function (as in the example above where > binary function foo would use profiles for foo and bar), and one profile can > be matched to multiple binary functions (eg if bar was inlined into multiple > functions). > > In this diff, ProbeMatchSpecs would only have one BinaryFunctionProfile > (existing name-based matching). Thanks for the explanation! I was confused of the use of `ProbeMatchSpecs`, it would be great to add more descriptions in the diff summary or somewhere in the comments, or in the another patch when it's used(IMO if we add a feature, but we doesn't use it in the patch, it should be in the future patch when it's used) https://github.com/llvm/llvm-project/pull/99891 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [compiler-rt] [libc] [lld] [lldb] [llvm] [mlir] [BOLT] Match blocks with pseudo probes (PR #99891)
https://github.com/wlei-llvm approved this pull request. LGTM, thanks for addressing the comments! https://github.com/llvm/llvm-project/pull/99891 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Coverage] Support -fprofile-list for cold function coverage (PR #136333)
https://github.com/wlei-llvm created https://github.com/llvm/llvm-project/pull/136333 Add a new instrumentation section type `[coldcov]` to support`-fprofile-list` for cold function coverage. Note that the current cold function coverage is based on sampling PGO pipeline, which is incompatible with the existing [llvm] option(see [PGOOptions](https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/Support/PGOOptions.h#L27-L43)), so we can't reuse the IR-PGO(-fprofile-instrument=llvm) flag. >From 3d770595ce6568148ddfe2596f1a53dfc2ee751f Mon Sep 17 00:00:00 2001 From: wlei Date: Fri, 18 Apr 2025 10:37:30 -0700 Subject: [PATCH] [Inverge] Support -fprofile-list for cold function coveragege --- clang/docs/UsersManual.rst | 6 +++--- clang/include/clang/Basic/CodeGenOptions.def | 2 +- clang/include/clang/Basic/CodeGenOptions.h | 2 ++ clang/include/clang/Driver/Options.td | 4 ++-- clang/lib/Basic/ProfileList.cpp| 2 ++ clang/lib/Driver/ToolChains/Clang.cpp | 1 + clang/test/CodeGen/profile-filter.c| 7 +++ .../test/Driver/fprofile-generate-cold-function-coverage.c | 2 +- 8 files changed, 19 insertions(+), 7 deletions(-) diff --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst index 69256527f40c9..47bd591d20a27 100644 --- a/clang/docs/UsersManual.rst +++ b/clang/docs/UsersManual.rst @@ -3394,9 +3394,9 @@ This can be done using the ``-fprofile-list`` option. $ clang++ -O2 -fprofile-instr-generate -fcoverage-mapping -fprofile-list=fun.list -fprofile-list=code.list code.cc -o code -Supported sections are ``[clang]``, ``[llvm]``, and ``[csllvm]`` representing -clang PGO, IRPGO, and CSIRPGO, respectively. Supported prefixes are ``function`` -and ``source``. Supported categories are ``allow``, ``skip``, and ``forbid``. +Supported sections are ``[clang]``, ``[llvm]``, ``[csllvm]``, and ``[coldcov]`` representing +clang PGO, IRPGO, CSIRPGO and cold function coverage, respectively. Supported prefixes are +``function`` and ``source``. Supported categories are ``allow``, ``skip``, and ``forbid``. ``skip`` adds the ``skipprofile`` attribute while ``forbid`` adds the ``noprofile`` attribute to the appropriate function. Use ``default:`` to specify the default category. diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def index c5990fb248689..fced5d7dcf6b7 100644 --- a/clang/include/clang/Basic/CodeGenOptions.def +++ b/clang/include/clang/Basic/CodeGenOptions.def @@ -223,7 +223,7 @@ AFFECTING_VALUE_CODEGENOPT(OptimizeSize, 2, 0) ///< If -Os (==1) or -Oz (==2) is CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic CODEGENOPT(ContinuousProfileSync, 1, 0) ///< Enable continuous instrumentation profiling /// Choose profile instrumenation kind or no instrumentation. -ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 2, ProfileNone) +ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 4, ProfileNone) /// Choose profile kind for PGO use compilation. ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone) /// Partition functions into N groups and select only functions in group i to be diff --git a/clang/include/clang/Basic/CodeGenOptions.h b/clang/include/clang/Basic/CodeGenOptions.h index e39a73bdb13ac..60001cfc62218 100644 --- a/clang/include/clang/Basic/CodeGenOptions.h +++ b/clang/include/clang/Basic/CodeGenOptions.h @@ -86,6 +86,8 @@ class CodeGenOptions : public CodeGenOptionsBase { // to use with PGO. ProfileIRInstr,// IR level PGO instrumentation in LLVM. ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. +ProfileIRColdCov, // IR level cold function coverage instrumentation in + // LLVM. }; enum EmbedBitcodeKind { diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 830d3459a1320..a76f90efc4a51 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -7679,9 +7679,9 @@ def fpatchable_function_entry_section_EQ HelpText<"Use Section instead of __patchable_function_entries">, MarshallingInfoString>; def fprofile_instrument_EQ : Joined<["-"], "fprofile-instrument=">, -HelpText<"Enable PGO instrumentation">, Values<"none,clang,llvm,csllvm">, +HelpText<"Enable PGO instrumentation">, Values<"none,clang,llvm,csllvm,coldcov">, NormalizedValuesScope<"CodeGenOptions">, -NormalizedValues<["ProfileNone", "ProfileClangInstr", "ProfileIRInstr", "ProfileCSIRInstr"]>, +NormalizedValues<["ProfileNone", "ProfileClangInstr", "ProfileIRInstr", "ProfileCSIRInstr", "ProfileIRColdCov"]>, MarshallingInfoEnum, "ProfileNone">; def fprofile_instrument_path_EQ : Joined<["-"], "fprofile-instrument-path=">, HelpText<"Generate instrumente
[clang] [Coverage] Support -fprofile-list for cold function coverage (PR #136333)
wlei-llvm wrote: ping:) https://github.com/llvm/llvm-project/pull/136333 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Coverage] Support -fprofile-list for cold function coverage (PR #136333)
https://github.com/wlei-llvm updated https://github.com/llvm/llvm-project/pull/136333 Rate limit · GitHub body { background-color: #f6f8fa; color: #24292e; font-family: -apple-system,BlinkMacSystemFont,Segoe UI,Helvetica,Arial,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol; font-size: 14px; line-height: 1.5; margin: 0; } .container { margin: 50px auto; max-width: 600px; text-align: center; padding: 0 24px; } a { color: #0366d6; text-decoration: none; } a:hover { text-decoration: underline; } h1 { line-height: 60px; font-size: 48px; font-weight: 300; margin: 0px; text-shadow: 0 1px 0 #fff; } p { color: rgba(0, 0, 0, 0.5); margin: 20px 0 40px; } ul { list-style: none; margin: 25px 0; padding: 0; } li { display: table-cell; font-weight: bold; width: 1%; } .logo { display: inline-block; margin-top: 35px; } .logo-img-2x { display: none; } @media only screen and (-webkit-min-device-pixel-ratio: 2), only screen and ( min--moz-device-pixel-ratio: 2), only screen and ( -o-min-device-pixel-ratio: 2/1), only screen and (min-device-pixel-ratio: 2), only screen and (min-resolution: 192dpi), only screen and (min-resolution: 2dppx) { .logo-img-1x { display: none; } .logo-img-2x { display: inline-block; } } #suggestions { margin-top: 35px; color: #ccc; } #suggestions a { color: #66; font-weight: 200; font-size: 14px; margin: 0 10px; } Whoa there! You have exceeded a secondary rate limit. Please wait a few minutes before you try again; in some cases this may take up to an hour. https://support.github.com/contact";>Contact Support — https://githubstatus.com";>GitHub Status — https://twitter.com/githubstatus";>@githubstatus ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Coverage] Support -fprofile-list for cold function coverage (PR #136333)
https://github.com/wlei-llvm edited https://github.com/llvm/llvm-project/pull/136333 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Coverage] Support -fprofile-list for cold function coverage (PR #136333)
https://github.com/wlei-llvm updated https://github.com/llvm/llvm-project/pull/136333 >From 3d770595ce6568148ddfe2596f1a53dfc2ee751f Mon Sep 17 00:00:00 2001 From: wlei Date: Fri, 18 Apr 2025 10:37:30 -0700 Subject: [PATCH 1/2] [Inverge] Support -fprofile-list for cold function coveragege --- clang/docs/UsersManual.rst | 6 +++--- clang/include/clang/Basic/CodeGenOptions.def | 2 +- clang/include/clang/Basic/CodeGenOptions.h | 2 ++ clang/include/clang/Driver/Options.td | 4 ++-- clang/lib/Basic/ProfileList.cpp| 2 ++ clang/lib/Driver/ToolChains/Clang.cpp | 1 + clang/test/CodeGen/profile-filter.c| 7 +++ .../test/Driver/fprofile-generate-cold-function-coverage.c | 2 +- 8 files changed, 19 insertions(+), 7 deletions(-) diff --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst index 69256527f40c9..47bd591d20a27 100644 --- a/clang/docs/UsersManual.rst +++ b/clang/docs/UsersManual.rst @@ -3394,9 +3394,9 @@ This can be done using the ``-fprofile-list`` option. $ clang++ -O2 -fprofile-instr-generate -fcoverage-mapping -fprofile-list=fun.list -fprofile-list=code.list code.cc -o code -Supported sections are ``[clang]``, ``[llvm]``, and ``[csllvm]`` representing -clang PGO, IRPGO, and CSIRPGO, respectively. Supported prefixes are ``function`` -and ``source``. Supported categories are ``allow``, ``skip``, and ``forbid``. +Supported sections are ``[clang]``, ``[llvm]``, ``[csllvm]``, and ``[coldcov]`` representing +clang PGO, IRPGO, CSIRPGO and cold function coverage, respectively. Supported prefixes are +``function`` and ``source``. Supported categories are ``allow``, ``skip``, and ``forbid``. ``skip`` adds the ``skipprofile`` attribute while ``forbid`` adds the ``noprofile`` attribute to the appropriate function. Use ``default:`` to specify the default category. diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def index c5990fb248689..fced5d7dcf6b7 100644 --- a/clang/include/clang/Basic/CodeGenOptions.def +++ b/clang/include/clang/Basic/CodeGenOptions.def @@ -223,7 +223,7 @@ AFFECTING_VALUE_CODEGENOPT(OptimizeSize, 2, 0) ///< If -Os (==1) or -Oz (==2) is CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic CODEGENOPT(ContinuousProfileSync, 1, 0) ///< Enable continuous instrumentation profiling /// Choose profile instrumenation kind or no instrumentation. -ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 2, ProfileNone) +ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 4, ProfileNone) /// Choose profile kind for PGO use compilation. ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone) /// Partition functions into N groups and select only functions in group i to be diff --git a/clang/include/clang/Basic/CodeGenOptions.h b/clang/include/clang/Basic/CodeGenOptions.h index e39a73bdb13ac..60001cfc62218 100644 --- a/clang/include/clang/Basic/CodeGenOptions.h +++ b/clang/include/clang/Basic/CodeGenOptions.h @@ -86,6 +86,8 @@ class CodeGenOptions : public CodeGenOptionsBase { // to use with PGO. ProfileIRInstr,// IR level PGO instrumentation in LLVM. ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. +ProfileIRColdCov, // IR level cold function coverage instrumentation in + // LLVM. }; enum EmbedBitcodeKind { diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 830d3459a1320..a76f90efc4a51 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -7679,9 +7679,9 @@ def fpatchable_function_entry_section_EQ HelpText<"Use Section instead of __patchable_function_entries">, MarshallingInfoString>; def fprofile_instrument_EQ : Joined<["-"], "fprofile-instrument=">, -HelpText<"Enable PGO instrumentation">, Values<"none,clang,llvm,csllvm">, +HelpText<"Enable PGO instrumentation">, Values<"none,clang,llvm,csllvm,coldcov">, NormalizedValuesScope<"CodeGenOptions">, -NormalizedValues<["ProfileNone", "ProfileClangInstr", "ProfileIRInstr", "ProfileCSIRInstr"]>, +NormalizedValues<["ProfileNone", "ProfileClangInstr", "ProfileIRInstr", "ProfileCSIRInstr", "ProfileIRColdCov"]>, MarshallingInfoEnum, "ProfileNone">; def fprofile_instrument_path_EQ : Joined<["-"], "fprofile-instrument-path=">, HelpText<"Generate instrumented code to collect execution counts into " diff --git a/clang/lib/Basic/ProfileList.cpp b/clang/lib/Basic/ProfileList.cpp index 8fa16e2eb069a..0991867e1ac3a 100644 --- a/clang/lib/Basic/ProfileList.cpp +++ b/clang/lib/Basic/ProfileList.cpp @@ -81,6 +81,8 @@ static StringRef getSectionName(CodeGenOptions::ProfileInstrKind Kind) { return "llvm"; case CodeGenOptions::ProfileCSIRInstr: return "csllvm"; +
[clang] [Coverage] Support -fprofile-list for cold function coverage (PR #136333)
https://github.com/wlei-llvm updated https://github.com/llvm/llvm-project/pull/136333 Rate limit · GitHub body { background-color: #f6f8fa; color: #24292e; font-family: -apple-system,BlinkMacSystemFont,Segoe UI,Helvetica,Arial,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol; font-size: 14px; line-height: 1.5; margin: 0; } .container { margin: 50px auto; max-width: 600px; text-align: center; padding: 0 24px; } a { color: #0366d6; text-decoration: none; } a:hover { text-decoration: underline; } h1 { line-height: 60px; font-size: 48px; font-weight: 300; margin: 0px; text-shadow: 0 1px 0 #fff; } p { color: rgba(0, 0, 0, 0.5); margin: 20px 0 40px; } ul { list-style: none; margin: 25px 0; padding: 0; } li { display: table-cell; font-weight: bold; width: 1%; } .logo { display: inline-block; margin-top: 35px; } .logo-img-2x { display: none; } @media only screen and (-webkit-min-device-pixel-ratio: 2), only screen and ( min--moz-device-pixel-ratio: 2), only screen and ( -o-min-device-pixel-ratio: 2/1), only screen and (min-device-pixel-ratio: 2), only screen and (min-resolution: 192dpi), only screen and (min-resolution: 2dppx) { .logo-img-1x { display: none; } .logo-img-2x { display: inline-block; } } #suggestions { margin-top: 35px; color: #ccc; } #suggestions a { color: #66; font-weight: 200; font-size: 14px; margin: 0 10px; } Whoa there! You have exceeded a secondary rate limit. Please wait a few minutes before you try again; in some cases this may take up to an hour. https://support.github.com/contact";>Contact Support — https://githubstatus.com";>GitHub Status — https://twitter.com/githubstatus";>@githubstatus ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Coverage] Support -fprofile-list for cold function coverage (PR #136333)
wlei-llvm wrote: > LGTM, but I'd like to bikeshed the name a bit. Since this is only used for > the Sample PGO pipeline, should we include Sample PGO in the name? Sounds good, I will go with `[sample-coldcov]` if that works for you. https://github.com/llvm/llvm-project/pull/136333 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Coverage] Support -fprofile-list for cold function coverage (PR #136333)
@@ -9,6 +9,9 @@ // RUN: echo -e "[clang]\nfun:test1\n[llvm]\nfun:test2" > %t-section.list // RUN: %clang_cc1 -fprofile-instrument=llvm -fprofile-list=%t-section.list -emit-llvm %s -o - | FileCheck %s --check-prefix=SECTION +// RUN: echo -e "[coldcov]\nfun:test*\n!fun:test2" > %t-cold-func.list +// RUN: %clang_cc1 -fprofile-instrument=coldcov -fprofile-list=%t-cold-func.list -emit-llvm %s -o - | FileCheck %s --check-prefix=COLDCOV wlei-llvm wrote: There is also a explicit check noprofile ("// COLDCOV: noprofile"), so I guess `implicit-check-not` doesn't work here. https://github.com/llvm/llvm-project/pull/136333 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Coverage] Support -fprofile-list for cold function coverage (PR #136333)
@@ -9,6 +9,9 @@ // RUN: echo -e "[clang]\nfun:test1\n[llvm]\nfun:test2" > %t-section.list // RUN: %clang_cc1 -fprofile-instrument=llvm -fprofile-list=%t-section.list -emit-llvm %s -o - | FileCheck %s --check-prefix=SECTION +// RUN: echo -e "[coldcov]\nfun:test*\n!fun:test2" > %t-cold-func.list +// RUN: %clang_cc1 -fprofile-instrument=coldcov -fprofile-list=%t-cold-func.list -emit-llvm %s -o - | FileCheck %s --check-prefix=COLDCOV + wlei-llvm wrote: improved the test with `split-file`, thanks! https://github.com/llvm/llvm-project/pull/136333 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Coverage] Support -fprofile-list for cold function coverage (PR #136333)
https://github.com/wlei-llvm closed https://github.com/llvm/llvm-project/pull/136333 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits