[clang] [llvm] [PseudoProbe] Mix and reorder block and call probe ID in lexical order (PR #75092)

2023-12-11 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm created 
https://github.com/llvm/llvm-project/pull/75092

Before all the call probe ids are after block ids, in this change, it mixed the 
call probe and block probe by reordering them in lexical(line-number) order.  
For example:
```
main():
BB1
if(...) 
  BB2 foo(..);   
else 
  BB3 bar(...);
BB4
```
Before the profile is
```
main
 1: ..
 2: ..
 3: ...
 4: ...
 5: foo ...
 6: bar ...
 ```
 Now the new order is
```
 main
 1: ..
 2: ..
 3: foo ...
 4: ...
 5: bar ...
 6: ...
```
This can potentially make it more tolerant of profile mismatch, either from 
stale profile or frontend change. e.g. before if we add one block, even the 
block is the last one, all the call probes are shifted and mismatched. 
Moreover, this makes better use of call-anchor based stale profile matching. 
Blocks are matched based on the closest anchor, there would be more anchors 
used for the matching, reduce the mismatch scope.



>From 90c3c829e67492e92b86517e92a49c6f90e83b4f Mon Sep 17 00:00:00 2001
From: wlei 
Date: Sun, 10 Dec 2023 18:30:42 -0800
Subject: [PATCH] [PseudoProbe] Mix and reorder block and call probe ID in
 lexical order

---
 clang/test/CodeGen/pseudo-probe-emit.c|  8 +++
 .../llvm/Transforms/IPO/SampleProfileProbe.h  |  3 +--
 .../lib/Transforms/IPO/SampleProfileProbe.cpp | 16 +++---
 .../Inputs/pseudo-probe-profile.prof  |  8 +++
 .../Inputs/pseudo-probe-update.prof   |  8 +++
 .../SampleProfile/pseudo-probe-dangle.ll  | 12 +-
 .../pseudo-probe-discriminator.ll |  6 ++---
 .../pseudo-probe-profile-metadata-2.ll| 15 ++---
 .../SampleProfile/pseudo-probe-profile.ll | 22 +--
 .../SampleProfile/pseudo-probe-update.ll  | 11 +-
 .../SampleProfile/pseudo-probe-verify.ll  | 16 +++---
 11 files changed, 56 insertions(+), 69 deletions(-)

diff --git a/clang/test/CodeGen/pseudo-probe-emit.c 
b/clang/test/CodeGen/pseudo-probe-emit.c
index c7a3f7e6d5b02b..360f831e842945 100644
--- a/clang/test/CodeGen/pseudo-probe-emit.c
+++ b/clang/test/CodeGen/pseudo-probe-emit.c
@@ -10,9 +10,9 @@ void foo(int x) {
   // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID:]], i64 1, i32 0, i64 -1)
   if (x == 0)
 // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 2, i32 0, i64 -1)
-bar();
+bar();  // probe id : 3
   else
-// CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 3, i32 0, i64 -1)
-go();
-  // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 4, i32 0, i64 -1)
+// CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 4, i32 0, i64 -1)
+go(); // probe id : 5
+  // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 6, i32 0, i64 -1)
 }
diff --git a/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h 
b/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h
index 0f2729a9462de2..69b87adf105fd1 100644
--- a/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h
+++ b/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h
@@ -82,8 +82,7 @@ class SampleProfileProber {
   uint32_t getBlockId(const BasicBlock *BB) const;
   uint32_t getCallsiteId(const Instruction *Call) const;
   void computeCFGHash();
-  void computeProbeIdForBlocks();
-  void computeProbeIdForCallsites();
+  void computeProbeId();
 
   Function *F;
 
diff --git a/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp 
b/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp
index 8f0b12d0cfedfc..e2c5eaf261d298 100644
--- a/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp
+++ b/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp
@@ -173,8 +173,7 @@ SampleProfileProber::SampleProfileProber(Function &Func,
   BlockProbeIds.clear();
   CallProbeIds.clear();
   LastProbeId = (uint32_t)PseudoProbeReservedId::Last;
-  computeProbeIdForBlocks();
-  computeProbeIdForCallsites();
+  computeProbeId();
   computeCFGHash();
 }
 
@@ -209,7 +208,7 @@ void SampleProfileProber::computeCFGHash() {
 << ", Hash = " << FunctionHash << "\n");
 }
 
-void SampleProfileProber::computeProbeIdForBlocks() {
+void SampleProfileProber::computeProbeId() {
   DenseSet KnownColdBlocks;
   computeEHOnlyBlocks(*F, KnownColdBlocks);
   // Insert pseudo probe to non-cold blocks only. This will reduce IR size as
@@ -218,18 +217,9 @@ void SampleProfileProber::computeProbeIdForBlocks() {
 ++LastProbeId;
 if (!KnownColdBlocks.contains(&BB))
   BlockProbeIds[&BB] = LastProbeId;
-  }
-}
-
-void SampleProfileProber::computeProbeIdForCallsites() {
-  LLVMContext &Ctx = F->getContext();
-  Module *M = F->getParent();
 
-  for (auto &BB : *F) {
 for (auto &I : BB) {
-  if (!isa(I))
-continue;
-  if (isa(&I))
+  if (!isa(I) || isa(&I))
 continue;
 
   // The current implementation uses the lower 16 bits of the discriminator
diff --git 
a/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-profile.prof 
b/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-profile.prof
ind

[clang] [llvm] [PseudoProbe] Mix and reorder block and call probe ID in lexical order (PR #75092)

2023-12-11 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm updated 
https://github.com/llvm/llvm-project/pull/75092

>From fe05f64f7ae0decc17d98bb694bbade178df7a4b Mon Sep 17 00:00:00 2001
From: wlei 
Date: Sun, 10 Dec 2023 18:30:42 -0800
Subject: [PATCH] [PseudoProbe] Mix and reorder block and call probe ID in
 lexical order

---
 clang/test/CodeGen/pseudo-probe-emit.c|  8 +++
 .../llvm/Transforms/IPO/SampleProfileProbe.h  |  3 +--
 .../lib/Transforms/IPO/SampleProfileProbe.cpp | 19 +---
 .../Inputs/pseudo-probe-profile.prof  |  8 +++
 .../Inputs/pseudo-probe-update.prof   |  8 +++
 .../SampleProfile/pseudo-probe-dangle.ll  | 12 +-
 .../pseudo-probe-discriminator.ll |  6 ++---
 .../pseudo-probe-profile-metadata-2.ll| 15 ++---
 .../SampleProfile/pseudo-probe-profile.ll | 22 +--
 .../SampleProfile/pseudo-probe-update.ll  | 11 +-
 .../SampleProfile/pseudo-probe-verify.ll  | 16 +++---
 11 files changed, 59 insertions(+), 69 deletions(-)

diff --git a/clang/test/CodeGen/pseudo-probe-emit.c 
b/clang/test/CodeGen/pseudo-probe-emit.c
index c7a3f7e6d5b02..360f831e84294 100644
--- a/clang/test/CodeGen/pseudo-probe-emit.c
+++ b/clang/test/CodeGen/pseudo-probe-emit.c
@@ -10,9 +10,9 @@ void foo(int x) {
   // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID:]], i64 1, i32 0, i64 -1)
   if (x == 0)
 // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 2, i32 0, i64 -1)
-bar();
+bar();  // probe id : 3
   else
-// CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 3, i32 0, i64 -1)
-go();
-  // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 4, i32 0, i64 -1)
+// CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 4, i32 0, i64 -1)
+go(); // probe id : 5
+  // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 6, i32 0, i64 -1)
 }
diff --git a/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h 
b/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h
index 0f2729a9462de..69b87adf105fd 100644
--- a/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h
+++ b/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h
@@ -82,8 +82,7 @@ class SampleProfileProber {
   uint32_t getBlockId(const BasicBlock *BB) const;
   uint32_t getCallsiteId(const Instruction *Call) const;
   void computeCFGHash();
-  void computeProbeIdForBlocks();
-  void computeProbeIdForCallsites();
+  void computeProbeId();
 
   Function *F;
 
diff --git a/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp 
b/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp
index 8f0b12d0cfedf..6c6bb18bcb3c9 100644
--- a/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp
+++ b/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp
@@ -173,8 +173,7 @@ SampleProfileProber::SampleProfileProber(Function &Func,
   BlockProbeIds.clear();
   CallProbeIds.clear();
   LastProbeId = (uint32_t)PseudoProbeReservedId::Last;
-  computeProbeIdForBlocks();
-  computeProbeIdForCallsites();
+  computeProbeId();
   computeCFGHash();
 }
 
@@ -209,7 +208,10 @@ void SampleProfileProber::computeCFGHash() {
 << ", Hash = " << FunctionHash << "\n");
 }
 
-void SampleProfileProber::computeProbeIdForBlocks() {
+void SampleProfileProber::computeProbeId() {
+  LLVMContext &Ctx = F->getContext();
+  Module *M = F->getParent();
+
   DenseSet KnownColdBlocks;
   computeEHOnlyBlocks(*F, KnownColdBlocks);
   // Insert pseudo probe to non-cold blocks only. This will reduce IR size as
@@ -218,18 +220,9 @@ void SampleProfileProber::computeProbeIdForBlocks() {
 ++LastProbeId;
 if (!KnownColdBlocks.contains(&BB))
   BlockProbeIds[&BB] = LastProbeId;
-  }
-}
 
-void SampleProfileProber::computeProbeIdForCallsites() {
-  LLVMContext &Ctx = F->getContext();
-  Module *M = F->getParent();
-
-  for (auto &BB : *F) {
 for (auto &I : BB) {
-  if (!isa(I))
-continue;
-  if (isa(&I))
+  if (!isa(I) || isa(&I))
 continue;
 
   // The current implementation uses the lower 16 bits of the discriminator
diff --git 
a/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-profile.prof 
b/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-profile.prof
index ba4c6117dc96a..d3847946b9403 100644
--- a/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-profile.prof
+++ b/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-profile.prof
@@ -1,8 +1,8 @@
 foo:3200:13
  1: 13
  2: 7
- 3: 6
- 4: 13
- 5: 7 _Z3barv:2 _Z3foov:5
- 6: 6 _Z3barv:4 _Z3foov:2
+ 4: 6
+ 6: 13
+ 3: 7 _Z3barv:2 _Z3foov:5
+ 5: 6 _Z3barv:4 _Z3foov:2
  !CFGChecksum: 563022570642068
diff --git a/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-update.prof 
b/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-update.prof
index 62f9bd5992e73..213bf0b6f81cc 100644
--- a/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-update.prof
+++ b/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-update.prof
@@ -1,8 +1,8 @@
 foo:3200:13
  1: 13
  2: 7
- 3: 6
-

[llvm] [clang] [PseudoProbe] Mix and reorder block and call probe ID in lexical order (PR #75092)

2024-01-04 Thread Lei Wang via cfe-commits

wlei-llvm wrote:

> We will need some protection against accidentally consuming old profile with 
> new toolchain and vice versa. The cost of investigating mysterious perf 
> regression can be high and we'd rather simply error out in those cases. Maybe 
> consider some flag for `SecProfSummaryFlags` 
> 
Agreed with this concern. To do this, we probably also need a flag in the 
binary, because otherwise if we use the new toolchain for prof-gen but the 
binary built on the old toolchain, we then would generate a profile with this 
flag on but the order is the old one.  My understanding is the (probe) version 
of profile should line up with the version of the binary. Maybe make this more 
general, we can introduce a "pseudo_probe_version" thing, similar to instrPGO 
https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/ProfileData/InstrProf.h#L996
 . It can be useful if we want to update other pseudo probe formats in future.  

> (also wondering if this needs to be per-function flag since technically some 
> function/module can be compiled with old order while others with new order).

This seems very complicated, supposing you meant it's mixed compiled case(some 
objects files are built on old toolchain and some are on new toolchain), then 
in profile generation time, it's hard to know which order the function uses. 
That requires some function level version thing.. maybe add version to 
`pseudo_probe_desc` which is function-level and encode the version.   

> Can we gauge perf benefit of mixed order encoding? i.e. profile an old source 
> rev (several weeks ago?) with both probe order, and build with new source rev 
> using that old profile, then see how old vs new order compare in terms of 
> perf and efficacy with incremental PGO.

Yes, I saw some perf wins using the new order, it's not as easily measurable as 
the call anchor, as you said, I had to use a old rev to run profile collection 
using the new order. I will show you the results offline.


https://github.com/llvm/llvm-project/pull/75092
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [PseudoProbe] Mix and reorder block and call probe ID in lexical order (PR #75092)

2024-01-10 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm updated 
https://github.com/llvm/llvm-project/pull/75092

>From ccfee2d0c5399b03b53ef79a4645ab0d10efeafd Mon Sep 17 00:00:00 2001
From: wlei 
Date: Sun, 10 Dec 2023 18:30:42 -0800
Subject: [PATCH 1/2] [PseudoProbe] Mix and reorder block and call probe ID in
 lexical order

---
 clang/test/CodeGen/pseudo-probe-emit.c|  8 +++
 .../llvm/Transforms/IPO/SampleProfileProbe.h  |  3 +--
 .../lib/Transforms/IPO/SampleProfileProbe.cpp | 19 +---
 .../Inputs/pseudo-probe-profile.prof  |  8 +++
 .../Inputs/pseudo-probe-update.prof   |  8 +++
 .../SampleProfile/pseudo-probe-dangle.ll  | 12 +-
 .../pseudo-probe-discriminator.ll |  6 ++---
 .../pseudo-probe-profile-metadata-2.ll| 15 ++---
 .../SampleProfile/pseudo-probe-profile.ll | 22 +--
 .../SampleProfile/pseudo-probe-update.ll  | 11 +-
 .../SampleProfile/pseudo-probe-verify.ll  | 16 +++---
 11 files changed, 59 insertions(+), 69 deletions(-)

diff --git a/clang/test/CodeGen/pseudo-probe-emit.c 
b/clang/test/CodeGen/pseudo-probe-emit.c
index c7a3f7e6d5b02b..360f831e842945 100644
--- a/clang/test/CodeGen/pseudo-probe-emit.c
+++ b/clang/test/CodeGen/pseudo-probe-emit.c
@@ -10,9 +10,9 @@ void foo(int x) {
   // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID:]], i64 1, i32 0, i64 -1)
   if (x == 0)
 // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 2, i32 0, i64 -1)
-bar();
+bar();  // probe id : 3
   else
-// CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 3, i32 0, i64 -1)
-go();
-  // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 4, i32 0, i64 -1)
+// CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 4, i32 0, i64 -1)
+go(); // probe id : 5
+  // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 6, i32 0, i64 -1)
 }
diff --git a/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h 
b/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h
index 0f2729a9462de2..69b87adf105fd1 100644
--- a/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h
+++ b/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h
@@ -82,8 +82,7 @@ class SampleProfileProber {
   uint32_t getBlockId(const BasicBlock *BB) const;
   uint32_t getCallsiteId(const Instruction *Call) const;
   void computeCFGHash();
-  void computeProbeIdForBlocks();
-  void computeProbeIdForCallsites();
+  void computeProbeId();
 
   Function *F;
 
diff --git a/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp 
b/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp
index 8f0b12d0cfedfc..6c6bb18bcb3c9d 100644
--- a/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp
+++ b/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp
@@ -173,8 +173,7 @@ SampleProfileProber::SampleProfileProber(Function &Func,
   BlockProbeIds.clear();
   CallProbeIds.clear();
   LastProbeId = (uint32_t)PseudoProbeReservedId::Last;
-  computeProbeIdForBlocks();
-  computeProbeIdForCallsites();
+  computeProbeId();
   computeCFGHash();
 }
 
@@ -209,7 +208,10 @@ void SampleProfileProber::computeCFGHash() {
 << ", Hash = " << FunctionHash << "\n");
 }
 
-void SampleProfileProber::computeProbeIdForBlocks() {
+void SampleProfileProber::computeProbeId() {
+  LLVMContext &Ctx = F->getContext();
+  Module *M = F->getParent();
+
   DenseSet KnownColdBlocks;
   computeEHOnlyBlocks(*F, KnownColdBlocks);
   // Insert pseudo probe to non-cold blocks only. This will reduce IR size as
@@ -218,18 +220,9 @@ void SampleProfileProber::computeProbeIdForBlocks() {
 ++LastProbeId;
 if (!KnownColdBlocks.contains(&BB))
   BlockProbeIds[&BB] = LastProbeId;
-  }
-}
 
-void SampleProfileProber::computeProbeIdForCallsites() {
-  LLVMContext &Ctx = F->getContext();
-  Module *M = F->getParent();
-
-  for (auto &BB : *F) {
 for (auto &I : BB) {
-  if (!isa(I))
-continue;
-  if (isa(&I))
+  if (!isa(I) || isa(&I))
 continue;
 
   // The current implementation uses the lower 16 bits of the discriminator
diff --git 
a/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-profile.prof 
b/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-profile.prof
index ba4c6117dc96ab..d3847946b94033 100644
--- a/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-profile.prof
+++ b/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-profile.prof
@@ -1,8 +1,8 @@
 foo:3200:13
  1: 13
  2: 7
- 3: 6
- 4: 13
- 5: 7 _Z3barv:2 _Z3foov:5
- 6: 6 _Z3barv:4 _Z3foov:2
+ 4: 6
+ 6: 13
+ 3: 7 _Z3barv:2 _Z3foov:5
+ 5: 6 _Z3barv:4 _Z3foov:2
  !CFGChecksum: 563022570642068
diff --git a/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-update.prof 
b/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-update.prof
index 62f9bd5992e735..213bf0b6f81cc4 100644
--- a/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-update.prof
+++ b/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-update.prof
@@ -1,8 +1,8 @@
 foo:3200:13
  1: 13
 

[clang] [llvm] [PseudoProbe] Mix and reorder block and call probe ID in lexical order (PR #75092)

2024-01-10 Thread Lei Wang via cfe-commits

wlei-llvm wrote:

> > Agreed with this concern. To do this, we probably also need a flag in the 
> > binary, because otherwise if we use the new toolchain for prof-gen but the 
> > binary built on the old toolchain, we then would generate a profile with 
> > this flag on but the order is the old one. My understanding is the (probe) 
> > version of profile should line up with the version of the binary. Maybe 
> > make this more general, we can introduce a "pseudo_probe_version" thing, 
> > similar to instrPGO 
> > https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/ProfileData/InstrProf.h#L996
> >  . It can be useful if we want to update other pseudo probe formats in 
> > future.
> 
> Versioning is one way to do this. Alternatively, we should be able to detect 
> the case during llvm-profgen time when all call site probe and block probe 
> ids are separated, which can then be used to set flags?

Sounds good. 

Add a new function(`checkProbeIDIsInMixedOrder`) to check the old order in 
profile generation time.
Always write a SecProfSummaryFlags flag, if the function return true. 

https://github.com/llvm/llvm-project/pull/75092
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang][Sema] Fix crash when using name of UnresolvedUsingValueDecl with template arguments (PR #83842)

2024-03-07 Thread Lei Wang via cfe-commits

wlei-llvm wrote:

Hi:
We hit a crash/assertion, and bisected to this. Here is the stack dump:

```
clang++: 
/home/wlei/local/upstream/llvm-project/llvm/include/llvm/Support/Casting.h:578: 
decltype(auto) llvm::cast(From *) [To = clang::CXXRecordDecl, From = 
clang::DeclContext]: Assertion `isa(Val) && "cast() argument of 
incompatible type!"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and 
include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.  Program arguments: /home/wlei/local/upstream/llvm-build/bin/clang++ 
@/tmp/fbcc.xxbvookk/compiler.argsfile
1.  fbcode/multifeed/ranking/user_profile/GFIHCountsMap.cpp:633:55: current 
parser token ')'
2.  fbcode/multifeed/ranking/user_profile/GFIHCountsMap.cpp:26:1: parsing 
namespace 'facebook'
3.  fbcode/multifeed/ranking/user_profile/GFIHCountsMap.cpp:622:29: parsing 
function body 
'facebook::multifeed::ranking::GFIHCountsMap::getTotalAggregationIfConfigured'
4.  fbcode/multifeed/ranking/user_profile/GFIHCountsMap.cpp:622:29: in 
compound statement ('{}')
 #0 0x7f25c2bf9588 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) 
/home/wlei/local/upstream/llvm-project/llvm/lib/Support/Unix/Signals.inc:723:13
 #1 0x7f25c2bf7630 llvm::sys::RunSignalHandlers() 
/home/wlei/local/upstream/llvm-project/llvm/lib/Support/Signals.cpp:106:18
 #2 0x7f25c2bf8c41 llvm::sys::CleanupOnSignal(unsigned long) 
/home/wlei/local/upstream/llvm-project/llvm/lib/Support/Unix/Signals.inc:0:3
 #3 0x7f25c2b3f908 (anonymous 
namespace)::CrashRecoveryContextImpl::HandleCrash(int, unsigned long) 
/home/wlei/local/upstream/llvm-project/llvm/lib/Support/CrashRecoveryContext.cpp:73:5
 #4 0x7f25c2b3f908 CrashRecoverySignalHandler(int) 
/home/wlei/local/upstream/llvm-project/llvm/lib/Support/CrashRecoveryContext.cpp:390:51
 #5 0x7f25c7612d20 __restore_rt (/lib64/libpthread.so.0+0x12d20)
 #6 0x7f25c1a4e52f raise (/lib64/libc.so.6+0x4e52f)
 #7 0x7f25c1a21e65 abort (/lib64/libc.so.6+0x21e65)
 #8 0x7f25c1a21d39 _nl_load_domain.cold.0 (/lib64/libc.so.6+0x21d39)
 #9 0x7f25c1a46e86 (/lib64/libc.so.6+0x46e86)
#10 0x7f25bf3a934c FindDeclaringClass(clang::NamedDecl*) 
/home/wlei/local/upstream/llvm-project/clang/lib/Sema/SemaAccess.cpp:0:0
#11 0x7f25bf3a9265 (anonymous namespace)::AccessTarget::initialize() 
/home/wlei/local/upstream/llvm-project/clang/lib/Sema/SemaAccess.cpp:267:24
#12 0x7f25bf3a686c clang::sema::AccessedEntity::isQuiet() const 
/home/wlei/local/upstream/llvm-project/clang/include/clang/Sema/DelayedDiagnostic.h:75:50
#13 0x7f25bf3a686c clang::sema::AccessedEntity::setDiag(unsigned int) 
/home/wlei/local/upstream/llvm-project/clang/include/clang/Sema/DelayedDiagnostic.h:104:5
#14 0x7f25bf3a686c 
clang::Sema::CheckUnresolvedLookupAccess(clang::UnresolvedLookupExpr*, 
clang::DeclAccessPair) 
/home/wlei/local/upstream/llvm-project/clang/lib/Sema/SemaAccess.cpp:1567:10
#15 0x7f25bfa98e83 clang::DeclarationNameInfo::getLoc() const 
/home/wlei/local/upstream/llvm-project/clang/include/clang/AST/DeclarationName.h:797:42
#16 0x7f25bfa98e83 clang::OverloadExpr::getNameLoc() const 
/home/wlei/local/upstream/llvm-project/clang/include/clang/AST/ExprCXX.h:3078:55
#17 0x7f25bfa98e83 FinishOverloadedCallExpr(clang::Sema&, clang::Scope*, 
clang::Expr*, clang::UnresolvedLookupExpr*, clang::SourceLocation, 
llvm::MutableArrayRef, clang::SourceLocation, clang::Expr*, 
clang::OverloadCandidateSet*, clang::OverloadCandidate**, 
clang::OverloadingResult, bool) 
/home/wlei/local/upstream/llvm-project/clang/lib/Sema/SemaOverload.cpp:14062:47
#18 0x7f25bfa98c8f clang::Sema::BuildOverloadedCallExpr(clang::Scope*, 
clang::Expr*, clang::UnresolvedLookupExpr*, clang::SourceLocation, 
llvm::MutableArrayRef, clang::SourceLocation, clang::Expr*, bool, 
bool) 
/home/wlei/local/upstream/llvm-project/clang/lib/Sema/SemaOverload.cpp:14200:10
#19 0x7f25bf70511d clang::Sema::BuildCallExpr(clang::Scope*, clang::Expr*, 
clang::SourceLocation, llvm::MutableArrayRef, 
clang::SourceLocation, clang::Expr*, bool, bool) 
/home/wlei/local/upstream/llvm-project/clang/lib/Sema/SemaExpr.cpp:7276:16
#20 0x7f25bf720e6b clang::Sema::ActOnCallExpr(clang::Scope*, clang::Expr*, 
clang::SourceLocation, llvm::MutableArrayRef, 
clang::SourceLocation, clang::Expr*) 
/home/wlei/local/upstream/llvm-project/clang/lib/Sema/SemaExpr.cpp:7167:7
#21 0x7f25c2d59dca 
clang::Parser::ParsePostfixExpressionSuffix(clang::ActionResult) 
/home/wlei/local/upstream/llvm-project/clang/lib/Parse/ParseExpr.cpp:2181:23
#22 0x7f25c2d5af8a 
clang::Parser::ParseCastExpression(clang::Parser::CastParseKind, bool, bool&, 
clang::Parser::TypeCastState, bool, bool*) 
/home/wlei/local/upstream/llvm-project/clang/lib/Parse/ParseExpr.cpp:1890:7
#23 0x7f25c2d56e2b 
clang::Parser::ParseCastExpression(clang::Parser::CastParseKind, bool, 
clang::Parser::TypeCastState, bool, bool*) 
/home/wlei/l

[clang] [Clang][Sema] Fix crash when using name of UnresolvedUsingValueDecl with template arguments (PR #83842)

2024-03-09 Thread Lei Wang via cfe-commits

wlei-llvm wrote:

Here is the repro from our side. It's reduced by creduce, there are some 
unrelated error, the assertion is the real issue. 

```
typedef a;
template < typename b, b c > struct d {
  static constexpr b e = c;
};
typedef d< bool, true > f;
typedef d< bool, false > g;
template < bool, typename, typename > struct aa;
template < typename... > struct h;
template < typename i, typename j, typename ab, typename... k >
struct h< i, j, ab, k... >
: aa< i::e, i, h<> >::l {};
template < typename > struct ad : g {};
template < typename b >
struct ae : ad<  b  >{};
template < typename ai, typename aj,
   bool = h< ae< ai >,  aj ,
  aj  >::e >
class ak {
  template < typename >
  static f al(int);
public:
  typedef decltype(al< aj >(0)) l;
};
template < typename ai, typename aj >
struct am : ak< ai, aj >::l {};
template < bool , typename az, typename >
struct aa {
  typedef az l;
}template < typename ai, typename aj >
constexpr bool bj = am< ai, aj >::e;
template < typename bx, typename by >
concept bz =
bj< bx , by >;
  template < typename... > class cy
template < typename m >
concept n =
bz<  m , m > 
 template < typename b >
concept o =
n<  b  >
template < a , typename > struct tuple_element;
template < typename p, typename r >
class cy< p, r > {};
template < a q, typename... t >
__tuple_element_t< q, cy<> > get( cy< t... > ) ;
class s ;
template < class w > class u {
  w begin() ;
};
template < typename x > class v {
  using y = x
   ;
  using dp = decltype(static_cast< y (*)() >(
  nullptr)()[{}])
  ;
  dp operator*()
}
template < typename x >
struct dr {
  using ds = x::dt;
  using dp = u< ds >;
}template < o x > class du {
  using   dr< x >
  ::dp;
}
enum class dy
class dz 
template < typename eb > struct ec : eb ;
template < typename... ed >
struct ee
 :ec< cy< ed... > > {};
template < o... ej > class ek {
public:
  using el = a;
  using dw =
  ee<  ej ... >;
  using dt = v< ek >
   ;
  dw operator[](el ) ;
};
class  {
  using eo =
   s ;
  using ep =
  ek< eo,  dz  >;
  using eq = du< ep >::dp;
  eq get(dy cat) {
  auto es = get(cat)
  auto et = es.begin();
  get< 0 >(*et)
```

https://github.com/llvm/llvm-project/pull/83842
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang][Sema] Fix crash when using name of UnresolvedUsingValueDecl with template arguments (PR #83842)

2024-03-09 Thread Lei Wang via cfe-commits

wlei-llvm wrote:

```
clang++ -std=gnu++20  test.cpp

clang-18: /../llvm-project/llvm/include/llvm/Support/Casting.h:578: 
decltype(auto) llvm::cast(From *) [To = clang::CXXRecordDecl, From = 
clang::DeclContext]: Assertion `isa(Val) && "cast() argument of 
incompatible type!"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and 
include the crash backtrace, preprocessed source, and associated run script.

1.  test.bk.10.cpp:90:15: current parser token ')'
2.  test.bk.10.cpp:81:1: parsing struct/union/class body '(unnamed class at 
test.bk.10.cpp:81:1)'
3.  test.bk.10.cpp:87:18: parsing function body '(anonymous class)::get'
4.  test.bk.10.cpp:87:18: in compound statement ('{}')
 #0 0x7fb54bdf9588 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) 
/home/wlei/local/upstream/llvm-project/llvm/lib/Support/Unix/Signals.inc:723:13
 #1 0x7fb54bdf7630 llvm::sys::RunSignalHandlers() 
/home/wlei/local/upstream/llvm-project/llvm/lib/Support/Signals.cpp:106:18
 #2 0x7fb54bdf9c5d SignalHandler(int) 
/home/wlei/local/upstream/llvm-project/llvm/lib/Support/Unix/Signals.inc:413:1
 #3 0x7fb550812d20 __restore_rt (/lib64/libpthread.so.0+0x12d20)
 #4 0x7fb54ac4e52f raise (/lib64/libc.so.6+0x4e52f)
 #5 0x7fb54ac21e65 abort (/lib64/libc.so.6+0x21e65)
 #6 0x7fb54ac21d39 _nl_load_domain.cold.0 (/lib64/libc.so.6+0x21d39)
 #7 0x7fb54ac46e86 (/lib64/libc.so.6+0x46e86)
 #8 0x7fb5485a934c FindDeclaringClass(clang::NamedDecl*) 
/home/wlei/local/upstream/llvm-project/clang/lib/Sema/SemaAccess.cpp:0:0
 #9 0x7fb5485a9265 (anonymous namespace)::AccessTarget::initialize() 
/home/wlei/local/upstream/llvm-project/clang/lib/Sema/SemaAccess.cpp:267:24
#10 0x7fb5485a686c clang::sema::AccessedEntity::isQuiet() const 
/home/wlei/local/upstream/llvm-project/clang/include/clang/Sema/DelayedDiagnostic.h:75:50
#11 0x7fb5485a686c clang::sema::AccessedEntity::setDiag(unsigned int) 
/home/wlei/local/upstream/llvm-project/clang/include/clang/Sema/DelayedDiagnostic.h:104:5
#12 0x7fb5485a686c 
clang::Sema::CheckUnresolvedLookupAccess(clang::UnresolvedLookupExpr*, 
clang::DeclAccessPair) 
/home/wlei/local/upstream/llvm-project/clang/lib/Sema/SemaAccess.cpp:1567:10
#13 0x7fb548c98e83 clang::DeclarationNameInfo::getLoc() const 
/home/wlei/local/upstream/llvm-project/clang/include/clang/AST/DeclarationName.h:797:42
#14 0x7fb548c98e83 clang::OverloadExpr::getNameLoc() const 
/home/wlei/local/upstream/llvm-project/clang/include/clang/AST/ExprCXX.h:3078:55
#15 0x7fb548c98e83 FinishOverloadedCallExpr(clang::Sema&, clang::Scope*, 
clang::Expr*, clang::UnresolvedLookupExpr*, clang::SourceLocation, 
llvm::MutableArrayRef, clang::SourceLocation, clang::Expr*, 
clang::OverloadCandidateSet*, clang::OverloadCandidate**, 
clang::OverloadingResult, bool) 
/home/wlei/local/upstream/llvm-project/clang/lib/Sema/SemaOverload.cpp:14062:47
#16 0x7fb548c98c8f clang::Sema::BuildOverloadedCallExpr(clang::Scope*, 
clang::Expr*, clang::UnresolvedLookupExpr*, clang::SourceLocation, 
llvm::MutableArrayRef, clang::SourceLocation, clang::Expr*, bool, 
bool) 
/home/wlei/local/upstream/llvm-project/clang/lib/Sema/SemaOverload.cpp:14200:10
#17 0x7fb54890511d clang::Sema::BuildCallExpr(clang::Scope*, clang::Expr*, 
clang::SourceLocation, llvm::MutableArrayRef, 
clang::SourceLocation, clang::Expr*, bool, bool) 
/home/wlei/local/upstream/llvm-project/clang/lib/Sema/SemaExpr.cpp:7276:16
#18 0x7fb548920e6b clang::Sema::ActOnCallExpr(clang::Scope*, clang::Expr*, 
clang::SourceLocation, llvm::MutableArrayRef, 
clang::SourceLocation, clang::Expr*) 
/home/wlei/local/upstream/llvm-project/clang/lib/Sema/SemaExpr.cpp:7167:7
#19 0x7fb54bf59dca 
clang::Parser::ParsePostfixExpressionSuffix(clang::ActionResult) 
/home/wlei/local/upstream/llvm-project/clang/lib/Parse/ParseExpr.cpp:2181:23
#20 0x7fb54bf5af8a 
clang::Parser::ParseCastExpression(clang::Parser::CastParseKind, bool, bool&, 
clang::Parser::TypeCastState, bool, bool*) 
/home/wlei/local/upstream/llvm-project/clang/lib/Parse/ParseExpr.cpp:1890:7
#21 0x7fb54bf56e2b 
clang::Parser::ParseCastExpression(clang::Parser::CastParseKind, bool, 
clang::Parser::TypeCastState, bool, bool*) 
/home/wlei/local/upstream/llvm-project/clang/lib/Parse/ParseExpr.cpp:693:20
#22 0x7fb54bf56e2b clang::Parser:
```


https://github.com/llvm/llvm-project/pull/83842
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [PseudoProbe] Mix and reorder block and call probe ID in lexical order (PR #75092)

2024-02-28 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm updated 
https://github.com/llvm/llvm-project/pull/75092

>From ccfee2d0c5399b03b53ef79a4645ab0d10efeafd Mon Sep 17 00:00:00 2001
From: wlei 
Date: Sun, 10 Dec 2023 18:30:42 -0800
Subject: [PATCH 1/3] [PseudoProbe] Mix and reorder block and call probe ID in
 lexical order

---
 clang/test/CodeGen/pseudo-probe-emit.c|  8 +++
 .../llvm/Transforms/IPO/SampleProfileProbe.h  |  3 +--
 .../lib/Transforms/IPO/SampleProfileProbe.cpp | 19 +---
 .../Inputs/pseudo-probe-profile.prof  |  8 +++
 .../Inputs/pseudo-probe-update.prof   |  8 +++
 .../SampleProfile/pseudo-probe-dangle.ll  | 12 +-
 .../pseudo-probe-discriminator.ll |  6 ++---
 .../pseudo-probe-profile-metadata-2.ll| 15 ++---
 .../SampleProfile/pseudo-probe-profile.ll | 22 +--
 .../SampleProfile/pseudo-probe-update.ll  | 11 +-
 .../SampleProfile/pseudo-probe-verify.ll  | 16 +++---
 11 files changed, 59 insertions(+), 69 deletions(-)

diff --git a/clang/test/CodeGen/pseudo-probe-emit.c 
b/clang/test/CodeGen/pseudo-probe-emit.c
index c7a3f7e6d5b02b..360f831e842945 100644
--- a/clang/test/CodeGen/pseudo-probe-emit.c
+++ b/clang/test/CodeGen/pseudo-probe-emit.c
@@ -10,9 +10,9 @@ void foo(int x) {
   // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID:]], i64 1, i32 0, i64 -1)
   if (x == 0)
 // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 2, i32 0, i64 -1)
-bar();
+bar();  // probe id : 3
   else
-// CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 3, i32 0, i64 -1)
-go();
-  // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 4, i32 0, i64 -1)
+// CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 4, i32 0, i64 -1)
+go(); // probe id : 5
+  // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 6, i32 0, i64 -1)
 }
diff --git a/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h 
b/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h
index 0f2729a9462de2..69b87adf105fd1 100644
--- a/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h
+++ b/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h
@@ -82,8 +82,7 @@ class SampleProfileProber {
   uint32_t getBlockId(const BasicBlock *BB) const;
   uint32_t getCallsiteId(const Instruction *Call) const;
   void computeCFGHash();
-  void computeProbeIdForBlocks();
-  void computeProbeIdForCallsites();
+  void computeProbeId();
 
   Function *F;
 
diff --git a/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp 
b/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp
index 8f0b12d0cfedfc..6c6bb18bcb3c9d 100644
--- a/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp
+++ b/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp
@@ -173,8 +173,7 @@ SampleProfileProber::SampleProfileProber(Function &Func,
   BlockProbeIds.clear();
   CallProbeIds.clear();
   LastProbeId = (uint32_t)PseudoProbeReservedId::Last;
-  computeProbeIdForBlocks();
-  computeProbeIdForCallsites();
+  computeProbeId();
   computeCFGHash();
 }
 
@@ -209,7 +208,10 @@ void SampleProfileProber::computeCFGHash() {
 << ", Hash = " << FunctionHash << "\n");
 }
 
-void SampleProfileProber::computeProbeIdForBlocks() {
+void SampleProfileProber::computeProbeId() {
+  LLVMContext &Ctx = F->getContext();
+  Module *M = F->getParent();
+
   DenseSet KnownColdBlocks;
   computeEHOnlyBlocks(*F, KnownColdBlocks);
   // Insert pseudo probe to non-cold blocks only. This will reduce IR size as
@@ -218,18 +220,9 @@ void SampleProfileProber::computeProbeIdForBlocks() {
 ++LastProbeId;
 if (!KnownColdBlocks.contains(&BB))
   BlockProbeIds[&BB] = LastProbeId;
-  }
-}
 
-void SampleProfileProber::computeProbeIdForCallsites() {
-  LLVMContext &Ctx = F->getContext();
-  Module *M = F->getParent();
-
-  for (auto &BB : *F) {
 for (auto &I : BB) {
-  if (!isa(I))
-continue;
-  if (isa(&I))
+  if (!isa(I) || isa(&I))
 continue;
 
   // The current implementation uses the lower 16 bits of the discriminator
diff --git 
a/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-profile.prof 
b/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-profile.prof
index ba4c6117dc96ab..d3847946b94033 100644
--- a/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-profile.prof
+++ b/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-profile.prof
@@ -1,8 +1,8 @@
 foo:3200:13
  1: 13
  2: 7
- 3: 6
- 4: 13
- 5: 7 _Z3barv:2 _Z3foov:5
- 6: 6 _Z3barv:4 _Z3foov:2
+ 4: 6
+ 6: 13
+ 3: 7 _Z3barv:2 _Z3foov:5
+ 5: 6 _Z3barv:4 _Z3foov:2
  !CFGChecksum: 563022570642068
diff --git a/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-update.prof 
b/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-update.prof
index 62f9bd5992e735..213bf0b6f81cc4 100644
--- a/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-update.prof
+++ b/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-update.prof
@@ -1,8 +1,8 @@
 foo:3200:13
  1: 13
 

[clang] [llvm] [PseudoProbe] Mix and reorder block and call probe ID in lexical order (PR #75092)

2024-02-28 Thread Lei Wang via cfe-commits

wlei-llvm wrote:

As discussed offline, reverted the complicated order detection change.

https://github.com/llvm/llvm-project/pull/75092
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [PseudoProbe] Mix and reorder block and call probe ID in lexical order (PR #75092)

2024-02-29 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm updated 
https://github.com/llvm/llvm-project/pull/75092

>From e6f735a336f87659ee3236fc795ad4b7107dff4d Mon Sep 17 00:00:00 2001
From: wlei 
Date: Sun, 10 Dec 2023 18:30:42 -0800
Subject: [PATCH 1/4] [PseudoProbe] Mix and reorder block and call probe ID in
 lexical order

---
 clang/test/CodeGen/pseudo-probe-emit.c|  8 +++
 .../llvm/Transforms/IPO/SampleProfileProbe.h  |  3 +--
 .../lib/Transforms/IPO/SampleProfileProbe.cpp | 19 +---
 .../Inputs/pseudo-probe-profile.prof  |  8 +++
 .../Inputs/pseudo-probe-update.prof   |  8 +++
 .../SampleProfile/pseudo-probe-dangle.ll  | 12 +-
 .../pseudo-probe-discriminator.ll |  6 ++---
 .../pseudo-probe-profile-metadata-2.ll| 15 ++---
 .../SampleProfile/pseudo-probe-profile.ll | 22 +--
 .../SampleProfile/pseudo-probe-update.ll  | 11 +-
 .../SampleProfile/pseudo-probe-verify.ll  | 16 +++---
 11 files changed, 59 insertions(+), 69 deletions(-)

diff --git a/clang/test/CodeGen/pseudo-probe-emit.c 
b/clang/test/CodeGen/pseudo-probe-emit.c
index c7a3f7e6d5b02b..360f831e842945 100644
--- a/clang/test/CodeGen/pseudo-probe-emit.c
+++ b/clang/test/CodeGen/pseudo-probe-emit.c
@@ -10,9 +10,9 @@ void foo(int x) {
   // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID:]], i64 1, i32 0, i64 -1)
   if (x == 0)
 // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 2, i32 0, i64 -1)
-bar();
+bar();  // probe id : 3
   else
-// CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 3, i32 0, i64 -1)
-go();
-  // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 4, i32 0, i64 -1)
+// CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 4, i32 0, i64 -1)
+go(); // probe id : 5
+  // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 6, i32 0, i64 -1)
 }
diff --git a/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h 
b/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h
index 0f2729a9462de2..69b87adf105fd1 100644
--- a/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h
+++ b/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h
@@ -82,8 +82,7 @@ class SampleProfileProber {
   uint32_t getBlockId(const BasicBlock *BB) const;
   uint32_t getCallsiteId(const Instruction *Call) const;
   void computeCFGHash();
-  void computeProbeIdForBlocks();
-  void computeProbeIdForCallsites();
+  void computeProbeId();
 
   Function *F;
 
diff --git a/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp 
b/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp
index 090e5560483edb..975c1bc2347aa4 100644
--- a/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp
+++ b/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp
@@ -173,8 +173,7 @@ SampleProfileProber::SampleProfileProber(Function &Func,
   BlockProbeIds.clear();
   CallProbeIds.clear();
   LastProbeId = (uint32_t)PseudoProbeReservedId::Last;
-  computeProbeIdForBlocks();
-  computeProbeIdForCallsites();
+  computeProbeId();
   computeCFGHash();
 }
 
@@ -207,7 +206,10 @@ void SampleProfileProber::computeCFGHash() {
 << ", Hash = " << FunctionHash << "\n");
 }
 
-void SampleProfileProber::computeProbeIdForBlocks() {
+void SampleProfileProber::computeProbeId() {
+  LLVMContext &Ctx = F->getContext();
+  Module *M = F->getParent();
+
   DenseSet KnownColdBlocks;
   computeEHOnlyBlocks(*F, KnownColdBlocks);
   // Insert pseudo probe to non-cold blocks only. This will reduce IR size as
@@ -216,18 +218,9 @@ void SampleProfileProber::computeProbeIdForBlocks() {
 ++LastProbeId;
 if (!KnownColdBlocks.contains(&BB))
   BlockProbeIds[&BB] = LastProbeId;
-  }
-}
 
-void SampleProfileProber::computeProbeIdForCallsites() {
-  LLVMContext &Ctx = F->getContext();
-  Module *M = F->getParent();
-
-  for (auto &BB : *F) {
 for (auto &I : BB) {
-  if (!isa(I))
-continue;
-  if (isa(&I))
+  if (!isa(I) || isa(&I))
 continue;
 
   // The current implementation uses the lower 16 bits of the discriminator
diff --git 
a/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-profile.prof 
b/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-profile.prof
index ba4c6117dc96ab..d3847946b94033 100644
--- a/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-profile.prof
+++ b/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-profile.prof
@@ -1,8 +1,8 @@
 foo:3200:13
  1: 13
  2: 7
- 3: 6
- 4: 13
- 5: 7 _Z3barv:2 _Z3foov:5
- 6: 6 _Z3barv:4 _Z3foov:2
+ 4: 6
+ 6: 13
+ 3: 7 _Z3barv:2 _Z3foov:5
+ 5: 6 _Z3barv:4 _Z3foov:2
  !CFGChecksum: 563022570642068
diff --git a/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-update.prof 
b/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-update.prof
index 62f9bd5992e735..213bf0b6f81cc4 100644
--- a/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-update.prof
+++ b/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-update.prof
@@ -1,8 +1,8 @@
 foo:3200:13
  1: 13
 

[clang] [llvm] [PseudoProbe] Mix and reorder block and call probe ID in lexical order (PR #75092)

2024-02-29 Thread Lei Wang via cfe-commits

wlei-llvm wrote:

> LGTM. I'd also have a change that errors out on huge staleness go in together 
> with this one..

Was thinking to use a separate diff for that, feels non-trivial to do it,  but 
I'm also good for doing in this. 

Just pushed one version for this, it use heuristics to check this, note that 
the current threshold flags is just based my "gut feeling", so I'm going to 
tune those flags on our internal services, hope that we can achieve the goal: 

1) Use the current prod profile, so all build should pass without any false 
positives break. 

2) Use the incompatible profile, so all build should error out. 

Any early feedback is appreciated.

https://github.com/llvm/llvm-project/pull/75092
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [PseudoProbe] Mix and reorder block and call probe ID in lexical order (PR #75092)

2024-02-29 Thread Lei Wang via cfe-commits

wlei-llvm wrote:

> > > LGTM. I'd also have a change that errors out on huge staleness go in 
> > > together with this one..
> > 
> > 
> > Was thinking to use a separate diff for that, feels non-trivial to do it, 
> > but I'm also good for doing in this.
> > Just pushed one version for this, it use heuristics to check this, note 
> > that the current threshold flags is just based my "gut feeling", so I'm 
> > going to tune those flags on our internal services, hope that we can 
> > achieve the goal:
> > 
> > 1. Use the current prod profile, so all build should pass without any false 
> > positives break.
> > 2. Use the incompatible profile, so all build should error out.
> > 
> > Any early feedback is appreciated.
> 
> Sorry I wasn't clear. The error out change should be a separate patch as 
> logically it's independent of the original mix order change. By go in 
> together, I was mostly thinking about two patches that go in together..

Oh, that's my initial thought too, Ok, I will create a new PR, and hold this 
until we can go in together. 

https://github.com/llvm/llvm-project/pull/75092
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [PseudoProbe] Mix and reorder block and call probe ID in lexical order (PR #75092)

2024-02-29 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm updated 
https://github.com/llvm/llvm-project/pull/75092

>From e6f735a336f87659ee3236fc795ad4b7107dff4d Mon Sep 17 00:00:00 2001
From: wlei 
Date: Sun, 10 Dec 2023 18:30:42 -0800
Subject: [PATCH 1/5] [PseudoProbe] Mix and reorder block and call probe ID in
 lexical order

---
 clang/test/CodeGen/pseudo-probe-emit.c|  8 +++
 .../llvm/Transforms/IPO/SampleProfileProbe.h  |  3 +--
 .../lib/Transforms/IPO/SampleProfileProbe.cpp | 19 +---
 .../Inputs/pseudo-probe-profile.prof  |  8 +++
 .../Inputs/pseudo-probe-update.prof   |  8 +++
 .../SampleProfile/pseudo-probe-dangle.ll  | 12 +-
 .../pseudo-probe-discriminator.ll |  6 ++---
 .../pseudo-probe-profile-metadata-2.ll| 15 ++---
 .../SampleProfile/pseudo-probe-profile.ll | 22 +--
 .../SampleProfile/pseudo-probe-update.ll  | 11 +-
 .../SampleProfile/pseudo-probe-verify.ll  | 16 +++---
 11 files changed, 59 insertions(+), 69 deletions(-)

diff --git a/clang/test/CodeGen/pseudo-probe-emit.c 
b/clang/test/CodeGen/pseudo-probe-emit.c
index c7a3f7e6d5b02b..360f831e842945 100644
--- a/clang/test/CodeGen/pseudo-probe-emit.c
+++ b/clang/test/CodeGen/pseudo-probe-emit.c
@@ -10,9 +10,9 @@ void foo(int x) {
   // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID:]], i64 1, i32 0, i64 -1)
   if (x == 0)
 // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 2, i32 0, i64 -1)
-bar();
+bar();  // probe id : 3
   else
-// CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 3, i32 0, i64 -1)
-go();
-  // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 4, i32 0, i64 -1)
+// CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 4, i32 0, i64 -1)
+go(); // probe id : 5
+  // CHECK: call void @llvm.pseudoprobe(i64 [[#GUID]], i64 6, i32 0, i64 -1)
 }
diff --git a/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h 
b/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h
index 0f2729a9462de2..69b87adf105fd1 100644
--- a/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h
+++ b/llvm/include/llvm/Transforms/IPO/SampleProfileProbe.h
@@ -82,8 +82,7 @@ class SampleProfileProber {
   uint32_t getBlockId(const BasicBlock *BB) const;
   uint32_t getCallsiteId(const Instruction *Call) const;
   void computeCFGHash();
-  void computeProbeIdForBlocks();
-  void computeProbeIdForCallsites();
+  void computeProbeId();
 
   Function *F;
 
diff --git a/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp 
b/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp
index 090e5560483edb..975c1bc2347aa4 100644
--- a/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp
+++ b/llvm/lib/Transforms/IPO/SampleProfileProbe.cpp
@@ -173,8 +173,7 @@ SampleProfileProber::SampleProfileProber(Function &Func,
   BlockProbeIds.clear();
   CallProbeIds.clear();
   LastProbeId = (uint32_t)PseudoProbeReservedId::Last;
-  computeProbeIdForBlocks();
-  computeProbeIdForCallsites();
+  computeProbeId();
   computeCFGHash();
 }
 
@@ -207,7 +206,10 @@ void SampleProfileProber::computeCFGHash() {
 << ", Hash = " << FunctionHash << "\n");
 }
 
-void SampleProfileProber::computeProbeIdForBlocks() {
+void SampleProfileProber::computeProbeId() {
+  LLVMContext &Ctx = F->getContext();
+  Module *M = F->getParent();
+
   DenseSet KnownColdBlocks;
   computeEHOnlyBlocks(*F, KnownColdBlocks);
   // Insert pseudo probe to non-cold blocks only. This will reduce IR size as
@@ -216,18 +218,9 @@ void SampleProfileProber::computeProbeIdForBlocks() {
 ++LastProbeId;
 if (!KnownColdBlocks.contains(&BB))
   BlockProbeIds[&BB] = LastProbeId;
-  }
-}
 
-void SampleProfileProber::computeProbeIdForCallsites() {
-  LLVMContext &Ctx = F->getContext();
-  Module *M = F->getParent();
-
-  for (auto &BB : *F) {
 for (auto &I : BB) {
-  if (!isa(I))
-continue;
-  if (isa(&I))
+  if (!isa(I) || isa(&I))
 continue;
 
   // The current implementation uses the lower 16 bits of the discriminator
diff --git 
a/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-profile.prof 
b/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-profile.prof
index ba4c6117dc96ab..d3847946b94033 100644
--- a/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-profile.prof
+++ b/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-profile.prof
@@ -1,8 +1,8 @@
 foo:3200:13
  1: 13
  2: 7
- 3: 6
- 4: 13
- 5: 7 _Z3barv:2 _Z3foov:5
- 6: 6 _Z3barv:4 _Z3foov:2
+ 4: 6
+ 6: 13
+ 3: 7 _Z3barv:2 _Z3foov:5
+ 5: 6 _Z3barv:4 _Z3foov:2
  !CFGChecksum: 563022570642068
diff --git a/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-update.prof 
b/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-update.prof
index 62f9bd5992e735..213bf0b6f81cc4 100644
--- a/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-update.prof
+++ b/llvm/test/Transforms/SampleProfile/Inputs/pseudo-probe-update.prof
@@ -1,8 +1,8 @@
 foo:3200:13
  1: 13
 

[flang] [libcxx] [libcxxabi] [libc] [llvm] [clang] [compiler-rt] [clang-tools-extra] [lld] [CSSPGO] Compute and report post-match profile staleness (PR #79090)

2024-01-23 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm updated 
https://github.com/llvm/llvm-project/pull/79090

>From 54784e26f33178efd21b0289a1f673d66ea26cc3 Mon Sep 17 00:00:00 2001
From: wlei 
Date: Mon, 22 Jan 2024 19:16:26 -0800
Subject: [PATCH] [CSSPGO] Support post-match profile staleness metrics

---
 llvm/lib/Transforms/IPO/SampleProfile.cpp | 440 +++---
 .../Inputs/profile-mismatch.prof  |   7 +-
 .../SampleProfile/profile-mismatch.ll |  12 +-
 .../pseudo-probe-profile-mismatch-thinlto.ll  |   6 +-
 .../pseudo-probe-profile-mismatch.ll  |  76 +--
 5 files changed, 324 insertions(+), 217 deletions(-)

diff --git a/llvm/lib/Transforms/IPO/SampleProfile.cpp 
b/llvm/lib/Transforms/IPO/SampleProfile.cpp
index 2fd8668d15e200f..a7170faa65dc07c 100644
--- a/llvm/lib/Transforms/IPO/SampleProfile.cpp
+++ b/llvm/lib/Transforms/IPO/SampleProfile.cpp
@@ -433,12 +433,19 @@ using CandidateQueue =
 PriorityQueue,
   CandidateComparer>;
 
+using IRAnchorMap = std::map;
+using ProfileAnchorMap = std::map>;
+
 // Sample profile matching - fuzzy match.
 class SampleProfileMatcher {
   Module &M;
   SampleProfileReader &Reader;
   const PseudoProbeManager *ProbeManager;
   SampleProfileMap FlattenedProfiles;
+
+  std::unordered_map FuncIRAnchors;
+  std::unordered_map FuncProfileAnchors;
+
   // For each function, the matcher generates a map, of which each entry is a
   // mapping from the source location of current build to the source location 
in
   // the profile.
@@ -448,6 +455,8 @@ class SampleProfileMatcher {
   uint64_t TotalProfiledCallsites = 0;
   uint64_t NumMismatchedCallsites = 0;
   uint64_t MismatchedCallsiteSamples = 0;
+  uint64_t PostMatchNumMismatchedCallsites = 0;
+  uint64_t PostMatchMismatchedCallsiteSamples = 0;
   uint64_t TotalCallsiteSamples = 0;
   uint64_t TotalProfiledFunc = 0;
   uint64_t NumMismatchedFuncHash = 0;
@@ -474,24 +483,22 @@ class SampleProfileMatcher {
 return nullptr;
   }
   void runOnFunction(const Function &F);
-  void findIRAnchors(const Function &F,
- std::map &IRAnchors);
-  void findProfileAnchors(
+  void findFuncAnchors();
+  void UpdateIRAnchors();
+  void findIRAnchors(const Function &F, IRAnchorMap &IRAnchors);
+  void findProfileAnchors(const FunctionSamples &FS,
+  ProfileAnchorMap &ProfileAnchors);
+  void countMismatchedHashSamples(const FunctionSamples &FS);
+  void countProfileMismatches(bool IsPreMatch);
+  void countMismatchedHashes(const Function &F, const FunctionSamples &FS);
+  void countMismatchedCallsites(
+  const Function &F,
+  StringMap> &FuncToMismatchCallsites,
+  uint64_t &FuncProfiledCallsites, uint64_t &FuncMismatchedCallsites) 
const;
+  void countMismatchedCallsiteSamples(
   const FunctionSamples &FS,
-  std::map>
-  &ProfileAnchors);
-  void countMismatchedSamples(const FunctionSamples &FS);
-  void countProfileMismatches(
-  const Function &F, const FunctionSamples &FS,
-  const std::map &IRAnchors,
-  const std::map>
-  &ProfileAnchors);
-  void countProfileCallsiteMismatches(
-  const FunctionSamples &FS,
-  const std::map &IRAnchors,
-  const std::map>
-  &ProfileAnchors,
-  uint64_t &FuncMismatchedCallsites, uint64_t &FuncProfiledCallsites);
+  StringMap> &FuncToMismatchCallsites,
+  uint64_t &FuncMismatchedCallsiteSamples) const;
   LocToLocMap &getIRToProfileLocationMap(const Function &F) {
 auto Ret = FuncMappings.try_emplace(
 FunctionSamples::getCanonicalFnName(F.getName()), LocToLocMap());
@@ -499,11 +506,10 @@ class SampleProfileMatcher {
   }
   void distributeIRToProfileLocationMap();
   void distributeIRToProfileLocationMap(FunctionSamples &FS);
-  void runStaleProfileMatching(
-  const Function &F, const std::map &IRAnchors,
-  const std::map>
-  &ProfileAnchors,
-  LocToLocMap &IRToProfileLocationMap);
+  void runStaleProfileMatching();
+  void runStaleProfileMatching(const Function &F, const IRAnchorMap &IRAnchors,
+   const ProfileAnchorMap &ProfileAnchors,
+   LocToLocMap &IRToProfileLocationMap);
 };
 
 /// Sample profile pass.
@@ -1129,7 +1135,7 @@ void SampleProfileLoader::findExternalInlineCandidate(
 CalleeSample->getContext().hasAttribute(ContextShouldBeInlined);
 if (!PreInline && CalleeSample->getHeadSamplesEstimate() < Threshold)
   continue;
-
+
 Function *Func = SymbolMap.lookup(CalleeSample->getFunction());
 // Add to the import list only when it's defined out of module.
 if (!Func || Func->isDeclaration())
@@ -2123,8 +2129,8 @@ bool SampleProfileLoader::doInitialization(Module &M,
   return true;
 }
 
-void SampleProfileMatcher::findIRAnchors(
-const Function &F, std::map &IRAnchors) {
+void SampleProfileMatcher::findIRAnchors(const Function &F,
+ IRAnchorMap &IRAnchors) {
   /

[libc] [llvm] [clang-tools-extra] [libcxxabi] [flang] [lld] [compiler-rt] [clang] [libcxx] [CSSPGO] Compute and report post-match profile staleness (PR #79090)

2024-01-24 Thread Lei Wang via cfe-commits


@@ -2422,59 +2342,226 @@ void SampleProfileMatcher::runStaleProfileMatching(
   }
 }
 
-void SampleProfileMatcher::runOnFunction(const Function &F) {
-  // We need to use flattened function samples for matching.
-  // Unlike IR, which includes all callsites from the source code, the 
callsites
-  // in profile only show up when they are hit by samples, i,e. the profile
-  // callsites in one context may differ from those in another context. To get
-  // the maximum number of callsites, we merge the function profiles from all
-  // contexts, aka, the flattened profile to find profile anchors.
-  const auto *FSFlattened = getFlattenedSamplesFor(F);
-  if (!FSFlattened)
-return;
+void SampleProfileMatcher::runStaleProfileMatching() {
+  for (const auto &F : M) {
+if (F.isDeclaration() || !F.hasFnAttribute("use-sample-profile"))
+  continue;
+const auto *FSFlattened = getFlattenedSamplesFor(F);
+if (!FSFlattened)
+  continue;
+auto IR = FuncIRAnchors.find(&F);
+auto P = FuncProfileAnchors.find(&F);
+if (IR == FuncIRAnchors.end() || P == FuncProfileAnchors.end())
+  continue;
 
-  // Anchors for IR. It's a map from IR location to callee name, callee name is
-  // empty for non-call instruction and use a dummy name(UnknownIndirectCallee)
-  // for unknown indrect callee name.
-  std::map IRAnchors;
-  findIRAnchors(F, IRAnchors);
-  // Anchors for profile. It's a map from callsite location to a set of callee
-  // name.
-  std::map> ProfileAnchors;
-  findProfileAnchors(*FSFlattened, ProfileAnchors);
-
-  // Detect profile mismatch for profile staleness metrics report.
-  // Skip reporting the metrics for imported functions.
-  if (!GlobalValue::isAvailableExternallyLinkage(F.getLinkage()) &&
-  (ReportProfileStaleness || PersistProfileStaleness)) {
-// Use top-level nested FS for counting profile mismatch metrics since
-// currently once a callsite is mismatched, all its children profiles are
-// dropped.
-if (const auto *FS = Reader.getSamplesFor(F))
-  countProfileMismatches(F, *FS, IRAnchors, ProfileAnchors);
+// Run profile matching for checksum mismatched profile, currently only
+// support for pseudo-probe.
+if (FunctionSamples::ProfileIsProbeBased &&
+!ProbeManager->profileIsValid(F, *FSFlattened)) {
+  runStaleProfileMatching(F, IR->second, P->second,
+  getIRToProfileLocationMap(F));
+}
   }
 
-  // Run profile matching for checksum mismatched profile, currently only
-  // support for pseudo-probe.
-  if (SalvageStaleProfile && FunctionSamples::ProfileIsProbeBased &&
-  !ProbeManager->profileIsValid(F, *FSFlattened)) {
-// The matching result will be saved to IRToProfileLocationMap, create a 
new
-// map for each function.
-runStaleProfileMatching(F, IRAnchors, ProfileAnchors,
-getIRToProfileLocationMap(F));
-  }
+  distributeIRToProfileLocationMap();
 }
 
-void SampleProfileMatcher::runOnModule() {
+void SampleProfileMatcher::findFuncAnchors() {
   ProfileConverter::flattenProfile(Reader.getProfiles(), FlattenedProfiles,
FunctionSamples::ProfileIsCS);
-  for (auto &F : M) {
+  for (const auto &F : M) {
 if (F.isDeclaration() || !F.hasFnAttribute("use-sample-profile"))
   continue;
-runOnFunction(F);
+// We need to use flattened function samples for matching.
+// Unlike IR, which includes all callsites from the source code, the
+// callsites in profile only show up when they are hit by samples, i,e. the
+// profile callsites in one context may differ from those in another
+// context. To get the maximum number of callsites, we merge the function
+// profiles from all contexts, aka, the flattened profile to find profile
+// anchors.
+const auto *FSFlattened = getFlattenedSamplesFor(F);
+if (!FSFlattened)
+  continue;
+
+// Anchors for IR. It's a map from IR location to callee name, callee name
+// is empty for non-call instruction and use a dummy
+// name(UnknownIndirectCallee) for unknown indrect callee name.
+auto IR = FuncIRAnchors.emplace(&F, IRAnchorMap());
+findIRAnchors(F, IR.first->second);
+
+// Anchors for profile. It's a map from callsite location to a set of 
callee
+// name.
+auto P = FuncProfileAnchors.emplace(&F, ProfileAnchorMap());
+findProfileAnchors(*FSFlattened, P.first->second);
+  }
+}
+
+void SampleProfileMatcher::countMismatchedCallsiteSamples(
+const FunctionSamples &FS,
+StringMap> &FuncToMismatchCallsites,
+uint64_t &FuncMismatchedCallsiteSamples) const {
+  auto It = FuncToMismatchCallsites.find(FS.getFuncName());
+  // Skip it if no mismatched callsite or this is an external function.
+  if (It == FuncToMismatchCallsites.end() || It->second.empty())
+return;
+  const auto &MismatchCallsites = It->second;
+  for (const auto &I : FS.getBodySamples()) {
+if 

[libc] [llvm] [clang-tools-extra] [libcxxabi] [flang] [lld] [compiler-rt] [clang] [libcxx] [CSSPGO] Compute and report post-match profile staleness (PR #79090)

2024-01-24 Thread Lei Wang via cfe-commits


@@ -2422,59 +2342,226 @@ void SampleProfileMatcher::runStaleProfileMatching(
   }
 }
 
-void SampleProfileMatcher::runOnFunction(const Function &F) {
-  // We need to use flattened function samples for matching.
-  // Unlike IR, which includes all callsites from the source code, the 
callsites
-  // in profile only show up when they are hit by samples, i,e. the profile
-  // callsites in one context may differ from those in another context. To get
-  // the maximum number of callsites, we merge the function profiles from all
-  // contexts, aka, the flattened profile to find profile anchors.
-  const auto *FSFlattened = getFlattenedSamplesFor(F);
-  if (!FSFlattened)
-return;
+void SampleProfileMatcher::runStaleProfileMatching() {
+  for (const auto &F : M) {
+if (F.isDeclaration() || !F.hasFnAttribute("use-sample-profile"))
+  continue;
+const auto *FSFlattened = getFlattenedSamplesFor(F);
+if (!FSFlattened)
+  continue;
+auto IR = FuncIRAnchors.find(&F);
+auto P = FuncProfileAnchors.find(&F);
+if (IR == FuncIRAnchors.end() || P == FuncProfileAnchors.end())
+  continue;
 
-  // Anchors for IR. It's a map from IR location to callee name, callee name is
-  // empty for non-call instruction and use a dummy name(UnknownIndirectCallee)
-  // for unknown indrect callee name.
-  std::map IRAnchors;
-  findIRAnchors(F, IRAnchors);
-  // Anchors for profile. It's a map from callsite location to a set of callee
-  // name.
-  std::map> ProfileAnchors;
-  findProfileAnchors(*FSFlattened, ProfileAnchors);
-
-  // Detect profile mismatch for profile staleness metrics report.
-  // Skip reporting the metrics for imported functions.
-  if (!GlobalValue::isAvailableExternallyLinkage(F.getLinkage()) &&
-  (ReportProfileStaleness || PersistProfileStaleness)) {
-// Use top-level nested FS for counting profile mismatch metrics since
-// currently once a callsite is mismatched, all its children profiles are
-// dropped.
-if (const auto *FS = Reader.getSamplesFor(F))
-  countProfileMismatches(F, *FS, IRAnchors, ProfileAnchors);
+// Run profile matching for checksum mismatched profile, currently only
+// support for pseudo-probe.
+if (FunctionSamples::ProfileIsProbeBased &&
+!ProbeManager->profileIsValid(F, *FSFlattened)) {
+  runStaleProfileMatching(F, IR->second, P->second,
+  getIRToProfileLocationMap(F));
+}
   }
 
-  // Run profile matching for checksum mismatched profile, currently only
-  // support for pseudo-probe.
-  if (SalvageStaleProfile && FunctionSamples::ProfileIsProbeBased &&
-  !ProbeManager->profileIsValid(F, *FSFlattened)) {
-// The matching result will be saved to IRToProfileLocationMap, create a 
new
-// map for each function.
-runStaleProfileMatching(F, IRAnchors, ProfileAnchors,
-getIRToProfileLocationMap(F));
-  }
+  distributeIRToProfileLocationMap();
 }
 
-void SampleProfileMatcher::runOnModule() {
+void SampleProfileMatcher::findFuncAnchors() {
   ProfileConverter::flattenProfile(Reader.getProfiles(), FlattenedProfiles,
FunctionSamples::ProfileIsCS);
-  for (auto &F : M) {
+  for (const auto &F : M) {
 if (F.isDeclaration() || !F.hasFnAttribute("use-sample-profile"))
   continue;
-runOnFunction(F);
+// We need to use flattened function samples for matching.
+// Unlike IR, which includes all callsites from the source code, the
+// callsites in profile only show up when they are hit by samples, i,e. the
+// profile callsites in one context may differ from those in another
+// context. To get the maximum number of callsites, we merge the function
+// profiles from all contexts, aka, the flattened profile to find profile
+// anchors.
+const auto *FSFlattened = getFlattenedSamplesFor(F);
+if (!FSFlattened)
+  continue;
+
+// Anchors for IR. It's a map from IR location to callee name, callee name
+// is empty for non-call instruction and use a dummy
+// name(UnknownIndirectCallee) for unknown indrect callee name.
+auto IR = FuncIRAnchors.emplace(&F, IRAnchorMap());
+findIRAnchors(F, IR.first->second);
+
+// Anchors for profile. It's a map from callsite location to a set of 
callee
+// name.
+auto P = FuncProfileAnchors.emplace(&F, ProfileAnchorMap());
+findProfileAnchors(*FSFlattened, P.first->second);
+  }
+}
+
+void SampleProfileMatcher::countMismatchedCallsiteSamples(
+const FunctionSamples &FS,
+StringMap> &FuncToMismatchCallsites,
+uint64_t &FuncMismatchedCallsiteSamples) const {
+  auto It = FuncToMismatchCallsites.find(FS.getFuncName());
+  // Skip it if no mismatched callsite or this is an external function.
+  if (It == FuncToMismatchCallsites.end() || It->second.empty())
+return;
+  const auto &MismatchCallsites = It->second;
+  for (const auto &I : FS.getBodySamples()) {
+if 

[libc] [llvm] [clang-tools-extra] [libcxxabi] [flang] [lld] [compiler-rt] [clang] [libcxx] [CSSPGO] Compute and report post-match profile staleness (PR #79090)

2024-01-24 Thread Lei Wang via cfe-commits


@@ -2422,59 +2342,226 @@ void SampleProfileMatcher::runStaleProfileMatching(
   }
 }
 
-void SampleProfileMatcher::runOnFunction(const Function &F) {
-  // We need to use flattened function samples for matching.
-  // Unlike IR, which includes all callsites from the source code, the 
callsites
-  // in profile only show up when they are hit by samples, i,e. the profile
-  // callsites in one context may differ from those in another context. To get
-  // the maximum number of callsites, we merge the function profiles from all
-  // contexts, aka, the flattened profile to find profile anchors.
-  const auto *FSFlattened = getFlattenedSamplesFor(F);
-  if (!FSFlattened)
-return;
+void SampleProfileMatcher::runStaleProfileMatching() {
+  for (const auto &F : M) {
+if (F.isDeclaration() || !F.hasFnAttribute("use-sample-profile"))
+  continue;
+const auto *FSFlattened = getFlattenedSamplesFor(F);
+if (!FSFlattened)
+  continue;
+auto IR = FuncIRAnchors.find(&F);
+auto P = FuncProfileAnchors.find(&F);
+if (IR == FuncIRAnchors.end() || P == FuncProfileAnchors.end())
+  continue;
 
-  // Anchors for IR. It's a map from IR location to callee name, callee name is
-  // empty for non-call instruction and use a dummy name(UnknownIndirectCallee)
-  // for unknown indrect callee name.
-  std::map IRAnchors;
-  findIRAnchors(F, IRAnchors);
-  // Anchors for profile. It's a map from callsite location to a set of callee
-  // name.
-  std::map> ProfileAnchors;
-  findProfileAnchors(*FSFlattened, ProfileAnchors);
-
-  // Detect profile mismatch for profile staleness metrics report.
-  // Skip reporting the metrics for imported functions.
-  if (!GlobalValue::isAvailableExternallyLinkage(F.getLinkage()) &&
-  (ReportProfileStaleness || PersistProfileStaleness)) {
-// Use top-level nested FS for counting profile mismatch metrics since
-// currently once a callsite is mismatched, all its children profiles are
-// dropped.
-if (const auto *FS = Reader.getSamplesFor(F))
-  countProfileMismatches(F, *FS, IRAnchors, ProfileAnchors);
+// Run profile matching for checksum mismatched profile, currently only
+// support for pseudo-probe.
+if (FunctionSamples::ProfileIsProbeBased &&
+!ProbeManager->profileIsValid(F, *FSFlattened)) {
+  runStaleProfileMatching(F, IR->second, P->second,
+  getIRToProfileLocationMap(F));
+}
   }
 
-  // Run profile matching for checksum mismatched profile, currently only
-  // support for pseudo-probe.
-  if (SalvageStaleProfile && FunctionSamples::ProfileIsProbeBased &&
-  !ProbeManager->profileIsValid(F, *FSFlattened)) {
-// The matching result will be saved to IRToProfileLocationMap, create a 
new
-// map for each function.
-runStaleProfileMatching(F, IRAnchors, ProfileAnchors,
-getIRToProfileLocationMap(F));
-  }
+  distributeIRToProfileLocationMap();
 }
 
-void SampleProfileMatcher::runOnModule() {
+void SampleProfileMatcher::findFuncAnchors() {
   ProfileConverter::flattenProfile(Reader.getProfiles(), FlattenedProfiles,
FunctionSamples::ProfileIsCS);
-  for (auto &F : M) {
+  for (const auto &F : M) {
 if (F.isDeclaration() || !F.hasFnAttribute("use-sample-profile"))
   continue;
-runOnFunction(F);
+// We need to use flattened function samples for matching.
+// Unlike IR, which includes all callsites from the source code, the
+// callsites in profile only show up when they are hit by samples, i,e. the
+// profile callsites in one context may differ from those in another
+// context. To get the maximum number of callsites, we merge the function
+// profiles from all contexts, aka, the flattened profile to find profile
+// anchors.
+const auto *FSFlattened = getFlattenedSamplesFor(F);
+if (!FSFlattened)
+  continue;
+
+// Anchors for IR. It's a map from IR location to callee name, callee name
+// is empty for non-call instruction and use a dummy
+// name(UnknownIndirectCallee) for unknown indrect callee name.
+auto IR = FuncIRAnchors.emplace(&F, IRAnchorMap());
+findIRAnchors(F, IR.first->second);
+
+// Anchors for profile. It's a map from callsite location to a set of 
callee
+// name.
+auto P = FuncProfileAnchors.emplace(&F, ProfileAnchorMap());
+findProfileAnchors(*FSFlattened, P.first->second);
+  }
+}
+
+void SampleProfileMatcher::countMismatchedCallsiteSamples(
+const FunctionSamples &FS,
+StringMap> &FuncToMismatchCallsites,
+uint64_t &FuncMismatchedCallsiteSamples) const {
+  auto It = FuncToMismatchCallsites.find(FS.getFuncName());
+  // Skip it if no mismatched callsite or this is an external function.
+  if (It == FuncToMismatchCallsites.end() || It->second.empty())
+return;
+  const auto &MismatchCallsites = It->second;
+  for (const auto &I : FS.getBodySamples()) {
+if 

[lld] [clang] [clang-tools-extra] [compiler-rt] [libcxxabi] [flang] [llvm] [libc] [libcxx] [CSSPGO] Compute and report post-match profile staleness (PR #79090)

2024-01-24 Thread Lei Wang via cfe-commits

wlei-llvm wrote:

Thanks so much for the feedback. Answering some high-level questions, will 
address others later.

https://github.com/llvm/llvm-project/pull/79090
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libcxx] [compiler-rt] [lld] [libc] [llvm] [flang] [clang-tools-extra] [clang] [libcxxabi] [CSSPGO] Compute and report post-match profile staleness (PR #79090)

2024-01-24 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm edited 
https://github.com/llvm/llvm-project/pull/79090
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[llvm] [clang] [PseudoProbe] Mix and reorder block and call probe ID in lexical order (PR #75092)

2024-01-24 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm edited 
https://github.com/llvm/llvm-project/pull/75092
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [PseudoProbe] Mix and reorder block and call probe ID in lexical order (PR #75092)

2024-01-24 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm commented:

Thanks for the feedback, it's indeed complicated (even than I expected..)

https://github.com/llvm/llvm-project/pull/75092
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [PseudoProbe] Mix and reorder block and call probe ID in lexical order (PR #75092)

2024-01-24 Thread Lei Wang via cfe-commits


@@ -471,6 +471,66 @@ ProfileGenerator::getTopLevelFunctionProfile(FunctionId 
FuncName) {
   return ProfileMap.Create(Context);
 }
 
+// Use a heuristic to determine probe order by checking callsite insertion
+// position relative to the BB probes. Sort all the BB probes is in a list, for
+// each calliste, compute "ratio = insert position / length of the list". For
+// the old order, the probe ids for BB should be all before(smaller than) the
+// probe ids for callsite, this ratio should be equal to or close to 1.
+bool checkProbeIDIsInMixedOrder(const SampleProfileMap &Profiles) {
+  // Use flattned profile to maximize the callsites in the profile.
+  SampleProfileMap flattenProfile;
+  ProfileConverter::flattenProfile(Profiles, flattenProfile);
+
+  uint32_t PossibleOldOrderNum = 0;
+  uint32_t PossibleNewOrderNum = 0;
+
+  for (const auto &I : flattenProfile) {
+const FunctionSamples &FS = I.second;
+// Skip small functions whose profile order are likely random.
+if (FS.getBodySamples().size() < 10)
+  continue;
+
+std::set PossibleBBProbeIDs;
+std::set CallsiteIDs;
+for (const auto &I : FS.getBodySamples()) {
+  if (I.second.getCallTargets().empty())
+PossibleBBProbeIDs.insert(I.first.LineOffset);
+  else
+CallsiteIDs.insert(I.first.LineOffset);
+}
+
+if (PossibleBBProbeIDs.empty() || CallsiteIDs.empty())
+  continue;
+
+uint32_t DistanceToBeginSum = 0;
+for (const auto &ID : CallsiteIDs)
+  DistanceToBeginSum += std::distance(PossibleBBProbeIDs.begin(),
+  PossibleBBProbeIDs.upper_bound(ID));
+uint32_t LengthSum = PossibleBBProbeIDs.size() * CallsiteIDs.size();
+
+// Note that PossibleBBProbeIDs could contains some callsite probe id, the

wlei-llvm wrote:

> I'm curious what lead to empty call targets for a call location covered by 
> LBR..

yeah, I need to investigate where the empty call targets comes from.(I'm sure I 
saw a lot of  empty callsite names, the count are all zero)



https://github.com/llvm/llvm-project/pull/75092
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[llvm] [clang] [PseudoProbe] Mix and reorder block and call probe ID in lexical order (PR #75092)

2024-01-24 Thread Lei Wang via cfe-commits


@@ -471,6 +471,66 @@ ProfileGenerator::getTopLevelFunctionProfile(FunctionId 
FuncName) {
   return ProfileMap.Create(Context);
 }
 
+// Use a heuristic to determine probe order by checking callsite insertion
+// position relative to the BB probes. Sort all the BB probes is in a list, for
+// each calliste, compute "ratio = insert position / length of the list". For
+// the old order, the probe ids for BB should be all before(smaller than) the
+// probe ids for callsite, this ratio should be equal to or close to 1.
+bool checkProbeIDIsInMixedOrder(const SampleProfileMap &Profiles) {
+  // Use flattned profile to maximize the callsites in the profile.
+  SampleProfileMap flattenProfile;
+  ProfileConverter::flattenProfile(Profiles, flattenProfile);
+
+  uint32_t PossibleOldOrderNum = 0;
+  uint32_t PossibleNewOrderNum = 0;
+
+  for (const auto &I : flattenProfile) {
+const FunctionSamples &FS = I.second;
+// Skip small functions whose profile order are likely random.
+if (FS.getBodySamples().size() < 10)
+  continue;
+
+std::set PossibleBBProbeIDs;
+std::set CallsiteIDs;
+for (const auto &I : FS.getBodySamples()) {
+  if (I.second.getCallTargets().empty())
+PossibleBBProbeIDs.insert(I.first.LineOffset);
+  else
+CallsiteIDs.insert(I.first.LineOffset);
+}
+
+if (PossibleBBProbeIDs.empty() || CallsiteIDs.empty())
+  continue;
+
+uint32_t DistanceToBeginSum = 0;
+for (const auto &ID : CallsiteIDs)
+  DistanceToBeginSum += std::distance(PossibleBBProbeIDs.begin(),
+  PossibleBBProbeIDs.upper_bound(ID));
+uint32_t LengthSum = PossibleBBProbeIDs.size() * CallsiteIDs.size();
+
+// Note that PossibleBBProbeIDs could contains some callsite probe id, the
+// ratio is not exactly 1 for the old order, hence use a smaller threshold
+// to determine.
+if ((float)DistanceToBeginSum / LengthSum > 0.8)
+  PossibleOldOrderNum++;
+else
+  PossibleNewOrderNum++;
+  }
+  return PossibleNewOrderNum >= PossibleOldOrderNum;

wlei-llvm wrote:

Sorry, it was also far complicated than I expected, try to explain more about 
this, we can discuss offline. 

The fact is that, in theory there can be cases that all the calls are after BB 
probes for the new order ID(like all hot profiled calls are in the last BB), so 
even we can decide the probe is call or not, we still can't just use one 
function or one probe as a line to decide the order, that's why here I use this 
complex statistical way.(It appears it works well)

> If we can't decide whether a probe is originally a call probe or a block 
> probe, that makes the checking less reliable

So far sadly no from profile. We can query the type of probe from the original 
profiled binary, like parsing the binary asm, but that seems a lot of infra 
work.(Also as mentioned above, even the case all calls are after all BB also 
can't decide the order) 

> In reality, do we see cases where there are many PossibleNewOrderNum and 
> PossibleOldOrderNum for one profile?

Yep, both are actually a lot in a large production-used profile, that's why I 
use a 0.8 not a 0.9 things, I spent some time tuning this threshold, right now 
it works very well on my large profile test.

> but if that's case we can't even confidently issue error.

To some degree, this heuristic is reliable(as long as there are enough big 
functions). From 
my statistics, using 0.8 as threshold showed it works really good to detect the 
order, the two num are distinguishable. For both side, the `PossibleOrderNum` 
is much larger than the other num, the 
ratio(PossibleNewOrderNum/PossibleOldOrderNum) is almost ~0.9 for new order, a 
large room to swing, and vice versa.  (like PossibleNewOrderNum is 9X more than 
 PossibleOldOrderNum for a new order. Or PossibleOldOrderNum is 9X more than 
PossibleNewOrderNum for an old order). 
But TBH, I'm sure how it works for a small profile, what i tested is all larger 
than 100M profile, I know that for a small function, it's quite random, it just 
has little calls, I have to filter out those small function as showed in the 
code. 



https://github.com/llvm/llvm-project/pull/75092
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[flang] [llvm] [compiler-rt] [libcxx] [libcxxabi] [libc] [clang-tools-extra] [lld] [clang] [CSSPGO] Compute and report post-match profile staleness (PR #79090)

2024-01-25 Thread Lei Wang via cfe-commits


@@ -2422,59 +2342,226 @@ void SampleProfileMatcher::runStaleProfileMatching(
   }
 }
 
-void SampleProfileMatcher::runOnFunction(const Function &F) {
-  // We need to use flattened function samples for matching.
-  // Unlike IR, which includes all callsites from the source code, the 
callsites
-  // in profile only show up when they are hit by samples, i,e. the profile
-  // callsites in one context may differ from those in another context. To get
-  // the maximum number of callsites, we merge the function profiles from all
-  // contexts, aka, the flattened profile to find profile anchors.
-  const auto *FSFlattened = getFlattenedSamplesFor(F);
-  if (!FSFlattened)
-return;
+void SampleProfileMatcher::runStaleProfileMatching() {
+  for (const auto &F : M) {
+if (F.isDeclaration() || !F.hasFnAttribute("use-sample-profile"))
+  continue;
+const auto *FSFlattened = getFlattenedSamplesFor(F);
+if (!FSFlattened)
+  continue;
+auto IR = FuncIRAnchors.find(&F);
+auto P = FuncProfileAnchors.find(&F);
+if (IR == FuncIRAnchors.end() || P == FuncProfileAnchors.end())
+  continue;
 
-  // Anchors for IR. It's a map from IR location to callee name, callee name is
-  // empty for non-call instruction and use a dummy name(UnknownIndirectCallee)
-  // for unknown indrect callee name.
-  std::map IRAnchors;
-  findIRAnchors(F, IRAnchors);
-  // Anchors for profile. It's a map from callsite location to a set of callee
-  // name.
-  std::map> ProfileAnchors;
-  findProfileAnchors(*FSFlattened, ProfileAnchors);
-
-  // Detect profile mismatch for profile staleness metrics report.
-  // Skip reporting the metrics for imported functions.
-  if (!GlobalValue::isAvailableExternallyLinkage(F.getLinkage()) &&
-  (ReportProfileStaleness || PersistProfileStaleness)) {
-// Use top-level nested FS for counting profile mismatch metrics since
-// currently once a callsite is mismatched, all its children profiles are
-// dropped.
-if (const auto *FS = Reader.getSamplesFor(F))
-  countProfileMismatches(F, *FS, IRAnchors, ProfileAnchors);
+// Run profile matching for checksum mismatched profile, currently only
+// support for pseudo-probe.
+if (FunctionSamples::ProfileIsProbeBased &&
+!ProbeManager->profileIsValid(F, *FSFlattened)) {
+  runStaleProfileMatching(F, IR->second, P->second,
+  getIRToProfileLocationMap(F));
+}
   }
 
-  // Run profile matching for checksum mismatched profile, currently only
-  // support for pseudo-probe.
-  if (SalvageStaleProfile && FunctionSamples::ProfileIsProbeBased &&
-  !ProbeManager->profileIsValid(F, *FSFlattened)) {
-// The matching result will be saved to IRToProfileLocationMap, create a 
new
-// map for each function.
-runStaleProfileMatching(F, IRAnchors, ProfileAnchors,
-getIRToProfileLocationMap(F));
-  }
+  distributeIRToProfileLocationMap();
 }
 
-void SampleProfileMatcher::runOnModule() {
+void SampleProfileMatcher::findFuncAnchors() {
   ProfileConverter::flattenProfile(Reader.getProfiles(), FlattenedProfiles,
FunctionSamples::ProfileIsCS);
-  for (auto &F : M) {
+  for (const auto &F : M) {
 if (F.isDeclaration() || !F.hasFnAttribute("use-sample-profile"))
   continue;
-runOnFunction(F);
+// We need to use flattened function samples for matching.
+// Unlike IR, which includes all callsites from the source code, the
+// callsites in profile only show up when they are hit by samples, i,e. the
+// profile callsites in one context may differ from those in another
+// context. To get the maximum number of callsites, we merge the function
+// profiles from all contexts, aka, the flattened profile to find profile
+// anchors.
+const auto *FSFlattened = getFlattenedSamplesFor(F);
+if (!FSFlattened)
+  continue;
+
+// Anchors for IR. It's a map from IR location to callee name, callee name
+// is empty for non-call instruction and use a dummy
+// name(UnknownIndirectCallee) for unknown indrect callee name.
+auto IR = FuncIRAnchors.emplace(&F, IRAnchorMap());
+findIRAnchors(F, IR.first->second);
+
+// Anchors for profile. It's a map from callsite location to a set of 
callee
+// name.
+auto P = FuncProfileAnchors.emplace(&F, ProfileAnchorMap());
+findProfileAnchors(*FSFlattened, P.first->second);
+  }
+}
+
+void SampleProfileMatcher::countMismatchedCallsiteSamples(
+const FunctionSamples &FS,
+StringMap> &FuncToMismatchCallsites,
+uint64_t &FuncMismatchedCallsiteSamples) const {
+  auto It = FuncToMismatchCallsites.find(FS.getFuncName());
+  // Skip it if no mismatched callsite or this is an external function.
+  if (It == FuncToMismatchCallsites.end() || It->second.empty())
+return;
+  const auto &MismatchCallsites = It->second;
+  for (const auto &I : FS.getBodySamples()) {
+if 

[llvm] [clang] [PseudoProbe] Mix and reorder block and call probe ID in lexical order (PR #75092)

2024-01-25 Thread Lei Wang via cfe-commits


@@ -471,6 +471,66 @@ ProfileGenerator::getTopLevelFunctionProfile(FunctionId 
FuncName) {
   return ProfileMap.Create(Context);
 }
 
+// Use a heuristic to determine probe order by checking callsite insertion
+// position relative to the BB probes. Sort all the BB probes is in a list, for
+// each calliste, compute "ratio = insert position / length of the list". For
+// the old order, the probe ids for BB should be all before(smaller than) the
+// probe ids for callsite, this ratio should be equal to or close to 1.
+bool checkProbeIDIsInMixedOrder(const SampleProfileMap &Profiles) {
+  // Use flattned profile to maximize the callsites in the profile.
+  SampleProfileMap flattenProfile;
+  ProfileConverter::flattenProfile(Profiles, flattenProfile);
+
+  uint32_t PossibleOldOrderNum = 0;
+  uint32_t PossibleNewOrderNum = 0;
+
+  for (const auto &I : flattenProfile) {
+const FunctionSamples &FS = I.second;
+// Skip small functions whose profile order are likely random.
+if (FS.getBodySamples().size() < 10)
+  continue;
+
+std::set PossibleBBProbeIDs;
+std::set CallsiteIDs;
+for (const auto &I : FS.getBodySamples()) {
+  if (I.second.getCallTargets().empty())
+PossibleBBProbeIDs.insert(I.first.LineOffset);
+  else
+CallsiteIDs.insert(I.first.LineOffset);
+}
+
+if (PossibleBBProbeIDs.empty() || CallsiteIDs.empty())
+  continue;
+
+uint32_t DistanceToBeginSum = 0;
+for (const auto &ID : CallsiteIDs)
+  DistanceToBeginSum += std::distance(PossibleBBProbeIDs.begin(),
+  PossibleBBProbeIDs.upper_bound(ID));
+uint32_t LengthSum = PossibleBBProbeIDs.size() * CallsiteIDs.size();
+
+// Note that PossibleBBProbeIDs could contains some callsite probe id, the
+// ratio is not exactly 1 for the old order, hence use a smaller threshold
+// to determine.
+if ((float)DistanceToBeginSum / LengthSum > 0.8)
+  PossibleOldOrderNum++;
+else
+  PossibleNewOrderNum++;
+  }
+  return PossibleNewOrderNum >= PossibleOldOrderNum;

wlei-llvm wrote:

> If we only traverse the probe sections, would that be enough to verify?

yes, that should be enough, we can check some large functions if there are 
calls before BBs to decide old order.

> Another idea is like you said, just move the checks to compile time. But 
> maybe we don't need to detect order specifically, but just detect huge 
> staleness and issue error. Say if a profile is >=90% (or maybe 100%?) stale 
> at compile time for a large file or a file with hot functions, issue error. 
> That would be not specific to this order issue, but also generally useful?

Yes, that should also work, it should be useful for catching other mismatch 
cases, like compiler upgrade.



https://github.com/llvm/llvm-project/pull/75092
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libc] [lld] [libcxx] [libcxxabi] [compiler-rt] [llvm] [clang] [flang] [clang-tools-extra] [CSSPGO] Compute and report post-match profile staleness (PR #79090)

2024-01-26 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm updated 
https://github.com/llvm/llvm-project/pull/79090

>From 54784e26f33178efd21b0289a1f673d66ea26cc3 Mon Sep 17 00:00:00 2001
From: wlei 
Date: Mon, 22 Jan 2024 19:16:26 -0800
Subject: [PATCH 1/3] [CSSPGO] Support post-match profile staleness metrics

---
 llvm/lib/Transforms/IPO/SampleProfile.cpp | 440 +++---
 .../Inputs/profile-mismatch.prof  |   7 +-
 .../SampleProfile/profile-mismatch.ll |  12 +-
 .../pseudo-probe-profile-mismatch-thinlto.ll  |   6 +-
 .../pseudo-probe-profile-mismatch.ll  |  76 +--
 5 files changed, 324 insertions(+), 217 deletions(-)

diff --git a/llvm/lib/Transforms/IPO/SampleProfile.cpp 
b/llvm/lib/Transforms/IPO/SampleProfile.cpp
index 2fd8668d15e200f..a7170faa65dc07c 100644
--- a/llvm/lib/Transforms/IPO/SampleProfile.cpp
+++ b/llvm/lib/Transforms/IPO/SampleProfile.cpp
@@ -433,12 +433,19 @@ using CandidateQueue =
 PriorityQueue,
   CandidateComparer>;
 
+using IRAnchorMap = std::map;
+using ProfileAnchorMap = std::map>;
+
 // Sample profile matching - fuzzy match.
 class SampleProfileMatcher {
   Module &M;
   SampleProfileReader &Reader;
   const PseudoProbeManager *ProbeManager;
   SampleProfileMap FlattenedProfiles;
+
+  std::unordered_map FuncIRAnchors;
+  std::unordered_map FuncProfileAnchors;
+
   // For each function, the matcher generates a map, of which each entry is a
   // mapping from the source location of current build to the source location 
in
   // the profile.
@@ -448,6 +455,8 @@ class SampleProfileMatcher {
   uint64_t TotalProfiledCallsites = 0;
   uint64_t NumMismatchedCallsites = 0;
   uint64_t MismatchedCallsiteSamples = 0;
+  uint64_t PostMatchNumMismatchedCallsites = 0;
+  uint64_t PostMatchMismatchedCallsiteSamples = 0;
   uint64_t TotalCallsiteSamples = 0;
   uint64_t TotalProfiledFunc = 0;
   uint64_t NumMismatchedFuncHash = 0;
@@ -474,24 +483,22 @@ class SampleProfileMatcher {
 return nullptr;
   }
   void runOnFunction(const Function &F);
-  void findIRAnchors(const Function &F,
- std::map &IRAnchors);
-  void findProfileAnchors(
+  void findFuncAnchors();
+  void UpdateIRAnchors();
+  void findIRAnchors(const Function &F, IRAnchorMap &IRAnchors);
+  void findProfileAnchors(const FunctionSamples &FS,
+  ProfileAnchorMap &ProfileAnchors);
+  void countMismatchedHashSamples(const FunctionSamples &FS);
+  void countProfileMismatches(bool IsPreMatch);
+  void countMismatchedHashes(const Function &F, const FunctionSamples &FS);
+  void countMismatchedCallsites(
+  const Function &F,
+  StringMap> &FuncToMismatchCallsites,
+  uint64_t &FuncProfiledCallsites, uint64_t &FuncMismatchedCallsites) 
const;
+  void countMismatchedCallsiteSamples(
   const FunctionSamples &FS,
-  std::map>
-  &ProfileAnchors);
-  void countMismatchedSamples(const FunctionSamples &FS);
-  void countProfileMismatches(
-  const Function &F, const FunctionSamples &FS,
-  const std::map &IRAnchors,
-  const std::map>
-  &ProfileAnchors);
-  void countProfileCallsiteMismatches(
-  const FunctionSamples &FS,
-  const std::map &IRAnchors,
-  const std::map>
-  &ProfileAnchors,
-  uint64_t &FuncMismatchedCallsites, uint64_t &FuncProfiledCallsites);
+  StringMap> &FuncToMismatchCallsites,
+  uint64_t &FuncMismatchedCallsiteSamples) const;
   LocToLocMap &getIRToProfileLocationMap(const Function &F) {
 auto Ret = FuncMappings.try_emplace(
 FunctionSamples::getCanonicalFnName(F.getName()), LocToLocMap());
@@ -499,11 +506,10 @@ class SampleProfileMatcher {
   }
   void distributeIRToProfileLocationMap();
   void distributeIRToProfileLocationMap(FunctionSamples &FS);
-  void runStaleProfileMatching(
-  const Function &F, const std::map &IRAnchors,
-  const std::map>
-  &ProfileAnchors,
-  LocToLocMap &IRToProfileLocationMap);
+  void runStaleProfileMatching();
+  void runStaleProfileMatching(const Function &F, const IRAnchorMap &IRAnchors,
+   const ProfileAnchorMap &ProfileAnchors,
+   LocToLocMap &IRToProfileLocationMap);
 };
 
 /// Sample profile pass.
@@ -1129,7 +1135,7 @@ void SampleProfileLoader::findExternalInlineCandidate(
 CalleeSample->getContext().hasAttribute(ContextShouldBeInlined);
 if (!PreInline && CalleeSample->getHeadSamplesEstimate() < Threshold)
   continue;
-
+
 Function *Func = SymbolMap.lookup(CalleeSample->getFunction());
 // Add to the import list only when it's defined out of module.
 if (!Func || Func->isDeclaration())
@@ -2123,8 +2129,8 @@ bool SampleProfileLoader::doInitialization(Module &M,
   return true;
 }
 
-void SampleProfileMatcher::findIRAnchors(
-const Function &F, std::map &IRAnchors) {
+void SampleProfileMatcher::findIRAnchors(const Function &F,
+ IRAnchorMap &IRAnchors) {

[libcxx] [llvm] [lld] [libc] [flang] [compiler-rt] [libcxxabi] [clang] [clang-tools-extra] [CSSPGO] Compute and report post-match profile staleness (PR #79090)

2024-01-29 Thread Lei Wang via cfe-commits


@@ -2205,93 +2230,141 @@ void 
SampleProfileMatcher::countMismatchedSamples(const FunctionSamples &FS) {
   countMismatchedSamples(CS.second);
 }
 
-void SampleProfileMatcher::countProfileMismatches(
-const Function &F, const FunctionSamples &FS,
-const std::map &IRAnchors,
+void ProfileMatchStats::countMismatchedCallsites(
+const Function &F, const std::map &IRAnchors,
+const std::map>
+&ProfileAnchors,
+const LocToLocMap &IRToProfileLocationMap) {
+  auto &MismatchedCallsites =
+  
FuncMismatchedCallsites[FunctionSamples::getCanonicalFnName(F.getName())];
+
+  auto MapIRLocToProfileLoc = [&](const LineLocation &IRLoc) {

wlei-llvm wrote:

Yep, the benefit here is pre-match and post-match using the same 
function`countMismatchedCallsites`, no special process to parsing the matching 
results. For pre-match, we just pass a empty `IRToProfileLocationMap` and no 
changes to `RunStaleProfileMatching`.

https://github.com/llvm/llvm-project/pull/79090
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libc] [libcxx] [libcxxabi] [clang-tools-extra] [llvm] [compiler-rt] [lld] [clang] [flang] [CSSPGO] Compute and report post-match profile staleness (PR #79090)

2024-01-29 Thread Lei Wang via cfe-commits


@@ -433,6 +433,44 @@ using CandidateQueue =
 PriorityQueue,
   CandidateComparer>;
 
+// Profile matching statstics.
+class ProfileMatchStats {

wlei-llvm wrote:

I was thinking that too until I worked with this 
`StringMap> FuncMismatchedCallsites;` which introduce a 
lot of actions making me think to put them into a new class. But as you 
suggested below, we can move  `FuncMismatchedCallsites` out of the stats, or 
even we don't need a new class.



https://github.com/llvm/llvm-project/pull/79090
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[flang] [lld] [llvm] [clang] [libcxxabi] [libc] [libcxx] [clang-tools-extra] [compiler-rt] [CSSPGO] Compute and report post-match profile staleness (PR #79090)

2024-01-29 Thread Lei Wang via cfe-commits


@@ -433,6 +433,44 @@ using CandidateQueue =
 PriorityQueue,
   CandidateComparer>;
 
+// Profile matching statstics.
+class ProfileMatchStats {
+  const Module &M;
+  SampleProfileReader &Reader;
+  const PseudoProbeManager *ProbeManager;
+
+public:
+  ProfileMatchStats(const Module &M, SampleProfileReader &Reader,
+const PseudoProbeManager *ProbeManager)
+  : M(M), Reader(Reader), ProbeManager(ProbeManager) {}
+
+  uint64_t NumMismatchedCallsites = 0;
+  uint64_t TotalProfiledCallsites = 0;
+  uint64_t MismatchedCallsiteSamples = 0;
+  uint64_t NumMismatchedFuncHash = 0;
+  uint64_t TotalProfiledFunc = 0;
+  uint64_t MismatchedFuncHashSamples = 0;

wlei-llvm wrote:

Sounds good to remove the "Hash"

https://github.com/llvm/llvm-project/pull/79090
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libc] [lld] [flang] [libcxxabi] [compiler-rt] [libcxx] [clang-tools-extra] [llvm] [clang] [CSSPGO] Compute and report post-match profile staleness (PR #79090)

2024-01-29 Thread Lei Wang via cfe-commits


@@ -2460,63 +2528,108 @@ void SampleProfileMatcher::runOnFunction(const 
Function &F) {
   !ProbeManager->profileIsValid(F, *FSFlattened)) {
 // The matching result will be saved to IRToProfileLocationMap, create a 
new
 // map for each function.
+auto &IRToProfileLocationMap = getIRToProfileLocationMap(F);
 runStaleProfileMatching(F, IRAnchors, ProfileAnchors,
-getIRToProfileLocationMap(F));
+IRToProfileLocationMap);
+PostMatchStats.countMismatchedCallsites(F, IRAnchors, ProfileAnchors,
+IRToProfileLocationMap);
   }
 }
 
-void SampleProfileMatcher::runOnModule() {
-  ProfileConverter::flattenProfile(Reader.getProfiles(), FlattenedProfiles,
-   FunctionSamples::ProfileIsCS);
-  for (auto &F : M) {
-if (F.isDeclaration() || !F.hasFnAttribute("use-sample-profile"))
-  continue;
-runOnFunction(F);
-  }
-  if (SalvageStaleProfile)
-distributeIRToProfileLocationMap();
-
+void SampleProfileMatcher::reportOrPersistProfileStats() {
   if (ReportProfileStaleness) {
 if (FunctionSamples::ProfileIsProbeBased) {
-  errs() << "(" << NumMismatchedFuncHash << "/" << TotalProfiledFunc << ")"
+  errs() << "(" << PreMatchStats.NumMismatchedFuncHash << "/"
+ << PreMatchStats.TotalProfiledFunc << ")"
  << " of functions' profile are invalid and "
- << " (" << MismatchedFuncHashSamples << "/" << 
TotalFuncHashSamples
- << ")"
+ << " (" << PreMatchStats.MismatchedFuncHashSamples << "/"
+ << PreMatchStats.TotalFunctionSamples << ")"
  << " of samples are discarded due to function hash mismatch.\n";
 }
-errs() << "(" << NumMismatchedCallsites << "/" << TotalProfiledCallsites
-   << ")"
+errs() << "(" << PreMatchStats.NumMismatchedCallsites << "/"
+   << PreMatchStats.TotalProfiledCallsites << ")"
<< " of callsites' profile are invalid and "
-   << "(" << MismatchedCallsiteSamples << "/" << TotalCallsiteSamples
-   << ")"
+   << "(" << PreMatchStats.MismatchedCallsiteSamples << "/"
+   << PreMatchStats.TotalFunctionSamples << ")"
<< " of samples are discarded due to callsite location mismatch.\n";
+if (SalvageStaleProfile) {
+  uint64_t NumRecoveredCallsites = PostMatchStats.TotalProfiledCallsites -
+   PostMatchStats.NumMismatchedCallsites;
+  uint64_t NumMismatchedCallsites =
+  PreMatchStats.NumMismatchedCallsites - NumRecoveredCallsites;
+  errs() << "Out of " << PostMatchStats.TotalProfiledCallsites
+ << " callsites used for profile matching, "
+ << NumRecoveredCallsites
+ << " callsites have been recovered. After the matching, ("
+ << NumMismatchedCallsites << "/"
+ << PreMatchStats.TotalProfiledCallsites
+ << ") of callsites are still invalid ("
+ << PostMatchStats.MismatchedCallsiteSamples << "/"
+ << PreMatchStats.TotalFunctionSamples << ")"
+ << " of samples are still discarded.\n";
+}
   }
 
   if (PersistProfileStaleness) {
 LLVMContext &Ctx = M.getContext();
 MDBuilder MDB(Ctx);
 
 SmallVector> ProfStatsVec;
+ProfStatsVec.emplace_back("NumMismatchedCallsites",
+  PreMatchStats.NumMismatchedCallsites);
+ProfStatsVec.emplace_back("TotalProfiledCallsites",
+  PreMatchStats.TotalProfiledCallsites);
+ProfStatsVec.emplace_back("MismatchedCallsiteSamples",
+  PreMatchStats.MismatchedCallsiteSamples);
+ProfStatsVec.emplace_back("TotalProfiledFunc",
+  PreMatchStats.TotalProfiledFunc);
+ProfStatsVec.emplace_back("TotalFunctionSamples",
+  PreMatchStats.TotalFunctionSamples);
 if (FunctionSamples::ProfileIsProbeBased) {
-  ProfStatsVec.emplace_back("NumMismatchedFuncHash", 
NumMismatchedFuncHash);
-  ProfStatsVec.emplace_back("TotalProfiledFunc", TotalProfiledFunc);
+  ProfStatsVec.emplace_back("NumMismatchedFuncHash",
+PreMatchStats.NumMismatchedFuncHash);
   ProfStatsVec.emplace_back("MismatchedFuncHashSamples",
-MismatchedFuncHashSamples);
-  ProfStatsVec.emplace_back("TotalFuncHashSamples", TotalFuncHashSamples);
+PreMatchStats.MismatchedFuncHashSamples);
+}
+if (SalvageStaleProfile) {
+  ProfStatsVec.emplace_back("PostMatchNumMismatchedCallsites",
+PostMatchStats.NumMismatchedCallsites);
+  ProfStatsVec.emplace_back("NumCallsitesForMatching",
+PostMatchStats.TotalProfiledCallsites);
+  ProfStatsVec.emplace_back("PostMatch

[llvm] [clang] [libcxx] [libc] [compiler-rt] [lld] [libcxxabi] [flang] [clang-tools-extra] [CSSPGO] Compute and report post-match profile staleness (PR #79090)

2024-01-29 Thread Lei Wang via cfe-commits


@@ -2460,63 +2528,108 @@ void SampleProfileMatcher::runOnFunction(const 
Function &F) {
   !ProbeManager->profileIsValid(F, *FSFlattened)) {
 // The matching result will be saved to IRToProfileLocationMap, create a 
new
 // map for each function.
+auto &IRToProfileLocationMap = getIRToProfileLocationMap(F);
 runStaleProfileMatching(F, IRAnchors, ProfileAnchors,
-getIRToProfileLocationMap(F));
+IRToProfileLocationMap);
+PostMatchStats.countMismatchedCallsites(F, IRAnchors, ProfileAnchors,
+IRToProfileLocationMap);
   }
 }
 
-void SampleProfileMatcher::runOnModule() {
-  ProfileConverter::flattenProfile(Reader.getProfiles(), FlattenedProfiles,
-   FunctionSamples::ProfileIsCS);
-  for (auto &F : M) {
-if (F.isDeclaration() || !F.hasFnAttribute("use-sample-profile"))
-  continue;
-runOnFunction(F);
-  }
-  if (SalvageStaleProfile)
-distributeIRToProfileLocationMap();
-
+void SampleProfileMatcher::reportOrPersistProfileStats() {
   if (ReportProfileStaleness) {
 if (FunctionSamples::ProfileIsProbeBased) {
-  errs() << "(" << NumMismatchedFuncHash << "/" << TotalProfiledFunc << ")"
+  errs() << "(" << PreMatchStats.NumMismatchedFuncHash << "/"
+ << PreMatchStats.TotalProfiledFunc << ")"
  << " of functions' profile are invalid and "
- << " (" << MismatchedFuncHashSamples << "/" << 
TotalFuncHashSamples
- << ")"
+ << " (" << PreMatchStats.MismatchedFuncHashSamples << "/"
+ << PreMatchStats.TotalFunctionSamples << ")"
  << " of samples are discarded due to function hash mismatch.\n";
 }
-errs() << "(" << NumMismatchedCallsites << "/" << TotalProfiledCallsites
-   << ")"
+errs() << "(" << PreMatchStats.NumMismatchedCallsites << "/"
+   << PreMatchStats.TotalProfiledCallsites << ")"
<< " of callsites' profile are invalid and "
-   << "(" << MismatchedCallsiteSamples << "/" << TotalCallsiteSamples
-   << ")"
+   << "(" << PreMatchStats.MismatchedCallsiteSamples << "/"
+   << PreMatchStats.TotalFunctionSamples << ")"
<< " of samples are discarded due to callsite location mismatch.\n";
+if (SalvageStaleProfile) {
+  uint64_t NumRecoveredCallsites = PostMatchStats.TotalProfiledCallsites -
+   PostMatchStats.NumMismatchedCallsites;
+  uint64_t NumMismatchedCallsites =
+  PreMatchStats.NumMismatchedCallsites - NumRecoveredCallsites;
+  errs() << "Out of " << PostMatchStats.TotalProfiledCallsites
+ << " callsites used for profile matching, "
+ << NumRecoveredCallsites
+ << " callsites have been recovered. After the matching, ("
+ << NumMismatchedCallsites << "/"
+ << PreMatchStats.TotalProfiledCallsites
+ << ") of callsites are still invalid ("
+ << PostMatchStats.MismatchedCallsiteSamples << "/"
+ << PreMatchStats.TotalFunctionSamples << ")"
+ << " of samples are still discarded.\n";
+}
   }
 
   if (PersistProfileStaleness) {
 LLVMContext &Ctx = M.getContext();
 MDBuilder MDB(Ctx);
 
 SmallVector> ProfStatsVec;
+ProfStatsVec.emplace_back("NumMismatchedCallsites",
+  PreMatchStats.NumMismatchedCallsites);
+ProfStatsVec.emplace_back("TotalProfiledCallsites",
+  PreMatchStats.TotalProfiledCallsites);
+ProfStatsVec.emplace_back("MismatchedCallsiteSamples",
+  PreMatchStats.MismatchedCallsiteSamples);
+ProfStatsVec.emplace_back("TotalProfiledFunc",
+  PreMatchStats.TotalProfiledFunc);
+ProfStatsVec.emplace_back("TotalFunctionSamples",
+  PreMatchStats.TotalFunctionSamples);
 if (FunctionSamples::ProfileIsProbeBased) {
-  ProfStatsVec.emplace_back("NumMismatchedFuncHash", 
NumMismatchedFuncHash);
-  ProfStatsVec.emplace_back("TotalProfiledFunc", TotalProfiledFunc);
+  ProfStatsVec.emplace_back("NumMismatchedFuncHash",
+PreMatchStats.NumMismatchedFuncHash);
   ProfStatsVec.emplace_back("MismatchedFuncHashSamples",
-MismatchedFuncHashSamples);
-  ProfStatsVec.emplace_back("TotalFuncHashSamples", TotalFuncHashSamples);
+PreMatchStats.MismatchedFuncHashSamples);
+}
+if (SalvageStaleProfile) {
+  ProfStatsVec.emplace_back("PostMatchNumMismatchedCallsites",
+PostMatchStats.NumMismatchedCallsites);
+  ProfStatsVec.emplace_back("NumCallsitesForMatching",
+PostMatchStats.TotalProfiledCallsites);
+  ProfStatsVec.emplace_back("PostMatch

[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)

2024-10-08 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm updated 
https://github.com/llvm/llvm-project/pull/109837

>From 07a2cab3fa5df2965f9f7da9ee2d3603581a47f1 Mon Sep 17 00:00:00 2001
From: wlei 
Date: Sun, 22 Sep 2024 20:23:20 -0700
Subject: [PATCH 1/5] [InstrPGO] Instrument sampling profile based cold
 function

---
 clang/include/clang/Driver/Options.td |  6 +
 clang/lib/Driver/ToolChain.cpp|  4 +++-
 clang/lib/Driver/ToolChains/Clang.cpp | 18 ++
 clang/test/CodeGen/Inputs/pgo-cold-func.prof  |  2 ++
 .../test/CodeGen/pgo-cold-function-coverage.c | 12 ++
 llvm/lib/Passes/PassBuilderPipelines.cpp  | 18 ++
 .../Instrumentation/PGOInstrumentation.cpp| 16 +
 .../PGOProfile/instr-gen-cold-function.ll | 24 +++
 8 files changed, 99 insertions(+), 1 deletion(-)
 create mode 100644 clang/test/CodeGen/Inputs/pgo-cold-func.prof
 create mode 100644 clang/test/CodeGen/pgo-cold-function-coverage.c
 create mode 100644 llvm/test/Transforms/PGOProfile/instr-gen-cold-function.ll

diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 002f60350543d9..6bb92427f2d53f 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1784,6 +1784,12 @@ defm debug_info_for_profiling : 
BoolFOption<"debug-info-for-profiling",
   PosFlag,
   NegFlag>;
+def fprofile_generate_cold_function_coverage : Flag<["-"], 
"fprofile-generate-cold-function-coverage">, 
+Group, Visibility<[ClangOption, CLOption]>,
+HelpText<"Generate instrumented code to cold functions into 
default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env 
var)">;
+def fprofile_generate_cold_function_coverage_EQ : Joined<["-"], 
"fprofile-generate-cold-function-coverage=">, 
+Group, Visibility<[ClangOption, CLOption]>, 
MetaVarName<"">,
+HelpText<"Generate instrumented code to cold functions into 
/default.profraw (overridden by LLVM_PROFILE_FILE env var)">; 
 def fprofile_instr_generate : Flag<["-"], "fprofile-instr-generate">,
 Group, Visibility<[ClangOption, CLOption]>,
 HelpText<"Generate instrumented code to collect execution counts into 
default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env 
var)">;
diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp
index 16f9b629fc538c..e56db150c6b47e 100644
--- a/clang/lib/Driver/ToolChain.cpp
+++ b/clang/lib/Driver/ToolChain.cpp
@@ -889,7 +889,9 @@ bool ToolChain::needsProfileRT(const ArgList &Args) {
  Args.hasArg(options::OPT_fprofile_instr_generate) ||
  Args.hasArg(options::OPT_fprofile_instr_generate_EQ) ||
  Args.hasArg(options::OPT_fcreate_profile) ||
- Args.hasArg(options::OPT_forder_file_instrumentation);
+ Args.hasArg(options::OPT_forder_file_instrumentation) ||
+ Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage) ||
+ Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage_EQ);
 }
 
 bool ToolChain::needsGCovInstrumentation(const llvm::opt::ArgList &Args) {
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index 0bab48caf1a5e2..00602a08232ba2 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -649,6 +649,24 @@ static void addPGOAndCoverageFlags(const ToolChain &TC, 
Compilation &C,
 }
   }
 
+  if (auto *ColdFuncCoverageArg = Args.getLastArg(
+  options::OPT_fprofile_generate_cold_function_coverage,
+  options::OPT_fprofile_generate_cold_function_coverage_EQ)) {
+SmallString<128> Path(
+ColdFuncCoverageArg->getOption().matches(
+options::OPT_fprofile_generate_cold_function_coverage_EQ)
+? ColdFuncCoverageArg->getValue()
+: "");
+llvm::sys::path::append(Path, "default_%m.profraw");
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back(Args.MakeArgString(
+Twine("--instrument-sample-cold-function-path=") + Path));
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back("--instrument-cold-function-coverage");
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back("--pgo-function-entry-coverage");
+  }
+
   Arg *PGOGenArg = nullptr;
   if (PGOGenerateArg) {
 assert(!CSPGOGenerateArg);
diff --git a/clang/test/CodeGen/Inputs/pgo-cold-func.prof 
b/clang/test/CodeGen/Inputs/pgo-cold-func.prof
new file mode 100644
index 00..e50be02e0a8545
--- /dev/null
+++ b/clang/test/CodeGen/Inputs/pgo-cold-func.prof
@@ -0,0 +1,2 @@
+foo:1:1
+ 1: 1
diff --git a/clang/test/CodeGen/pgo-cold-function-coverage.c 
b/clang/test/CodeGen/pgo-cold-function-coverage.c
new file mode 100644
index 00..0d2767c022b9f1
--- /dev/null
+++ b/clang/test/CodeGen/pgo-cold-function-coverage.c
@@ -0,0 +1,12 @@
+// Test -fprofile-generate-cold-function-coverage 
+// RUN: %clang -O2 -fprofile-generate-cold-function-coverage=/xxx

[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)

2024-10-08 Thread Lei Wang via cfe-commits


@@ -1119,6 +1125,18 @@ 
PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level,
   // removed.
   MPM.addPass(
   PGOIndirectCallPromotion(true /* IsInLTO */, true /* SamplePGO */));
+
+if (InstrumentSampleColdFuncPath.getNumOccurrences() &&
+Phase != ThinOrFullLTOPhase::ThinLTOPostLink) {
+  assert(!InstrumentSampleColdFuncPath.empty() &&
+ "File path is requeired for instrumentation generation");
+  InstrumentColdFunctionCoverage = true;
+  addPreInlinerPasses(MPM, Level, Phase);
+  addPGOInstrPasses(MPM, Level, /* RunProfileGen */ true,
+/* IsCS */ false, /* AtomicCounterUpdate */ false,
+InstrumentSampleColdFuncPath, "",
+IntrusiveRefCntPtr());
+}

wlei-llvm wrote:

Good point, that's more flexible, thanks!

https://github.com/llvm/llvm-project/pull/109837
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)

2024-10-09 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm updated 
https://github.com/llvm/llvm-project/pull/109837

>From 07a2cab3fa5df2965f9f7da9ee2d3603581a47f1 Mon Sep 17 00:00:00 2001
From: wlei 
Date: Sun, 22 Sep 2024 20:23:20 -0700
Subject: [PATCH 1/6] [InstrPGO] Instrument sampling profile based cold
 function

---
 clang/include/clang/Driver/Options.td |  6 +
 clang/lib/Driver/ToolChain.cpp|  4 +++-
 clang/lib/Driver/ToolChains/Clang.cpp | 18 ++
 clang/test/CodeGen/Inputs/pgo-cold-func.prof  |  2 ++
 .../test/CodeGen/pgo-cold-function-coverage.c | 12 ++
 llvm/lib/Passes/PassBuilderPipelines.cpp  | 18 ++
 .../Instrumentation/PGOInstrumentation.cpp| 16 +
 .../PGOProfile/instr-gen-cold-function.ll | 24 +++
 8 files changed, 99 insertions(+), 1 deletion(-)
 create mode 100644 clang/test/CodeGen/Inputs/pgo-cold-func.prof
 create mode 100644 clang/test/CodeGen/pgo-cold-function-coverage.c
 create mode 100644 llvm/test/Transforms/PGOProfile/instr-gen-cold-function.ll

diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 002f60350543d9..6bb92427f2d53f 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1784,6 +1784,12 @@ defm debug_info_for_profiling : 
BoolFOption<"debug-info-for-profiling",
   PosFlag,
   NegFlag>;
+def fprofile_generate_cold_function_coverage : Flag<["-"], 
"fprofile-generate-cold-function-coverage">, 
+Group, Visibility<[ClangOption, CLOption]>,
+HelpText<"Generate instrumented code to cold functions into 
default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env 
var)">;
+def fprofile_generate_cold_function_coverage_EQ : Joined<["-"], 
"fprofile-generate-cold-function-coverage=">, 
+Group, Visibility<[ClangOption, CLOption]>, 
MetaVarName<"">,
+HelpText<"Generate instrumented code to cold functions into 
/default.profraw (overridden by LLVM_PROFILE_FILE env var)">; 
 def fprofile_instr_generate : Flag<["-"], "fprofile-instr-generate">,
 Group, Visibility<[ClangOption, CLOption]>,
 HelpText<"Generate instrumented code to collect execution counts into 
default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env 
var)">;
diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp
index 16f9b629fc538c..e56db150c6b47e 100644
--- a/clang/lib/Driver/ToolChain.cpp
+++ b/clang/lib/Driver/ToolChain.cpp
@@ -889,7 +889,9 @@ bool ToolChain::needsProfileRT(const ArgList &Args) {
  Args.hasArg(options::OPT_fprofile_instr_generate) ||
  Args.hasArg(options::OPT_fprofile_instr_generate_EQ) ||
  Args.hasArg(options::OPT_fcreate_profile) ||
- Args.hasArg(options::OPT_forder_file_instrumentation);
+ Args.hasArg(options::OPT_forder_file_instrumentation) ||
+ Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage) ||
+ Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage_EQ);
 }
 
 bool ToolChain::needsGCovInstrumentation(const llvm::opt::ArgList &Args) {
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index 0bab48caf1a5e2..00602a08232ba2 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -649,6 +649,24 @@ static void addPGOAndCoverageFlags(const ToolChain &TC, 
Compilation &C,
 }
   }
 
+  if (auto *ColdFuncCoverageArg = Args.getLastArg(
+  options::OPT_fprofile_generate_cold_function_coverage,
+  options::OPT_fprofile_generate_cold_function_coverage_EQ)) {
+SmallString<128> Path(
+ColdFuncCoverageArg->getOption().matches(
+options::OPT_fprofile_generate_cold_function_coverage_EQ)
+? ColdFuncCoverageArg->getValue()
+: "");
+llvm::sys::path::append(Path, "default_%m.profraw");
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back(Args.MakeArgString(
+Twine("--instrument-sample-cold-function-path=") + Path));
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back("--instrument-cold-function-coverage");
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back("--pgo-function-entry-coverage");
+  }
+
   Arg *PGOGenArg = nullptr;
   if (PGOGenerateArg) {
 assert(!CSPGOGenerateArg);
diff --git a/clang/test/CodeGen/Inputs/pgo-cold-func.prof 
b/clang/test/CodeGen/Inputs/pgo-cold-func.prof
new file mode 100644
index 00..e50be02e0a8545
--- /dev/null
+++ b/clang/test/CodeGen/Inputs/pgo-cold-func.prof
@@ -0,0 +1,2 @@
+foo:1:1
+ 1: 1
diff --git a/clang/test/CodeGen/pgo-cold-function-coverage.c 
b/clang/test/CodeGen/pgo-cold-function-coverage.c
new file mode 100644
index 00..0d2767c022b9f1
--- /dev/null
+++ b/clang/test/CodeGen/pgo-cold-function-coverage.c
@@ -0,0 +1,12 @@
+// Test -fprofile-generate-cold-function-coverage 
+// RUN: %clang -O2 -fprofile-generate-cold-function-coverage=/xxx

[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)

2024-10-08 Thread Lei Wang via cfe-commits


@@ -1119,6 +1125,18 @@ 
PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level,
   // removed.
   MPM.addPass(
   PGOIndirectCallPromotion(true /* IsInLTO */, true /* SamplePGO */));
+
+if (InstrumentSampleColdFuncPath.getNumOccurrences() &&
+Phase != ThinOrFullLTOPhase::ThinLTOPostLink) {
+  assert(!InstrumentSampleColdFuncPath.empty() &&
+ "File path is requeired for instrumentation generation");
+  InstrumentColdFunctionCoverage = true;
+  addPreInlinerPasses(MPM, Level, Phase);
+  addPGOInstrPasses(MPM, Level, /* RunProfileGen */ true,
+/* IsCS */ false, /* AtomicCounterUpdate */ false,
+InstrumentSampleColdFuncPath, "",
+IntrusiveRefCntPtr());
+}

wlei-llvm wrote:

> I don't see how there would be any functional difference between using 
> `IsSampleColdFuncCovGen` and `IsXXXColdFuncCovGen`, but I could be missing 
> something. If we can have a single flag for all cases, I think that would be 
> cleaner and I can also try to use it for my case.

I see, sounds good. 

My previous thought is:  there could be functional difference for using the 
`-instrument-cold-function-coverage` flag with sampling PGO 
flag(`-fprofile-sample-use`) vs without sampling PGO flags.  With the sampling 
PGO flag, if a function symbol shows in the binary but not in the profile, we 
can treat it as cold function(set 0 entry count), we then would instrument 
those function. But without sampling PGO flag(also no other PGO options), the 
entry count is unknown(-1), then we don't instrument them. 

So user needs to be aware of the combination use of those flags, I was then 
trying to prevent the misuse, to disallow the use of 
`--instrument-cold-function-coverage` without (sampling) PGO use options(use 
the assertion in the version). 

But on a second thought, if we want to extend for other PGO options, say if the 
profile annotation and instrumentation are not in the same pipeline, it's 
probably hard to check this locally, we can leave it to the build system. 

https://github.com/llvm/llvm-project/pull/109837
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)

2024-10-08 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm updated 
https://github.com/llvm/llvm-project/pull/109837

>From 07a2cab3fa5df2965f9f7da9ee2d3603581a47f1 Mon Sep 17 00:00:00 2001
From: wlei 
Date: Sun, 22 Sep 2024 20:23:20 -0700
Subject: [PATCH 1/4] [InstrPGO] Instrument sampling profile based cold
 function

---
 clang/include/clang/Driver/Options.td |  6 +
 clang/lib/Driver/ToolChain.cpp|  4 +++-
 clang/lib/Driver/ToolChains/Clang.cpp | 18 ++
 clang/test/CodeGen/Inputs/pgo-cold-func.prof  |  2 ++
 .../test/CodeGen/pgo-cold-function-coverage.c | 12 ++
 llvm/lib/Passes/PassBuilderPipelines.cpp  | 18 ++
 .../Instrumentation/PGOInstrumentation.cpp| 16 +
 .../PGOProfile/instr-gen-cold-function.ll | 24 +++
 8 files changed, 99 insertions(+), 1 deletion(-)
 create mode 100644 clang/test/CodeGen/Inputs/pgo-cold-func.prof
 create mode 100644 clang/test/CodeGen/pgo-cold-function-coverage.c
 create mode 100644 llvm/test/Transforms/PGOProfile/instr-gen-cold-function.ll

diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 002f60350543d9..6bb92427f2d53f 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1784,6 +1784,12 @@ defm debug_info_for_profiling : 
BoolFOption<"debug-info-for-profiling",
   PosFlag,
   NegFlag>;
+def fprofile_generate_cold_function_coverage : Flag<["-"], 
"fprofile-generate-cold-function-coverage">, 
+Group, Visibility<[ClangOption, CLOption]>,
+HelpText<"Generate instrumented code to cold functions into 
default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env 
var)">;
+def fprofile_generate_cold_function_coverage_EQ : Joined<["-"], 
"fprofile-generate-cold-function-coverage=">, 
+Group, Visibility<[ClangOption, CLOption]>, 
MetaVarName<"">,
+HelpText<"Generate instrumented code to cold functions into 
/default.profraw (overridden by LLVM_PROFILE_FILE env var)">; 
 def fprofile_instr_generate : Flag<["-"], "fprofile-instr-generate">,
 Group, Visibility<[ClangOption, CLOption]>,
 HelpText<"Generate instrumented code to collect execution counts into 
default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env 
var)">;
diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp
index 16f9b629fc538c..e56db150c6b47e 100644
--- a/clang/lib/Driver/ToolChain.cpp
+++ b/clang/lib/Driver/ToolChain.cpp
@@ -889,7 +889,9 @@ bool ToolChain::needsProfileRT(const ArgList &Args) {
  Args.hasArg(options::OPT_fprofile_instr_generate) ||
  Args.hasArg(options::OPT_fprofile_instr_generate_EQ) ||
  Args.hasArg(options::OPT_fcreate_profile) ||
- Args.hasArg(options::OPT_forder_file_instrumentation);
+ Args.hasArg(options::OPT_forder_file_instrumentation) ||
+ Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage) ||
+ Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage_EQ);
 }
 
 bool ToolChain::needsGCovInstrumentation(const llvm::opt::ArgList &Args) {
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index 0bab48caf1a5e2..00602a08232ba2 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -649,6 +649,24 @@ static void addPGOAndCoverageFlags(const ToolChain &TC, 
Compilation &C,
 }
   }
 
+  if (auto *ColdFuncCoverageArg = Args.getLastArg(
+  options::OPT_fprofile_generate_cold_function_coverage,
+  options::OPT_fprofile_generate_cold_function_coverage_EQ)) {
+SmallString<128> Path(
+ColdFuncCoverageArg->getOption().matches(
+options::OPT_fprofile_generate_cold_function_coverage_EQ)
+? ColdFuncCoverageArg->getValue()
+: "");
+llvm::sys::path::append(Path, "default_%m.profraw");
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back(Args.MakeArgString(
+Twine("--instrument-sample-cold-function-path=") + Path));
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back("--instrument-cold-function-coverage");
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back("--pgo-function-entry-coverage");
+  }
+
   Arg *PGOGenArg = nullptr;
   if (PGOGenerateArg) {
 assert(!CSPGOGenerateArg);
diff --git a/clang/test/CodeGen/Inputs/pgo-cold-func.prof 
b/clang/test/CodeGen/Inputs/pgo-cold-func.prof
new file mode 100644
index 00..e50be02e0a8545
--- /dev/null
+++ b/clang/test/CodeGen/Inputs/pgo-cold-func.prof
@@ -0,0 +1,2 @@
+foo:1:1
+ 1: 1
diff --git a/clang/test/CodeGen/pgo-cold-function-coverage.c 
b/clang/test/CodeGen/pgo-cold-function-coverage.c
new file mode 100644
index 00..0d2767c022b9f1
--- /dev/null
+++ b/clang/test/CodeGen/pgo-cold-function-coverage.c
@@ -0,0 +1,12 @@
+// Test -fprofile-generate-cold-function-coverage 
+// RUN: %clang -O2 -fprofile-generate-cold-function-coverage=/xxx

[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)

2024-09-29 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm edited 
https://github.com/llvm/llvm-project/pull/109837
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)

2024-10-01 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm updated 
https://github.com/llvm/llvm-project/pull/109837

>From 07a2cab3fa5df2965f9f7da9ee2d3603581a47f1 Mon Sep 17 00:00:00 2001
From: wlei 
Date: Sun, 22 Sep 2024 20:23:20 -0700
Subject: [PATCH 1/2] [InstrPGO] Instrument sampling profile based cold
 function

---
 clang/include/clang/Driver/Options.td |  6 +
 clang/lib/Driver/ToolChain.cpp|  4 +++-
 clang/lib/Driver/ToolChains/Clang.cpp | 18 ++
 clang/test/CodeGen/Inputs/pgo-cold-func.prof  |  2 ++
 .../test/CodeGen/pgo-cold-function-coverage.c | 12 ++
 llvm/lib/Passes/PassBuilderPipelines.cpp  | 18 ++
 .../Instrumentation/PGOInstrumentation.cpp| 16 +
 .../PGOProfile/instr-gen-cold-function.ll | 24 +++
 8 files changed, 99 insertions(+), 1 deletion(-)
 create mode 100644 clang/test/CodeGen/Inputs/pgo-cold-func.prof
 create mode 100644 clang/test/CodeGen/pgo-cold-function-coverage.c
 create mode 100644 llvm/test/Transforms/PGOProfile/instr-gen-cold-function.ll

diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 002f60350543d9..6bb92427f2d53f 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1784,6 +1784,12 @@ defm debug_info_for_profiling : 
BoolFOption<"debug-info-for-profiling",
   PosFlag,
   NegFlag>;
+def fprofile_generate_cold_function_coverage : Flag<["-"], 
"fprofile-generate-cold-function-coverage">, 
+Group, Visibility<[ClangOption, CLOption]>,
+HelpText<"Generate instrumented code to cold functions into 
default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env 
var)">;
+def fprofile_generate_cold_function_coverage_EQ : Joined<["-"], 
"fprofile-generate-cold-function-coverage=">, 
+Group, Visibility<[ClangOption, CLOption]>, 
MetaVarName<"">,
+HelpText<"Generate instrumented code to cold functions into 
/default.profraw (overridden by LLVM_PROFILE_FILE env var)">; 
 def fprofile_instr_generate : Flag<["-"], "fprofile-instr-generate">,
 Group, Visibility<[ClangOption, CLOption]>,
 HelpText<"Generate instrumented code to collect execution counts into 
default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env 
var)">;
diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp
index 16f9b629fc538c..e56db150c6b47e 100644
--- a/clang/lib/Driver/ToolChain.cpp
+++ b/clang/lib/Driver/ToolChain.cpp
@@ -889,7 +889,9 @@ bool ToolChain::needsProfileRT(const ArgList &Args) {
  Args.hasArg(options::OPT_fprofile_instr_generate) ||
  Args.hasArg(options::OPT_fprofile_instr_generate_EQ) ||
  Args.hasArg(options::OPT_fcreate_profile) ||
- Args.hasArg(options::OPT_forder_file_instrumentation);
+ Args.hasArg(options::OPT_forder_file_instrumentation) ||
+ Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage) ||
+ Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage_EQ);
 }
 
 bool ToolChain::needsGCovInstrumentation(const llvm::opt::ArgList &Args) {
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index 0bab48caf1a5e2..00602a08232ba2 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -649,6 +649,24 @@ static void addPGOAndCoverageFlags(const ToolChain &TC, 
Compilation &C,
 }
   }
 
+  if (auto *ColdFuncCoverageArg = Args.getLastArg(
+  options::OPT_fprofile_generate_cold_function_coverage,
+  options::OPT_fprofile_generate_cold_function_coverage_EQ)) {
+SmallString<128> Path(
+ColdFuncCoverageArg->getOption().matches(
+options::OPT_fprofile_generate_cold_function_coverage_EQ)
+? ColdFuncCoverageArg->getValue()
+: "");
+llvm::sys::path::append(Path, "default_%m.profraw");
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back(Args.MakeArgString(
+Twine("--instrument-sample-cold-function-path=") + Path));
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back("--instrument-cold-function-coverage");
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back("--pgo-function-entry-coverage");
+  }
+
   Arg *PGOGenArg = nullptr;
   if (PGOGenerateArg) {
 assert(!CSPGOGenerateArg);
diff --git a/clang/test/CodeGen/Inputs/pgo-cold-func.prof 
b/clang/test/CodeGen/Inputs/pgo-cold-func.prof
new file mode 100644
index 00..e50be02e0a8545
--- /dev/null
+++ b/clang/test/CodeGen/Inputs/pgo-cold-func.prof
@@ -0,0 +1,2 @@
+foo:1:1
+ 1: 1
diff --git a/clang/test/CodeGen/pgo-cold-function-coverage.c 
b/clang/test/CodeGen/pgo-cold-function-coverage.c
new file mode 100644
index 00..0d2767c022b9f1
--- /dev/null
+++ b/clang/test/CodeGen/pgo-cold-function-coverage.c
@@ -0,0 +1,12 @@
+// Test -fprofile-generate-cold-function-coverage 
+// RUN: %clang -O2 -fprofile-generate-cold-function-coverage=/xxx

[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)

2024-10-21 Thread Lei Wang via cfe-commits


@@ -319,6 +319,29 @@ static cl::opt PGOFunctionCriticalEdgeThreshold(
 cl::desc("Do not instrument functions with the number of critical edges "
  " greater than this threshold."));
 
+static cl::opt ColdFuncCoverageMaxEntryCount(
+"cold-function-coverage-max-entry-count", cl::init(0), cl::Hidden,
+cl::desc("When enabling cold function coverage instrumentation, skip "
+ "instrumenting the function whose entry count is above the given "
+ "value"));
+
+static cl::opt InstrumentColdFunctionCoverageMode(
+"instrument-cold-function-coverage-mode",

wlei-llvm wrote:

Agreed, I can't see other modes so far, changed to `pgo-treat-unknown-as-cold`

https://github.com/llvm/llvm-project/pull/109837
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)

2024-10-21 Thread Lei Wang via cfe-commits


@@ -1784,6 +1784,12 @@ defm debug_info_for_profiling : 
BoolFOption<"debug-info-for-profiling",
   PosFlag,
   NegFlag>;
+def fprofile_generate_cold_function_coverage : Flag<["-"], 
"fprofile-generate-cold-function-coverage">, 
+Group, Visibility<[ClangOption, CLOption]>,
+HelpText<"Generate instrumented code to cold functions into 
default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env 
var)">;

wlei-llvm wrote:

thanks for the suggestion!

https://github.com/llvm/llvm-project/pull/109837
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)

2024-10-21 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm updated 
https://github.com/llvm/llvm-project/pull/109837

>From 07a2cab3fa5df2965f9f7da9ee2d3603581a47f1 Mon Sep 17 00:00:00 2001
From: wlei 
Date: Sun, 22 Sep 2024 20:23:20 -0700
Subject: [PATCH 1/7] [InstrPGO] Instrument sampling profile based cold
 function

---
 clang/include/clang/Driver/Options.td |  6 +
 clang/lib/Driver/ToolChain.cpp|  4 +++-
 clang/lib/Driver/ToolChains/Clang.cpp | 18 ++
 clang/test/CodeGen/Inputs/pgo-cold-func.prof  |  2 ++
 .../test/CodeGen/pgo-cold-function-coverage.c | 12 ++
 llvm/lib/Passes/PassBuilderPipelines.cpp  | 18 ++
 .../Instrumentation/PGOInstrumentation.cpp| 16 +
 .../PGOProfile/instr-gen-cold-function.ll | 24 +++
 8 files changed, 99 insertions(+), 1 deletion(-)
 create mode 100644 clang/test/CodeGen/Inputs/pgo-cold-func.prof
 create mode 100644 clang/test/CodeGen/pgo-cold-function-coverage.c
 create mode 100644 llvm/test/Transforms/PGOProfile/instr-gen-cold-function.ll

diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 002f60350543d9..6bb92427f2d53f 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1784,6 +1784,12 @@ defm debug_info_for_profiling : 
BoolFOption<"debug-info-for-profiling",
   PosFlag,
   NegFlag>;
+def fprofile_generate_cold_function_coverage : Flag<["-"], 
"fprofile-generate-cold-function-coverage">, 
+Group, Visibility<[ClangOption, CLOption]>,
+HelpText<"Generate instrumented code to cold functions into 
default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env 
var)">;
+def fprofile_generate_cold_function_coverage_EQ : Joined<["-"], 
"fprofile-generate-cold-function-coverage=">, 
+Group, Visibility<[ClangOption, CLOption]>, 
MetaVarName<"">,
+HelpText<"Generate instrumented code to cold functions into 
/default.profraw (overridden by LLVM_PROFILE_FILE env var)">; 
 def fprofile_instr_generate : Flag<["-"], "fprofile-instr-generate">,
 Group, Visibility<[ClangOption, CLOption]>,
 HelpText<"Generate instrumented code to collect execution counts into 
default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env 
var)">;
diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp
index 16f9b629fc538c..e56db150c6b47e 100644
--- a/clang/lib/Driver/ToolChain.cpp
+++ b/clang/lib/Driver/ToolChain.cpp
@@ -889,7 +889,9 @@ bool ToolChain::needsProfileRT(const ArgList &Args) {
  Args.hasArg(options::OPT_fprofile_instr_generate) ||
  Args.hasArg(options::OPT_fprofile_instr_generate_EQ) ||
  Args.hasArg(options::OPT_fcreate_profile) ||
- Args.hasArg(options::OPT_forder_file_instrumentation);
+ Args.hasArg(options::OPT_forder_file_instrumentation) ||
+ Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage) ||
+ Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage_EQ);
 }
 
 bool ToolChain::needsGCovInstrumentation(const llvm::opt::ArgList &Args) {
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index 0bab48caf1a5e2..00602a08232ba2 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -649,6 +649,24 @@ static void addPGOAndCoverageFlags(const ToolChain &TC, 
Compilation &C,
 }
   }
 
+  if (auto *ColdFuncCoverageArg = Args.getLastArg(
+  options::OPT_fprofile_generate_cold_function_coverage,
+  options::OPT_fprofile_generate_cold_function_coverage_EQ)) {
+SmallString<128> Path(
+ColdFuncCoverageArg->getOption().matches(
+options::OPT_fprofile_generate_cold_function_coverage_EQ)
+? ColdFuncCoverageArg->getValue()
+: "");
+llvm::sys::path::append(Path, "default_%m.profraw");
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back(Args.MakeArgString(
+Twine("--instrument-sample-cold-function-path=") + Path));
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back("--instrument-cold-function-coverage");
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back("--pgo-function-entry-coverage");
+  }
+
   Arg *PGOGenArg = nullptr;
   if (PGOGenerateArg) {
 assert(!CSPGOGenerateArg);
diff --git a/clang/test/CodeGen/Inputs/pgo-cold-func.prof 
b/clang/test/CodeGen/Inputs/pgo-cold-func.prof
new file mode 100644
index 00..e50be02e0a8545
--- /dev/null
+++ b/clang/test/CodeGen/Inputs/pgo-cold-func.prof
@@ -0,0 +1,2 @@
+foo:1:1
+ 1: 1
diff --git a/clang/test/CodeGen/pgo-cold-function-coverage.c 
b/clang/test/CodeGen/pgo-cold-function-coverage.c
new file mode 100644
index 00..0d2767c022b9f1
--- /dev/null
+++ b/clang/test/CodeGen/pgo-cold-function-coverage.c
@@ -0,0 +1,12 @@
+// Test -fprofile-generate-cold-function-coverage 
+// RUN: %clang -O2 -fprofile-generate-cold-function-coverage=/xxx

[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)

2024-10-21 Thread Lei Wang via cfe-commits


@@ -649,6 +649,24 @@ static void addPGOAndCoverageFlags(const ToolChain &TC, 
Compilation &C,
 }
   }
 
+  if (auto *ColdFuncCoverageArg = Args.getLastArg(
+  options::OPT_fprofile_generate_cold_function_coverage,
+  options::OPT_fprofile_generate_cold_function_coverage_EQ)) {
+SmallString<128> Path(
+ColdFuncCoverageArg->getOption().matches(
+options::OPT_fprofile_generate_cold_function_coverage_EQ)
+? ColdFuncCoverageArg->getValue()
+: "");
+llvm::sys::path::append(Path, "default_%m.profraw");
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back(Args.MakeArgString(
+Twine("--instrument-cold-function-coverage-path=") + Path));
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back("--instrument-cold-function-coverage");

wlei-llvm wrote:

I really like this idea, however, this seems indeed not easy(needs a lot of 
refactoring). 

As you said, the `InstrProfileOutput` filed is shared with other PGO flag(in 
our case, it's the `fprofile-sample-use=`) , see 
[`PGOOptions.ProfileFIle`](https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/Support/PGOOptions.h#L43).
 One thing we can do is to split this field into two: `ProfileUseFile` and 
`ProfileGenFole`, let all `profile--gererate` go to  `ProfileGenFile` 
and let `sample-use/instr-use go` to `ProfileUseFile`. I think that's doable 
and not a big change(maybe a separate diff). Then in sample-use case, we can 
use the  `ProfileGenFIle` path for cold coverage work. WDYT?

Though it would still not be perfect, the logic here 
https://github.com/llvm/llvm-project/blob/main/clang/lib/CodeGen/BackendUtil.cpp#L811-L841
  is complicated/ would still need more refactoring. I was stuck here when 
thinking if `-fprofile-sample-use` and `--fprofile-instrument-path=` are both 
used, which branch(`CodeGenOpts.hasProfileIRInstr()` or 
`!CodeGenOpts.SampleProfileFile.empty()`) should it belongs to.

For this version, I just removed the duplicated 
flag(`--pgo-instrument-cold-function-only`) and added a FIXME comment to 
explain this. 




https://github.com/llvm/llvm-project/pull/109837
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)

2024-10-02 Thread Lei Wang via cfe-commits


@@ -1784,6 +1784,12 @@ defm debug_info_for_profiling : 
BoolFOption<"debug-info-for-profiling",
   PosFlag,
   NegFlag>;
+def fprofile_generate_cold_function_coverage : Flag<["-"], 
"fprofile-generate-cold-function-coverage">, 

wlei-llvm wrote:

> I meant, why not just use `clang ... -mllvm 
> -instrument-cold-function-coverage`? Is this a clang - only feature (i.e. 
> rust can't use it?) Is it just for symmetry with the current PGO flags?
> 
> (This is not a pushback, mainly curious. Also the patch would be quite 
> smaller if you didn't need to pass through the flags from clang to llvm)

I see, thanks for the suggestion!  We also need to link runtime 
lib(`compiler_rt.profile.a`)
but yeah, I agree it's possible to pass directly to llvm and linker without 
through clang. Then the full command line would be like 
```
clang ... -mllvm -instrument-cold-function-coverage -mllvm 
-instrument-sample-cold-function-path= -mllvm 
--pgo-function-entry-coverage 
ld.lld ... --u__llvm_runtime_variable  
.../lib/x86_64-unknown-linux-gnu/libclang_rt.profile.a
```
So yes, adding the clang driver flag is kind of to mirror current[ IRPGO 
flags](https://fburl.com/na3cp3gn), for easy maintenance purpose(given that 
`-fprofile-generate` doesn't work with `-fprofile-sample-use`) and also to 
centralize the configuration for the convenience. IMO, the `compiler_rt` is 
probably the main blocker here, I didn't find an easy way to bundle it with a 
llvm flag. 

Appreciate any further suggestions!

https://github.com/llvm/llvm-project/pull/109837
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)

2024-10-01 Thread Lei Wang via cfe-commits

wlei-llvm wrote:

Thanks for the context! 
> Why not use the existing `-pgo-function-entry-coverage` 
> (https://discourse.llvm.org/t/instrprofiling-lightweight-instrumentation/59113/14?u=ellishg)
>  LLVM flag? It takes advantage of the `llvm.instrprof.cover` intrinsic which 
> has less size and runtime overhead than `llvm.instrprof.increment`.

We do use the  `-pgo-function-entry-coverage` in this PR, see 
[here](https://github.com/llvm/llvm-project/pull/109837/files#diff-bac41c71569f27df21a843bcd74d2e604ed508afbdf14161dfb545c5d228R666-R667).
 but furthermore, we skip instrumenting the functions that are covered by 
sampling PGO profile. 

> It also supports IRPGO and CSIRPGO while it seems that this PR requires a 
> sample PGO input.

Yeah, as I commented above, unfortunately currently the IRPGO main flag is 
incompatible with the Sampling PGO flag(also a bit diverged for the pipeline 
passes), that's why we have to add extra instr passes under sampling PGO 
pipeline and added a new driver flag. 


https://github.com/llvm/llvm-project/pull/109837
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)

2024-10-01 Thread Lei Wang via cfe-commits


@@ -1784,6 +1784,12 @@ defm debug_info_for_profiling : 
BoolFOption<"debug-info-for-profiling",
   PosFlag,
   NegFlag>;
+def fprofile_generate_cold_function_coverage : Flag<["-"], 
"fprofile-generate-cold-function-coverage">, 

wlei-llvm wrote:

Sorry if my PR description is not clear. Note that there is no use for  
`-fprofile-generate` and  `-fprofile-instr-generate` here, so a driver flag 
here is to configure the instr file path and make linker to link the 
compiler.profile object files (just similar to `-fprofile-[instr]-generate=`).

The reason for not using `-fprofile-[instr]-generate` is because those flags 
conflict with `-fprofile-sample-use`, see 
[PGOOptions](https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/Support/PGOOptions.h#L27-L43),
 `ProfileFile` is a shared file path which means the two flags are actually 
mutually exclusive. 

Another option is to make `-fprofile-generate` compatible with 
`-fprofile-samle-use`(like refactoring PGOOptions or adding another flag to 
configure the file path things), this seems to me they are more complicated 
than the current one. But I’m open to any suggestions on this. 

https://github.com/llvm/llvm-project/pull/109837
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)

2024-10-01 Thread Lei Wang via cfe-commits

wlei-llvm wrote:

> For your use case, can you use 
> [ProfileSymbolList](https://github.com/llvm/llvm-project/blob/32ffc9fdc2cd422c88c926b862adb3de726e3888/llvm/include/llvm/ProfileData/SampleProf.h#L1509-L1512)
>  to identify very cold functions (executed but unsampled) or are you indeed 
> looking for functions that are never executed?

We are indeed focusing on functions that are never executed. The 
ProfileSymbolList is already used to identify functions with a “0” entry count, 
which is the target of the instrumentation in this work. We then aim to further 
distinguish the dead functions from them.

> 
> Will this be used to guide developers with diagnostics or more aggressive 
> compiler driven optimizations?

We haven’t considered to use them for any compiler optimization. The primary 
goal is to improve the developer experience, just to take the general benefits 
from removing dead code: improving readability, good codebase maintainability, 
reducing build time, etc. Another potential use might be to verify the sampling 
PGO profile coverage/quality, say to check if we missed any functions that are 
actually hot. 


https://github.com/llvm/llvm-project/pull/109837
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function v2 (PR #110330)

2024-10-28 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm closed 
https://github.com/llvm/llvm-project/pull/110330
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] specify clang --target to fix breakage on AIX (PR #114127)

2024-10-29 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm created 
https://github.com/llvm/llvm-project/pull/114127

`-fprofile-sample-use` is not supported on AIX, which caused a CI failure.

>From 2c2ddcc9bda277c6da7b51147785c0f1e88b848d Mon Sep 17 00:00:00 2001
From: wlei 
Date: Tue, 29 Oct 2024 13:28:33 -0700
Subject: [PATCH] specify clang --target to fix breakage on AIX

---
 clang/test/CodeGen/pgo-cold-function-coverage.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/clang/test/CodeGen/pgo-cold-function-coverage.c 
b/clang/test/CodeGen/pgo-cold-function-coverage.c
index fd1e1e7e14cda5..3003cdc3e15e02 100644
--- a/clang/test/CodeGen/pgo-cold-function-coverage.c
+++ b/clang/test/CodeGen/pgo-cold-function-coverage.c
@@ -1,7 +1,7 @@
 // Test -fprofile-generate-cold-function-coverage 
 
 // RUN: rm -rf %t && split-file %s %t
-// RUN: %clang -O2 -fprofile-generate-cold-function-coverage=/xxx/yyy/ 
-fprofile-sample-accurate -fprofile-sample-use=%t/pgo-cold-func.prof  -S 
-emit-llvm -o - %t/pgo-cold-func.c | FileCheck %s
+// RUN: %clang --target=x86_64 -O2 
-fprofile-generate-cold-function-coverage=/xxx/yyy/ -fprofile-sample-accurate 
-fprofile-sample-use=%t/pgo-cold-func.prof  -S -emit-llvm -o - 
%t/pgo-cold-func.c | FileCheck %s
 
 // CHECK: @__llvm_profile_filename = {{.*}} c"/xxx/yyy/default_%m.profraw\00"
 

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstrPGO] Support cold function coverage instrumentation (PR #109837)

2024-10-29 Thread Lei Wang via cfe-commits

wlei-llvm wrote:

> Hi, this PR is causing a regression on the AIX bot here 
> https://lab.llvm.org/buildbot/#/builders/64/builds/1321/steps/6/logs/FAIL__Clang__pgo-cold-function-coverage_c.
>  Would you be able to take a look? I think it can be resolved by using 
> clang_cc1 in the testcase of marking it unsupported for AIX (see 
> [951919e](https://github.com/llvm/llvm-project/commit/951919e5112cabbd63c7a3bf424736efca81d964))

Sorry for the breakage, looks like we can bypass it by adding `--target=`  like 
https://github.com/llvm/llvm-project/commit/f92551ba6b5aaef28d886ea6b3c4f9774fa8fadd

https://github.com/llvm/llvm-project/pull/109837
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstrPGO] Support cold function coverage instrumentation (PR #109837)

2024-10-30 Thread Lei Wang via cfe-commits


@@ -1182,8 +1187,13 @@ 
PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level,
   const bool IsCtxProfUse =
   !UseCtxProfile.empty() && Phase == ThinOrFullLTOPhase::ThinLTOPreLink;
 
+  // Enable cold function coverage instrumentation if
+  // InstrumentColdFuncOnlyPath is provided.
+  const bool IsColdFuncOnlyInstrGen = PGOInstrumentColdFunctionOnly =

wlei-llvm wrote:

Thanks for pointing out. It's intentional to set the global variable, I wasn't 
aware it will cause data race. I will try to change it to local one. 

https://github.com/llvm/llvm-project/pull/109837
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] specify clang --target to fix breakage on AIX (PR #114127)

2024-10-29 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm updated 
https://github.com/llvm/llvm-project/pull/114127

>From 2c2ddcc9bda277c6da7b51147785c0f1e88b848d Mon Sep 17 00:00:00 2001
From: wlei 
Date: Tue, 29 Oct 2024 13:28:33 -0700
Subject: [PATCH] specify clang --target to fix breakage on AIX

---
 clang/test/CodeGen/pgo-cold-function-coverage.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/clang/test/CodeGen/pgo-cold-function-coverage.c 
b/clang/test/CodeGen/pgo-cold-function-coverage.c
index fd1e1e7e14cda5..3003cdc3e15e02 100644
--- a/clang/test/CodeGen/pgo-cold-function-coverage.c
+++ b/clang/test/CodeGen/pgo-cold-function-coverage.c
@@ -1,7 +1,7 @@
 // Test -fprofile-generate-cold-function-coverage 
 
 // RUN: rm -rf %t && split-file %s %t
-// RUN: %clang -O2 -fprofile-generate-cold-function-coverage=/xxx/yyy/ 
-fprofile-sample-accurate -fprofile-sample-use=%t/pgo-cold-func.prof  -S 
-emit-llvm -o - %t/pgo-cold-func.c | FileCheck %s
+// RUN: %clang --target=x86_64 -O2 
-fprofile-generate-cold-function-coverage=/xxx/yyy/ -fprofile-sample-accurate 
-fprofile-sample-use=%t/pgo-cold-func.prof  -S -emit-llvm -o - 
%t/pgo-cold-func.c | FileCheck %s
 
 // CHECK: @__llvm_profile_filename = {{.*}} c"/xxx/yyy/default_%m.profraw\00"
 

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] specify clang --target to fix breakage on AIX (PR #114127)

2024-10-29 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm closed 
https://github.com/llvm/llvm-project/pull/114127
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstrPGO] Avoid using global variable to fix potential data race (PR #114364)

2024-10-31 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm updated 
https://github.com/llvm/llvm-project/pull/114364

>From a7b444bd75d6f83ed0f5692783990a59f36e8459 Mon Sep 17 00:00:00 2001
From: wlei 
Date: Thu, 31 Oct 2024 09:58:27 -0700
Subject: [PATCH 1/3] Reapply "[InstrPGO] Support cold function coverage
 instrumentation (#109837)"

This reverts commit d924a9ba03a05b417676e84f6c81aac44f907f71.
---
 clang/include/clang/Driver/Options.td |  6 
 clang/lib/Driver/ToolChain.cpp|  4 ++-
 clang/lib/Driver/ToolChains/Clang.cpp | 20 +++
 .../test/CodeGen/pgo-cold-function-coverage.c | 19 ++
 ...fprofile-generate-cold-function-coverage.c |  8 +
 llvm/lib/Passes/PassBuilderPipelines.cpp  | 17 -
 .../Instrumentation/PGOInstrumentation.cpp| 19 ++
 .../PGOProfile/instr-gen-cold-function.ll | 35 +++
 8 files changed, 126 insertions(+), 2 deletions(-)
 create mode 100644 clang/test/CodeGen/pgo-cold-function-coverage.c
 create mode 100644 clang/test/Driver/fprofile-generate-cold-function-coverage.c
 create mode 100644 llvm/test/Transforms/PGOProfile/instr-gen-cold-function.ll

diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index c8bc2fe377b8ec..2814d2b1bf3733 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1786,6 +1786,12 @@ defm debug_info_for_profiling : 
BoolFOption<"debug-info-for-profiling",
   PosFlag,
   NegFlag>;
+def fprofile_generate_cold_function_coverage : Flag<["-"], 
"fprofile-generate-cold-function-coverage">, 
+Group, Visibility<[ClangOption, CLOption]>,
+HelpText<"Generate instrumented code to collect coverage info for cold 
functions into default.profraw file (overridden by '=' form of option or 
LLVM_PROFILE_FILE env var)">;
+def fprofile_generate_cold_function_coverage_EQ : Joined<["-"], 
"fprofile-generate-cold-function-coverage=">, 
+Group, Visibility<[ClangOption, CLOption]>, 
MetaVarName<"">,
+HelpText<"Generate instrumented code to collect coverage info for cold 
functions into /default.profraw (overridden by LLVM_PROFILE_FILE env 
var)">; 
 def fprofile_instr_generate : Flag<["-"], "fprofile-instr-generate">,
 Group, Visibility<[ClangOption, CLOption]>,
 HelpText<"Generate instrumented code to collect execution counts into 
default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env 
var)">;
diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp
index 6d3ede40691093..bdf3da0c96adca 100644
--- a/clang/lib/Driver/ToolChain.cpp
+++ b/clang/lib/Driver/ToolChain.cpp
@@ -899,7 +899,9 @@ bool ToolChain::needsProfileRT(const ArgList &Args) {
  Args.hasArg(options::OPT_fprofile_instr_generate) ||
  Args.hasArg(options::OPT_fprofile_instr_generate_EQ) ||
  Args.hasArg(options::OPT_fcreate_profile) ||
- Args.hasArg(options::OPT_forder_file_instrumentation);
+ Args.hasArg(options::OPT_forder_file_instrumentation) ||
+ Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage) ||
+ Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage_EQ);
 }
 
 bool ToolChain::needsGCovInstrumentation(const llvm::opt::ArgList &Args) {
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index 04b3832327a99c..4c6f508f1f24a6 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -632,6 +632,26 @@ static void addPGOAndCoverageFlags(const ToolChain &TC, 
Compilation &C,
 }
   }
 
+  if (auto *ColdFuncCoverageArg = Args.getLastArg(
+  options::OPT_fprofile_generate_cold_function_coverage,
+  options::OPT_fprofile_generate_cold_function_coverage_EQ)) {
+SmallString<128> Path(
+ColdFuncCoverageArg->getOption().matches(
+options::OPT_fprofile_generate_cold_function_coverage_EQ)
+? ColdFuncCoverageArg->getValue()
+: "");
+llvm::sys::path::append(Path, "default_%m.profraw");
+// FIXME: Idealy the file path should be passed through
+// `-fprofile-instrument-path=`(InstrProfileOutput), however, this field is
+// shared with other profile use path(see PGOOptions), we need to refactor
+// PGOOptions to make it work.
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back(Args.MakeArgString(
+Twine("--instrument-cold-function-only-path=") + Path));
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back("--pgo-function-entry-coverage");
+  }
+
   Arg *PGOGenArg = nullptr;
   if (PGOGenerateArg) {
 assert(!CSPGOGenerateArg);
diff --git a/clang/test/CodeGen/pgo-cold-function-coverage.c 
b/clang/test/CodeGen/pgo-cold-function-coverage.c
new file mode 100644
index 00..fd1e1e7e14cda5
--- /dev/null
+++ b/clang/test/CodeGen/pgo-cold-function-coverage.c
@@ -0,0 +1,19 @@
+// Test -fprofile-generate-cold-function-coverage 
+
+// RUN: rm -rf %t && sp

[clang] [llvm] [InstrPGO] Avoid using global variable to fix potential data race (PR #114364)

2024-10-30 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm created 
https://github.com/llvm/llvm-project/pull/114364

In https://github.com/llvm/llvm-project/pull/109837, it sets a global 
variable(`PGOInstrumentColdFunctionOnly`) in PassBuilderPipelines.cpp which 
introduced a data race detected by TSan. To fix this, I decouple the flag 
setting, the flags are now set separately(`instrument-cold-function-only-path` 
is required to be used with `--pgo-instrument-cold-function-only`). 

>From f912c30955541bd18f5fa56eff1c222672c0fa7f Mon Sep 17 00:00:00 2001
From: wlei 
Date: Wed, 30 Oct 2024 22:21:24 -0700
Subject: [PATCH] fix datarace

---
 clang/lib/Driver/ToolChains/Clang.cpp |  2 ++
 .../fprofile-generate-cold-function-coverage.c|  1 +
 llvm/lib/Passes/PassBuilderPipelines.cpp  | 15 ++-
 3 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index 4c6f508f1f24a6..04ae99d417d4f7 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -649,6 +649,8 @@ static void addPGOAndCoverageFlags(const ToolChain &TC, 
Compilation &C,
 CmdArgs.push_back(Args.MakeArgString(
 Twine("--instrument-cold-function-only-path=") + Path));
 CmdArgs.push_back("-mllvm");
+CmdArgs.push_back("--pgo-instrument-cold-function-only");
+CmdArgs.push_back("-mllvm");
 CmdArgs.push_back("--pgo-function-entry-coverage");
   }
 
diff --git a/clang/test/Driver/fprofile-generate-cold-function-coverage.c 
b/clang/test/Driver/fprofile-generate-cold-function-coverage.c
index 9b2f46423f34b1..a8ca69cf224cd0 100644
--- a/clang/test/Driver/fprofile-generate-cold-function-coverage.c
+++ b/clang/test/Driver/fprofile-generate-cold-function-coverage.c
@@ -1,6 +1,7 @@
 // RUN: %clang -### -c -fprofile-generate-cold-function-coverage %s 2>&1 | 
FileCheck %s
 // CHECK: "--instrument-cold-function-only-path=default_%m.profraw" 
 // CHECK: "--pgo-function-entry-coverage"
+// CHECK: "--pgo-instrument-cold-function-only"
 // CHECK-NOT:  "-fprofile-instrument"
 // CHECK-NOT:  "-fprofile-instrument-path=
 
diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp 
b/llvm/lib/Passes/PassBuilderPipelines.cpp
index c391853c8d0c2b..d5ef7c654c35cb 100644
--- a/llvm/lib/Passes/PassBuilderPipelines.cpp
+++ b/llvm/lib/Passes/PassBuilderPipelines.cpp
@@ -298,7 +298,9 @@ static cl::opt UseLoopVersioningLICM(
 
 static cl::opt InstrumentColdFuncOnlyPath(
 "instrument-cold-function-only-path", cl::init(""),
-cl::desc("File path for cold function only instrumentation"), cl::Hidden);
+cl::desc("File path for cold function only instrumentation(requires use "
+ "with --pgo-instrument-cold-function-only)"),
+cl::Hidden);
 
 extern cl::opt UseCtxProfile;
 extern cl::opt PGOInstrumentColdFunctionOnly;
@@ -1188,10 +1190,13 @@ 
PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level,
   const bool IsCtxProfUse =
   !UseCtxProfile.empty() && Phase == ThinOrFullLTOPhase::ThinLTOPreLink;
 
-  // Enable cold function coverage instrumentation if
-  // InstrumentColdFuncOnlyPath is provided.
-  const bool IsColdFuncOnlyInstrGen = PGOInstrumentColdFunctionOnly =
-  IsPGOPreLink && !InstrumentColdFuncOnlyPath.empty();
+  assert(
+  (InstrumentColdFuncOnlyPath.empty() || PGOInstrumentColdFunctionOnly) &&
+  "--instrument-cold-function-only-path is provided but "
+  "--pgo-instrument-cold-function-only is not enabled");
+  const bool IsColdFuncOnlyInstrGen = PGOInstrumentColdFunctionOnly &&
+  IsPGOPreLink &&
+  !InstrumentColdFuncOnlyPath.empty();
 
   if (IsPGOInstrGen || IsPGOInstrUse || IsMemprofUse || IsCtxProfGen ||
   IsCtxProfUse || IsColdFuncOnlyInstrGen)

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstrPGO] Support cold function coverage instrumentation (PR #109837)

2024-10-28 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm edited 
https://github.com/llvm/llvm-project/pull/109837
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstrPGO] Support cold function coverage instrumentation (PR #109837)

2024-10-28 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm closed 
https://github.com/llvm/llvm-project/pull/109837
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)

2024-10-27 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm updated 
https://github.com/llvm/llvm-project/pull/109837

>From 07a2cab3fa5df2965f9f7da9ee2d3603581a47f1 Mon Sep 17 00:00:00 2001
From: wlei 
Date: Sun, 22 Sep 2024 20:23:20 -0700
Subject: [PATCH 1/8] [InstrPGO] Instrument sampling profile based cold
 function

---
 clang/include/clang/Driver/Options.td |  6 +
 clang/lib/Driver/ToolChain.cpp|  4 +++-
 clang/lib/Driver/ToolChains/Clang.cpp | 18 ++
 clang/test/CodeGen/Inputs/pgo-cold-func.prof  |  2 ++
 .../test/CodeGen/pgo-cold-function-coverage.c | 12 ++
 llvm/lib/Passes/PassBuilderPipelines.cpp  | 18 ++
 .../Instrumentation/PGOInstrumentation.cpp| 16 +
 .../PGOProfile/instr-gen-cold-function.ll | 24 +++
 8 files changed, 99 insertions(+), 1 deletion(-)
 create mode 100644 clang/test/CodeGen/Inputs/pgo-cold-func.prof
 create mode 100644 clang/test/CodeGen/pgo-cold-function-coverage.c
 create mode 100644 llvm/test/Transforms/PGOProfile/instr-gen-cold-function.ll

diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 002f60350543d9..6bb92427f2d53f 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1784,6 +1784,12 @@ defm debug_info_for_profiling : 
BoolFOption<"debug-info-for-profiling",
   PosFlag,
   NegFlag>;
+def fprofile_generate_cold_function_coverage : Flag<["-"], 
"fprofile-generate-cold-function-coverage">, 
+Group, Visibility<[ClangOption, CLOption]>,
+HelpText<"Generate instrumented code to cold functions into 
default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env 
var)">;
+def fprofile_generate_cold_function_coverage_EQ : Joined<["-"], 
"fprofile-generate-cold-function-coverage=">, 
+Group, Visibility<[ClangOption, CLOption]>, 
MetaVarName<"">,
+HelpText<"Generate instrumented code to cold functions into 
/default.profraw (overridden by LLVM_PROFILE_FILE env var)">; 
 def fprofile_instr_generate : Flag<["-"], "fprofile-instr-generate">,
 Group, Visibility<[ClangOption, CLOption]>,
 HelpText<"Generate instrumented code to collect execution counts into 
default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env 
var)">;
diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp
index 16f9b629fc538c..e56db150c6b47e 100644
--- a/clang/lib/Driver/ToolChain.cpp
+++ b/clang/lib/Driver/ToolChain.cpp
@@ -889,7 +889,9 @@ bool ToolChain::needsProfileRT(const ArgList &Args) {
  Args.hasArg(options::OPT_fprofile_instr_generate) ||
  Args.hasArg(options::OPT_fprofile_instr_generate_EQ) ||
  Args.hasArg(options::OPT_fcreate_profile) ||
- Args.hasArg(options::OPT_forder_file_instrumentation);
+ Args.hasArg(options::OPT_forder_file_instrumentation) ||
+ Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage) ||
+ Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage_EQ);
 }
 
 bool ToolChain::needsGCovInstrumentation(const llvm::opt::ArgList &Args) {
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index 0bab48caf1a5e2..00602a08232ba2 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -649,6 +649,24 @@ static void addPGOAndCoverageFlags(const ToolChain &TC, 
Compilation &C,
 }
   }
 
+  if (auto *ColdFuncCoverageArg = Args.getLastArg(
+  options::OPT_fprofile_generate_cold_function_coverage,
+  options::OPT_fprofile_generate_cold_function_coverage_EQ)) {
+SmallString<128> Path(
+ColdFuncCoverageArg->getOption().matches(
+options::OPT_fprofile_generate_cold_function_coverage_EQ)
+? ColdFuncCoverageArg->getValue()
+: "");
+llvm::sys::path::append(Path, "default_%m.profraw");
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back(Args.MakeArgString(
+Twine("--instrument-sample-cold-function-path=") + Path));
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back("--instrument-cold-function-coverage");
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back("--pgo-function-entry-coverage");
+  }
+
   Arg *PGOGenArg = nullptr;
   if (PGOGenerateArg) {
 assert(!CSPGOGenerateArg);
diff --git a/clang/test/CodeGen/Inputs/pgo-cold-func.prof 
b/clang/test/CodeGen/Inputs/pgo-cold-func.prof
new file mode 100644
index 00..e50be02e0a8545
--- /dev/null
+++ b/clang/test/CodeGen/Inputs/pgo-cold-func.prof
@@ -0,0 +1,2 @@
+foo:1:1
+ 1: 1
diff --git a/clang/test/CodeGen/pgo-cold-function-coverage.c 
b/clang/test/CodeGen/pgo-cold-function-coverage.c
new file mode 100644
index 00..0d2767c022b9f1
--- /dev/null
+++ b/clang/test/CodeGen/pgo-cold-function-coverage.c
@@ -0,0 +1,12 @@
+// Test -fprofile-generate-cold-function-coverage 
+// RUN: %clang -O2 -fprofile-generate-cold-function-coverage=/xxx

[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)

2024-10-27 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm updated 
https://github.com/llvm/llvm-project/pull/109837

>From 07a2cab3fa5df2965f9f7da9ee2d3603581a47f1 Mon Sep 17 00:00:00 2001
From: wlei 
Date: Sun, 22 Sep 2024 20:23:20 -0700
Subject: [PATCH 1/8] [InstrPGO] Instrument sampling profile based cold
 function

---
 clang/include/clang/Driver/Options.td |  6 +
 clang/lib/Driver/ToolChain.cpp|  4 +++-
 clang/lib/Driver/ToolChains/Clang.cpp | 18 ++
 clang/test/CodeGen/Inputs/pgo-cold-func.prof  |  2 ++
 .../test/CodeGen/pgo-cold-function-coverage.c | 12 ++
 llvm/lib/Passes/PassBuilderPipelines.cpp  | 18 ++
 .../Instrumentation/PGOInstrumentation.cpp| 16 +
 .../PGOProfile/instr-gen-cold-function.ll | 24 +++
 8 files changed, 99 insertions(+), 1 deletion(-)
 create mode 100644 clang/test/CodeGen/Inputs/pgo-cold-func.prof
 create mode 100644 clang/test/CodeGen/pgo-cold-function-coverage.c
 create mode 100644 llvm/test/Transforms/PGOProfile/instr-gen-cold-function.ll

diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 002f60350543d9..6bb92427f2d53f 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1784,6 +1784,12 @@ defm debug_info_for_profiling : 
BoolFOption<"debug-info-for-profiling",
   PosFlag,
   NegFlag>;
+def fprofile_generate_cold_function_coverage : Flag<["-"], 
"fprofile-generate-cold-function-coverage">, 
+Group, Visibility<[ClangOption, CLOption]>,
+HelpText<"Generate instrumented code to cold functions into 
default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env 
var)">;
+def fprofile_generate_cold_function_coverage_EQ : Joined<["-"], 
"fprofile-generate-cold-function-coverage=">, 
+Group, Visibility<[ClangOption, CLOption]>, 
MetaVarName<"">,
+HelpText<"Generate instrumented code to cold functions into 
/default.profraw (overridden by LLVM_PROFILE_FILE env var)">; 
 def fprofile_instr_generate : Flag<["-"], "fprofile-instr-generate">,
 Group, Visibility<[ClangOption, CLOption]>,
 HelpText<"Generate instrumented code to collect execution counts into 
default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env 
var)">;
diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp
index 16f9b629fc538c..e56db150c6b47e 100644
--- a/clang/lib/Driver/ToolChain.cpp
+++ b/clang/lib/Driver/ToolChain.cpp
@@ -889,7 +889,9 @@ bool ToolChain::needsProfileRT(const ArgList &Args) {
  Args.hasArg(options::OPT_fprofile_instr_generate) ||
  Args.hasArg(options::OPT_fprofile_instr_generate_EQ) ||
  Args.hasArg(options::OPT_fcreate_profile) ||
- Args.hasArg(options::OPT_forder_file_instrumentation);
+ Args.hasArg(options::OPT_forder_file_instrumentation) ||
+ Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage) ||
+ Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage_EQ);
 }
 
 bool ToolChain::needsGCovInstrumentation(const llvm::opt::ArgList &Args) {
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index 0bab48caf1a5e2..00602a08232ba2 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -649,6 +649,24 @@ static void addPGOAndCoverageFlags(const ToolChain &TC, 
Compilation &C,
 }
   }
 
+  if (auto *ColdFuncCoverageArg = Args.getLastArg(
+  options::OPT_fprofile_generate_cold_function_coverage,
+  options::OPT_fprofile_generate_cold_function_coverage_EQ)) {
+SmallString<128> Path(
+ColdFuncCoverageArg->getOption().matches(
+options::OPT_fprofile_generate_cold_function_coverage_EQ)
+? ColdFuncCoverageArg->getValue()
+: "");
+llvm::sys::path::append(Path, "default_%m.profraw");
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back(Args.MakeArgString(
+Twine("--instrument-sample-cold-function-path=") + Path));
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back("--instrument-cold-function-coverage");
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back("--pgo-function-entry-coverage");
+  }
+
   Arg *PGOGenArg = nullptr;
   if (PGOGenerateArg) {
 assert(!CSPGOGenerateArg);
diff --git a/clang/test/CodeGen/Inputs/pgo-cold-func.prof 
b/clang/test/CodeGen/Inputs/pgo-cold-func.prof
new file mode 100644
index 00..e50be02e0a8545
--- /dev/null
+++ b/clang/test/CodeGen/Inputs/pgo-cold-func.prof
@@ -0,0 +1,2 @@
+foo:1:1
+ 1: 1
diff --git a/clang/test/CodeGen/pgo-cold-function-coverage.c 
b/clang/test/CodeGen/pgo-cold-function-coverage.c
new file mode 100644
index 00..0d2767c022b9f1
--- /dev/null
+++ b/clang/test/CodeGen/pgo-cold-function-coverage.c
@@ -0,0 +1,12 @@
+// Test -fprofile-generate-cold-function-coverage 
+// RUN: %clang -O2 -fprofile-generate-cold-function-coverage=/xxx

[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)

2024-10-27 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm edited 
https://github.com/llvm/llvm-project/pull/109837
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstrPGO] Avoid using global variable to fix potential data race (PR #114364)

2024-10-31 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm ready_for_review 
https://github.com/llvm/llvm-project/pull/114364
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstrPGO] Support cold function coverage instrumentation (PR #109837)

2024-10-31 Thread Lei Wang via cfe-commits


@@ -1182,8 +1187,13 @@ 
PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level,
   const bool IsCtxProfUse =
   !UseCtxProfile.empty() && Phase == ThinOrFullLTOPhase::ThinLTOPreLink;
 
+  // Enable cold function coverage instrumentation if
+  // InstrumentColdFuncOnlyPath is provided.
+  const bool IsColdFuncOnlyInstrGen = PGOInstrumentColdFunctionOnly =

wlei-llvm wrote:

Should be fixed by https://github.com/llvm/llvm-project/pull/114364

https://github.com/llvm/llvm-project/pull/109837
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstrPGO] Avoid using global variable to fix potential data race (PR #114364)

2024-10-31 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm updated 
https://github.com/llvm/llvm-project/pull/114364

>From a7b444bd75d6f83ed0f5692783990a59f36e8459 Mon Sep 17 00:00:00 2001
From: wlei 
Date: Thu, 31 Oct 2024 09:58:27 -0700
Subject: [PATCH 1/3] Reapply "[InstrPGO] Support cold function coverage
 instrumentation (#109837)"

This reverts commit d924a9ba03a05b417676e84f6c81aac44f907f71.
---
 clang/include/clang/Driver/Options.td |  6 
 clang/lib/Driver/ToolChain.cpp|  4 ++-
 clang/lib/Driver/ToolChains/Clang.cpp | 20 +++
 .../test/CodeGen/pgo-cold-function-coverage.c | 19 ++
 ...fprofile-generate-cold-function-coverage.c |  8 +
 llvm/lib/Passes/PassBuilderPipelines.cpp  | 17 -
 .../Instrumentation/PGOInstrumentation.cpp| 19 ++
 .../PGOProfile/instr-gen-cold-function.ll | 35 +++
 8 files changed, 126 insertions(+), 2 deletions(-)
 create mode 100644 clang/test/CodeGen/pgo-cold-function-coverage.c
 create mode 100644 clang/test/Driver/fprofile-generate-cold-function-coverage.c
 create mode 100644 llvm/test/Transforms/PGOProfile/instr-gen-cold-function.ll

diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index c8bc2fe377b8ec..2814d2b1bf3733 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1786,6 +1786,12 @@ defm debug_info_for_profiling : 
BoolFOption<"debug-info-for-profiling",
   PosFlag,
   NegFlag>;
+def fprofile_generate_cold_function_coverage : Flag<["-"], 
"fprofile-generate-cold-function-coverage">, 
+Group, Visibility<[ClangOption, CLOption]>,
+HelpText<"Generate instrumented code to collect coverage info for cold 
functions into default.profraw file (overridden by '=' form of option or 
LLVM_PROFILE_FILE env var)">;
+def fprofile_generate_cold_function_coverage_EQ : Joined<["-"], 
"fprofile-generate-cold-function-coverage=">, 
+Group, Visibility<[ClangOption, CLOption]>, 
MetaVarName<"">,
+HelpText<"Generate instrumented code to collect coverage info for cold 
functions into /default.profraw (overridden by LLVM_PROFILE_FILE env 
var)">; 
 def fprofile_instr_generate : Flag<["-"], "fprofile-instr-generate">,
 Group, Visibility<[ClangOption, CLOption]>,
 HelpText<"Generate instrumented code to collect execution counts into 
default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env 
var)">;
diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp
index 6d3ede40691093..bdf3da0c96adca 100644
--- a/clang/lib/Driver/ToolChain.cpp
+++ b/clang/lib/Driver/ToolChain.cpp
@@ -899,7 +899,9 @@ bool ToolChain::needsProfileRT(const ArgList &Args) {
  Args.hasArg(options::OPT_fprofile_instr_generate) ||
  Args.hasArg(options::OPT_fprofile_instr_generate_EQ) ||
  Args.hasArg(options::OPT_fcreate_profile) ||
- Args.hasArg(options::OPT_forder_file_instrumentation);
+ Args.hasArg(options::OPT_forder_file_instrumentation) ||
+ Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage) ||
+ Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage_EQ);
 }
 
 bool ToolChain::needsGCovInstrumentation(const llvm::opt::ArgList &Args) {
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index 04b3832327a99c..4c6f508f1f24a6 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -632,6 +632,26 @@ static void addPGOAndCoverageFlags(const ToolChain &TC, 
Compilation &C,
 }
   }
 
+  if (auto *ColdFuncCoverageArg = Args.getLastArg(
+  options::OPT_fprofile_generate_cold_function_coverage,
+  options::OPT_fprofile_generate_cold_function_coverage_EQ)) {
+SmallString<128> Path(
+ColdFuncCoverageArg->getOption().matches(
+options::OPT_fprofile_generate_cold_function_coverage_EQ)
+? ColdFuncCoverageArg->getValue()
+: "");
+llvm::sys::path::append(Path, "default_%m.profraw");
+// FIXME: Idealy the file path should be passed through
+// `-fprofile-instrument-path=`(InstrProfileOutput), however, this field is
+// shared with other profile use path(see PGOOptions), we need to refactor
+// PGOOptions to make it work.
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back(Args.MakeArgString(
+Twine("--instrument-cold-function-only-path=") + Path));
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back("--pgo-function-entry-coverage");
+  }
+
   Arg *PGOGenArg = nullptr;
   if (PGOGenerateArg) {
 assert(!CSPGOGenerateArg);
diff --git a/clang/test/CodeGen/pgo-cold-function-coverage.c 
b/clang/test/CodeGen/pgo-cold-function-coverage.c
new file mode 100644
index 00..fd1e1e7e14cda5
--- /dev/null
+++ b/clang/test/CodeGen/pgo-cold-function-coverage.c
@@ -0,0 +1,19 @@
+// Test -fprofile-generate-cold-function-coverage 
+
+// RUN: rm -rf %t && sp

[clang] [llvm] [InstrPGO] Avoid using global variable to fix potential data race (PR #114364)

2024-10-31 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm closed 
https://github.com/llvm/llvm-project/pull/114364
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstrPGO] Avoid using global variable to fix potential data race (PR #114364)

2024-10-31 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm updated 
https://github.com/llvm/llvm-project/pull/114364

>From a7b444bd75d6f83ed0f5692783990a59f36e8459 Mon Sep 17 00:00:00 2001
From: wlei 
Date: Thu, 31 Oct 2024 09:58:27 -0700
Subject: [PATCH 1/4] Reapply "[InstrPGO] Support cold function coverage
 instrumentation (#109837)"

This reverts commit d924a9ba03a05b417676e84f6c81aac44f907f71.
---
 clang/include/clang/Driver/Options.td |  6 
 clang/lib/Driver/ToolChain.cpp|  4 ++-
 clang/lib/Driver/ToolChains/Clang.cpp | 20 +++
 .../test/CodeGen/pgo-cold-function-coverage.c | 19 ++
 ...fprofile-generate-cold-function-coverage.c |  8 +
 llvm/lib/Passes/PassBuilderPipelines.cpp  | 17 -
 .../Instrumentation/PGOInstrumentation.cpp| 19 ++
 .../PGOProfile/instr-gen-cold-function.ll | 35 +++
 8 files changed, 126 insertions(+), 2 deletions(-)
 create mode 100644 clang/test/CodeGen/pgo-cold-function-coverage.c
 create mode 100644 clang/test/Driver/fprofile-generate-cold-function-coverage.c
 create mode 100644 llvm/test/Transforms/PGOProfile/instr-gen-cold-function.ll

diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index c8bc2fe377b8ec..2814d2b1bf3733 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1786,6 +1786,12 @@ defm debug_info_for_profiling : 
BoolFOption<"debug-info-for-profiling",
   PosFlag,
   NegFlag>;
+def fprofile_generate_cold_function_coverage : Flag<["-"], 
"fprofile-generate-cold-function-coverage">, 
+Group, Visibility<[ClangOption, CLOption]>,
+HelpText<"Generate instrumented code to collect coverage info for cold 
functions into default.profraw file (overridden by '=' form of option or 
LLVM_PROFILE_FILE env var)">;
+def fprofile_generate_cold_function_coverage_EQ : Joined<["-"], 
"fprofile-generate-cold-function-coverage=">, 
+Group, Visibility<[ClangOption, CLOption]>, 
MetaVarName<"">,
+HelpText<"Generate instrumented code to collect coverage info for cold 
functions into /default.profraw (overridden by LLVM_PROFILE_FILE env 
var)">; 
 def fprofile_instr_generate : Flag<["-"], "fprofile-instr-generate">,
 Group, Visibility<[ClangOption, CLOption]>,
 HelpText<"Generate instrumented code to collect execution counts into 
default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env 
var)">;
diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp
index 6d3ede40691093..bdf3da0c96adca 100644
--- a/clang/lib/Driver/ToolChain.cpp
+++ b/clang/lib/Driver/ToolChain.cpp
@@ -899,7 +899,9 @@ bool ToolChain::needsProfileRT(const ArgList &Args) {
  Args.hasArg(options::OPT_fprofile_instr_generate) ||
  Args.hasArg(options::OPT_fprofile_instr_generate_EQ) ||
  Args.hasArg(options::OPT_fcreate_profile) ||
- Args.hasArg(options::OPT_forder_file_instrumentation);
+ Args.hasArg(options::OPT_forder_file_instrumentation) ||
+ Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage) ||
+ Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage_EQ);
 }
 
 bool ToolChain::needsGCovInstrumentation(const llvm::opt::ArgList &Args) {
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index 04b3832327a99c..4c6f508f1f24a6 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -632,6 +632,26 @@ static void addPGOAndCoverageFlags(const ToolChain &TC, 
Compilation &C,
 }
   }
 
+  if (auto *ColdFuncCoverageArg = Args.getLastArg(
+  options::OPT_fprofile_generate_cold_function_coverage,
+  options::OPT_fprofile_generate_cold_function_coverage_EQ)) {
+SmallString<128> Path(
+ColdFuncCoverageArg->getOption().matches(
+options::OPT_fprofile_generate_cold_function_coverage_EQ)
+? ColdFuncCoverageArg->getValue()
+: "");
+llvm::sys::path::append(Path, "default_%m.profraw");
+// FIXME: Idealy the file path should be passed through
+// `-fprofile-instrument-path=`(InstrProfileOutput), however, this field is
+// shared with other profile use path(see PGOOptions), we need to refactor
+// PGOOptions to make it work.
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back(Args.MakeArgString(
+Twine("--instrument-cold-function-only-path=") + Path));
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back("--pgo-function-entry-coverage");
+  }
+
   Arg *PGOGenArg = nullptr;
   if (PGOGenerateArg) {
 assert(!CSPGOGenerateArg);
diff --git a/clang/test/CodeGen/pgo-cold-function-coverage.c 
b/clang/test/CodeGen/pgo-cold-function-coverage.c
new file mode 100644
index 00..fd1e1e7e14cda5
--- /dev/null
+++ b/clang/test/CodeGen/pgo-cold-function-coverage.c
@@ -0,0 +1,19 @@
+// Test -fprofile-generate-cold-function-coverage 
+
+// RUN: rm -rf %t && sp

[clang] [llvm] [InstrPGO] Avoid using global variable to fix potential data race (PR #114364)

2024-10-31 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm updated 
https://github.com/llvm/llvm-project/pull/114364

>From a7b444bd75d6f83ed0f5692783990a59f36e8459 Mon Sep 17 00:00:00 2001
From: wlei 
Date: Thu, 31 Oct 2024 09:58:27 -0700
Subject: [PATCH 1/4] Reapply "[InstrPGO] Support cold function coverage
 instrumentation (#109837)"

This reverts commit d924a9ba03a05b417676e84f6c81aac44f907f71.
---
 clang/include/clang/Driver/Options.td |  6 
 clang/lib/Driver/ToolChain.cpp|  4 ++-
 clang/lib/Driver/ToolChains/Clang.cpp | 20 +++
 .../test/CodeGen/pgo-cold-function-coverage.c | 19 ++
 ...fprofile-generate-cold-function-coverage.c |  8 +
 llvm/lib/Passes/PassBuilderPipelines.cpp  | 17 -
 .../Instrumentation/PGOInstrumentation.cpp| 19 ++
 .../PGOProfile/instr-gen-cold-function.ll | 35 +++
 8 files changed, 126 insertions(+), 2 deletions(-)
 create mode 100644 clang/test/CodeGen/pgo-cold-function-coverage.c
 create mode 100644 clang/test/Driver/fprofile-generate-cold-function-coverage.c
 create mode 100644 llvm/test/Transforms/PGOProfile/instr-gen-cold-function.ll

diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index c8bc2fe377b8ec..2814d2b1bf3733 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1786,6 +1786,12 @@ defm debug_info_for_profiling : 
BoolFOption<"debug-info-for-profiling",
   PosFlag,
   NegFlag>;
+def fprofile_generate_cold_function_coverage : Flag<["-"], 
"fprofile-generate-cold-function-coverage">, 
+Group, Visibility<[ClangOption, CLOption]>,
+HelpText<"Generate instrumented code to collect coverage info for cold 
functions into default.profraw file (overridden by '=' form of option or 
LLVM_PROFILE_FILE env var)">;
+def fprofile_generate_cold_function_coverage_EQ : Joined<["-"], 
"fprofile-generate-cold-function-coverage=">, 
+Group, Visibility<[ClangOption, CLOption]>, 
MetaVarName<"">,
+HelpText<"Generate instrumented code to collect coverage info for cold 
functions into /default.profraw (overridden by LLVM_PROFILE_FILE env 
var)">; 
 def fprofile_instr_generate : Flag<["-"], "fprofile-instr-generate">,
 Group, Visibility<[ClangOption, CLOption]>,
 HelpText<"Generate instrumented code to collect execution counts into 
default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env 
var)">;
diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp
index 6d3ede40691093..bdf3da0c96adca 100644
--- a/clang/lib/Driver/ToolChain.cpp
+++ b/clang/lib/Driver/ToolChain.cpp
@@ -899,7 +899,9 @@ bool ToolChain::needsProfileRT(const ArgList &Args) {
  Args.hasArg(options::OPT_fprofile_instr_generate) ||
  Args.hasArg(options::OPT_fprofile_instr_generate_EQ) ||
  Args.hasArg(options::OPT_fcreate_profile) ||
- Args.hasArg(options::OPT_forder_file_instrumentation);
+ Args.hasArg(options::OPT_forder_file_instrumentation) ||
+ Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage) ||
+ Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage_EQ);
 }
 
 bool ToolChain::needsGCovInstrumentation(const llvm::opt::ArgList &Args) {
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index 04b3832327a99c..4c6f508f1f24a6 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -632,6 +632,26 @@ static void addPGOAndCoverageFlags(const ToolChain &TC, 
Compilation &C,
 }
   }
 
+  if (auto *ColdFuncCoverageArg = Args.getLastArg(
+  options::OPT_fprofile_generate_cold_function_coverage,
+  options::OPT_fprofile_generate_cold_function_coverage_EQ)) {
+SmallString<128> Path(
+ColdFuncCoverageArg->getOption().matches(
+options::OPT_fprofile_generate_cold_function_coverage_EQ)
+? ColdFuncCoverageArg->getValue()
+: "");
+llvm::sys::path::append(Path, "default_%m.profraw");
+// FIXME: Idealy the file path should be passed through
+// `-fprofile-instrument-path=`(InstrProfileOutput), however, this field is
+// shared with other profile use path(see PGOOptions), we need to refactor
+// PGOOptions to make it work.
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back(Args.MakeArgString(
+Twine("--instrument-cold-function-only-path=") + Path));
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back("--pgo-function-entry-coverage");
+  }
+
   Arg *PGOGenArg = nullptr;
   if (PGOGenerateArg) {
 assert(!CSPGOGenerateArg);
diff --git a/clang/test/CodeGen/pgo-cold-function-coverage.c 
b/clang/test/CodeGen/pgo-cold-function-coverage.c
new file mode 100644
index 00..fd1e1e7e14cda5
--- /dev/null
+++ b/clang/test/CodeGen/pgo-cold-function-coverage.c
@@ -0,0 +1,19 @@
+// Test -fprofile-generate-cold-function-coverage 
+
+// RUN: rm -rf %t && sp

[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)

2024-09-24 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm created 
https://github.com/llvm/llvm-project/pull/109837

None

>From b693ff3c33dcf21c1343d9d3432b1d7410a8b579 Mon Sep 17 00:00:00 2001
From: wlei 
Date: Sun, 22 Sep 2024 20:23:20 -0700
Subject: [PATCH] [InstrPGO] Instrument sampling profile based cold function

---
 clang/include/clang/Driver/Options.td |  6 +
 clang/lib/Driver/ToolChain.cpp|  4 +++-
 clang/lib/Driver/ToolChains/Clang.cpp | 14 +++
 llvm/lib/Passes/PassBuilderPipelines.cpp  | 16 +
 .../Instrumentation/PGOInstrumentation.cpp| 14 +++
 .../PGOProfile/instr-gen-cold-function.ll | 24 +++
 6 files changed, 77 insertions(+), 1 deletion(-)
 create mode 100644 llvm/test/Transforms/PGOProfile/instr-gen-cold-function.ll

diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 002f60350543d9..0ac289089839d2 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1784,6 +1784,12 @@ defm debug_info_for_profiling : 
BoolFOption<"debug-info-for-profiling",
   PosFlag,
   NegFlag>;
+def fprofile_sample_cold_function : Flag<["-"], 
"fprofile-sample-cold-function">, 
+Group, Visibility<[ClangOption, CLOption]>,
+HelpText<"Generate instrumented code to cold functions guided by 
sampling-based profile into default.profraw file (overridden by '=' form of 
option or LLVM_PROFILE_FILE env var)">;
+def fprofile_sample_cold_function_EQ : Joined<["-"], 
"fprofile-sample-cold-function=">, 
+Group, Visibility<[ClangOption, CLOption]>, MetaVarName<"">,
+HelpText<"Generate instrumented code to cold functions guided by 
sampling-based profile into  (overridden by LLVM_PROFILE_FILE env var)">; 
 def fprofile_instr_generate : Flag<["-"], "fprofile-instr-generate">,
 Group, Visibility<[ClangOption, CLOption]>,
 HelpText<"Generate instrumented code to collect execution counts into 
default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env 
var)">;
diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp
index 16f9b629fc538c..5cd821cfe780c4 100644
--- a/clang/lib/Driver/ToolChain.cpp
+++ b/clang/lib/Driver/ToolChain.cpp
@@ -889,7 +889,9 @@ bool ToolChain::needsProfileRT(const ArgList &Args) {
  Args.hasArg(options::OPT_fprofile_instr_generate) ||
  Args.hasArg(options::OPT_fprofile_instr_generate_EQ) ||
  Args.hasArg(options::OPT_fcreate_profile) ||
- Args.hasArg(options::OPT_forder_file_instrumentation);
+ Args.hasArg(options::OPT_forder_file_instrumentation) ||
+ Args.hasArg(options::OPT_fprofile_sample_cold_function) ||
+ Args.hasArg(options::OPT_fprofile_sample_cold_function_EQ);
 }
 
 bool ToolChain::needsGCovInstrumentation(const llvm::opt::ArgList &Args) {
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index 0bab48caf1a5e2..85e9e02ca8a4ee 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -649,6 +649,20 @@ static void addPGOAndCoverageFlags(const ToolChain &TC, 
Compilation &C,
 }
   }
 
+  if (auto *SampleColdArg =
+  Args.getLastArg(options::OPT_fprofile_sample_cold_function,
+  options::OPT_fprofile_sample_cold_function_EQ)) {
+SmallString<128> Path;
+if 
(SampleColdArg->getOption().matches(options::OPT_fprofile_sample_cold_function_EQ))
+  Path.append(SampleColdArg->getValue());
+else 
+  Path.append("default_%m.profraw");
+
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back(Args.MakeArgString(
+Twine("--instrument-sample-cold-function-path=") + Path));
+  }
+
   Arg *PGOGenArg = nullptr;
   if (PGOGenerateArg) {
 assert(!CSPGOGenerateArg);
diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp 
b/llvm/lib/Passes/PassBuilderPipelines.cpp
index 8f151a99b11709..5d1d8480b1472d 100644
--- a/llvm/lib/Passes/PassBuilderPipelines.cpp
+++ b/llvm/lib/Passes/PassBuilderPipelines.cpp
@@ -296,7 +296,13 @@ static cl::opt UseLoopVersioningLICM(
 "enable-loop-versioning-licm", cl::init(false), cl::Hidden,
 cl::desc("Enable the experimental Loop Versioning LICM pass"));
 
+static cl::opt InstrumentSampleColdFuncPath(
+"instrument-sample-cold-function-path", cl::init(""),
+cl::desc("File path for instrumenting sampling PGO guided cold functions"),
+cl::Hidden);
+
 extern cl::opt UseCtxProfile;
+extern cl::opt InstrumentColdFunction;
 
 namespace llvm {
 extern cl::opt EnableMemProfContextDisambiguation;
@@ -1119,6 +1125,16 @@ 
PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level,
   // removed.
   MPM.addPass(
   PGOIndirectCallPromotion(true /* IsInLTO */, true /* SamplePGO */));
+
+if (InstrumentSampleColdFuncPath.getNumOccurrences() && Phase != 
ThinOrFullLTOPhase::ThinLTOPostLink) {
+  assert(!InstrumentSampleColdFuncPath.e

[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #110330)

2024-09-27 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm created 
https://github.com/llvm/llvm-project/pull/110330

None

>From d660d3b9a043a3530a735c1b95790116f6366062 Mon Sep 17 00:00:00 2001
From: wlei 
Date: Sun, 22 Sep 2024 20:23:20 -0700
Subject: [PATCH 1/2] [InstrPGO] Instrument sampling profile based cold
 function

---
 clang/include/clang/Driver/Options.td |  6 +
 clang/lib/Driver/ToolChain.cpp|  4 +++-
 clang/lib/Driver/ToolChains/Clang.cpp | 13 ++
 llvm/lib/Passes/PassBuilderPipelines.cpp  | 17 +
 .../Instrumentation/PGOInstrumentation.cpp| 14 +++
 .../PGOProfile/instr-gen-cold-function.ll | 24 +++
 6 files changed, 77 insertions(+), 1 deletion(-)
 create mode 100644 llvm/test/Transforms/PGOProfile/instr-gen-cold-function.ll

diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index aedc4c16d4e9d5..230b9604b8a8aa 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1785,6 +1785,12 @@ defm debug_info_for_profiling : 
BoolFOption<"debug-info-for-profiling",
   PosFlag,
   NegFlag>;
+def fprofile_sample_cold_function : Flag<["-"], 
"fprofile-sample-cold-function">, 
+Group, Visibility<[ClangOption, CLOption]>,
+HelpText<"Generate instrumented code to cold functions guided by 
sampling-based profile into default.profraw file (overridden by '=' form of 
option or LLVM_PROFILE_FILE env var)">;
+def fprofile_sample_cold_function_EQ : Joined<["-"], 
"fprofile-sample-cold-function=">, 
+Group, Visibility<[ClangOption, CLOption]>, MetaVarName<"">,
+HelpText<"Generate instrumented code to cold functions guided by 
sampling-based profile into  (overridden by LLVM_PROFILE_FILE env var)">; 
 def fprofile_instr_generate : Flag<["-"], "fprofile-instr-generate">,
 Group, Visibility<[ClangOption, CLOption]>,
 HelpText<"Generate instrumented code to collect execution counts into 
default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env 
var)">;
diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp
index 16f9b629fc538c..5cd821cfe780c4 100644
--- a/clang/lib/Driver/ToolChain.cpp
+++ b/clang/lib/Driver/ToolChain.cpp
@@ -889,7 +889,9 @@ bool ToolChain::needsProfileRT(const ArgList &Args) {
  Args.hasArg(options::OPT_fprofile_instr_generate) ||
  Args.hasArg(options::OPT_fprofile_instr_generate_EQ) ||
  Args.hasArg(options::OPT_fcreate_profile) ||
- Args.hasArg(options::OPT_forder_file_instrumentation);
+ Args.hasArg(options::OPT_forder_file_instrumentation) ||
+ Args.hasArg(options::OPT_fprofile_sample_cold_function) ||
+ Args.hasArg(options::OPT_fprofile_sample_cold_function_EQ);
 }
 
 bool ToolChain::needsGCovInstrumentation(const llvm::opt::ArgList &Args) {
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index b9987288d82d10..fced94833aaf79 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -649,6 +649,19 @@ static void addPGOAndCoverageFlags(const ToolChain &TC, 
Compilation &C,
 }
   }
 
+  if (auto *SampleColdArg =
+  Args.getLastArg(options::OPT_fprofile_sample_cold_function,
+  options::OPT_fprofile_sample_cold_function_EQ)) {
+SmallString<128> Path(SampleColdArg->getOption().matches(
+  options::OPT_fprofile_sample_cold_function_EQ)
+  ? SampleColdArg->getValue()
+  : "");
+llvm::sys::path::append(Path, "default_%m.profraw");
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back(Args.MakeArgString(
+Twine("--instrument-sample-cold-function-path=") + Path));
+  }
+
   Arg *PGOGenArg = nullptr;
   if (PGOGenerateArg) {
 assert(!CSPGOGenerateArg);
diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp 
b/llvm/lib/Passes/PassBuilderPipelines.cpp
index 8f151a99b11709..0514d17db20721 100644
--- a/llvm/lib/Passes/PassBuilderPipelines.cpp
+++ b/llvm/lib/Passes/PassBuilderPipelines.cpp
@@ -296,7 +296,13 @@ static cl::opt UseLoopVersioningLICM(
 "enable-loop-versioning-licm", cl::init(false), cl::Hidden,
 cl::desc("Enable the experimental Loop Versioning LICM pass"));
 
+static cl::opt InstrumentSampleColdFuncPath(
+"instrument-sample-cold-function-path", cl::init(""),
+cl::desc("File path for instrumenting sampling PGO guided cold functions"),
+cl::Hidden);
+
 extern cl::opt UseCtxProfile;
+extern cl::opt InstrumentColdFunction;
 
 namespace llvm {
 extern cl::opt EnableMemProfContextDisambiguation;
@@ -1119,6 +1125,17 @@ 
PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level,
   // removed.
   MPM.addPass(
   PGOIndirectCallPromotion(true /* IsInLTO */, true /* SamplePGO */));
+
+if (InstrumentSampleColdFuncPath.getNumOccurrences() &&
+Phase != ThinOrF

[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function v2 (PR #110330)

2024-09-27 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm edited 
https://github.com/llvm/llvm-project/pull/110330
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)

2024-09-27 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm edited 
https://github.com/llvm/llvm-project/pull/109837
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)

2024-09-27 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm updated 
https://github.com/llvm/llvm-project/pull/109837

>From 36a747b5ef2123dd10909b91ad963e30af63b3b8 Mon Sep 17 00:00:00 2001
From: wlei 
Date: Sun, 22 Sep 2024 20:23:20 -0700
Subject: [PATCH] [InstrPGO] Instrument sampling profile based cold function

---
 clang/include/clang/Driver/Options.td |  6 +
 clang/lib/Driver/ToolChain.cpp|  4 +++-
 clang/lib/Driver/ToolChains/Clang.cpp | 12 ++
 llvm/lib/Passes/PassBuilderPipelines.cpp  | 17 +
 .../Instrumentation/PGOInstrumentation.cpp| 14 +++
 .../PGOProfile/instr-gen-cold-function.ll | 24 +++
 6 files changed, 76 insertions(+), 1 deletion(-)
 create mode 100644 llvm/test/Transforms/PGOProfile/instr-gen-cold-function.ll

diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 002f60350543d9..0ac289089839d2 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1784,6 +1784,12 @@ defm debug_info_for_profiling : 
BoolFOption<"debug-info-for-profiling",
   PosFlag,
   NegFlag>;
+def fprofile_sample_cold_function : Flag<["-"], 
"fprofile-sample-cold-function">, 
+Group, Visibility<[ClangOption, CLOption]>,
+HelpText<"Generate instrumented code to cold functions guided by 
sampling-based profile into default.profraw file (overridden by '=' form of 
option or LLVM_PROFILE_FILE env var)">;
+def fprofile_sample_cold_function_EQ : Joined<["-"], 
"fprofile-sample-cold-function=">, 
+Group, Visibility<[ClangOption, CLOption]>, MetaVarName<"">,
+HelpText<"Generate instrumented code to cold functions guided by 
sampling-based profile into  (overridden by LLVM_PROFILE_FILE env var)">; 
 def fprofile_instr_generate : Flag<["-"], "fprofile-instr-generate">,
 Group, Visibility<[ClangOption, CLOption]>,
 HelpText<"Generate instrumented code to collect execution counts into 
default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env 
var)">;
diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp
index 16f9b629fc538c..5cd821cfe780c4 100644
--- a/clang/lib/Driver/ToolChain.cpp
+++ b/clang/lib/Driver/ToolChain.cpp
@@ -889,7 +889,9 @@ bool ToolChain::needsProfileRT(const ArgList &Args) {
  Args.hasArg(options::OPT_fprofile_instr_generate) ||
  Args.hasArg(options::OPT_fprofile_instr_generate_EQ) ||
  Args.hasArg(options::OPT_fcreate_profile) ||
- Args.hasArg(options::OPT_forder_file_instrumentation);
+ Args.hasArg(options::OPT_forder_file_instrumentation) ||
+ Args.hasArg(options::OPT_fprofile_sample_cold_function) ||
+ Args.hasArg(options::OPT_fprofile_sample_cold_function_EQ);
 }
 
 bool ToolChain::needsGCovInstrumentation(const llvm::opt::ArgList &Args) {
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index 0bab48caf1a5e2..6e128dfc888ab4 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -649,6 +649,18 @@ static void addPGOAndCoverageFlags(const ToolChain &TC, 
Compilation &C,
 }
   }
 
+  if (auto *SampleColdArg =
+  Args.getLastArg(options::OPT_fprofile_sample_cold_function,
+  options::OPT_fprofile_sample_cold_function_EQ)) {
+SmallString<128> Path(SampleColdArg->getOption().matches(
+  options::OPT_fprofile_sample_cold_function_EQ)
+  ? SampleColdArg->getValue()
+  : "default_%m.profraw");
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back(Args.MakeArgString(
+Twine("--instrument-sample-cold-function-path=") + Path));
+  }
+
   Arg *PGOGenArg = nullptr;
   if (PGOGenerateArg) {
 assert(!CSPGOGenerateArg);
diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp 
b/llvm/lib/Passes/PassBuilderPipelines.cpp
index 8f151a99b11709..0514d17db20721 100644
--- a/llvm/lib/Passes/PassBuilderPipelines.cpp
+++ b/llvm/lib/Passes/PassBuilderPipelines.cpp
@@ -296,7 +296,13 @@ static cl::opt UseLoopVersioningLICM(
 "enable-loop-versioning-licm", cl::init(false), cl::Hidden,
 cl::desc("Enable the experimental Loop Versioning LICM pass"));
 
+static cl::opt InstrumentSampleColdFuncPath(
+"instrument-sample-cold-function-path", cl::init(""),
+cl::desc("File path for instrumenting sampling PGO guided cold functions"),
+cl::Hidden);
+
 extern cl::opt UseCtxProfile;
+extern cl::opt InstrumentColdFunction;
 
 namespace llvm {
 extern cl::opt EnableMemProfContextDisambiguation;
@@ -1119,6 +1125,17 @@ 
PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level,
   // removed.
   MPM.addPass(
   PGOIndirectCallPromotion(true /* IsInLTO */, true /* SamplePGO */));
+
+if (InstrumentSampleColdFuncPath.getNumOccurrences() &&
+Phase != ThinOrFullLTOPhase::ThinLTOPostLink) {
+  assert(!Inst

[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function v2 (PR #110330)

2024-09-27 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm edited 
https://github.com/llvm/llvm-project/pull/110330
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)

2024-09-27 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm edited 
https://github.com/llvm/llvm-project/pull/109837
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)

2024-09-27 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm updated 
https://github.com/llvm/llvm-project/pull/109837

>From 6a9b51130269975b5832a5439897dcb34b9dc942 Mon Sep 17 00:00:00 2001
From: wlei 
Date: Sun, 22 Sep 2024 20:23:20 -0700
Subject: [PATCH] [InstrPGO] Instrument sampling profile based cold function

---
 clang/include/clang/Driver/Options.td |  6 +
 clang/lib/Driver/ToolChain.cpp|  4 +++-
 clang/lib/Driver/ToolChains/Clang.cpp | 10 
 llvm/lib/Passes/PassBuilderPipelines.cpp  | 17 +
 .../Instrumentation/PGOInstrumentation.cpp| 14 +++
 .../PGOProfile/instr-gen-cold-function.ll | 24 +++
 6 files changed, 74 insertions(+), 1 deletion(-)
 create mode 100644 llvm/test/Transforms/PGOProfile/instr-gen-cold-function.ll

diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 002f60350543d9..0ac289089839d2 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1784,6 +1784,12 @@ defm debug_info_for_profiling : 
BoolFOption<"debug-info-for-profiling",
   PosFlag,
   NegFlag>;
+def fprofile_sample_cold_function : Flag<["-"], 
"fprofile-sample-cold-function">, 
+Group, Visibility<[ClangOption, CLOption]>,
+HelpText<"Generate instrumented code to cold functions guided by 
sampling-based profile into default.profraw file (overridden by '=' form of 
option or LLVM_PROFILE_FILE env var)">;
+def fprofile_sample_cold_function_EQ : Joined<["-"], 
"fprofile-sample-cold-function=">, 
+Group, Visibility<[ClangOption, CLOption]>, MetaVarName<"">,
+HelpText<"Generate instrumented code to cold functions guided by 
sampling-based profile into  (overridden by LLVM_PROFILE_FILE env var)">; 
 def fprofile_instr_generate : Flag<["-"], "fprofile-instr-generate">,
 Group, Visibility<[ClangOption, CLOption]>,
 HelpText<"Generate instrumented code to collect execution counts into 
default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env 
var)">;
diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp
index 16f9b629fc538c..5cd821cfe780c4 100644
--- a/clang/lib/Driver/ToolChain.cpp
+++ b/clang/lib/Driver/ToolChain.cpp
@@ -889,7 +889,9 @@ bool ToolChain::needsProfileRT(const ArgList &Args) {
  Args.hasArg(options::OPT_fprofile_instr_generate) ||
  Args.hasArg(options::OPT_fprofile_instr_generate_EQ) ||
  Args.hasArg(options::OPT_fcreate_profile) ||
- Args.hasArg(options::OPT_forder_file_instrumentation);
+ Args.hasArg(options::OPT_forder_file_instrumentation) ||
+ Args.hasArg(options::OPT_fprofile_sample_cold_function) ||
+ Args.hasArg(options::OPT_fprofile_sample_cold_function_EQ);
 }
 
 bool ToolChain::needsGCovInstrumentation(const llvm::opt::ArgList &Args) {
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index 0bab48caf1a5e2..2b3e0047412d3e 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -649,6 +649,16 @@ static void addPGOAndCoverageFlags(const ToolChain &TC, 
Compilation &C,
 }
   }
 
+  if (auto *SampleColdArg =
+  Args.getLastArg(options::OPT_fprofile_sample_cold_function,
+  options::OPT_fprofile_sample_cold_function_EQ)) {
+SmallString<128> Path (SampleColdArg->getOption().matches(
+options::OPT_fprofile_sample_cold_function_EQ) ? 
SampleColdArg->getValue(): "default_%m.profraw");
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back(Args.MakeArgString(
+Twine("--instrument-sample-cold-function-path=") + Path));
+  }
+
   Arg *PGOGenArg = nullptr;
   if (PGOGenerateArg) {
 assert(!CSPGOGenerateArg);
diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp 
b/llvm/lib/Passes/PassBuilderPipelines.cpp
index 8f151a99b11709..0514d17db20721 100644
--- a/llvm/lib/Passes/PassBuilderPipelines.cpp
+++ b/llvm/lib/Passes/PassBuilderPipelines.cpp
@@ -296,7 +296,13 @@ static cl::opt UseLoopVersioningLICM(
 "enable-loop-versioning-licm", cl::init(false), cl::Hidden,
 cl::desc("Enable the experimental Loop Versioning LICM pass"));
 
+static cl::opt InstrumentSampleColdFuncPath(
+"instrument-sample-cold-function-path", cl::init(""),
+cl::desc("File path for instrumenting sampling PGO guided cold functions"),
+cl::Hidden);
+
 extern cl::opt UseCtxProfile;
+extern cl::opt InstrumentColdFunction;
 
 namespace llvm {
 extern cl::opt EnableMemProfContextDisambiguation;
@@ -1119,6 +1125,17 @@ 
PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level,
   // removed.
   MPM.addPass(
   PGOIndirectCallPromotion(true /* IsInLTO */, true /* SamplePGO */));
+
+if (InstrumentSampleColdFuncPath.getNumOccurrences() &&
+Phase != ThinOrFullLTOPhase::ThinLTOPostLink) {
+  assert(!InstrumentSampleColdFuncPath.empty() &&
+ "File path is requeired for ins

[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)

2024-09-27 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm updated 
https://github.com/llvm/llvm-project/pull/109837

>From 436ab3bfb3d7f1978ba1598ec7d3ca006706cbd3 Mon Sep 17 00:00:00 2001
From: wlei 
Date: Sun, 22 Sep 2024 20:23:20 -0700
Subject: [PATCH] [InstrPGO] Instrument sampling profile based cold function

---
 clang/include/clang/Driver/Options.td |  6 +
 clang/lib/Driver/ToolChain.cpp|  4 +++-
 clang/lib/Driver/ToolChains/Clang.cpp | 13 ++
 llvm/lib/Passes/PassBuilderPipelines.cpp  | 17 +
 .../Instrumentation/PGOInstrumentation.cpp| 14 +++
 .../PGOProfile/instr-gen-cold-function.ll | 24 +++
 6 files changed, 77 insertions(+), 1 deletion(-)
 create mode 100644 llvm/test/Transforms/PGOProfile/instr-gen-cold-function.ll

diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 002f60350543d9..0ac289089839d2 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1784,6 +1784,12 @@ defm debug_info_for_profiling : 
BoolFOption<"debug-info-for-profiling",
   PosFlag,
   NegFlag>;
+def fprofile_sample_cold_function : Flag<["-"], 
"fprofile-sample-cold-function">, 
+Group, Visibility<[ClangOption, CLOption]>,
+HelpText<"Generate instrumented code to cold functions guided by 
sampling-based profile into default.profraw file (overridden by '=' form of 
option or LLVM_PROFILE_FILE env var)">;
+def fprofile_sample_cold_function_EQ : Joined<["-"], 
"fprofile-sample-cold-function=">, 
+Group, Visibility<[ClangOption, CLOption]>, MetaVarName<"">,
+HelpText<"Generate instrumented code to cold functions guided by 
sampling-based profile into  (overridden by LLVM_PROFILE_FILE env var)">; 
 def fprofile_instr_generate : Flag<["-"], "fprofile-instr-generate">,
 Group, Visibility<[ClangOption, CLOption]>,
 HelpText<"Generate instrumented code to collect execution counts into 
default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env 
var)">;
diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp
index 16f9b629fc538c..5cd821cfe780c4 100644
--- a/clang/lib/Driver/ToolChain.cpp
+++ b/clang/lib/Driver/ToolChain.cpp
@@ -889,7 +889,9 @@ bool ToolChain::needsProfileRT(const ArgList &Args) {
  Args.hasArg(options::OPT_fprofile_instr_generate) ||
  Args.hasArg(options::OPT_fprofile_instr_generate_EQ) ||
  Args.hasArg(options::OPT_fcreate_profile) ||
- Args.hasArg(options::OPT_forder_file_instrumentation);
+ Args.hasArg(options::OPT_forder_file_instrumentation) ||
+ Args.hasArg(options::OPT_fprofile_sample_cold_function) ||
+ Args.hasArg(options::OPT_fprofile_sample_cold_function_EQ);
 }
 
 bool ToolChain::needsGCovInstrumentation(const llvm::opt::ArgList &Args) {
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index 0bab48caf1a5e2..f4a073d2036d12 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -649,6 +649,19 @@ static void addPGOAndCoverageFlags(const ToolChain &TC, 
Compilation &C,
 }
   }
 
+  if (auto *SampleColdArg =
+  Args.getLastArg(options::OPT_fprofile_sample_cold_function,
+  options::OPT_fprofile_sample_cold_function_EQ)) {
+SmallString<128> Path(SampleColdArg->getOption().matches(
+  options::OPT_fprofile_sample_cold_function_EQ)
+  ? SampleColdArg->getValue()
+  : "");
+llvm::sys::path::append(Path, "default_%m.profraw");
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back(Args.MakeArgString(
+Twine("--instrument-sample-cold-function-path=") + Path));
+  }
+
   Arg *PGOGenArg = nullptr;
   if (PGOGenerateArg) {
 assert(!CSPGOGenerateArg);
diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp 
b/llvm/lib/Passes/PassBuilderPipelines.cpp
index 8f151a99b11709..0514d17db20721 100644
--- a/llvm/lib/Passes/PassBuilderPipelines.cpp
+++ b/llvm/lib/Passes/PassBuilderPipelines.cpp
@@ -296,7 +296,13 @@ static cl::opt UseLoopVersioningLICM(
 "enable-loop-versioning-licm", cl::init(false), cl::Hidden,
 cl::desc("Enable the experimental Loop Versioning LICM pass"));
 
+static cl::opt InstrumentSampleColdFuncPath(
+"instrument-sample-cold-function-path", cl::init(""),
+cl::desc("File path for instrumenting sampling PGO guided cold functions"),
+cl::Hidden);
+
 extern cl::opt UseCtxProfile;
+extern cl::opt InstrumentColdFunction;
 
 namespace llvm {
 extern cl::opt EnableMemProfContextDisambiguation;
@@ -1119,6 +1125,17 @@ 
PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level,
   // removed.
   MPM.addPass(
   PGOIndirectCallPromotion(true /* IsInLTO */, true /* SamplePGO */));
+
+if (InstrumentSampleColdFuncPath.getNumOccurrences() &&
+Phase != ThinOrFullLTOPhase

[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)

2024-09-30 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm updated 
https://github.com/llvm/llvm-project/pull/109837

>From 07a2cab3fa5df2965f9f7da9ee2d3603581a47f1 Mon Sep 17 00:00:00 2001
From: wlei 
Date: Sun, 22 Sep 2024 20:23:20 -0700
Subject: [PATCH] [InstrPGO] Instrument sampling profile based cold function

---
 clang/include/clang/Driver/Options.td |  6 +
 clang/lib/Driver/ToolChain.cpp|  4 +++-
 clang/lib/Driver/ToolChains/Clang.cpp | 18 ++
 clang/test/CodeGen/Inputs/pgo-cold-func.prof  |  2 ++
 .../test/CodeGen/pgo-cold-function-coverage.c | 12 ++
 llvm/lib/Passes/PassBuilderPipelines.cpp  | 18 ++
 .../Instrumentation/PGOInstrumentation.cpp| 16 +
 .../PGOProfile/instr-gen-cold-function.ll | 24 +++
 8 files changed, 99 insertions(+), 1 deletion(-)
 create mode 100644 clang/test/CodeGen/Inputs/pgo-cold-func.prof
 create mode 100644 clang/test/CodeGen/pgo-cold-function-coverage.c
 create mode 100644 llvm/test/Transforms/PGOProfile/instr-gen-cold-function.ll

diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 002f60350543d9..6bb92427f2d53f 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1784,6 +1784,12 @@ defm debug_info_for_profiling : 
BoolFOption<"debug-info-for-profiling",
   PosFlag,
   NegFlag>;
+def fprofile_generate_cold_function_coverage : Flag<["-"], 
"fprofile-generate-cold-function-coverage">, 
+Group, Visibility<[ClangOption, CLOption]>,
+HelpText<"Generate instrumented code to cold functions into 
default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env 
var)">;
+def fprofile_generate_cold_function_coverage_EQ : Joined<["-"], 
"fprofile-generate-cold-function-coverage=">, 
+Group, Visibility<[ClangOption, CLOption]>, 
MetaVarName<"">,
+HelpText<"Generate instrumented code to cold functions into 
/default.profraw (overridden by LLVM_PROFILE_FILE env var)">; 
 def fprofile_instr_generate : Flag<["-"], "fprofile-instr-generate">,
 Group, Visibility<[ClangOption, CLOption]>,
 HelpText<"Generate instrumented code to collect execution counts into 
default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env 
var)">;
diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp
index 16f9b629fc538c..e56db150c6b47e 100644
--- a/clang/lib/Driver/ToolChain.cpp
+++ b/clang/lib/Driver/ToolChain.cpp
@@ -889,7 +889,9 @@ bool ToolChain::needsProfileRT(const ArgList &Args) {
  Args.hasArg(options::OPT_fprofile_instr_generate) ||
  Args.hasArg(options::OPT_fprofile_instr_generate_EQ) ||
  Args.hasArg(options::OPT_fcreate_profile) ||
- Args.hasArg(options::OPT_forder_file_instrumentation);
+ Args.hasArg(options::OPT_forder_file_instrumentation) ||
+ Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage) ||
+ Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage_EQ);
 }
 
 bool ToolChain::needsGCovInstrumentation(const llvm::opt::ArgList &Args) {
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index 0bab48caf1a5e2..00602a08232ba2 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -649,6 +649,24 @@ static void addPGOAndCoverageFlags(const ToolChain &TC, 
Compilation &C,
 }
   }
 
+  if (auto *ColdFuncCoverageArg = Args.getLastArg(
+  options::OPT_fprofile_generate_cold_function_coverage,
+  options::OPT_fprofile_generate_cold_function_coverage_EQ)) {
+SmallString<128> Path(
+ColdFuncCoverageArg->getOption().matches(
+options::OPT_fprofile_generate_cold_function_coverage_EQ)
+? ColdFuncCoverageArg->getValue()
+: "");
+llvm::sys::path::append(Path, "default_%m.profraw");
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back(Args.MakeArgString(
+Twine("--instrument-sample-cold-function-path=") + Path));
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back("--instrument-cold-function-coverage");
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back("--pgo-function-entry-coverage");
+  }
+
   Arg *PGOGenArg = nullptr;
   if (PGOGenerateArg) {
 assert(!CSPGOGenerateArg);
diff --git a/clang/test/CodeGen/Inputs/pgo-cold-func.prof 
b/clang/test/CodeGen/Inputs/pgo-cold-func.prof
new file mode 100644
index 00..e50be02e0a8545
--- /dev/null
+++ b/clang/test/CodeGen/Inputs/pgo-cold-func.prof
@@ -0,0 +1,2 @@
+foo:1:1
+ 1: 1
diff --git a/clang/test/CodeGen/pgo-cold-function-coverage.c 
b/clang/test/CodeGen/pgo-cold-function-coverage.c
new file mode 100644
index 00..0d2767c022b9f1
--- /dev/null
+++ b/clang/test/CodeGen/pgo-cold-function-coverage.c
@@ -0,0 +1,12 @@
+// Test -fprofile-generate-cold-function-coverage 
+// RUN: %clang -O2 -fprofile-generate-cold-function-coverage=/xxx/yyy/

[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)

2024-09-30 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm edited 
https://github.com/llvm/llvm-project/pull/109837
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)

2024-10-01 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm ready_for_review 
https://github.com/llvm/llvm-project/pull/109837
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)

2024-10-01 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm edited 
https://github.com/llvm/llvm-project/pull/109837
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)

2024-10-02 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm updated 
https://github.com/llvm/llvm-project/pull/109837

>From 07a2cab3fa5df2965f9f7da9ee2d3603581a47f1 Mon Sep 17 00:00:00 2001
From: wlei 
Date: Sun, 22 Sep 2024 20:23:20 -0700
Subject: [PATCH 1/3] [InstrPGO] Instrument sampling profile based cold
 function

---
 clang/include/clang/Driver/Options.td |  6 +
 clang/lib/Driver/ToolChain.cpp|  4 +++-
 clang/lib/Driver/ToolChains/Clang.cpp | 18 ++
 clang/test/CodeGen/Inputs/pgo-cold-func.prof  |  2 ++
 .../test/CodeGen/pgo-cold-function-coverage.c | 12 ++
 llvm/lib/Passes/PassBuilderPipelines.cpp  | 18 ++
 .../Instrumentation/PGOInstrumentation.cpp| 16 +
 .../PGOProfile/instr-gen-cold-function.ll | 24 +++
 8 files changed, 99 insertions(+), 1 deletion(-)
 create mode 100644 clang/test/CodeGen/Inputs/pgo-cold-func.prof
 create mode 100644 clang/test/CodeGen/pgo-cold-function-coverage.c
 create mode 100644 llvm/test/Transforms/PGOProfile/instr-gen-cold-function.ll

diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 002f60350543d9..6bb92427f2d53f 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1784,6 +1784,12 @@ defm debug_info_for_profiling : 
BoolFOption<"debug-info-for-profiling",
   PosFlag,
   NegFlag>;
+def fprofile_generate_cold_function_coverage : Flag<["-"], 
"fprofile-generate-cold-function-coverage">, 
+Group, Visibility<[ClangOption, CLOption]>,
+HelpText<"Generate instrumented code to cold functions into 
default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env 
var)">;
+def fprofile_generate_cold_function_coverage_EQ : Joined<["-"], 
"fprofile-generate-cold-function-coverage=">, 
+Group, Visibility<[ClangOption, CLOption]>, 
MetaVarName<"">,
+HelpText<"Generate instrumented code to cold functions into 
/default.profraw (overridden by LLVM_PROFILE_FILE env var)">; 
 def fprofile_instr_generate : Flag<["-"], "fprofile-instr-generate">,
 Group, Visibility<[ClangOption, CLOption]>,
 HelpText<"Generate instrumented code to collect execution counts into 
default.profraw file (overridden by '=' form of option or LLVM_PROFILE_FILE env 
var)">;
diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp
index 16f9b629fc538c..e56db150c6b47e 100644
--- a/clang/lib/Driver/ToolChain.cpp
+++ b/clang/lib/Driver/ToolChain.cpp
@@ -889,7 +889,9 @@ bool ToolChain::needsProfileRT(const ArgList &Args) {
  Args.hasArg(options::OPT_fprofile_instr_generate) ||
  Args.hasArg(options::OPT_fprofile_instr_generate_EQ) ||
  Args.hasArg(options::OPT_fcreate_profile) ||
- Args.hasArg(options::OPT_forder_file_instrumentation);
+ Args.hasArg(options::OPT_forder_file_instrumentation) ||
+ Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage) ||
+ Args.hasArg(options::OPT_fprofile_generate_cold_function_coverage_EQ);
 }
 
 bool ToolChain::needsGCovInstrumentation(const llvm::opt::ArgList &Args) {
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index 0bab48caf1a5e2..00602a08232ba2 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -649,6 +649,24 @@ static void addPGOAndCoverageFlags(const ToolChain &TC, 
Compilation &C,
 }
   }
 
+  if (auto *ColdFuncCoverageArg = Args.getLastArg(
+  options::OPT_fprofile_generate_cold_function_coverage,
+  options::OPT_fprofile_generate_cold_function_coverage_EQ)) {
+SmallString<128> Path(
+ColdFuncCoverageArg->getOption().matches(
+options::OPT_fprofile_generate_cold_function_coverage_EQ)
+? ColdFuncCoverageArg->getValue()
+: "");
+llvm::sys::path::append(Path, "default_%m.profraw");
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back(Args.MakeArgString(
+Twine("--instrument-sample-cold-function-path=") + Path));
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back("--instrument-cold-function-coverage");
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back("--pgo-function-entry-coverage");
+  }
+
   Arg *PGOGenArg = nullptr;
   if (PGOGenerateArg) {
 assert(!CSPGOGenerateArg);
diff --git a/clang/test/CodeGen/Inputs/pgo-cold-func.prof 
b/clang/test/CodeGen/Inputs/pgo-cold-func.prof
new file mode 100644
index 00..e50be02e0a8545
--- /dev/null
+++ b/clang/test/CodeGen/Inputs/pgo-cold-func.prof
@@ -0,0 +1,2 @@
+foo:1:1
+ 1: 1
diff --git a/clang/test/CodeGen/pgo-cold-function-coverage.c 
b/clang/test/CodeGen/pgo-cold-function-coverage.c
new file mode 100644
index 00..0d2767c022b9f1
--- /dev/null
+++ b/clang/test/CodeGen/pgo-cold-function-coverage.c
@@ -0,0 +1,12 @@
+// Test -fprofile-generate-cold-function-coverage 
+// RUN: %clang -O2 -fprofile-generate-cold-function-coverage=/xxx

[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)

2024-10-02 Thread Lei Wang via cfe-commits


@@ -1784,6 +1784,12 @@ defm debug_info_for_profiling : 
BoolFOption<"debug-info-for-profiling",
   PosFlag,
   NegFlag>;
+def fprofile_generate_cold_function_coverage : Flag<["-"], 
"fprofile-generate-cold-function-coverage">, 

wlei-llvm wrote:

>Would -Wl,-lclang_rt.profile work?

Got it, I think it should work, except it requires another linker flag: the 
`--u__llvm_runtime_variable`, we can configure it to linker too, but basically 
[those instr PGO 
flags](https://github.com/llvm/llvm-project/blob/main/clang/lib/Driver/ToolChain.cpp#L885-L893)
 control 
[addProfileRTLibs](https://github.com/llvm/llvm-project/blob/main/clang/lib/Driver/ToolChains/Linux.cpp#L841-L849)
 (which seems not equivalent to `-Wl,-lclang_rt.profile`), I just wanted to 
mirror those flags so that we don't need to maintain if anything changes to 
`addProfileRTLibs` in the future. (Though I feel this code should be very 
stable, should not be a big problem, so mainly for the convenience for compiler 
user to use one flag instead of using/maintaining multiple flags )
Overall, I agree that it's feasible to do it without clang flag. I'm not very 
familiar with the convention for adding driver flags, so if you think this 
doesn't justify it, I will drop it from this patch. Thanks for the discussion!

https://github.com/llvm/llvm-project/pull/109837
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)

2024-10-02 Thread Lei Wang via cfe-commits


@@ -1119,6 +1125,18 @@ 
PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level,
   // removed.
   MPM.addPass(
   PGOIndirectCallPromotion(true /* IsInLTO */, true /* SamplePGO */));
+
+if (InstrumentSampleColdFuncPath.getNumOccurrences() &&
+Phase != ThinOrFullLTOPhase::ThinLTOPostLink) {
+  assert(!InstrumentSampleColdFuncPath.empty() &&
+ "File path is requeired for instrumentation generation");
+  InstrumentColdFunctionCoverage = true;
+  addPreInlinerPasses(MPM, Level, Phase);
+  addPGOInstrPasses(MPM, Level, /* RunProfileGen */ true,
+/* IsCS */ false, /* AtomicCounterUpdate */ false,
+InstrumentSampleColdFuncPath, "",
+IntrusiveRefCntPtr());
+}

wlei-llvm wrote:

> After looking at this a bit closer, I'm not sure why it needs to be tied to 
> closely to SamplePGO. Couldn't we move this out of the `LoadSampleProfile` 
> and move it after `IsPGOInstrUse/IsCtxProfUse`? That way we can use IRPGO, 
> CSIRPGO, and SamplePGO profile counts to block instrumentation hot functions.

Ah, good point, moved them closer to other IRPGO passes. Thanks for the 
suggestion!

https://github.com/llvm/llvm-project/pull/109837
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)

2024-10-02 Thread Lei Wang via cfe-commits


@@ -0,0 +1,24 @@
+; RUN: opt < %s  --passes=pgo-instr-gen -instrument-cold-function-coverage -S  
| FileCheck --check-prefixes=COLD %s
+; RUN: opt < %s  --passes=pgo-instr-gen -instrument-cold-function-coverage 
-cold-function-coverage-max-entry-count=1 -S  | FileCheck 
--check-prefixes=ENTRY-COUNT %s
+
+; COLD: call void @llvm.instrprof.increment(ptr @__profn_foo, i64  [[#]], i32 
1, i32 0)

wlei-llvm wrote:

Sounds good!(it should still work without `-pgo-function-entry-coverage`, but 
agree to test the most common case)

https://github.com/llvm/llvm-project/pull/109837
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [InstrPGO] Instrument sampling profile based cold function (PR #109837)

2024-10-02 Thread Lei Wang via cfe-commits


@@ -1119,6 +1125,18 @@ 
PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level,
   // removed.
   MPM.addPass(
   PGOIndirectCallPromotion(true /* IsInLTO */, true /* SamplePGO */));
+
+if (InstrumentSampleColdFuncPath.getNumOccurrences() &&
+Phase != ThinOrFullLTOPhase::ThinLTOPostLink) {
+  assert(!InstrumentSampleColdFuncPath.empty() &&
+ "File path is requeired for instrumentation generation");
+  InstrumentColdFunctionCoverage = true;
+  addPreInlinerPasses(MPM, Level, Phase);
+  addPGOInstrPasses(MPM, Level, /* RunProfileGen */ true,
+/* IsCS */ false, /* AtomicCounterUpdate */ false,
+InstrumentSampleColdFuncPath, "",
+IntrusiveRefCntPtr());
+}

wlei-llvm wrote:

do you mean the `sample` in `InstrumentSampleColdFuncPath`flag or all other 
variables(like `IsSampleColdFuncCovGen`)?  I thought something like:

```
if (IsSampleColdFuncCovGen || IsXXXColdFuncCovGen) {
 addPGOInstrPasses(.., .., InstrumentColdFuncCoveragePath, .. )
}
```
`InstrumentColdFuncCoveragePath` would be a general flag that can be used for 
all cold function coverage case. but currently `IsSampleColdFuncCovGen` only 
represents for sampling PGO cold func. And If we want to extend it in future, 
then we can add more bool flag IsXXXColdFuncCovGen... 

I also added an assertion:
```
  assert((InstrumentColdFuncCoveragePath.empty() || LoadSampleProfile) &&
 "Sampling-based cold functon coverage is not supported without "
 "providing sampling PGO profile");
```
Seems I have to remove that if we want it to be general. (just curious) do you 
plan to extend it for your case? 

https://github.com/llvm/llvm-project/pull/109837
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Reapply "[Clang][Sema] Always use latest redeclaration of primary template" (PR #114569)

2024-11-07 Thread Lei Wang via cfe-commits

wlei-llvm wrote:

Hi, here is the reduced repro (not perfect, but was as far as I can get), 
thanks!

```
namespace std
{
  template
_Tp
__declval(long);
  template
auto declval() noexcept -> decltype(__declval<_Tp>(0));
  template
class __is_convertible_helper
{
  template()))>
 static int
 __test;
  typedef decltype(__test<_From, _To>0) type;
};
  template
struct is_convertible
: public __is_convertible_helper<_From, _To>::type
;
template 
  inline constexpr bool is_same_v = __is_same(_Tp, _Up);
}
  enum _Lock_policy { _S_single, _S_mutex, _S_atomic };
  static const _Lock_policy __default_lock_policy =
  _S_atomic;
  template
class __shared_ptr;
  template
struct __sp_compatible_with
: is_convertible<_Yp*, _Tp*>::type
;
  template
class __shared_ptr
{
 using _SafeConv
   = typename enable_if::type;
  template
 using _Compatible = typename
   enable_if__sp_compatible_with<_Yp*, _Tp*>::value, _Res::type;
public:
 __shared_ptr0 noexcept
 : _M_ptr0, _M_refcount0
  __shared_ptr0 noexcept = default;
  __shared_ptr& operator=0 noexcept = default;
  ~__shared_ptr0 = default;
  template>
 __shared_ptr(const __shared_ptr<_Yp, _Lp>& __r) noexcept
};
namespace storage {
class S ;
}
namespace storage 
template 
concept StatState =
std::is_same_v;
namespace storage 
template 
class Tracker;
template 
std::shared_ptr> createTracker() ;
template 
class Tracker {
 public:
  template 
  std::shared_ptr> friend createTracker();
};
namespace storage 
class Pool {
 public:
  Pool() {
g = createTracker();
  }
 private:
  std::shared_ptr g;
};
void create0  () {
  auto tracker = createTracker();
}

```
please ignore the irrelevant syntax error, the real error is :
```
clang++ -ferror-limit= -std=gnu++20 -c test.cpp 

clang++: 
/home/wlei/local/upstream/llvm-project/clang/lib/AST/ExprConstant.cpp:16601: 
bool clang::Expr::EvaluateAsConstantExpr(EvalResult &, const ASTContext &, 
ConstantExprKind) const: Assertion `!isValueDependent() && "Expression 
evaluator can't be called on a dependent expression."' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and 
include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.  Program arguments: /home/wlei/local/upstream/llvm-release/bin/clang++ 
-ferror-limit= -std=gnu++20 -c test.cpp
1.  test.cpp:78:32: current parser token ')'
2.  test.cpp:77:18: parsing function body 'create0'
3.  test.cpp:77:18: in compound statement ('{}')
 #0 0x7fcf547fb128 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) 
/home/wlei/local/upstream/llvm-project/llvm/lib/Support/Unix/Signals.inc:723:13
 #1 0x7fcf547f9040 llvm::sys::RunSignalHandlers() 
/home/wlei/local/upstream/llvm-project/llvm/lib/Support/Signals.cpp:106:18
 #2 0x7fcf5473c7b6 (anonymous 
namespace)::CrashRecoveryContextImpl::HandleCrash(int, unsigned long) 
/home/wlei/local/upstream/llvm-project/llvm/lib/Support/CrashRecoveryContext.cpp:73:5
 #3 0x7fcf5473c7b6 CrashRecoverySignalHandler(int) 
/home/wlei/local/upstream/llvm-project/llvm/lib/Support/CrashRecoveryContext.cpp:390:51
 #4 0x7fcf53e3e730 __restore_rt (/lib64/libc.so.6+0x3e730)
 #5 0x7fcf53e8bacc __pthread_kill_implementation (/lib64/libc.so.6+0x8bacc)
 #6 0x7fcf53e3e686 gsignal (/lib64/libc.so.6+0x3e686)
 #7 0x7fcf53e28833 abort (/lib64/libc.so.6+0x28833)
 #8 0x7fcf53e2875b _nl_load_domain.cold (/lib64/libc.so.6+0x2875b)
 #9 0x7fcf53e373c6 (/lib64/libc.so.6+0x373c6)
#10 0x7fcf523da5b1 
clang::Expr::EvaluateAsConstantExpr(clang::Expr::EvalResult&, clang::ASTContext 
const&, clang::Expr::ConstantExprKind) const 
/home/wlei/local/upstream/llvm-project/clang/lib/AST/ExprConstant.cpp:0:0
#11 0x7fcf50f9b41d clang::ActionResult 
calculateConstraintSatisfaction(clang::Sema&, clang::Expr 
const*, clang::ConstraintSatisfaction&, 
calculateConstraintSatisfaction(clang::Sema&, clang::NamedDecl const*, 
clang::SourceLocation, clang::MultiLevelTemplateArgumentList const&, 
clang::Expr const*, clang::ConstraintSatisfaction&)::ConstraintEvaluator 
const&) 
/home/wlei/local/upstream/llvm-project/clang/lib/Sema/SemaConcept.cpp:389:71
#12 0x7fcf50f941eb clang::ActionResult::isInvalid() 
const 
/home/wlei/local/upstream/llvm-project/clang/include/clang/Sema/Ownership.h:199:41
#13 0x7fcf50f941eb CheckConstraintSatisfaction(clang::Sema&, 
clang::NamedDecl const*, llvm::ArrayRef, 
llvm::SmallVectorImpl&, clang::MultiLevelTemplateArgumentList 
const&, clang::SourceRange, clang::ConstraintSatisfaction&) 
/home/wlei/local/upstream/llvm-project/clang/lib/Sema/SemaConcept.cpp:600:13
#14 0x7fcf50f93f6a 
clang::Sema::CheckConstraintSatisfaction(clang::NamedDecl const*, 
llvm::ArrayRef, llvm::SmallVectorImpl&, 
clang::MultiLevelTemplateArgumentList const&, clang::SourceRange, 
clang::ConstraintSatisfaction&) 
/home/wlei/local/upstream/llvm-project/clang/li

[clang] Reapply "[Clang][Sema] Refactor collection of multi-level template argument lists (#106585, #111173)" (PR #111852)

2024-11-05 Thread Lei Wang via cfe-commits

wlei-llvm wrote:

Hi @sdkrystian , FYI, we also hit an error/assertion that bisected to this 
https://github.com/llvm/llvm-project/pull/114569

```
lvm-project/clang/lib/AST/ExprConstant.cpp:16601: bool 
clang::Expr::EvaluateAsConstantExpr(EvalResult &, const ASTContext &, 
ConstantExprKind) const: Assertion `!isValueDependent() && "Expression 
evaluator can't be called on a dependent expression."' failed.
```
I'm trying working on reducing the repro(it seems very slow). 


https://github.com/llvm/llvm-project/pull/111852
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [compiler-rt] [libc] [lld] [lldb] [llvm] [mlir] [BOLT] Match blocks with pseudo probes (PR #99891)

2024-11-08 Thread Lei Wang via cfe-commits

wlei-llvm wrote:

> > > > > > Ping @wlei-llvm
> > > > > 
> > > > > 
> > > > > Sorry for the delay. The new version addressed my last comment (with 
> > > > > just minor nits). However, I didn't fully follow the new features 
> > > > > related to `ProbeMatchSpecs` stuffs. Could you add more descriptions 
> > > > > to the diff summary? Or if it’s not a lot of work, could we split it 
> > > > > into two patches? We could commit the first part, and I will review 
> > > > > the second part separately.
> > > > 
> > > > 
> > > > NVM, I think now I get what `ProbeMatchSpecs` does, it's a vector 
> > > > because a function can have multiple sections(function split)
> > > 
> > > 
> > > Thank you for reviewing and sorry for the delay from my end, was busy 
> > > with profile quality work.
> > > ProbeMatchSpecs is a mechanism to match probes belonging to another 
> > > binary function. I'm going to utilize it in probe-based function matching 
> > > (#100446). For example:
> > > source function:
> > > ```
> > > void foo() {
> > >   bar();
> > > }
> > > ```
> > > 
> > > 
> > > 
> > >   
> > > 
> > > 
> > >   
> > > 
> > > 
> > > 
> > >   
> > > profiled binary: bar is not inlined => have top-level function bar new 
> > > binary (where the profile is applied to): bar is inlined into foo.
> > > Right now, BOLT does 1:1 matching between profile functions and binary 
> > > functions based on the name. #100446 will extend this to N:M where 
> > > multiple profiles can be matched to one binary function (as in the 
> > > example above where binary function foo would use profiles for foo and 
> > > bar), and one profile can be matched to multiple binary functions (eg if 
> > > bar was inlined into multiple functions).
> > > In this diff, ProbeMatchSpecs would only have one BinaryFunctionProfile 
> > > (existing name-based matching).
> > 
> > 
> > Thanks for the explanation! I was confused of the use of `ProbeMatchSpecs`, 
> > it would be great to add more descriptions in the diff summary or somewhere 
> > in the comments, or in the another patch when it's used(IMO if we add a 
> > feature, but we doesn't use it in the patch, it should be in the future 
> > patch when it's used)
> 
> Thank you. Added description of ProbeMatchSpecs to the summary. I decided to 
> introduce it in this diff because there's tight coupling between probe-based 
> block matching and function matching. This way, probe-based function matching 
> would not need to change how block matching works.

Makes sense, thanks for clarifying!

https://github.com/llvm/llvm-project/pull/99891
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [compiler-rt] [libc] [lld] [lldb] [llvm] [mlir] [BOLT] Match blocks with pseudo probes (PR #99891)

2024-11-08 Thread Lei Wang via cfe-commits

wlei-llvm wrote:

> > > > Ping @wlei-llvm
> > > 
> > > 
> > > Sorry for the delay. The new version addressed my last comment (with just 
> > > minor nits). However, I didn't fully follow the new features related to 
> > > `ProbeMatchSpecs` stuffs. Could you add more descriptions to the diff 
> > > summary? Or if it’s not a lot of work, could we split it into two 
> > > patches? We could commit the first part, and I will review the second 
> > > part separately.
> > 
> > 
> > NVM, I think now I get what `ProbeMatchSpecs` does, it's a vector because a 
> > function can have multiple sections(function split)
> 
> Thank you for reviewing and sorry for the delay from my end, was busy with 
> profile quality work.
> 
> ProbeMatchSpecs is a mechanism to match probes belonging to another binary 
> function. I'm going to utilize it in probe-based function matching (#100446). 
> For example:
> 
> source function:
> 
> ```
> void foo() {
>   bar();
> }
> ```
> 
> profiled binary: bar is not inlined => have top-level function bar new binary 
> (where the profile is applied to): bar is inlined into foo.
> 
> Right now, BOLT does 1:1 matching between profile functions and binary 
> functions based on the name. #100446 will extend this to N:M where multiple 
> profiles can be matched to one binary function (as in the example above where 
> binary function foo would use profiles for foo and bar), and one profile can 
> be matched to multiple binary functions (eg if bar was inlined into multiple 
> functions).
> 
> In this diff, ProbeMatchSpecs would only have one BinaryFunctionProfile 
> (existing name-based matching).

Thanks for the explanation!  I was confused of the use of `ProbeMatchSpecs`, it 
would be great to add more descriptions in the diff summary or somewhere in the 
comments, or in the another patch when it's used(IMO if we add a feature, but 
we doesn't use it in the patch, it should be in the future patch when it's used)

https://github.com/llvm/llvm-project/pull/99891
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [compiler-rt] [libc] [lld] [lldb] [llvm] [mlir] [BOLT] Match blocks with pseudo probes (PR #99891)

2024-11-08 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm approved this pull request.

LGTM, thanks for addressing the comments!

https://github.com/llvm/llvm-project/pull/99891
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Coverage] Support -fprofile-list for cold function coverage (PR #136333)

2025-04-18 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm created 
https://github.com/llvm/llvm-project/pull/136333

 Add a new instrumentation section type `[coldcov]` to support`-fprofile-list` 
for cold function coverage. Note that the current cold function coverage is 
based on sampling PGO pipeline, which is incompatible with the existing [llvm] 
option(see 
[PGOOptions](https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/Support/PGOOptions.h#L27-L43)),
 so we can't reuse the IR-PGO(-fprofile-instrument=llvm) flag.

>From 3d770595ce6568148ddfe2596f1a53dfc2ee751f Mon Sep 17 00:00:00 2001
From: wlei 
Date: Fri, 18 Apr 2025 10:37:30 -0700
Subject: [PATCH] [Inverge] Support -fprofile-list for cold function coveragege

---
 clang/docs/UsersManual.rst | 6 +++---
 clang/include/clang/Basic/CodeGenOptions.def   | 2 +-
 clang/include/clang/Basic/CodeGenOptions.h | 2 ++
 clang/include/clang/Driver/Options.td  | 4 ++--
 clang/lib/Basic/ProfileList.cpp| 2 ++
 clang/lib/Driver/ToolChains/Clang.cpp  | 1 +
 clang/test/CodeGen/profile-filter.c| 7 +++
 .../test/Driver/fprofile-generate-cold-function-coverage.c | 2 +-
 8 files changed, 19 insertions(+), 7 deletions(-)

diff --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst
index 69256527f40c9..47bd591d20a27 100644
--- a/clang/docs/UsersManual.rst
+++ b/clang/docs/UsersManual.rst
@@ -3394,9 +3394,9 @@ This can be done using the ``-fprofile-list`` option.
 
 $ clang++ -O2 -fprofile-instr-generate -fcoverage-mapping 
-fprofile-list=fun.list -fprofile-list=code.list code.cc -o code
 
-Supported sections are ``[clang]``, ``[llvm]``, and ``[csllvm]`` representing
-clang PGO, IRPGO, and CSIRPGO, respectively. Supported prefixes are 
``function``
-and ``source``. Supported categories are ``allow``, ``skip``, and ``forbid``.
+Supported sections are ``[clang]``, ``[llvm]``, ``[csllvm]``, and 
``[coldcov]`` representing
+clang PGO, IRPGO, CSIRPGO and cold function coverage, respectively. Supported 
prefixes are 
+``function`` and ``source``. Supported categories are ``allow``, ``skip``, and 
``forbid``.
 ``skip`` adds the ``skipprofile`` attribute while ``forbid`` adds the
 ``noprofile`` attribute to the appropriate function. Use
 ``default:`` to specify the default category.
diff --git a/clang/include/clang/Basic/CodeGenOptions.def 
b/clang/include/clang/Basic/CodeGenOptions.def
index c5990fb248689..fced5d7dcf6b7 100644
--- a/clang/include/clang/Basic/CodeGenOptions.def
+++ b/clang/include/clang/Basic/CodeGenOptions.def
@@ -223,7 +223,7 @@ AFFECTING_VALUE_CODEGENOPT(OptimizeSize, 2, 0) ///< If -Os 
(==1) or -Oz (==2) is
 CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic
 CODEGENOPT(ContinuousProfileSync, 1, 0) ///< Enable continuous instrumentation 
profiling
 /// Choose profile instrumenation kind or no instrumentation.
-ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 2, ProfileNone)
+ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 4, ProfileNone)
 /// Choose profile kind for PGO use compilation.
 ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone)
 /// Partition functions into N groups and select only functions in group i to 
be
diff --git a/clang/include/clang/Basic/CodeGenOptions.h 
b/clang/include/clang/Basic/CodeGenOptions.h
index e39a73bdb13ac..60001cfc62218 100644
--- a/clang/include/clang/Basic/CodeGenOptions.h
+++ b/clang/include/clang/Basic/CodeGenOptions.h
@@ -86,6 +86,8 @@ class CodeGenOptions : public CodeGenOptionsBase {
// to use with PGO.
 ProfileIRInstr,// IR level PGO instrumentation in LLVM.
 ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in 
LLVM.
+ProfileIRColdCov, // IR level cold function coverage instrumentation in
+  // LLVM.
   };
 
   enum EmbedBitcodeKind {
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 830d3459a1320..a76f90efc4a51 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -7679,9 +7679,9 @@ def fpatchable_function_entry_section_EQ
   HelpText<"Use Section instead of __patchable_function_entries">,
   MarshallingInfoString>;
 def fprofile_instrument_EQ : Joined<["-"], "fprofile-instrument=">,
-HelpText<"Enable PGO instrumentation">, Values<"none,clang,llvm,csllvm">,
+HelpText<"Enable PGO instrumentation">, 
Values<"none,clang,llvm,csllvm,coldcov">,
 NormalizedValuesScope<"CodeGenOptions">,
-NormalizedValues<["ProfileNone", "ProfileClangInstr", "ProfileIRInstr", 
"ProfileCSIRInstr"]>,
+NormalizedValues<["ProfileNone", "ProfileClangInstr", "ProfileIRInstr", 
"ProfileCSIRInstr", "ProfileIRColdCov"]>,
 MarshallingInfoEnum, "ProfileNone">;
 def fprofile_instrument_path_EQ : Joined<["-"], "fprofile-instrument-path=">,
 HelpText<"Generate instrumente

[clang] [Coverage] Support -fprofile-list for cold function coverage (PR #136333)

2025-04-29 Thread Lei Wang via cfe-commits

wlei-llvm wrote:

ping:)

https://github.com/llvm/llvm-project/pull/136333
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Coverage] Support -fprofile-list for cold function coverage (PR #136333)

2025-05-07 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm updated 
https://github.com/llvm/llvm-project/pull/136333



  



Rate limit · GitHub


  body {
background-color: #f6f8fa;
color: #24292e;
font-family: -apple-system,BlinkMacSystemFont,Segoe 
UI,Helvetica,Arial,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol;
font-size: 14px;
line-height: 1.5;
margin: 0;
  }

  .container { margin: 50px auto; max-width: 600px; text-align: center; 
padding: 0 24px; }

  a { color: #0366d6; text-decoration: none; }
  a:hover { text-decoration: underline; }

  h1 { line-height: 60px; font-size: 48px; font-weight: 300; margin: 0px; 
text-shadow: 0 1px 0 #fff; }
  p { color: rgba(0, 0, 0, 0.5); margin: 20px 0 40px; }

  ul { list-style: none; margin: 25px 0; padding: 0; }
  li { display: table-cell; font-weight: bold; width: 1%; }

  .logo { display: inline-block; margin-top: 35px; }
  .logo-img-2x { display: none; }
  @media
  only screen and (-webkit-min-device-pixel-ratio: 2),
  only screen and (   min--moz-device-pixel-ratio: 2),
  only screen and ( -o-min-device-pixel-ratio: 2/1),
  only screen and (min-device-pixel-ratio: 2),
  only screen and (min-resolution: 192dpi),
  only screen and (min-resolution: 2dppx) {
.logo-img-1x { display: none; }
.logo-img-2x { display: inline-block; }
  }

  #suggestions {
margin-top: 35px;
color: #ccc;
  }
  #suggestions a {
color: #66;
font-weight: 200;
font-size: 14px;
margin: 0 10px;
  }


  
  



  Whoa there!
  You have exceeded a secondary rate limit.
Please wait a few minutes before you try again;
in some cases this may take up to an hour.
  
  
https://support.github.com/contact";>Contact Support —
https://githubstatus.com";>GitHub Status —
https://twitter.com/githubstatus";>@githubstatus
  

  

  

  

  

  


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Coverage] Support -fprofile-list for cold function coverage (PR #136333)

2025-05-07 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm edited 
https://github.com/llvm/llvm-project/pull/136333
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Coverage] Support -fprofile-list for cold function coverage (PR #136333)

2025-05-07 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm updated 
https://github.com/llvm/llvm-project/pull/136333

>From 3d770595ce6568148ddfe2596f1a53dfc2ee751f Mon Sep 17 00:00:00 2001
From: wlei 
Date: Fri, 18 Apr 2025 10:37:30 -0700
Subject: [PATCH 1/2] [Inverge] Support -fprofile-list for cold function
 coveragege

---
 clang/docs/UsersManual.rst | 6 +++---
 clang/include/clang/Basic/CodeGenOptions.def   | 2 +-
 clang/include/clang/Basic/CodeGenOptions.h | 2 ++
 clang/include/clang/Driver/Options.td  | 4 ++--
 clang/lib/Basic/ProfileList.cpp| 2 ++
 clang/lib/Driver/ToolChains/Clang.cpp  | 1 +
 clang/test/CodeGen/profile-filter.c| 7 +++
 .../test/Driver/fprofile-generate-cold-function-coverage.c | 2 +-
 8 files changed, 19 insertions(+), 7 deletions(-)

diff --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst
index 69256527f40c9..47bd591d20a27 100644
--- a/clang/docs/UsersManual.rst
+++ b/clang/docs/UsersManual.rst
@@ -3394,9 +3394,9 @@ This can be done using the ``-fprofile-list`` option.
 
 $ clang++ -O2 -fprofile-instr-generate -fcoverage-mapping 
-fprofile-list=fun.list -fprofile-list=code.list code.cc -o code
 
-Supported sections are ``[clang]``, ``[llvm]``, and ``[csllvm]`` representing
-clang PGO, IRPGO, and CSIRPGO, respectively. Supported prefixes are 
``function``
-and ``source``. Supported categories are ``allow``, ``skip``, and ``forbid``.
+Supported sections are ``[clang]``, ``[llvm]``, ``[csllvm]``, and 
``[coldcov]`` representing
+clang PGO, IRPGO, CSIRPGO and cold function coverage, respectively. Supported 
prefixes are 
+``function`` and ``source``. Supported categories are ``allow``, ``skip``, and 
``forbid``.
 ``skip`` adds the ``skipprofile`` attribute while ``forbid`` adds the
 ``noprofile`` attribute to the appropriate function. Use
 ``default:`` to specify the default category.
diff --git a/clang/include/clang/Basic/CodeGenOptions.def 
b/clang/include/clang/Basic/CodeGenOptions.def
index c5990fb248689..fced5d7dcf6b7 100644
--- a/clang/include/clang/Basic/CodeGenOptions.def
+++ b/clang/include/clang/Basic/CodeGenOptions.def
@@ -223,7 +223,7 @@ AFFECTING_VALUE_CODEGENOPT(OptimizeSize, 2, 0) ///< If -Os 
(==1) or -Oz (==2) is
 CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic
 CODEGENOPT(ContinuousProfileSync, 1, 0) ///< Enable continuous instrumentation 
profiling
 /// Choose profile instrumenation kind or no instrumentation.
-ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 2, ProfileNone)
+ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 4, ProfileNone)
 /// Choose profile kind for PGO use compilation.
 ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone)
 /// Partition functions into N groups and select only functions in group i to 
be
diff --git a/clang/include/clang/Basic/CodeGenOptions.h 
b/clang/include/clang/Basic/CodeGenOptions.h
index e39a73bdb13ac..60001cfc62218 100644
--- a/clang/include/clang/Basic/CodeGenOptions.h
+++ b/clang/include/clang/Basic/CodeGenOptions.h
@@ -86,6 +86,8 @@ class CodeGenOptions : public CodeGenOptionsBase {
// to use with PGO.
 ProfileIRInstr,// IR level PGO instrumentation in LLVM.
 ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in 
LLVM.
+ProfileIRColdCov, // IR level cold function coverage instrumentation in
+  // LLVM.
   };
 
   enum EmbedBitcodeKind {
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 830d3459a1320..a76f90efc4a51 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -7679,9 +7679,9 @@ def fpatchable_function_entry_section_EQ
   HelpText<"Use Section instead of __patchable_function_entries">,
   MarshallingInfoString>;
 def fprofile_instrument_EQ : Joined<["-"], "fprofile-instrument=">,
-HelpText<"Enable PGO instrumentation">, Values<"none,clang,llvm,csllvm">,
+HelpText<"Enable PGO instrumentation">, 
Values<"none,clang,llvm,csllvm,coldcov">,
 NormalizedValuesScope<"CodeGenOptions">,
-NormalizedValues<["ProfileNone", "ProfileClangInstr", "ProfileIRInstr", 
"ProfileCSIRInstr"]>,
+NormalizedValues<["ProfileNone", "ProfileClangInstr", "ProfileIRInstr", 
"ProfileCSIRInstr", "ProfileIRColdCov"]>,
 MarshallingInfoEnum, "ProfileNone">;
 def fprofile_instrument_path_EQ : Joined<["-"], "fprofile-instrument-path=">,
 HelpText<"Generate instrumented code to collect execution counts into "
diff --git a/clang/lib/Basic/ProfileList.cpp b/clang/lib/Basic/ProfileList.cpp
index 8fa16e2eb069a..0991867e1ac3a 100644
--- a/clang/lib/Basic/ProfileList.cpp
+++ b/clang/lib/Basic/ProfileList.cpp
@@ -81,6 +81,8 @@ static StringRef 
getSectionName(CodeGenOptions::ProfileInstrKind Kind) {
 return "llvm";
   case CodeGenOptions::ProfileCSIRInstr:
 return "csllvm";
+  

[clang] [Coverage] Support -fprofile-list for cold function coverage (PR #136333)

2025-05-07 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm updated 
https://github.com/llvm/llvm-project/pull/136333



  



Rate limit · GitHub


  body {
background-color: #f6f8fa;
color: #24292e;
font-family: -apple-system,BlinkMacSystemFont,Segoe 
UI,Helvetica,Arial,sans-serif,Apple Color Emoji,Segoe UI Emoji,Segoe UI Symbol;
font-size: 14px;
line-height: 1.5;
margin: 0;
  }

  .container { margin: 50px auto; max-width: 600px; text-align: center; 
padding: 0 24px; }

  a { color: #0366d6; text-decoration: none; }
  a:hover { text-decoration: underline; }

  h1 { line-height: 60px; font-size: 48px; font-weight: 300; margin: 0px; 
text-shadow: 0 1px 0 #fff; }
  p { color: rgba(0, 0, 0, 0.5); margin: 20px 0 40px; }

  ul { list-style: none; margin: 25px 0; padding: 0; }
  li { display: table-cell; font-weight: bold; width: 1%; }

  .logo { display: inline-block; margin-top: 35px; }
  .logo-img-2x { display: none; }
  @media
  only screen and (-webkit-min-device-pixel-ratio: 2),
  only screen and (   min--moz-device-pixel-ratio: 2),
  only screen and ( -o-min-device-pixel-ratio: 2/1),
  only screen and (min-device-pixel-ratio: 2),
  only screen and (min-resolution: 192dpi),
  only screen and (min-resolution: 2dppx) {
.logo-img-1x { display: none; }
.logo-img-2x { display: inline-block; }
  }

  #suggestions {
margin-top: 35px;
color: #ccc;
  }
  #suggestions a {
color: #66;
font-weight: 200;
font-size: 14px;
margin: 0 10px;
  }


  
  



  Whoa there!
  You have exceeded a secondary rate limit.
Please wait a few minutes before you try again;
in some cases this may take up to an hour.
  
  
https://support.github.com/contact";>Contact Support —
https://githubstatus.com";>GitHub Status —
https://twitter.com/githubstatus";>@githubstatus
  

  

  

  

  

  


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Coverage] Support -fprofile-list for cold function coverage (PR #136333)

2025-05-07 Thread Lei Wang via cfe-commits

wlei-llvm wrote:

> LGTM, but I'd like to bikeshed the name a bit. Since this is only used for 
> the Sample PGO pipeline, should we include Sample PGO in the name?

Sounds good, I will go with `[sample-coldcov]` if that works for you.

https://github.com/llvm/llvm-project/pull/136333
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Coverage] Support -fprofile-list for cold function coverage (PR #136333)

2025-05-07 Thread Lei Wang via cfe-commits


@@ -9,6 +9,9 @@
 // RUN: echo -e "[clang]\nfun:test1\n[llvm]\nfun:test2" > %t-section.list
 // RUN: %clang_cc1 -fprofile-instrument=llvm -fprofile-list=%t-section.list 
-emit-llvm %s -o - | FileCheck %s --check-prefix=SECTION
 
+// RUN: echo -e "[coldcov]\nfun:test*\n!fun:test2" > %t-cold-func.list
+// RUN: %clang_cc1 -fprofile-instrument=coldcov 
-fprofile-list=%t-cold-func.list -emit-llvm %s -o - | FileCheck %s 
--check-prefix=COLDCOV

wlei-llvm wrote:

There is also a explicit check noprofile ("// COLDCOV: noprofile"), so I guess 
`implicit-check-not` doesn't work here.

https://github.com/llvm/llvm-project/pull/136333
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Coverage] Support -fprofile-list for cold function coverage (PR #136333)

2025-05-07 Thread Lei Wang via cfe-commits


@@ -9,6 +9,9 @@
 // RUN: echo -e "[clang]\nfun:test1\n[llvm]\nfun:test2" > %t-section.list
 // RUN: %clang_cc1 -fprofile-instrument=llvm -fprofile-list=%t-section.list 
-emit-llvm %s -o - | FileCheck %s --check-prefix=SECTION
 
+// RUN: echo -e "[coldcov]\nfun:test*\n!fun:test2" > %t-cold-func.list
+// RUN: %clang_cc1 -fprofile-instrument=coldcov 
-fprofile-list=%t-cold-func.list -emit-llvm %s -o - | FileCheck %s 
--check-prefix=COLDCOV
+

wlei-llvm wrote:

improved the test with `split-file`, thanks!

https://github.com/llvm/llvm-project/pull/136333
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Coverage] Support -fprofile-list for cold function coverage (PR #136333)

2025-05-08 Thread Lei Wang via cfe-commits

https://github.com/wlei-llvm closed 
https://github.com/llvm/llvm-project/pull/136333
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits