[llvm-branch-commits] [compiler-rt] [profile] Change __llvm_profile_counter_bias type to match llvm (PR #107362)

2024-09-11 Thread Petr Hosek via llvm-branch-commits

petrhosek wrote:

I agree, while the risk is low, this issue has existed for several releases so 
it should fine do delay the fix a bit further.

https://github.com/llvm/llvm-project/pull/107362
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/19.x: [Clang] Fix crash due to invalid source location in __is_trivially_equality_comparable (#107815) (PR #108147)

2024-09-11 Thread Dimitry Andric via llvm-branch-commits

https://github.com/DimitryAndric approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/108147
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)

2024-09-11 Thread Luke Lau via llvm-branch-commits

lukel97 wrote:

I collected the stats on the number of memcmps that were inlined, it looks like 
we're able to expand a good chunk of them:
```
Program   expand-memcmp.NumMemCmpCalls  
expand-memcmp.NumMemCmpInlined 
  lhs  rhs  
  diff  lhsrhsdiff 
FP2017rate/510.parest_r/510.parest_r  410.00   
468.00 14.1%104.00  inf%
INT2017speed/602.gcc_s/602.gcc_s   83.00
92.00 10.8% 36.00  inf%
INT2017rate/502.gcc_r/502.gcc_r83.00
92.00 10.8% 36.00  inf%
INT2017spe...00.perlbench_s/600.perlbench_s   207.00   
220.00  6.3%120.00  inf%
INT2017rat...00.perlbench_r/500.perlbench_r   207.00   
220.00  6.3%120.00  inf%
INT2017spe...ed/620.omnetpp_s/620.omnetpp_s   304.00   
306.00  0.7% 13.00  inf%
INT2017rate/520.omnetpp_r/520.omnetpp_r   304.00   
306.00  0.7% 13.00  inf%
FP2017rate/508.namd_r/508.namd_r   13.00
13.00  0.0% 13.00  inf%
INT2017rate/541.leela_r/541.leela_r40.00
40.00  0.0%  3.00  inf%
INT2017speed/641.leela_s/641.leela_s   40.00
40.00  0.0%  3.00  inf%
INT2017speed/625.x264_s/625.x264_s  8.00 
8.00  0.0%  6.00  inf%
INT2017spe...23.xalancbmk_s/623.xalancbmk_s 8.00 
8.00  0.0%  6.00  inf%
INT2017rate/557.xz_r/557.xz_r   6.00 
6.00  0.0%  4.00  inf%
INT2017rat...23.xalancbmk_r/523.xalancbmk_r 8.00 
8.00  0.0%  6.00  inf%
INT2017rate/525.x264_r/525.x264_r   8.00 
8.00  0.0%  6.00  inf%
FP2017speed/644.nab_s/644.nab_s77.00
77.00  0.0% 71.00  inf%
FP2017speed/638.imagick_s/638.imagick_s 3.00 
3.00  0.0%
FP2017rate/544.nab_r/544.nab_r 77.00
77.00  0.0% 71.00  inf%
FP2017rate/538.imagick_r/538.imagick_r  3.00 
3.00  0.0%
FP2017rate/526.blender_r/526.blender_r 41.00
41.00  0.0% 27.00  inf%
FP2017rate/511.povray_r/511.povray_r5.00 
5.00  0.0%  5.00  inf%
INT2017speed/657.xz_s/657.xz_s  6.00 
6.00  0.0%  4.00  inf%
```

There's a small difference in the number of original memcmp calls, there's some 
merge commits in this branch which might have changed the codegen slightly in 
the meantime.

I'm working on getting some runtime numbers now, sorry for the delay

https://github.com/llvm/llvm-project/pull/107548
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)

2024-09-11 Thread Pengcheng Wang via llvm-branch-commits

wangpc-pp wrote:

> I'm working on getting some runtime numbers now, sorry for the delay.

No need to say sorry, I really appreciate your help! 😃 

https://github.com/llvm/llvm-project/pull/107548
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang] Introduce custom loop nest generation for loops in workshare construct (PR #101445)

2024-09-11 Thread via llvm-branch-commits

https://github.com/jeanPerier approved this pull request.

Looks reasonable to me.

https://github.com/llvm/llvm-project/pull/101445
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [Windows SEH] Fix crash on empty seh block (#107031) (PR #107466)

2024-09-11 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm approved this pull request.


https://github.com/llvm/llvm-project/pull/107466
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU] Fix sign confusion in performMulLoHiCombine (PR #106977)

2024-09-11 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm approved this pull request.


https://github.com/llvm/llvm-project/pull/106977
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU] Fix sign confusion in performMulLoHiCombine (PR #106977)

2024-09-11 Thread Matt Arsenault via llvm-branch-commits

arsenm wrote:

> I am unsure what ROCm is in this case and what that would mean for LLVM users 
> if this fix is not included.

GPU compute stack. Current ROCm releases are basing off of the LLVM release 
branches (in the past they were more random branch points) 

https://github.com/llvm/llvm-project/pull/106977
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [Serialization] Code cleanups and polish 83233 (PR #83237)

2024-09-11 Thread Ilya Biryukov via llvm-branch-commits

ilya-biryukov wrote:

Just wanted to give a short update that I'm still on it, but it takes longer 
than I initially anticipated.
I've got entangled into a few spaghetti-like dependencies and untangling them 
takes a bit of time... I should be much closer now, hope to share something 
tomorrow.

https://github.com/llvm/llvm-project/pull/83237
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match blocks with pseudo probes (PR #99891)

2024-09-11 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov updated 
https://github.com/llvm/llvm-project/pull/99891

>From 36197b175681d07b4704e576fb008cec3cc1e05e Mon Sep 17 00:00:00 2001
From: Amir Ayupov 
Date: Wed, 28 Aug 2024 21:10:25 +0200
Subject: [PATCH 1/2] Reworked block probe matching

Use new probe ifaces
Get all function probes at once
Drop ProfileUsePseudoProbes
Unify matchWithBlockPseudoProbes
Distinguish exact and loose probe match
---
 bolt/include/bolt/Core/BinaryContext.h|  20 +-
 bolt/lib/Passes/BinaryPasses.cpp  |  40 ++-
 bolt/lib/Profile/StaleProfileMatching.cpp | 404 ++
 bolt/lib/Rewrite/PseudoProbeRewriter.cpp  |   8 +-
 4 files changed, 237 insertions(+), 235 deletions(-)

diff --git a/bolt/include/bolt/Core/BinaryContext.h 
b/bolt/include/bolt/Core/BinaryContext.h
index 3e20cb607e657b..3f7b2ac0bc6cf9 100644
--- a/bolt/include/bolt/Core/BinaryContext.h
+++ b/bolt/include/bolt/Core/BinaryContext.h
@@ -724,14 +724,26 @@ class BinaryContext {
 uint32_t NumStaleBlocks{0};
 ///   the number of exactly matched basic blocks
 uint32_t NumExactMatchedBlocks{0};
-///   the number of pseudo probe matched basic blocks
-uint32_t NumPseudoProbeMatchedBlocks{0};
+///   the number of loosely matched basic blocks
+uint32_t NumLooseMatchedBlocks{0};
+///   the number of exactly pseudo probe matched basic blocks
+uint32_t NumPseudoProbeExactMatchedBlocks{0};
+///   the number of loosely pseudo probe matched basic blocks
+uint32_t NumPseudoProbeLooseMatchedBlocks{0};
+///   the number of call matched basic blocks
+uint32_t NumCallMatchedBlocks{0};
 ///   the total count of samples in the profile
 uint64_t StaleSampleCount{0};
 ///   the count of exactly matched samples
 uint64_t ExactMatchedSampleCount{0};
-///   the count of pseudo probe matched samples
-uint64_t PseudoProbeMatchedSampleCount{0};
+///   the count of exactly matched samples
+uint64_t LooseMatchedSampleCount{0};
+///   the count of exactly pseudo probe matched samples
+uint64_t PseudoProbeExactMatchedSampleCount{0};
+///   the count of loosely pseudo probe matched samples
+uint64_t PseudoProbeLooseMatchedSampleCount{0};
+///   the count of call matched samples
+uint64_t CallMatchedSampleCount{0};
 ///   the number of stale functions that have matching number of blocks in
 ///   the profile
 uint64_t NumStaleFuncsWithEqualBlockCount{0};
diff --git a/bolt/lib/Passes/BinaryPasses.cpp b/bolt/lib/Passes/BinaryPasses.cpp
index b786f07a6a6651..8edbd58c3ed3de 100644
--- a/bolt/lib/Passes/BinaryPasses.cpp
+++ b/bolt/lib/Passes/BinaryPasses.cpp
@@ -1524,15 +1524,43 @@ Error PrintProgramStats::runOnFunctions(BinaryContext 
&BC) {
 100.0 * BC.Stats.ExactMatchedSampleCount / BC.Stats.StaleSampleCount,
 BC.Stats.ExactMatchedSampleCount, BC.Stats.StaleSampleCount);
 BC.outs() << format(
-"BOLT-INFO: inference found a pseudo probe match for %.2f%% of basic "
+"BOLT-INFO: inference found an exact pseudo probe match for %.2f%% of "
+"basic blocks (%zu out of %zu stale) responsible for %.2f%% samples"
+" (%zu out of %zu stale)\n",
+100.0 * BC.Stats.NumPseudoProbeExactMatchedBlocks /
+BC.Stats.NumStaleBlocks,
+BC.Stats.NumPseudoProbeExactMatchedBlocks, BC.Stats.NumStaleBlocks,
+100.0 * BC.Stats.PseudoProbeExactMatchedSampleCount /
+BC.Stats.StaleSampleCount,
+BC.Stats.PseudoProbeExactMatchedSampleCount, 
BC.Stats.StaleSampleCount);
+BC.outs() << format(
+"BOLT-INFO: inference found a loose pseudo probe match for %.2f%% of "
+"basic blocks (%zu out of %zu stale) responsible for %.2f%% samples"
+" (%zu out of %zu stale)\n",
+100.0 * BC.Stats.NumPseudoProbeLooseMatchedBlocks /
+BC.Stats.NumStaleBlocks,
+BC.Stats.NumPseudoProbeLooseMatchedBlocks, BC.Stats.NumStaleBlocks,
+100.0 * BC.Stats.PseudoProbeLooseMatchedSampleCount /
+BC.Stats.StaleSampleCount,
+BC.Stats.PseudoProbeLooseMatchedSampleCount, 
BC.Stats.StaleSampleCount);
+BC.outs() << format(
+"BOLT-INFO: inference found a call match for %.2f%% of basic "
 "blocks"
 " (%zu out of %zu stale) responsible for %.2f%% samples"
 " (%zu out of %zu stale)\n",
-100.0 * BC.Stats.NumPseudoProbeMatchedBlocks / BC.Stats.NumStaleBlocks,
-BC.Stats.NumPseudoProbeMatchedBlocks, BC.Stats.NumStaleBlocks,
-100.0 * BC.Stats.PseudoProbeMatchedSampleCount /
-BC.Stats.StaleSampleCount,
-BC.Stats.PseudoProbeMatchedSampleCount, BC.Stats.StaleSampleCount);
+100.0 * BC.Stats.NumCallMatchedBlocks / BC.Stats.NumStaleBlocks,
+BC.Stats.NumCallMatchedBlocks, BC.Stats.NumStaleBlocks,
+100.0 * BC.Stats.CallMatchedSampleCount / BC.Stats.StaleSampleCount,
+BC.Stats.CallMatchedSampleCount, BC.Stats.StaleSampleCount);
+BC

[llvm-branch-commits] [llvm] [BOLT] Match blocks with pseudo probes (PR #99891)

2024-09-11 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov updated 
https://github.com/llvm/llvm-project/pull/99891

>From 36197b175681d07b4704e576fb008cec3cc1e05e Mon Sep 17 00:00:00 2001
From: Amir Ayupov 
Date: Wed, 28 Aug 2024 21:10:25 +0200
Subject: [PATCH 1/2] Reworked block probe matching

Use new probe ifaces
Get all function probes at once
Drop ProfileUsePseudoProbes
Unify matchWithBlockPseudoProbes
Distinguish exact and loose probe match
---
 bolt/include/bolt/Core/BinaryContext.h|  20 +-
 bolt/lib/Passes/BinaryPasses.cpp  |  40 ++-
 bolt/lib/Profile/StaleProfileMatching.cpp | 404 ++
 bolt/lib/Rewrite/PseudoProbeRewriter.cpp  |   8 +-
 4 files changed, 237 insertions(+), 235 deletions(-)

diff --git a/bolt/include/bolt/Core/BinaryContext.h 
b/bolt/include/bolt/Core/BinaryContext.h
index 3e20cb607e657b..3f7b2ac0bc6cf9 100644
--- a/bolt/include/bolt/Core/BinaryContext.h
+++ b/bolt/include/bolt/Core/BinaryContext.h
@@ -724,14 +724,26 @@ class BinaryContext {
 uint32_t NumStaleBlocks{0};
 ///   the number of exactly matched basic blocks
 uint32_t NumExactMatchedBlocks{0};
-///   the number of pseudo probe matched basic blocks
-uint32_t NumPseudoProbeMatchedBlocks{0};
+///   the number of loosely matched basic blocks
+uint32_t NumLooseMatchedBlocks{0};
+///   the number of exactly pseudo probe matched basic blocks
+uint32_t NumPseudoProbeExactMatchedBlocks{0};
+///   the number of loosely pseudo probe matched basic blocks
+uint32_t NumPseudoProbeLooseMatchedBlocks{0};
+///   the number of call matched basic blocks
+uint32_t NumCallMatchedBlocks{0};
 ///   the total count of samples in the profile
 uint64_t StaleSampleCount{0};
 ///   the count of exactly matched samples
 uint64_t ExactMatchedSampleCount{0};
-///   the count of pseudo probe matched samples
-uint64_t PseudoProbeMatchedSampleCount{0};
+///   the count of exactly matched samples
+uint64_t LooseMatchedSampleCount{0};
+///   the count of exactly pseudo probe matched samples
+uint64_t PseudoProbeExactMatchedSampleCount{0};
+///   the count of loosely pseudo probe matched samples
+uint64_t PseudoProbeLooseMatchedSampleCount{0};
+///   the count of call matched samples
+uint64_t CallMatchedSampleCount{0};
 ///   the number of stale functions that have matching number of blocks in
 ///   the profile
 uint64_t NumStaleFuncsWithEqualBlockCount{0};
diff --git a/bolt/lib/Passes/BinaryPasses.cpp b/bolt/lib/Passes/BinaryPasses.cpp
index b786f07a6a6651..8edbd58c3ed3de 100644
--- a/bolt/lib/Passes/BinaryPasses.cpp
+++ b/bolt/lib/Passes/BinaryPasses.cpp
@@ -1524,15 +1524,43 @@ Error PrintProgramStats::runOnFunctions(BinaryContext 
&BC) {
 100.0 * BC.Stats.ExactMatchedSampleCount / BC.Stats.StaleSampleCount,
 BC.Stats.ExactMatchedSampleCount, BC.Stats.StaleSampleCount);
 BC.outs() << format(
-"BOLT-INFO: inference found a pseudo probe match for %.2f%% of basic "
+"BOLT-INFO: inference found an exact pseudo probe match for %.2f%% of "
+"basic blocks (%zu out of %zu stale) responsible for %.2f%% samples"
+" (%zu out of %zu stale)\n",
+100.0 * BC.Stats.NumPseudoProbeExactMatchedBlocks /
+BC.Stats.NumStaleBlocks,
+BC.Stats.NumPseudoProbeExactMatchedBlocks, BC.Stats.NumStaleBlocks,
+100.0 * BC.Stats.PseudoProbeExactMatchedSampleCount /
+BC.Stats.StaleSampleCount,
+BC.Stats.PseudoProbeExactMatchedSampleCount, 
BC.Stats.StaleSampleCount);
+BC.outs() << format(
+"BOLT-INFO: inference found a loose pseudo probe match for %.2f%% of "
+"basic blocks (%zu out of %zu stale) responsible for %.2f%% samples"
+" (%zu out of %zu stale)\n",
+100.0 * BC.Stats.NumPseudoProbeLooseMatchedBlocks /
+BC.Stats.NumStaleBlocks,
+BC.Stats.NumPseudoProbeLooseMatchedBlocks, BC.Stats.NumStaleBlocks,
+100.0 * BC.Stats.PseudoProbeLooseMatchedSampleCount /
+BC.Stats.StaleSampleCount,
+BC.Stats.PseudoProbeLooseMatchedSampleCount, 
BC.Stats.StaleSampleCount);
+BC.outs() << format(
+"BOLT-INFO: inference found a call match for %.2f%% of basic "
 "blocks"
 " (%zu out of %zu stale) responsible for %.2f%% samples"
 " (%zu out of %zu stale)\n",
-100.0 * BC.Stats.NumPseudoProbeMatchedBlocks / BC.Stats.NumStaleBlocks,
-BC.Stats.NumPseudoProbeMatchedBlocks, BC.Stats.NumStaleBlocks,
-100.0 * BC.Stats.PseudoProbeMatchedSampleCount /
-BC.Stats.StaleSampleCount,
-BC.Stats.PseudoProbeMatchedSampleCount, BC.Stats.StaleSampleCount);
+100.0 * BC.Stats.NumCallMatchedBlocks / BC.Stats.NumStaleBlocks,
+BC.Stats.NumCallMatchedBlocks, BC.Stats.NumStaleBlocks,
+100.0 * BC.Stats.CallMatchedSampleCount / BC.Stats.StaleSampleCount,
+BC.Stats.CallMatchedSampleCount, BC.Stats.StaleSampleCount);
+BC

[llvm-branch-commits] [llvm] [BOLT] Match blocks with pseudo probes (PR #99891)

2024-09-11 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov updated 
https://github.com/llvm/llvm-project/pull/99891

>From 36197b175681d07b4704e576fb008cec3cc1e05e Mon Sep 17 00:00:00 2001
From: Amir Ayupov 
Date: Wed, 28 Aug 2024 21:10:25 +0200
Subject: [PATCH 1/2] Reworked block probe matching

Use new probe ifaces
Get all function probes at once
Drop ProfileUsePseudoProbes
Unify matchWithBlockPseudoProbes
Distinguish exact and loose probe match
---
 bolt/include/bolt/Core/BinaryContext.h|  20 +-
 bolt/lib/Passes/BinaryPasses.cpp  |  40 ++-
 bolt/lib/Profile/StaleProfileMatching.cpp | 404 ++
 bolt/lib/Rewrite/PseudoProbeRewriter.cpp  |   8 +-
 4 files changed, 237 insertions(+), 235 deletions(-)

diff --git a/bolt/include/bolt/Core/BinaryContext.h 
b/bolt/include/bolt/Core/BinaryContext.h
index 3e20cb607e657b..3f7b2ac0bc6cf9 100644
--- a/bolt/include/bolt/Core/BinaryContext.h
+++ b/bolt/include/bolt/Core/BinaryContext.h
@@ -724,14 +724,26 @@ class BinaryContext {
 uint32_t NumStaleBlocks{0};
 ///   the number of exactly matched basic blocks
 uint32_t NumExactMatchedBlocks{0};
-///   the number of pseudo probe matched basic blocks
-uint32_t NumPseudoProbeMatchedBlocks{0};
+///   the number of loosely matched basic blocks
+uint32_t NumLooseMatchedBlocks{0};
+///   the number of exactly pseudo probe matched basic blocks
+uint32_t NumPseudoProbeExactMatchedBlocks{0};
+///   the number of loosely pseudo probe matched basic blocks
+uint32_t NumPseudoProbeLooseMatchedBlocks{0};
+///   the number of call matched basic blocks
+uint32_t NumCallMatchedBlocks{0};
 ///   the total count of samples in the profile
 uint64_t StaleSampleCount{0};
 ///   the count of exactly matched samples
 uint64_t ExactMatchedSampleCount{0};
-///   the count of pseudo probe matched samples
-uint64_t PseudoProbeMatchedSampleCount{0};
+///   the count of exactly matched samples
+uint64_t LooseMatchedSampleCount{0};
+///   the count of exactly pseudo probe matched samples
+uint64_t PseudoProbeExactMatchedSampleCount{0};
+///   the count of loosely pseudo probe matched samples
+uint64_t PseudoProbeLooseMatchedSampleCount{0};
+///   the count of call matched samples
+uint64_t CallMatchedSampleCount{0};
 ///   the number of stale functions that have matching number of blocks in
 ///   the profile
 uint64_t NumStaleFuncsWithEqualBlockCount{0};
diff --git a/bolt/lib/Passes/BinaryPasses.cpp b/bolt/lib/Passes/BinaryPasses.cpp
index b786f07a6a6651..8edbd58c3ed3de 100644
--- a/bolt/lib/Passes/BinaryPasses.cpp
+++ b/bolt/lib/Passes/BinaryPasses.cpp
@@ -1524,15 +1524,43 @@ Error PrintProgramStats::runOnFunctions(BinaryContext 
&BC) {
 100.0 * BC.Stats.ExactMatchedSampleCount / BC.Stats.StaleSampleCount,
 BC.Stats.ExactMatchedSampleCount, BC.Stats.StaleSampleCount);
 BC.outs() << format(
-"BOLT-INFO: inference found a pseudo probe match for %.2f%% of basic "
+"BOLT-INFO: inference found an exact pseudo probe match for %.2f%% of "
+"basic blocks (%zu out of %zu stale) responsible for %.2f%% samples"
+" (%zu out of %zu stale)\n",
+100.0 * BC.Stats.NumPseudoProbeExactMatchedBlocks /
+BC.Stats.NumStaleBlocks,
+BC.Stats.NumPseudoProbeExactMatchedBlocks, BC.Stats.NumStaleBlocks,
+100.0 * BC.Stats.PseudoProbeExactMatchedSampleCount /
+BC.Stats.StaleSampleCount,
+BC.Stats.PseudoProbeExactMatchedSampleCount, 
BC.Stats.StaleSampleCount);
+BC.outs() << format(
+"BOLT-INFO: inference found a loose pseudo probe match for %.2f%% of "
+"basic blocks (%zu out of %zu stale) responsible for %.2f%% samples"
+" (%zu out of %zu stale)\n",
+100.0 * BC.Stats.NumPseudoProbeLooseMatchedBlocks /
+BC.Stats.NumStaleBlocks,
+BC.Stats.NumPseudoProbeLooseMatchedBlocks, BC.Stats.NumStaleBlocks,
+100.0 * BC.Stats.PseudoProbeLooseMatchedSampleCount /
+BC.Stats.StaleSampleCount,
+BC.Stats.PseudoProbeLooseMatchedSampleCount, 
BC.Stats.StaleSampleCount);
+BC.outs() << format(
+"BOLT-INFO: inference found a call match for %.2f%% of basic "
 "blocks"
 " (%zu out of %zu stale) responsible for %.2f%% samples"
 " (%zu out of %zu stale)\n",
-100.0 * BC.Stats.NumPseudoProbeMatchedBlocks / BC.Stats.NumStaleBlocks,
-BC.Stats.NumPseudoProbeMatchedBlocks, BC.Stats.NumStaleBlocks,
-100.0 * BC.Stats.PseudoProbeMatchedSampleCount /
-BC.Stats.StaleSampleCount,
-BC.Stats.PseudoProbeMatchedSampleCount, BC.Stats.StaleSampleCount);
+100.0 * BC.Stats.NumCallMatchedBlocks / BC.Stats.NumStaleBlocks,
+BC.Stats.NumCallMatchedBlocks, BC.Stats.NumStaleBlocks,
+100.0 * BC.Stats.CallMatchedSampleCount / BC.Stats.StaleSampleCount,
+BC.Stats.CallMatchedSampleCount, BC.Stats.StaleSampleCount);
+BC

[llvm-branch-commits] [llvm] [BOLT] Match blocks with pseudo probes (PR #99891)

2024-09-11 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov updated 
https://github.com/llvm/llvm-project/pull/99891

>From 36197b175681d07b4704e576fb008cec3cc1e05e Mon Sep 17 00:00:00 2001
From: Amir Ayupov 
Date: Wed, 28 Aug 2024 21:10:25 +0200
Subject: [PATCH 1/2] Reworked block probe matching

Use new probe ifaces
Get all function probes at once
Drop ProfileUsePseudoProbes
Unify matchWithBlockPseudoProbes
Distinguish exact and loose probe match
---
 bolt/include/bolt/Core/BinaryContext.h|  20 +-
 bolt/lib/Passes/BinaryPasses.cpp  |  40 ++-
 bolt/lib/Profile/StaleProfileMatching.cpp | 404 ++
 bolt/lib/Rewrite/PseudoProbeRewriter.cpp  |   8 +-
 4 files changed, 237 insertions(+), 235 deletions(-)

diff --git a/bolt/include/bolt/Core/BinaryContext.h 
b/bolt/include/bolt/Core/BinaryContext.h
index 3e20cb607e657b..3f7b2ac0bc6cf9 100644
--- a/bolt/include/bolt/Core/BinaryContext.h
+++ b/bolt/include/bolt/Core/BinaryContext.h
@@ -724,14 +724,26 @@ class BinaryContext {
 uint32_t NumStaleBlocks{0};
 ///   the number of exactly matched basic blocks
 uint32_t NumExactMatchedBlocks{0};
-///   the number of pseudo probe matched basic blocks
-uint32_t NumPseudoProbeMatchedBlocks{0};
+///   the number of loosely matched basic blocks
+uint32_t NumLooseMatchedBlocks{0};
+///   the number of exactly pseudo probe matched basic blocks
+uint32_t NumPseudoProbeExactMatchedBlocks{0};
+///   the number of loosely pseudo probe matched basic blocks
+uint32_t NumPseudoProbeLooseMatchedBlocks{0};
+///   the number of call matched basic blocks
+uint32_t NumCallMatchedBlocks{0};
 ///   the total count of samples in the profile
 uint64_t StaleSampleCount{0};
 ///   the count of exactly matched samples
 uint64_t ExactMatchedSampleCount{0};
-///   the count of pseudo probe matched samples
-uint64_t PseudoProbeMatchedSampleCount{0};
+///   the count of exactly matched samples
+uint64_t LooseMatchedSampleCount{0};
+///   the count of exactly pseudo probe matched samples
+uint64_t PseudoProbeExactMatchedSampleCount{0};
+///   the count of loosely pseudo probe matched samples
+uint64_t PseudoProbeLooseMatchedSampleCount{0};
+///   the count of call matched samples
+uint64_t CallMatchedSampleCount{0};
 ///   the number of stale functions that have matching number of blocks in
 ///   the profile
 uint64_t NumStaleFuncsWithEqualBlockCount{0};
diff --git a/bolt/lib/Passes/BinaryPasses.cpp b/bolt/lib/Passes/BinaryPasses.cpp
index b786f07a6a6651..8edbd58c3ed3de 100644
--- a/bolt/lib/Passes/BinaryPasses.cpp
+++ b/bolt/lib/Passes/BinaryPasses.cpp
@@ -1524,15 +1524,43 @@ Error PrintProgramStats::runOnFunctions(BinaryContext 
&BC) {
 100.0 * BC.Stats.ExactMatchedSampleCount / BC.Stats.StaleSampleCount,
 BC.Stats.ExactMatchedSampleCount, BC.Stats.StaleSampleCount);
 BC.outs() << format(
-"BOLT-INFO: inference found a pseudo probe match for %.2f%% of basic "
+"BOLT-INFO: inference found an exact pseudo probe match for %.2f%% of "
+"basic blocks (%zu out of %zu stale) responsible for %.2f%% samples"
+" (%zu out of %zu stale)\n",
+100.0 * BC.Stats.NumPseudoProbeExactMatchedBlocks /
+BC.Stats.NumStaleBlocks,
+BC.Stats.NumPseudoProbeExactMatchedBlocks, BC.Stats.NumStaleBlocks,
+100.0 * BC.Stats.PseudoProbeExactMatchedSampleCount /
+BC.Stats.StaleSampleCount,
+BC.Stats.PseudoProbeExactMatchedSampleCount, 
BC.Stats.StaleSampleCount);
+BC.outs() << format(
+"BOLT-INFO: inference found a loose pseudo probe match for %.2f%% of "
+"basic blocks (%zu out of %zu stale) responsible for %.2f%% samples"
+" (%zu out of %zu stale)\n",
+100.0 * BC.Stats.NumPseudoProbeLooseMatchedBlocks /
+BC.Stats.NumStaleBlocks,
+BC.Stats.NumPseudoProbeLooseMatchedBlocks, BC.Stats.NumStaleBlocks,
+100.0 * BC.Stats.PseudoProbeLooseMatchedSampleCount /
+BC.Stats.StaleSampleCount,
+BC.Stats.PseudoProbeLooseMatchedSampleCount, 
BC.Stats.StaleSampleCount);
+BC.outs() << format(
+"BOLT-INFO: inference found a call match for %.2f%% of basic "
 "blocks"
 " (%zu out of %zu stale) responsible for %.2f%% samples"
 " (%zu out of %zu stale)\n",
-100.0 * BC.Stats.NumPseudoProbeMatchedBlocks / BC.Stats.NumStaleBlocks,
-BC.Stats.NumPseudoProbeMatchedBlocks, BC.Stats.NumStaleBlocks,
-100.0 * BC.Stats.PseudoProbeMatchedSampleCount /
-BC.Stats.StaleSampleCount,
-BC.Stats.PseudoProbeMatchedSampleCount, BC.Stats.StaleSampleCount);
+100.0 * BC.Stats.NumCallMatchedBlocks / BC.Stats.NumStaleBlocks,
+BC.Stats.NumCallMatchedBlocks, BC.Stats.NumStaleBlocks,
+100.0 * BC.Stats.CallMatchedSampleCount / BC.Stats.StaleSampleCount,
+BC.Stats.CallMatchedSampleCount, BC.Stats.StaleSampleCount);
+BC

[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)

2024-09-11 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov edited 
https://github.com/llvm/llvm-project/pull/107137
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [RISCV] Don't outline pcrel_lo when the function has a section prefix (#107943) (PR #108288)

2024-09-11 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/108288

Backport 866b93e6b33fac9a4bc62bbc32199bd98f434784

Requested by: @jonathonpenix

>From e9246648ae352b514bad7570f0509b1337217619 Mon Sep 17 00:00:00 2001
From: Jonathon Penix 
Date: Wed, 11 Sep 2024 09:53:11 -0700
Subject: [PATCH] [RISCV] Don't outline pcrel_lo when the function has a
 section prefix (#107943)

GNU ld will error when encountering a pcrel_lo whose corresponding
pcrel_hi is in a different section. [1] introduced a check to help
prevent this issue by preventing outlining in a few circumstances.
However, we can also hit this same issue when outlining from functions
with prefixes ("hot"/"unlikely"/"unknown" from profile information, for
example) as the outlined function might not have the same prefix,
possibly resulting in a "paired" pcrel_lo and pcrel_hi ending up in
different sections.

To prevent this issue, take a similar approach as [1] and additionally
prevent outlining when we see a pcrel_lo and the function has a prefix.

[1]
https://github.com/llvm/llvm-project/commit/96c85f80f0d615ffde0f85d8270e0a8c9f4e5430

Fixes #107520

(cherry picked from commit 866b93e6b33fac9a4bc62bbc32199bd98f434784)
---
 llvm/lib/Target/RISCV/RISCVInstrInfo.cpp  |   2 +-
 .../RISCV/machineoutliner-pcrel-lo.mir| 104 +-
 2 files changed, 99 insertions(+), 7 deletions(-)

diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp 
b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
index ba3b4bd701d634..6c0cbeadebf431 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
@@ -2902,7 +2902,7 @@ 
RISCVInstrInfo::getOutliningTypeImpl(MachineBasicBlock::iterator &MBBI,
 // if any possible.
 if (MO.getTargetFlags() == RISCVII::MO_PCREL_LO &&
 (MI.getMF()->getTarget().getFunctionSections() || F.hasComdat() ||
- F.hasSection()))
+ F.hasSection() || F.getSectionPrefix()))
   return outliner::InstrType::Illegal;
   }
 
diff --git a/llvm/test/CodeGen/RISCV/machineoutliner-pcrel-lo.mir 
b/llvm/test/CodeGen/RISCV/machineoutliner-pcrel-lo.mir
index 8a83543b0280fd..fd3630bcfad256 100644
--- a/llvm/test/CodeGen/RISCV/machineoutliner-pcrel-lo.mir
+++ b/llvm/test/CodeGen/RISCV/machineoutliner-pcrel-lo.mir
@@ -18,6 +18,9 @@
   define i32 @foo2(i32 %a, i32 %b) comdat { ret i32 0 }
 
   define i32 @foo3(i32 %a, i32 %b) section ".abc" { ret i32 0 }
+
+  define i32 @foo4(i32 %a, i32 %b) !section_prefix !0 { ret i32 0 }
+  !0 = !{!"function_section_prefix", !"myprefix"}
 ...
 ---
 name:foo
@@ -27,23 +30,24 @@ body: |
   ; CHECK: bb.0:
   ; CHECK-NEXT:   liveins: $x10, $x11, $x13
   ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT:   $x5 = PseudoCALLReg target-flags(riscv-call) 
@OUTLINED_FUNCTION_0, implicit-def $x5, implicit-def $x10, implicit-def $x11, 
implicit-def $x12, implicit $x10, implicit $x11, implicit $x13
+  ; CHECK-NEXT:   $x5 = PseudoCALLReg target-flags(riscv-call) 
@OUTLINED_FUNCTION_1, implicit-def $x5, implicit-def $x10, implicit-def $x11, 
implicit-def $x12, implicit $x10, implicit $x11, implicit $x13
   ; CHECK-NEXT:   PseudoBR %bb.3
   ; CHECK-NEXT: {{  $}}
   ; CHECK-NEXT: bb.1:
   ; CHECK-NEXT:   liveins: $x10, $x11, $x13
   ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT:   $x5 = PseudoCALLReg target-flags(riscv-call) 
@OUTLINED_FUNCTION_0, implicit-def $x5, implicit-def $x10, implicit-def $x11, 
implicit-def $x12, implicit $x10, implicit $x11, implicit $x13
+  ; CHECK-NEXT:   $x5 = PseudoCALLReg target-flags(riscv-call) 
@OUTLINED_FUNCTION_1, implicit-def $x5, implicit-def $x10, implicit-def $x11, 
implicit-def $x12, implicit $x10, implicit $x11, implicit $x13
   ; CHECK-NEXT:   PseudoBR %bb.3
   ; CHECK-NEXT: {{  $}}
   ; CHECK-NEXT: bb.2:
   ; CHECK-NEXT:   liveins: $x10, $x11, $x13
   ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT:   $x5 = PseudoCALLReg target-flags(riscv-call) 
@OUTLINED_FUNCTION_0, implicit-def $x5, implicit-def $x10, implicit-def $x11, 
implicit-def $x12, implicit $x10, implicit $x11, implicit $x13
+  ; CHECK-NEXT:   $x5 = PseudoCALLReg target-flags(riscv-call) 
@OUTLINED_FUNCTION_1, implicit-def $x5, implicit-def $x10, implicit-def $x11, 
implicit-def $x12, implicit $x10, implicit $x11, implicit $x13
   ; CHECK-NEXT:   PseudoBR %bb.3
   ; CHECK-NEXT: {{  $}}
   ; CHECK-NEXT: bb.3:
   ; CHECK-NEXT:   PseudoRET
+  ;
   ; CHECK-FS-LABEL: name: foo
   ; CHECK-FS: bb.0:
   ; CHECK-FS-NEXT:   liveins: $x10, $x11, $x13
@@ -109,26 +113,27 @@ body: |
   ; CHECK: bb.0:
   ; CHECK-NEXT:   liveins: $x10, $x11, $x13
   ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT:   $x5 = PseudoCALLReg target-flags(riscv-call) 
@OUTLINED_FUNCTION_1, implicit-def $x5, implicit-def $x10, implicit-def $x11, 
implicit-def $x12, implicit $x10, implicit $x11
+  ; CHECK-NEXT:   $x5 = PseudoCALLReg target-flags(riscv-call) 
@OUTLINED_FUNCTION_0, implicit-def $x5, implicit-def $x10, implicit-def $x11, 
implicit-def $x12, implicit $x10, implicit $x1

[llvm-branch-commits] [llvm] release/19.x: [RISCV] Don't outline pcrel_lo when the function has a section prefix (#107943) (PR #108288)

2024-09-11 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/108288
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [RISCV] Don't outline pcrel_lo when the function has a section prefix (#107943) (PR #108288)

2024-09-11 Thread via llvm-branch-commits

llvmbot wrote:

@wangpc-pp What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/108288
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [RISCV] Don't outline pcrel_lo when the function has a section prefix (#107943) (PR #108288)

2024-09-11 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-risc-v

Author: None (llvmbot)


Changes

Backport 866b93e6b33fac9a4bc62bbc32199bd98f434784

Requested by: @jonathonpenix

---
Full diff: https://github.com/llvm/llvm-project/pull/108288.diff


2 Files Affected:

- (modified) llvm/lib/Target/RISCV/RISCVInstrInfo.cpp (+1-1) 
- (modified) llvm/test/CodeGen/RISCV/machineoutliner-pcrel-lo.mir (+98-6) 


``diff
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp 
b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
index ba3b4bd701d634..6c0cbeadebf431 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
@@ -2902,7 +2902,7 @@ 
RISCVInstrInfo::getOutliningTypeImpl(MachineBasicBlock::iterator &MBBI,
 // if any possible.
 if (MO.getTargetFlags() == RISCVII::MO_PCREL_LO &&
 (MI.getMF()->getTarget().getFunctionSections() || F.hasComdat() ||
- F.hasSection()))
+ F.hasSection() || F.getSectionPrefix()))
   return outliner::InstrType::Illegal;
   }
 
diff --git a/llvm/test/CodeGen/RISCV/machineoutliner-pcrel-lo.mir 
b/llvm/test/CodeGen/RISCV/machineoutliner-pcrel-lo.mir
index 8a83543b0280fd..fd3630bcfad256 100644
--- a/llvm/test/CodeGen/RISCV/machineoutliner-pcrel-lo.mir
+++ b/llvm/test/CodeGen/RISCV/machineoutliner-pcrel-lo.mir
@@ -18,6 +18,9 @@
   define i32 @foo2(i32 %a, i32 %b) comdat { ret i32 0 }
 
   define i32 @foo3(i32 %a, i32 %b) section ".abc" { ret i32 0 }
+
+  define i32 @foo4(i32 %a, i32 %b) !section_prefix !0 { ret i32 0 }
+  !0 = !{!"function_section_prefix", !"myprefix"}
 ...
 ---
 name:foo
@@ -27,23 +30,24 @@ body: |
   ; CHECK: bb.0:
   ; CHECK-NEXT:   liveins: $x10, $x11, $x13
   ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT:   $x5 = PseudoCALLReg target-flags(riscv-call) 
@OUTLINED_FUNCTION_0, implicit-def $x5, implicit-def $x10, implicit-def $x11, 
implicit-def $x12, implicit $x10, implicit $x11, implicit $x13
+  ; CHECK-NEXT:   $x5 = PseudoCALLReg target-flags(riscv-call) 
@OUTLINED_FUNCTION_1, implicit-def $x5, implicit-def $x10, implicit-def $x11, 
implicit-def $x12, implicit $x10, implicit $x11, implicit $x13
   ; CHECK-NEXT:   PseudoBR %bb.3
   ; CHECK-NEXT: {{  $}}
   ; CHECK-NEXT: bb.1:
   ; CHECK-NEXT:   liveins: $x10, $x11, $x13
   ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT:   $x5 = PseudoCALLReg target-flags(riscv-call) 
@OUTLINED_FUNCTION_0, implicit-def $x5, implicit-def $x10, implicit-def $x11, 
implicit-def $x12, implicit $x10, implicit $x11, implicit $x13
+  ; CHECK-NEXT:   $x5 = PseudoCALLReg target-flags(riscv-call) 
@OUTLINED_FUNCTION_1, implicit-def $x5, implicit-def $x10, implicit-def $x11, 
implicit-def $x12, implicit $x10, implicit $x11, implicit $x13
   ; CHECK-NEXT:   PseudoBR %bb.3
   ; CHECK-NEXT: {{  $}}
   ; CHECK-NEXT: bb.2:
   ; CHECK-NEXT:   liveins: $x10, $x11, $x13
   ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT:   $x5 = PseudoCALLReg target-flags(riscv-call) 
@OUTLINED_FUNCTION_0, implicit-def $x5, implicit-def $x10, implicit-def $x11, 
implicit-def $x12, implicit $x10, implicit $x11, implicit $x13
+  ; CHECK-NEXT:   $x5 = PseudoCALLReg target-flags(riscv-call) 
@OUTLINED_FUNCTION_1, implicit-def $x5, implicit-def $x10, implicit-def $x11, 
implicit-def $x12, implicit $x10, implicit $x11, implicit $x13
   ; CHECK-NEXT:   PseudoBR %bb.3
   ; CHECK-NEXT: {{  $}}
   ; CHECK-NEXT: bb.3:
   ; CHECK-NEXT:   PseudoRET
+  ;
   ; CHECK-FS-LABEL: name: foo
   ; CHECK-FS: bb.0:
   ; CHECK-FS-NEXT:   liveins: $x10, $x11, $x13
@@ -109,26 +113,27 @@ body: |
   ; CHECK: bb.0:
   ; CHECK-NEXT:   liveins: $x10, $x11, $x13
   ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT:   $x5 = PseudoCALLReg target-flags(riscv-call) 
@OUTLINED_FUNCTION_1, implicit-def $x5, implicit-def $x10, implicit-def $x11, 
implicit-def $x12, implicit $x10, implicit $x11
+  ; CHECK-NEXT:   $x5 = PseudoCALLReg target-flags(riscv-call) 
@OUTLINED_FUNCTION_0, implicit-def $x5, implicit-def $x10, implicit-def $x11, 
implicit-def $x12, implicit $x10, implicit $x11
   ; CHECK-NEXT:   $x11 = LW killed renamable $x13, 
target-flags(riscv-pcrel-lo)  :: (dereferenceable load 
(s32) from @bar)
   ; CHECK-NEXT:   PseudoBR %bb.3
   ; CHECK-NEXT: {{  $}}
   ; CHECK-NEXT: bb.1:
   ; CHECK-NEXT:   liveins: $x10, $x11, $x13
   ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT:   $x5 = PseudoCALLReg target-flags(riscv-call) 
@OUTLINED_FUNCTION_1, implicit-def $x5, implicit-def $x10, implicit-def $x11, 
implicit-def $x12, implicit $x10, implicit $x11
+  ; CHECK-NEXT:   $x5 = PseudoCALLReg target-flags(riscv-call) 
@OUTLINED_FUNCTION_0, implicit-def $x5, implicit-def $x10, implicit-def $x11, 
implicit-def $x12, implicit $x10, implicit $x11
   ; CHECK-NEXT:   $x11 = LW killed renamable $x13, 
target-flags(riscv-pcrel-lo)  :: (dereferenceable load 
(s32) from @bar)
   ; CHECK-NEXT:   PseudoBR %bb.3
   ; CHECK-NEXT: {{  $}}
   ; CHECK-NEXT: bb.2:
   ; CHECK-NEXT:   liveins: $x10, $x11, $x13
   ; CHECK-NEXT: {{  $}}
-  ; CHECK-NEXT:   $x5 = PseudoCALLReg tar

[llvm-branch-commits] [BOLT][NFC] Rename profile-use-pseudo-probes (PR #106364)

2024-09-11 Thread Rafael Auler via llvm-branch-commits

https://github.com/rafaelauler approved this pull request.

Makes sense, LG

https://github.com/llvm/llvm-project/pull/106364
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [BOLT] Drop blocks without profile in BAT YAML (PR #107970)

2024-09-11 Thread Rafael Auler via llvm-branch-commits

https://github.com/rafaelauler approved this pull request.

LG

https://github.com/llvm/llvm-project/pull/107970
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match functions with pseudo probes (PR #100446)

2024-09-11 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov updated 
https://github.com/llvm/llvm-project/pull/100446

>From 56b45b104a2ab2dbc4ab8e9643c90092894b579e Mon Sep 17 00:00:00 2001
From: shawbyoung 
Date: Wed, 24 Jul 2024 11:29:22 -0700
Subject: [PATCH 1/4] Comment

Created using spr 1.3.4
---
 bolt/include/bolt/Profile/YAMLProfileReader.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/bolt/include/bolt/Profile/YAMLProfileReader.h 
b/bolt/include/bolt/Profile/YAMLProfileReader.h
index 6c00f82302fb92..bc09751fcae75e 100644
--- a/bolt/include/bolt/Profile/YAMLProfileReader.h
+++ b/bolt/include/bolt/Profile/YAMLProfileReader.h
@@ -108,7 +108,7 @@ class YAMLProfileReader : public ProfileReaderBase {
   std::vector YamlProfileToFunction;
 
   using FunctionSet = std::unordered_set;
-  /// To keep track of functions that have a matched profile before the 
profilez
+  /// To keep track of functions that have a matched profile before the profile
   /// is attributed.
   FunctionSet ProfiledFunctions;
 

>From b851ca65c2bf2a9569315d62722b60a04c8102ee Mon Sep 17 00:00:00 2001
From: shawbyoung 
Date: Wed, 24 Jul 2024 11:39:48 -0700
Subject: [PATCH 2/4] Was accessing wrong YamlBF Hash, fixed

Created using spr 1.3.4
---
 bolt/lib/Profile/YAMLProfileReader.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/bolt/lib/Profile/YAMLProfileReader.cpp 
b/bolt/lib/Profile/YAMLProfileReader.cpp
index 68af95a1cd043e..f5ac0b8e2c56a2 100644
--- a/bolt/lib/Profile/YAMLProfileReader.cpp
+++ b/bolt/lib/Profile/YAMLProfileReader.cpp
@@ -614,7 +614,7 @@ size_t 
YAMLProfileReader::matchWithPseudoProbes(BinaryContext &BC) {
 
   uint64_t MatchedWithPseudoProbes = 0;
   for (yaml::bolt::BinaryFunctionProfile &YamlBF : YamlBP.Functions) {
-auto It = PseudoProbeDescHashToBF.find(YamlBF.Hash);
+auto It = PseudoProbeDescHashToBF.find(YamlBF.PseudoProbeDescHash);
 if (It == PseudoProbeDescHashToBF.end())
   continue;
 BinaryFunction *BF = It->second;

>From 39ba7175c9224c3584db7f5f8ca8fbed14da41e5 Mon Sep 17 00:00:00 2001
From: shawbyoung 
Date: Wed, 24 Jul 2024 11:49:54 -0700
Subject: [PATCH 3/4] Changed ordering of matching

Created using spr 1.3.4
---
 bolt/lib/Profile/YAMLProfileReader.cpp | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/bolt/lib/Profile/YAMLProfileReader.cpp 
b/bolt/lib/Profile/YAMLProfileReader.cpp
index f5ac0b8e2c56a2..75ec4465856a15 100644
--- a/bolt/lib/Profile/YAMLProfileReader.cpp
+++ b/bolt/lib/Profile/YAMLProfileReader.cpp
@@ -770,8 +770,8 @@ Error YAMLProfileReader::readProfile(BinaryContext &BC) {
   const size_t MatchedWithHash = matchWithHash(BC);
   const size_t MatchedWithLTOCommonName = matchWithLTOCommonName();
   const size_t MatchedWithCallGraph = matchWithCallGraph(BC);
-  const size_t MatchedWithNameSimilarity = matchWithNameSimilarity(BC);
   const size_t MatchedWithPseudoProbes = matchWithPseudoProbes(BC);
+  const size_t MatchedWithNameSimilarity = matchWithNameSimilarity(BC);
 
   for (auto [YamlBF, BF] : llvm::zip_equal(YamlBP.Functions, ProfileBFs))
 if (!YamlBF.Used && BF && !ProfiledFunctions.count(BF))
@@ -792,10 +792,10 @@ Error YAMLProfileReader::readProfile(BinaryContext &BC) {
<< " functions with matching LTO common names\n";
 outs() << "BOLT-INFO: matched " << MatchedWithCallGraph
<< " functions with call graph\n";
-outs() << "BOLT-INFO: matched " << MatchedWithNameSimilarity
-   << " functions with similar names\n";
 outs() << "BOLT-INFO: matched " << MatchedWithPseudoProbes
<< " functions with pseudo probes\n";
+outs() << "BOLT-INFO: matched " << MatchedWithNameSimilarity
+   << " functions with similar names\n";
   }
 
   // Set for parseFunctionProfile().

>From 11af7f19953da7c5ad4eb263be3d38a70b2518e0 Mon Sep 17 00:00:00 2001
From: shawbyoung 
Date: Wed, 24 Jul 2024 12:56:56 -0700
Subject: [PATCH 4/4] Added check for YamlBF.Used in pseudo probe function
 matching

Created using spr 1.3.4
---
 bolt/lib/Profile/YAMLProfileReader.cpp | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/bolt/lib/Profile/YAMLProfileReader.cpp 
b/bolt/lib/Profile/YAMLProfileReader.cpp
index 75ec4465856a15..8dfdf1fb30eb36 100644
--- a/bolt/lib/Profile/YAMLProfileReader.cpp
+++ b/bolt/lib/Profile/YAMLProfileReader.cpp
@@ -614,6 +614,8 @@ size_t 
YAMLProfileReader::matchWithPseudoProbes(BinaryContext &BC) {
 
   uint64_t MatchedWithPseudoProbes = 0;
   for (yaml::bolt::BinaryFunctionProfile &YamlBF : YamlBP.Functions) {
+if (YamlBF.Used)
+  continue;
 auto It = PseudoProbeDescHashToBF.find(YamlBF.PseudoProbeDescHash);
 if (It == PseudoProbeDescHashToBF.end())
   continue;

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match functions with pseudo probes (PR #100446)

2024-09-11 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov updated 
https://github.com/llvm/llvm-project/pull/100446

>From 56b45b104a2ab2dbc4ab8e9643c90092894b579e Mon Sep 17 00:00:00 2001
From: shawbyoung 
Date: Wed, 24 Jul 2024 11:29:22 -0700
Subject: [PATCH 1/4] Comment

Created using spr 1.3.4
---
 bolt/include/bolt/Profile/YAMLProfileReader.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/bolt/include/bolt/Profile/YAMLProfileReader.h 
b/bolt/include/bolt/Profile/YAMLProfileReader.h
index 6c00f82302fb92..bc09751fcae75e 100644
--- a/bolt/include/bolt/Profile/YAMLProfileReader.h
+++ b/bolt/include/bolt/Profile/YAMLProfileReader.h
@@ -108,7 +108,7 @@ class YAMLProfileReader : public ProfileReaderBase {
   std::vector YamlProfileToFunction;
 
   using FunctionSet = std::unordered_set;
-  /// To keep track of functions that have a matched profile before the 
profilez
+  /// To keep track of functions that have a matched profile before the profile
   /// is attributed.
   FunctionSet ProfiledFunctions;
 

>From b851ca65c2bf2a9569315d62722b60a04c8102ee Mon Sep 17 00:00:00 2001
From: shawbyoung 
Date: Wed, 24 Jul 2024 11:39:48 -0700
Subject: [PATCH 2/4] Was accessing wrong YamlBF Hash, fixed

Created using spr 1.3.4
---
 bolt/lib/Profile/YAMLProfileReader.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/bolt/lib/Profile/YAMLProfileReader.cpp 
b/bolt/lib/Profile/YAMLProfileReader.cpp
index 68af95a1cd043e..f5ac0b8e2c56a2 100644
--- a/bolt/lib/Profile/YAMLProfileReader.cpp
+++ b/bolt/lib/Profile/YAMLProfileReader.cpp
@@ -614,7 +614,7 @@ size_t 
YAMLProfileReader::matchWithPseudoProbes(BinaryContext &BC) {
 
   uint64_t MatchedWithPseudoProbes = 0;
   for (yaml::bolt::BinaryFunctionProfile &YamlBF : YamlBP.Functions) {
-auto It = PseudoProbeDescHashToBF.find(YamlBF.Hash);
+auto It = PseudoProbeDescHashToBF.find(YamlBF.PseudoProbeDescHash);
 if (It == PseudoProbeDescHashToBF.end())
   continue;
 BinaryFunction *BF = It->second;

>From 39ba7175c9224c3584db7f5f8ca8fbed14da41e5 Mon Sep 17 00:00:00 2001
From: shawbyoung 
Date: Wed, 24 Jul 2024 11:49:54 -0700
Subject: [PATCH 3/4] Changed ordering of matching

Created using spr 1.3.4
---
 bolt/lib/Profile/YAMLProfileReader.cpp | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/bolt/lib/Profile/YAMLProfileReader.cpp 
b/bolt/lib/Profile/YAMLProfileReader.cpp
index f5ac0b8e2c56a2..75ec4465856a15 100644
--- a/bolt/lib/Profile/YAMLProfileReader.cpp
+++ b/bolt/lib/Profile/YAMLProfileReader.cpp
@@ -770,8 +770,8 @@ Error YAMLProfileReader::readProfile(BinaryContext &BC) {
   const size_t MatchedWithHash = matchWithHash(BC);
   const size_t MatchedWithLTOCommonName = matchWithLTOCommonName();
   const size_t MatchedWithCallGraph = matchWithCallGraph(BC);
-  const size_t MatchedWithNameSimilarity = matchWithNameSimilarity(BC);
   const size_t MatchedWithPseudoProbes = matchWithPseudoProbes(BC);
+  const size_t MatchedWithNameSimilarity = matchWithNameSimilarity(BC);
 
   for (auto [YamlBF, BF] : llvm::zip_equal(YamlBP.Functions, ProfileBFs))
 if (!YamlBF.Used && BF && !ProfiledFunctions.count(BF))
@@ -792,10 +792,10 @@ Error YAMLProfileReader::readProfile(BinaryContext &BC) {
<< " functions with matching LTO common names\n";
 outs() << "BOLT-INFO: matched " << MatchedWithCallGraph
<< " functions with call graph\n";
-outs() << "BOLT-INFO: matched " << MatchedWithNameSimilarity
-   << " functions with similar names\n";
 outs() << "BOLT-INFO: matched " << MatchedWithPseudoProbes
<< " functions with pseudo probes\n";
+outs() << "BOLT-INFO: matched " << MatchedWithNameSimilarity
+   << " functions with similar names\n";
   }
 
   // Set for parseFunctionProfile().

>From 11af7f19953da7c5ad4eb263be3d38a70b2518e0 Mon Sep 17 00:00:00 2001
From: shawbyoung 
Date: Wed, 24 Jul 2024 12:56:56 -0700
Subject: [PATCH 4/4] Added check for YamlBF.Used in pseudo probe function
 matching

Created using spr 1.3.4
---
 bolt/lib/Profile/YAMLProfileReader.cpp | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/bolt/lib/Profile/YAMLProfileReader.cpp 
b/bolt/lib/Profile/YAMLProfileReader.cpp
index 75ec4465856a15..8dfdf1fb30eb36 100644
--- a/bolt/lib/Profile/YAMLProfileReader.cpp
+++ b/bolt/lib/Profile/YAMLProfileReader.cpp
@@ -614,6 +614,8 @@ size_t 
YAMLProfileReader::matchWithPseudoProbes(BinaryContext &BC) {
 
   uint64_t MatchedWithPseudoProbes = 0;
   for (yaml::bolt::BinaryFunctionProfile &YamlBF : YamlBP.Functions) {
+if (YamlBF.Used)
+  continue;
 auto It = PseudoProbeDescHashToBF.find(YamlBF.PseudoProbeDescHash);
 if (It == PseudoProbeDescHashToBF.end())
   continue;

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match functions with pseudo probes (PR #100446)

2024-09-11 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov updated 
https://github.com/llvm/llvm-project/pull/100446

>From 56b45b104a2ab2dbc4ab8e9643c90092894b579e Mon Sep 17 00:00:00 2001
From: shawbyoung 
Date: Wed, 24 Jul 2024 11:29:22 -0700
Subject: [PATCH 1/5] Comment

Created using spr 1.3.4
---
 bolt/include/bolt/Profile/YAMLProfileReader.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/bolt/include/bolt/Profile/YAMLProfileReader.h 
b/bolt/include/bolt/Profile/YAMLProfileReader.h
index 6c00f82302fb92..bc09751fcae75e 100644
--- a/bolt/include/bolt/Profile/YAMLProfileReader.h
+++ b/bolt/include/bolt/Profile/YAMLProfileReader.h
@@ -108,7 +108,7 @@ class YAMLProfileReader : public ProfileReaderBase {
   std::vector YamlProfileToFunction;
 
   using FunctionSet = std::unordered_set;
-  /// To keep track of functions that have a matched profile before the 
profilez
+  /// To keep track of functions that have a matched profile before the profile
   /// is attributed.
   FunctionSet ProfiledFunctions;
 

>From b851ca65c2bf2a9569315d62722b60a04c8102ee Mon Sep 17 00:00:00 2001
From: shawbyoung 
Date: Wed, 24 Jul 2024 11:39:48 -0700
Subject: [PATCH 2/5] Was accessing wrong YamlBF Hash, fixed

Created using spr 1.3.4
---
 bolt/lib/Profile/YAMLProfileReader.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/bolt/lib/Profile/YAMLProfileReader.cpp 
b/bolt/lib/Profile/YAMLProfileReader.cpp
index 68af95a1cd043e..f5ac0b8e2c56a2 100644
--- a/bolt/lib/Profile/YAMLProfileReader.cpp
+++ b/bolt/lib/Profile/YAMLProfileReader.cpp
@@ -614,7 +614,7 @@ size_t 
YAMLProfileReader::matchWithPseudoProbes(BinaryContext &BC) {
 
   uint64_t MatchedWithPseudoProbes = 0;
   for (yaml::bolt::BinaryFunctionProfile &YamlBF : YamlBP.Functions) {
-auto It = PseudoProbeDescHashToBF.find(YamlBF.Hash);
+auto It = PseudoProbeDescHashToBF.find(YamlBF.PseudoProbeDescHash);
 if (It == PseudoProbeDescHashToBF.end())
   continue;
 BinaryFunction *BF = It->second;

>From 39ba7175c9224c3584db7f5f8ca8fbed14da41e5 Mon Sep 17 00:00:00 2001
From: shawbyoung 
Date: Wed, 24 Jul 2024 11:49:54 -0700
Subject: [PATCH 3/5] Changed ordering of matching

Created using spr 1.3.4
---
 bolt/lib/Profile/YAMLProfileReader.cpp | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/bolt/lib/Profile/YAMLProfileReader.cpp 
b/bolt/lib/Profile/YAMLProfileReader.cpp
index f5ac0b8e2c56a2..75ec4465856a15 100644
--- a/bolt/lib/Profile/YAMLProfileReader.cpp
+++ b/bolt/lib/Profile/YAMLProfileReader.cpp
@@ -770,8 +770,8 @@ Error YAMLProfileReader::readProfile(BinaryContext &BC) {
   const size_t MatchedWithHash = matchWithHash(BC);
   const size_t MatchedWithLTOCommonName = matchWithLTOCommonName();
   const size_t MatchedWithCallGraph = matchWithCallGraph(BC);
-  const size_t MatchedWithNameSimilarity = matchWithNameSimilarity(BC);
   const size_t MatchedWithPseudoProbes = matchWithPseudoProbes(BC);
+  const size_t MatchedWithNameSimilarity = matchWithNameSimilarity(BC);
 
   for (auto [YamlBF, BF] : llvm::zip_equal(YamlBP.Functions, ProfileBFs))
 if (!YamlBF.Used && BF && !ProfiledFunctions.count(BF))
@@ -792,10 +792,10 @@ Error YAMLProfileReader::readProfile(BinaryContext &BC) {
<< " functions with matching LTO common names\n";
 outs() << "BOLT-INFO: matched " << MatchedWithCallGraph
<< " functions with call graph\n";
-outs() << "BOLT-INFO: matched " << MatchedWithNameSimilarity
-   << " functions with similar names\n";
 outs() << "BOLT-INFO: matched " << MatchedWithPseudoProbes
<< " functions with pseudo probes\n";
+outs() << "BOLT-INFO: matched " << MatchedWithNameSimilarity
+   << " functions with similar names\n";
   }
 
   // Set for parseFunctionProfile().

>From 11af7f19953da7c5ad4eb263be3d38a70b2518e0 Mon Sep 17 00:00:00 2001
From: shawbyoung 
Date: Wed, 24 Jul 2024 12:56:56 -0700
Subject: [PATCH 4/5] Added check for YamlBF.Used in pseudo probe function
 matching

Created using spr 1.3.4
---
 bolt/lib/Profile/YAMLProfileReader.cpp | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/bolt/lib/Profile/YAMLProfileReader.cpp 
b/bolt/lib/Profile/YAMLProfileReader.cpp
index 75ec4465856a15..8dfdf1fb30eb36 100644
--- a/bolt/lib/Profile/YAMLProfileReader.cpp
+++ b/bolt/lib/Profile/YAMLProfileReader.cpp
@@ -614,6 +614,8 @@ size_t 
YAMLProfileReader::matchWithPseudoProbes(BinaryContext &BC) {
 
   uint64_t MatchedWithPseudoProbes = 0;
   for (yaml::bolt::BinaryFunctionProfile &YamlBF : YamlBP.Functions) {
+if (YamlBF.Used)
+  continue;
 auto It = PseudoProbeDescHashToBF.find(YamlBF.PseudoProbeDescHash);
 if (It == PseudoProbeDescHashToBF.end())
   continue;

>From ad4d98fc4bf3f16d119ddbc5abad11b93641bf99 Mon Sep 17 00:00:00 2001
From: shawbyoung 
Date: Wed, 11 Sep 2024 15:49:03 -0700
Subject: [PATCH 5/5] Debug logging

Created using spr 1.3.4
---
 bolt/lib/Profile/YAMLProfileReader.cpp | 31 +++---
 1 file changed, 23 ins

[llvm-branch-commits] [BOLT][NFC] Rename profile-use-pseudo-probes (PR #106364)

2024-09-11 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov updated 
https://github.com/llvm/llvm-project/pull/106364


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [BOLT][NFC] Rename profile-use-pseudo-probes (PR #106364)

2024-09-11 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov updated 
https://github.com/llvm/llvm-project/pull/106364


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [BOLT] Only parse probes for profiled functions in profile-write-pseudo-probes mode (PR #106365)

2024-09-11 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov updated 
https://github.com/llvm/llvm-project/pull/106365


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [BOLT] Only parse probes for profiled functions in profile-write-pseudo-probes mode (PR #106365)

2024-09-11 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov updated 
https://github.com/llvm/llvm-project/pull/106365


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [BOLT] Drop blocks without profile in BAT YAML (PR #107970)

2024-09-11 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov updated 
https://github.com/llvm/llvm-project/pull/107970


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [BOLT] Drop blocks without profile in BAT YAML (PR #107970)

2024-09-11 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov updated 
https://github.com/llvm/llvm-project/pull/107970


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [BOLT] Drop blocks without profile in BAT YAML (PR #107970)

2024-09-11 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov edited 
https://github.com/llvm/llvm-project/pull/107970
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [BOLT] Drop blocks without profile in BAT YAML (PR #107970)

2024-09-11 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov updated 
https://github.com/llvm/llvm-project/pull/107970


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [BOLT] Drop blocks without profile in BAT YAML (PR #107970)

2024-09-11 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov updated 
https://github.com/llvm/llvm-project/pull/107970


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] 22b522a - Update sanitizer_tls_get_addr.cpp

2024-09-11 Thread via llvm-branch-commits

Author: Vitaly Buka
Date: 2024-09-11T17:29:21-07:00
New Revision: 22b522a32d76c6c471707ceb6b9b4dcb9407db27

URL: 
https://github.com/llvm/llvm-project/commit/22b522a32d76c6c471707ceb6b9b4dcb9407db27
DIFF: 
https://github.com/llvm/llvm-project/commit/22b522a32d76c6c471707ceb6b9b4dcb9407db27.diff

LOG: Update sanitizer_tls_get_addr.cpp

Added: 


Modified: 
compiler-rt/lib/sanitizer_common/sanitizer_tls_get_addr.cpp

Removed: 




diff  --git a/compiler-rt/lib/sanitizer_common/sanitizer_tls_get_addr.cpp 
b/compiler-rt/lib/sanitizer_common/sanitizer_tls_get_addr.cpp
index bf84a2ff60c91c..a1107ff7d24737 100644
--- a/compiler-rt/lib/sanitizer_common/sanitizer_tls_get_addr.cpp
+++ b/compiler-rt/lib/sanitizer_common/sanitizer_tls_get_addr.cpp
@@ -152,8 +152,8 @@ DTLS::DTV *DTLS_on_tls_get_addr(void *arg_void, void *res,
 tls_size = 0;
   }
   if (tls_size) {
-CHECK_LE(tls_beg, reinterpret_cast(res) - kDtvOffset);
-CHECK_LT(reinterpret_cast(res) - kDtvOffset, tls_beg + tls_size);
+CHECK_LE(tls_beg, reinterpret_cast(res));
+CHECK_LT(reinterpret_cast(res), tls_beg + tls_size);
   }
   dtv->beg = tls_beg;
   dtv->size = tls_size;



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [mlir][GPU] Plumb range information through the NVVM lowerings (PR #107659)

2024-09-11 Thread Krzysztof Drewniak via llvm-branch-commits


@@ -209,7 +209,12 @@ struct GPULaneIdOpToNVVM : 
ConvertOpToLLVMPattern {
   ConversionPatternRewriter &rewriter) const override {
 auto loc = op->getLoc();
 MLIRContext *context = rewriter.getContext();
-Value newOp = rewriter.create(loc, rewriter.getI32Type());
+LLVM::ConstantRangeAttr bounds = nullptr;
+if (std::optional upperBound = op.getUpperBound())
+  bounds = rewriter.getAttr(
+  /*bitWidth=*/32, /*lower=*/0, upperBound->getZExtValue());
+Value newOp =
+rewriter.create(loc, rewriter.getI32Type(), bounds);

krzysz00 wrote:

Done, thanks for a good observation about the default

https://github.com/llvm/llvm-project/pull/107659
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)

2024-09-11 Thread Lei Wang via llvm-branch-commits


@@ -58,8 +60,164 @@ const BinaryFunction *YAMLProfileWriter::setCSIDestination(
   return nullptr;
 }
 
+std::vector
+YAMLProfileWriter::getInlineTree(const MCPseudoProbeDecoder &Decoder,
+ const MCDecodedPseudoProbeInlineTree *Root) {
+  auto getHash = [&](const MCDecodedPseudoProbeInlineTree &Node) {
+return Decoder.getFuncDescForGUID(Node.Guid)->FuncHash;
+  };
+  assert(Root);

wlei-llvm wrote:

nit: add the assert msg, same to other assertion(to comply with the llvm code 
standard:) )

https://github.com/llvm/llvm-project/pull/107137
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)

2024-09-11 Thread Lei Wang via llvm-branch-commits


@@ -246,10 +270,32 @@ template <> struct 
MappingTraits {
   }
 };
 
+namespace bolt {
+struct PseudoProbeDesc {
+  std::vector GUID;
+  std::vector Hash;
+  std::vector GUIDHash; // Index of hash for that GUID in Hash

wlei-llvm wrote:

Give another name? IIUC, it's not Hash but Index.

>// Index of hash for that GUID in Hash

I don't fully get this, does this mean we have different Hashes for the same 
GUID, is this a global index or just the index for different Hash but the same 
GUID.


https://github.com/llvm/llvm-project/pull/107137
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)

2024-09-11 Thread Lei Wang via llvm-branch-commits


@@ -95,24 +95,30 @@ template <> struct MappingTraits {
 
 namespace bolt {
 struct PseudoProbeInfo {
-  llvm::yaml::Hex64 GUID;
-  uint64_t Index;
-  uint8_t Type;
+  uint32_t InlineTreeIndex = 0;
+  uint64_t BlockMask = 0; // bitset with probe indices
+  // Assume BlockMask == 1 if no other probes are set
+  std::vector BlockProbes;

wlei-llvm wrote:

So this doesn't stores all the probes but only the BlockProbes whose ID are > 
64, right? maybe add a comment or rename it. IIUC, for `CallProbes`, it 
contains all the call probes.

https://github.com/llvm/llvm-project/pull/107137
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)

2024-09-11 Thread Lei Wang via llvm-branch-commits


@@ -58,8 +60,164 @@ const BinaryFunction *YAMLProfileWriter::setCSIDestination(
   return nullptr;
 }
 
+std::vector
+YAMLProfileWriter::getInlineTree(const MCPseudoProbeDecoder &Decoder,
+ const MCDecodedPseudoProbeInlineTree *Root) {
+  auto getHash = [&](const MCDecodedPseudoProbeInlineTree &Node) {
+return Decoder.getFuncDescForGUID(Node.Guid)->FuncHash;
+  };
+  assert(Root);
+  std::vector InlineTree;
+  InlineTreeNode Node{Root, Root->Guid, getHash(*Root), 0, 0};
+  InlineTree.emplace_back(Node);

wlei-llvm wrote:

nit: InlineTree.emplace_back(Root, Root->Guid, ...), same to the one below

https://github.com/llvm/llvm-project/pull/107137
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)

2024-09-11 Thread Lei Wang via llvm-branch-commits


@@ -2421,11 +2433,14 @@ std::error_code 
DataAggregator::writeBATYAML(BinaryContext &BC,
 const uint32_t InputOffset = BAT->translate(
 FuncAddr, OutputAddress - FuncAddr, /*IsBranchSrc=*/true);
 const unsigned BlockIndex = getBlock(InputOffset).second;
-YamlBF.Blocks[BlockIndex].PseudoProbes.emplace_back(
-yaml::bolt::PseudoProbeInfo{Probe.getGuid(), Probe.getIndex(),
-Probe.getType()});
+BlockProbes[BlockIndex].emplace_back(Probe);
   }
 }
+
+for (auto &[Block, Probes] : BlockProbes) {
+  YamlBF.Blocks[Block].PseudoProbes =
+  YAMLProfileWriter::writeBlockProbes(Probes, InlineTreeNodeId);

wlei-llvm wrote:

Could you share me why there are two place usages for those yaml 
`writeBlockProbes`/`convertBFInlineTree` functions? one is here the 
`DataAggregator::writeBATYAML`, and one in `YAMLProfileWriter::writeProfile`. 
Just to learn about the context.

https://github.com/llvm/llvm-project/pull/107137
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)

2024-09-11 Thread Lei Wang via llvm-branch-commits


@@ -58,8 +60,164 @@ const BinaryFunction *YAMLProfileWriter::setCSIDestination(
   return nullptr;
 }
 
+std::vector
+YAMLProfileWriter::getInlineTree(const MCPseudoProbeDecoder &Decoder,
+ const MCDecodedPseudoProbeInlineTree *Root) {
+  auto getHash = [&](const MCDecodedPseudoProbeInlineTree &Node) {
+return Decoder.getFuncDescForGUID(Node.Guid)->FuncHash;
+  };
+  assert(Root);
+  std::vector InlineTree;
+  InlineTreeNode Node{Root, Root->Guid, getHash(*Root), 0, 0};
+  InlineTree.emplace_back(Node);
+  uint32_t ParentId = 0;
+  while (ParentId != InlineTree.size()) {
+const MCDecodedPseudoProbeInlineTree *Cur = 
InlineTree[ParentId].InlineTree;
+for (const MCDecodedPseudoProbeInlineTree &Child : Cur->getChildren()) {
+  InlineTreeNode Node{&Child, Child.Guid, getHash(Child), ParentId,
+  std::get<1>(Child.getInlineSite())};
+  InlineTree.emplace_back(Node);
+}
+++ParentId;
+  }
+
+  return InlineTree;
+}
+
+std::tuple
+YAMLProfileWriter::convertPseudoProbeDesc(const MCPseudoProbeDecoder &Decoder) 
{
+  yaml::bolt::PseudoProbeDesc Desc;
+  InlineTreeDesc InlineTree;
+
+  for (const MCDecodedPseudoProbeInlineTree &TopLev :
+   Decoder.getDummyInlineRoot().getChildren())
+InlineTree.TopLevelGUIDToInlineTree[TopLev.Guid] = &TopLev;
+
+  for (const auto &FuncDesc : Decoder.getGUID2FuncDescMap())
+++InlineTree.HashIdxMap[FuncDesc.FuncHash];

wlei-llvm wrote:

reserve the size here? looks `reserve`'s used for all other map but not 
`HashIdxMap` 

https://github.com/llvm/llvm-project/pull/107137
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)

2024-09-11 Thread Lei Wang via llvm-branch-commits


@@ -95,24 +95,30 @@ template <> struct MappingTraits {
 
 namespace bolt {
 struct PseudoProbeInfo {
-  llvm::yaml::Hex64 GUID;
-  uint64_t Index;
-  uint8_t Type;
+  uint32_t InlineTreeIndex = 0;
+  uint64_t BlockMask = 0; // bitset with probe indices

wlei-llvm wrote:

How much can it save using the Mask? if the saving is not significant, just 
using `BlockProbes` might be good for the readability. 

> Assume BlockMask == 1 if no other probes are set

Could you clarify more on this? why we need a special use for the value `1` 
BlockMask, there still could be more probes? say we can have two probes: 1 
basic block probe(ID1) and 1 call probe(ID2).



https://github.com/llvm/llvm-project/pull/107137
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)

2024-09-11 Thread Lei Wang via llvm-branch-commits


@@ -58,8 +60,164 @@ const BinaryFunction *YAMLProfileWriter::setCSIDestination(
   return nullptr;
 }
 
+std::vector
+YAMLProfileWriter::getInlineTree(const MCPseudoProbeDecoder &Decoder,

wlei-llvm wrote:

Is it possible to give a more explicit name? when I saw the getter/setter 
function, I feel it was simply get the raw data, but here it looks it does some 
processing things(or to add more comments to explain it).

https://github.com/llvm/llvm-project/pull/107137
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)

2024-09-11 Thread Lei Wang via llvm-branch-commits


@@ -95,24 +95,30 @@ template <> struct MappingTraits {
 
 namespace bolt {
 struct PseudoProbeInfo {
-  llvm::yaml::Hex64 GUID;
-  uint64_t Index;
-  uint8_t Type;
+  uint32_t InlineTreeIndex = 0;
+  uint64_t BlockMask = 0; // bitset with probe indices
+  // Assume BlockMask == 1 if no other probes are set
+  std::vector BlockProbes;
+  std::vector CallProbes;
+  std::vector IndCallProbes;
+  std::vector InlineTreeNodes;
 
   bool operator==(const PseudoProbeInfo &Other) const {
-return GUID == Other.GUID && Index == Other.Index;
-  }
-  bool operator!=(const PseudoProbeInfo &Other) const {
-return !(*this == Other);
+return InlineTreeIndex == Other.InlineTreeIndex &&
+   BlockProbes == Other.BlockProbes && CallProbes == Other.CallProbes 
&&
+   IndCallProbes == Other.IndCallProbes;
   }
 };
 } // end namespace bolt
 
 template <> struct MappingTraits {
   static void mapping(IO &YamlIO, bolt::PseudoProbeInfo &PI) {
-YamlIO.mapRequired("guid", PI.GUID);
-YamlIO.mapRequired("id", PI.Index);
-YamlIO.mapRequired("type", PI.Type);
+YamlIO.mapOptional("blk", PI.BlockMask, 0);

wlei-llvm wrote:

nit: `blk` makes me feel it's the only ID for the block, consider another name?

https://github.com/llvm/llvm-project/pull/107137
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)

2024-09-11 Thread Lei Wang via llvm-branch-commits


@@ -158,15 +164,35 @@ template <> struct 
MappingTraits {
std::vector());
 YamlIO.mapOptional("succ", BBP.Successors,
std::vector());
-YamlIO.mapOptional("pseudo_probes", BBP.PseudoProbes,
+YamlIO.mapOptional("probes", BBP.PseudoProbes,
std::vector());
   }
 };
 
+namespace bolt {
+struct InlineTreeInfo {
+  uint32_t ParentIndexDelta;
+  uint32_t CallSiteProbe;
+  // Index in PseudoProbeDesc.GUID + 1, 0 for same as previous
+  uint32_t GUIDIndex;

wlei-llvm wrote:

Is it possible to use `optional` to represent the value that is same as 
previous?

https://github.com/llvm/llvm-project/pull/107137
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)

2024-09-11 Thread Lei Wang via llvm-branch-commits


@@ -158,15 +164,35 @@ template <> struct 
MappingTraits {
std::vector());
 YamlIO.mapOptional("succ", BBP.Successors,
std::vector());
-YamlIO.mapOptional("pseudo_probes", BBP.PseudoProbes,
+YamlIO.mapOptional("probes", BBP.PseudoProbes,
std::vector());
   }
 };
 
+namespace bolt {
+struct InlineTreeInfo {

wlei-llvm wrote:

Is this to represent only a node or the whole tree? if it's a node, maybe 
rename it `InlineTreeNode`? (Or is it because there is a name conflict, then 
maybe rename this to `YamlInlineTreeNode`?)

https://github.com/llvm/llvm-project/pull/107137
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)

2024-09-11 Thread Amir Ayupov via llvm-branch-commits


@@ -95,24 +95,30 @@ template <> struct MappingTraits {
 
 namespace bolt {
 struct PseudoProbeInfo {
-  llvm::yaml::Hex64 GUID;
-  uint64_t Index;
-  uint8_t Type;
+  uint32_t InlineTreeIndex = 0;
+  uint64_t BlockMask = 0; // bitset with probe indices

aaupov wrote:

The profile without using BlockMask is 3.4% larger, the size goes from 92MB to 
95MB (same test binary/profile). I think it's worth it, but open to your 
suggestion.



https://github.com/llvm/llvm-project/pull/107137
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)

2024-09-11 Thread Amir Ayupov via llvm-branch-commits


@@ -95,24 +95,30 @@ template <> struct MappingTraits {
 
 namespace bolt {
 struct PseudoProbeInfo {
-  llvm::yaml::Hex64 GUID;
-  uint64_t Index;
-  uint8_t Type;
+  uint32_t InlineTreeIndex = 0;
+  uint64_t BlockMask = 0; // bitset with probe indices

aaupov wrote:

> Could you clarify more on this? why we need a special use for the value 1 
> BlockMask, there still could be more probes? say we can have two probes: 1 
> basic block probe(ID1) and 1 call probe(ID2).

This is a YAML size optimization: it's very common for the first block to only 
have block probe with id 1 (~entry point block probe). In this case the YAML 
representation is `{ blk: 1 }`. We can elide it and encode as `{ }`, and treat 
that as `{ blk: 1 }`. 
If the block has other probes, as in your example, we can't elide `blk: 1` 
because we need to distinguish `{ blk: 1, call: [ 2 ] }` from `{ call: [ 2 ] }` 
(no block probes).

This optimization only buys us ~1MB (92MB with -> 93MB without).

I agree the effect is low-ish for a special case, so I'll drop it if you react 
with thumbs up.

https://github.com/llvm/llvm-project/pull/107137
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)

2024-09-11 Thread Amir Ayupov via llvm-branch-commits


@@ -95,24 +95,30 @@ template <> struct MappingTraits {
 
 namespace bolt {
 struct PseudoProbeInfo {
-  llvm::yaml::Hex64 GUID;
-  uint64_t Index;
-  uint8_t Type;
+  uint32_t InlineTreeIndex = 0;
+  uint64_t BlockMask = 0; // bitset with probe indices
+  // Assume BlockMask == 1 if no other probes are set
+  std::vector BlockProbes;
+  std::vector CallProbes;
+  std::vector IndCallProbes;
+  std::vector InlineTreeNodes;
 
   bool operator==(const PseudoProbeInfo &Other) const {
-return GUID == Other.GUID && Index == Other.Index;
-  }
-  bool operator!=(const PseudoProbeInfo &Other) const {
-return !(*this == Other);
+return InlineTreeIndex == Other.InlineTreeIndex &&
+   BlockProbes == Other.BlockProbes && CallProbes == Other.CallProbes 
&&
+   IndCallProbes == Other.IndCallProbes;
   }
 };
 } // end namespace bolt
 
 template <> struct MappingTraits {
   static void mapping(IO &YamlIO, bolt::PseudoProbeInfo &PI) {
-YamlIO.mapRequired("guid", PI.GUID);
-YamlIO.mapRequired("id", PI.Index);
-YamlIO.mapRequired("type", PI.Type);
+YamlIO.mapOptional("blk", PI.BlockMask, 0);

aaupov wrote:

I think `blx` is a good key name for block mask.

https://github.com/llvm/llvm-project/pull/107137
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)

2024-09-11 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov edited 
https://github.com/llvm/llvm-project/pull/107137
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)

2024-09-11 Thread Amir Ayupov via llvm-branch-commits


@@ -58,8 +60,164 @@ const BinaryFunction *YAMLProfileWriter::setCSIDestination(
   return nullptr;
 }
 
+std::vector
+YAMLProfileWriter::getInlineTree(const MCPseudoProbeDecoder &Decoder,
+ const MCDecodedPseudoProbeInlineTree *Root) {
+  auto getHash = [&](const MCDecodedPseudoProbeInlineTree &Node) {
+return Decoder.getFuncDescForGUID(Node.Guid)->FuncHash;
+  };
+  assert(Root);
+  std::vector InlineTree;
+  InlineTreeNode Node{Root, Root->Guid, getHash(*Root), 0, 0};
+  InlineTree.emplace_back(Node);
+  uint32_t ParentId = 0;
+  while (ParentId != InlineTree.size()) {
+const MCDecodedPseudoProbeInlineTree *Cur = 
InlineTree[ParentId].InlineTree;
+for (const MCDecodedPseudoProbeInlineTree &Child : Cur->getChildren()) {
+  InlineTreeNode Node{&Child, Child.Guid, getHash(Child), ParentId,
+  std::get<1>(Child.getInlineSite())};
+  InlineTree.emplace_back(Node);
+}
+++ParentId;
+  }
+
+  return InlineTree;
+}
+
+std::tuple
+YAMLProfileWriter::convertPseudoProbeDesc(const MCPseudoProbeDecoder &Decoder) 
{
+  yaml::bolt::PseudoProbeDesc Desc;
+  InlineTreeDesc InlineTree;
+
+  for (const MCDecodedPseudoProbeInlineTree &TopLev :
+   Decoder.getDummyInlineRoot().getChildren())
+InlineTree.TopLevelGUIDToInlineTree[TopLev.Guid] = &TopLev;
+
+  for (const auto &FuncDesc : Decoder.getGUID2FuncDescMap())
+++InlineTree.HashIdxMap[FuncDesc.FuncHash];

aaupov wrote:

HashIdxMap stores unique hashes and we don't know their number beforehand. 

https://github.com/llvm/llvm-project/pull/107137
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)

2024-09-11 Thread Amir Ayupov via llvm-branch-commits


@@ -158,15 +164,35 @@ template <> struct 
MappingTraits {
std::vector());
 YamlIO.mapOptional("succ", BBP.Successors,
std::vector());
-YamlIO.mapOptional("pseudo_probes", BBP.PseudoProbes,
+YamlIO.mapOptional("probes", BBP.PseudoProbes,
std::vector());
   }
 };
 
+namespace bolt {
+struct InlineTreeInfo {
+  uint32_t ParentIndexDelta;
+  uint32_t CallSiteProbe;
+  // Index in PseudoProbeDesc.GUID + 1, 0 for same as previous
+  uint32_t GUIDIndex;

aaupov wrote:

If you mean using `std::optional`, I don't think so. However we can distinguish 
`0` from "same as previous" and avoid `idx+1` by using a different default 
value (e.g. UINT32_MAX) for same as previous.

https://github.com/llvm/llvm-project/pull/107137
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)

2024-09-11 Thread Amir Ayupov via llvm-branch-commits


@@ -246,10 +270,32 @@ template <> struct 
MappingTraits {
   }
 };
 
+namespace bolt {
+struct PseudoProbeDesc {
+  std::vector GUID;
+  std::vector Hash;
+  std::vector GUIDHash; // Index of hash for that GUID in Hash

aaupov wrote:

> > // Index of hash for that GUID in Hash
> 
> I don't fully get this, does this mean we have different Hashes for the same 
> GUID, is this a global index or just the index for different Hash but the 
> same GUID.

No, there can only be one Hash for the same GUID. 

But different GUIDs can have the same hash (functions with 1 block will share 
0x1 if I understand correctly). Thus Hash vector can be smaller, 
and two GUIDs with different indices in GUID vector can have the same index 
into Hash vector.

By the way, the most frequent hashes are, in my working example:
```
hs:  [ 0x1, 0x, 0x2, 
0x3, 0x4, 0x5, 0x6, ...
```

https://github.com/llvm/llvm-project/pull/107137
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)

2024-09-11 Thread Amir Ayupov via llvm-branch-commits


@@ -58,8 +60,164 @@ const BinaryFunction *YAMLProfileWriter::setCSIDestination(
   return nullptr;
 }
 
+std::vector
+YAMLProfileWriter::getInlineTree(const MCPseudoProbeDecoder &Decoder,

aaupov wrote:

Replaced with `collectInlineTree`

https://github.com/llvm/llvm-project/pull/107137
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)

2024-09-11 Thread Amir Ayupov via llvm-branch-commits


@@ -2421,11 +2433,14 @@ std::error_code 
DataAggregator::writeBATYAML(BinaryContext &BC,
 const uint32_t InputOffset = BAT->translate(
 FuncAddr, OutputAddress - FuncAddr, /*IsBranchSrc=*/true);
 const unsigned BlockIndex = getBlock(InputOffset).second;
-YamlBF.Blocks[BlockIndex].PseudoProbes.emplace_back(
-yaml::bolt::PseudoProbeInfo{Probe.getGuid(), Probe.getIndex(),
-Probe.getType()});
+BlockProbes[BlockIndex].emplace_back(Probe);
   }
 }
+
+for (auto &[Block, Probes] : BlockProbes) {
+  YamlBF.Blocks[Block].PseudoProbes =
+  YAMLProfileWriter::writeBlockProbes(Probes, InlineTreeNodeId);

aaupov wrote:

We have two places where YAML function profile is constructed: 
- YAMLProfileWriter for the regular case: the profile is dumped from function 
CFG, 
- DataAggregator for BOLTed binaries/functions where we want the profile for 
the original function: the profile is reconstructed using 
[BAT](https://github.com/llvm/llvm-project/blob/8168088f0a9015bc6d930e8bc1c639dee06ca82c/bolt/docs/BAT.md)
 (BOLT Address Translation) tables.

In DataAggregator/BAT case we want to write profile for original/pre-BOLT 
binary including probe information, and there's no explicit way to iterate over 
probes in original/pre-BOLT order while they're encoded in optimized/BOLTed 
order.

https://github.com/llvm/llvm-project/pull/107137
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)

2024-09-11 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov updated 
https://github.com/llvm/llvm-project/pull/107137

>From 50c021b09950cf7d6a8f25b1ac0dec246f2325f5 Mon Sep 17 00:00:00 2001
From: Amir Ayupov 
Date: Tue, 3 Sep 2024 11:38:04 -0700
Subject: [PATCH 1/4] update pseudoprobe-decoding-inline.test

Created using spr 1.3.4
---
 .../test/X86/pseudoprobe-decoding-inline.test | 31 ---
 1 file changed, 20 insertions(+), 11 deletions(-)

diff --git a/bolt/test/X86/pseudoprobe-decoding-inline.test 
b/bolt/test/X86/pseudoprobe-decoding-inline.test
index 1fdd00c7ef6c4b..629dd84ab8e1dc 100644
--- a/bolt/test/X86/pseudoprobe-decoding-inline.test
+++ b/bolt/test/X86/pseudoprobe-decoding-inline.test
@@ -14,29 +14,38 @@
 # RUN: FileCheck --input-file %t.yaml2 %s --check-prefix CHECK-YAML
 # CHECK-YAML: name: bar
 # CHECK-YAML: - bid: 0
-# CHECK-YAML:   pseudo_probes: [ { guid: 0xE413754A191DB537, id: 1, type: 0 }, 
{ guid: 0xE413754A191DB537, id: 4, type: 0 } ]
-# CHECK-YAML: guid: 0xE413754A191DB537
-# CHECK-YAML: pseudo_probe_desc_hash: 0x10E852DA94
+# CHECK-YAML:  pseudo_probes:
+# CHECK-YAML-NEXT:   - { id: 1, type: 0
+# CHECK-YAML-NEXT:   - { id: 4, type: 0
+# CHECK-YAML:  inline_tree:
+# CHECK-YAML-NEXT:   - { guid: 0xE413754A191DB537, hash: 0x10E852DA94, id: 0 }
 #
 # CHECK-YAML: name: foo
 # CHECK-YAML: - bid: 0
-# CHECK-YAML:   pseudo_probes: [ { guid: 0x5CF8C24CDB18BDAC, id: 1, type: 0 }, 
{ guid: 0x5CF8C24CDB18BDAC, id: 2, type: 0 } ]
-# CHECK-YAML: guid: 0x5CF8C24CDB18BDAC
-# CHECK-YAML: pseudo_probe_desc_hash: 0x200205A19C5B4
+# CHECK-YAML:  pseudo_probes:
+# CHECK-YAML-NEXT: - { id: 1, type: 0 }
+# CHECK-YAML-NEXT: - { id: 2, type: 0 }
+# CHECK-YAML:  inline_tree:
+# CHECK-YAML-NEXT:   - { guid: 0x5CF8C24CDB18BDAC, hash: 0x200205A19C5B4, id: 
0 }
+# CHECK-YAML-NEXT:   - { guid: 0xE413754A191DB537, hash: 0x10E852DA94, id: 1, 
callsite: 8 }
 #
 # CHECK-YAML: name: main
 # CHECK-YAML: - bid: 0
-# CHECK-YAML:   pseudo_probes: [ { guid: 0xDB956436E78DD5FA, id: 1, type: 0 }, 
{ guid: 0x5CF8C24CDB18BDAC, id: 1, type: 0 }, { guid: 0x5CF8C24CDB18BDAC, id: 
2, type: 0 } ]
-# CHECK-YAML: guid: 0xDB956436E78DD5FA
-# CHECK-YAML: pseudo_probe_desc_hash: 0x1
+# CHECK-YAML:  pseudo_probes:
+# CHECK-YAML-NEXT: - { id: 1, type: 0 }
+# CHECK-YAML-NEXT: - { id: 1, type: 0, inline_tree_id: 1 }
+# CHECK-YAML-NEXT: - { id: 2, type: 0, inline_tree_id: 1 }
+# CHECK-YAML:  inline_tree:
+# CHECK-YAML-NEXT:   - { guid: 0xDB956436E78DD5FA, hash: 0x1, id: 
0 }
+# CHECK-YAML-NEXT:   - { guid: 0x5CF8C24CDB18BDAC, hash: 0x200205A19C5B4, id: 
1, callsite: 2 }
+# CHECK-YAML-NEXT:   - { guid: 0xE413754A191DB537, hash: 0x10E852DA94, id: 2, 
parent: 1, callsite: 8 }
 #
 ## Check that without --profile-write-pseudo-probes option, no pseudo probes 
are
 ## generated
 # RUN: perf2bolt 
%S/../../../llvm/test/tools/llvm-profgen/Inputs/inline-cs-pseudoprobe.perfbin 
-p %t.preagg --pa -w %t.yaml -o %t.fdata
 # RUN: FileCheck --input-file %t.yaml %s --check-prefix CHECK-NO-OPT
 # CHECK-NO-OPT-NOT: pseudo_probes
-# CHECK-NO-OPT-NOT: guid
-# CHECK-NO-OPT-NOT: pseudo_probe_desc_hash
+# CHECK-NO-OPT-NOT: inline_tree
 
 CHECK: Report of decoding input pseudo probe binaries
 

>From 6ec4cf6bf05551d02cbf17e9edbe8d6931588ff1 Mon Sep 17 00:00:00 2001
From: Amir Ayupov 
Date: Mon, 9 Sep 2024 21:37:28 -0700
Subject: [PATCH 2/4] clang-format

Created using spr 1.3.4
---
 bolt/lib/Profile/YAMLProfileWriter.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/bolt/lib/Profile/YAMLProfileWriter.cpp 
b/bolt/lib/Profile/YAMLProfileWriter.cpp
index 70e5e09e2920e5..f2609de18ce63c 100644
--- a/bolt/lib/Profile/YAMLProfileWriter.cpp
+++ b/bolt/lib/Profile/YAMLProfileWriter.cpp
@@ -90,7 +90,7 @@ YAMLProfileWriter::convertPseudoProbeDesc(const 
MCPseudoProbeDecoder &Decoder) {
   InlineTreeDesc InlineTree;
 
   for (const MCDecodedPseudoProbeInlineTree &TopLev :
-  Decoder.getDummyInlineRoot().getChildren())
+   Decoder.getDummyInlineRoot().getChildren())
 InlineTree.TopLevelGUIDToInlineTree[TopLev.Guid] = &TopLev;
 
   for (const auto &FuncDesc : Decoder.getGUID2FuncDescMap())

>From 852eb07f345dd1d9e77a6faead8bf0f73ff64ba7 Mon Sep 17 00:00:00 2001
From: Amir Ayupov 
Date: Tue, 10 Sep 2024 12:26:11 -0700
Subject: [PATCH 3/4] Make pseudo_probe_desc optional

Created using spr 1.3.4
---
 bolt/include/bolt/Profile/ProfileYAMLMapping.h | 9 -
 bolt/test/X86/pseudoprobe-decoding-inline.test | 5 +++--
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/bolt/include/bolt/Profile/ProfileYAMLMapping.h 
b/bolt/include/bolt/Profile/ProfileYAMLMapping.h
index 588e2f59d67e01..9cc33264d70718 100644
--- a/bolt/include/bolt/Profile/ProfileYAMLMapping.h
+++ b/bolt/include/bolt/Profile/ProfileYAMLMapping.h
@@ -275,6 +275,12 @@ struct PseudoProbeDesc {
   std::vector GUID;
   std::vector Hash;
   std::vector GUIDHash; // Index of hash for that GUID in Hash
+
+  bool operator==(const PseudoProbeDesc &Ot

[llvm-branch-commits] [llvm] release/19.x: [RISCV] Don't outline pcrel_lo when the function has a section prefix (#107943) (PR #108288)

2024-09-11 Thread Pengcheng Wang via llvm-branch-commits

https://github.com/wangpc-pp approved this pull request.

LGTM.
This fixes a bug that exists for a long time.

https://github.com/llvm/llvm-project/pull/108288
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)

2024-09-11 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov updated 
https://github.com/llvm/llvm-project/pull/107137

>From 50c021b09950cf7d6a8f25b1ac0dec246f2325f5 Mon Sep 17 00:00:00 2001
From: Amir Ayupov 
Date: Tue, 3 Sep 2024 11:38:04 -0700
Subject: [PATCH 1/5] update pseudoprobe-decoding-inline.test

Created using spr 1.3.4
---
 .../test/X86/pseudoprobe-decoding-inline.test | 31 ---
 1 file changed, 20 insertions(+), 11 deletions(-)

diff --git a/bolt/test/X86/pseudoprobe-decoding-inline.test 
b/bolt/test/X86/pseudoprobe-decoding-inline.test
index 1fdd00c7ef6c4b..629dd84ab8e1dc 100644
--- a/bolt/test/X86/pseudoprobe-decoding-inline.test
+++ b/bolt/test/X86/pseudoprobe-decoding-inline.test
@@ -14,29 +14,38 @@
 # RUN: FileCheck --input-file %t.yaml2 %s --check-prefix CHECK-YAML
 # CHECK-YAML: name: bar
 # CHECK-YAML: - bid: 0
-# CHECK-YAML:   pseudo_probes: [ { guid: 0xE413754A191DB537, id: 1, type: 0 }, 
{ guid: 0xE413754A191DB537, id: 4, type: 0 } ]
-# CHECK-YAML: guid: 0xE413754A191DB537
-# CHECK-YAML: pseudo_probe_desc_hash: 0x10E852DA94
+# CHECK-YAML:  pseudo_probes:
+# CHECK-YAML-NEXT:   - { id: 1, type: 0
+# CHECK-YAML-NEXT:   - { id: 4, type: 0
+# CHECK-YAML:  inline_tree:
+# CHECK-YAML-NEXT:   - { guid: 0xE413754A191DB537, hash: 0x10E852DA94, id: 0 }
 #
 # CHECK-YAML: name: foo
 # CHECK-YAML: - bid: 0
-# CHECK-YAML:   pseudo_probes: [ { guid: 0x5CF8C24CDB18BDAC, id: 1, type: 0 }, 
{ guid: 0x5CF8C24CDB18BDAC, id: 2, type: 0 } ]
-# CHECK-YAML: guid: 0x5CF8C24CDB18BDAC
-# CHECK-YAML: pseudo_probe_desc_hash: 0x200205A19C5B4
+# CHECK-YAML:  pseudo_probes:
+# CHECK-YAML-NEXT: - { id: 1, type: 0 }
+# CHECK-YAML-NEXT: - { id: 2, type: 0 }
+# CHECK-YAML:  inline_tree:
+# CHECK-YAML-NEXT:   - { guid: 0x5CF8C24CDB18BDAC, hash: 0x200205A19C5B4, id: 
0 }
+# CHECK-YAML-NEXT:   - { guid: 0xE413754A191DB537, hash: 0x10E852DA94, id: 1, 
callsite: 8 }
 #
 # CHECK-YAML: name: main
 # CHECK-YAML: - bid: 0
-# CHECK-YAML:   pseudo_probes: [ { guid: 0xDB956436E78DD5FA, id: 1, type: 0 }, 
{ guid: 0x5CF8C24CDB18BDAC, id: 1, type: 0 }, { guid: 0x5CF8C24CDB18BDAC, id: 
2, type: 0 } ]
-# CHECK-YAML: guid: 0xDB956436E78DD5FA
-# CHECK-YAML: pseudo_probe_desc_hash: 0x1
+# CHECK-YAML:  pseudo_probes:
+# CHECK-YAML-NEXT: - { id: 1, type: 0 }
+# CHECK-YAML-NEXT: - { id: 1, type: 0, inline_tree_id: 1 }
+# CHECK-YAML-NEXT: - { id: 2, type: 0, inline_tree_id: 1 }
+# CHECK-YAML:  inline_tree:
+# CHECK-YAML-NEXT:   - { guid: 0xDB956436E78DD5FA, hash: 0x1, id: 
0 }
+# CHECK-YAML-NEXT:   - { guid: 0x5CF8C24CDB18BDAC, hash: 0x200205A19C5B4, id: 
1, callsite: 2 }
+# CHECK-YAML-NEXT:   - { guid: 0xE413754A191DB537, hash: 0x10E852DA94, id: 2, 
parent: 1, callsite: 8 }
 #
 ## Check that without --profile-write-pseudo-probes option, no pseudo probes 
are
 ## generated
 # RUN: perf2bolt 
%S/../../../llvm/test/tools/llvm-profgen/Inputs/inline-cs-pseudoprobe.perfbin 
-p %t.preagg --pa -w %t.yaml -o %t.fdata
 # RUN: FileCheck --input-file %t.yaml %s --check-prefix CHECK-NO-OPT
 # CHECK-NO-OPT-NOT: pseudo_probes
-# CHECK-NO-OPT-NOT: guid
-# CHECK-NO-OPT-NOT: pseudo_probe_desc_hash
+# CHECK-NO-OPT-NOT: inline_tree
 
 CHECK: Report of decoding input pseudo probe binaries
 

>From 6ec4cf6bf05551d02cbf17e9edbe8d6931588ff1 Mon Sep 17 00:00:00 2001
From: Amir Ayupov 
Date: Mon, 9 Sep 2024 21:37:28 -0700
Subject: [PATCH 2/5] clang-format

Created using spr 1.3.4
---
 bolt/lib/Profile/YAMLProfileWriter.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/bolt/lib/Profile/YAMLProfileWriter.cpp 
b/bolt/lib/Profile/YAMLProfileWriter.cpp
index 70e5e09e2920e5..f2609de18ce63c 100644
--- a/bolt/lib/Profile/YAMLProfileWriter.cpp
+++ b/bolt/lib/Profile/YAMLProfileWriter.cpp
@@ -90,7 +90,7 @@ YAMLProfileWriter::convertPseudoProbeDesc(const 
MCPseudoProbeDecoder &Decoder) {
   InlineTreeDesc InlineTree;
 
   for (const MCDecodedPseudoProbeInlineTree &TopLev :
-  Decoder.getDummyInlineRoot().getChildren())
+   Decoder.getDummyInlineRoot().getChildren())
 InlineTree.TopLevelGUIDToInlineTree[TopLev.Guid] = &TopLev;
 
   for (const auto &FuncDesc : Decoder.getGUID2FuncDescMap())

>From 852eb07f345dd1d9e77a6faead8bf0f73ff64ba7 Mon Sep 17 00:00:00 2001
From: Amir Ayupov 
Date: Tue, 10 Sep 2024 12:26:11 -0700
Subject: [PATCH 3/5] Make pseudo_probe_desc optional

Created using spr 1.3.4
---
 bolt/include/bolt/Profile/ProfileYAMLMapping.h | 9 -
 bolt/test/X86/pseudoprobe-decoding-inline.test | 5 +++--
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/bolt/include/bolt/Profile/ProfileYAMLMapping.h 
b/bolt/include/bolt/Profile/ProfileYAMLMapping.h
index 588e2f59d67e01..9cc33264d70718 100644
--- a/bolt/include/bolt/Profile/ProfileYAMLMapping.h
+++ b/bolt/include/bolt/Profile/ProfileYAMLMapping.h
@@ -275,6 +275,12 @@ struct PseudoProbeDesc {
   std::vector GUID;
   std::vector Hash;
   std::vector GUIDHash; // Index of hash for that GUID in Hash
+
+  bool operator==(const PseudoProbeDesc &Ot

[llvm-branch-commits] [llvm] [BOLT] Match blocks with pseudo probes (PR #99891)

2024-09-11 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov updated 
https://github.com/llvm/llvm-project/pull/99891

>From 36197b175681d07b4704e576fb008cec3cc1e05e Mon Sep 17 00:00:00 2001
From: Amir Ayupov 
Date: Wed, 28 Aug 2024 21:10:25 +0200
Subject: [PATCH 1/2] Reworked block probe matching

Use new probe ifaces
Get all function probes at once
Drop ProfileUsePseudoProbes
Unify matchWithBlockPseudoProbes
Distinguish exact and loose probe match
---
 bolt/include/bolt/Core/BinaryContext.h|  20 +-
 bolt/lib/Passes/BinaryPasses.cpp  |  40 ++-
 bolt/lib/Profile/StaleProfileMatching.cpp | 404 ++
 bolt/lib/Rewrite/PseudoProbeRewriter.cpp  |   8 +-
 4 files changed, 237 insertions(+), 235 deletions(-)

diff --git a/bolt/include/bolt/Core/BinaryContext.h 
b/bolt/include/bolt/Core/BinaryContext.h
index 3e20cb607e657b..3f7b2ac0bc6cf9 100644
--- a/bolt/include/bolt/Core/BinaryContext.h
+++ b/bolt/include/bolt/Core/BinaryContext.h
@@ -724,14 +724,26 @@ class BinaryContext {
 uint32_t NumStaleBlocks{0};
 ///   the number of exactly matched basic blocks
 uint32_t NumExactMatchedBlocks{0};
-///   the number of pseudo probe matched basic blocks
-uint32_t NumPseudoProbeMatchedBlocks{0};
+///   the number of loosely matched basic blocks
+uint32_t NumLooseMatchedBlocks{0};
+///   the number of exactly pseudo probe matched basic blocks
+uint32_t NumPseudoProbeExactMatchedBlocks{0};
+///   the number of loosely pseudo probe matched basic blocks
+uint32_t NumPseudoProbeLooseMatchedBlocks{0};
+///   the number of call matched basic blocks
+uint32_t NumCallMatchedBlocks{0};
 ///   the total count of samples in the profile
 uint64_t StaleSampleCount{0};
 ///   the count of exactly matched samples
 uint64_t ExactMatchedSampleCount{0};
-///   the count of pseudo probe matched samples
-uint64_t PseudoProbeMatchedSampleCount{0};
+///   the count of exactly matched samples
+uint64_t LooseMatchedSampleCount{0};
+///   the count of exactly pseudo probe matched samples
+uint64_t PseudoProbeExactMatchedSampleCount{0};
+///   the count of loosely pseudo probe matched samples
+uint64_t PseudoProbeLooseMatchedSampleCount{0};
+///   the count of call matched samples
+uint64_t CallMatchedSampleCount{0};
 ///   the number of stale functions that have matching number of blocks in
 ///   the profile
 uint64_t NumStaleFuncsWithEqualBlockCount{0};
diff --git a/bolt/lib/Passes/BinaryPasses.cpp b/bolt/lib/Passes/BinaryPasses.cpp
index b786f07a6a6651..8edbd58c3ed3de 100644
--- a/bolt/lib/Passes/BinaryPasses.cpp
+++ b/bolt/lib/Passes/BinaryPasses.cpp
@@ -1524,15 +1524,43 @@ Error PrintProgramStats::runOnFunctions(BinaryContext 
&BC) {
 100.0 * BC.Stats.ExactMatchedSampleCount / BC.Stats.StaleSampleCount,
 BC.Stats.ExactMatchedSampleCount, BC.Stats.StaleSampleCount);
 BC.outs() << format(
-"BOLT-INFO: inference found a pseudo probe match for %.2f%% of basic "
+"BOLT-INFO: inference found an exact pseudo probe match for %.2f%% of "
+"basic blocks (%zu out of %zu stale) responsible for %.2f%% samples"
+" (%zu out of %zu stale)\n",
+100.0 * BC.Stats.NumPseudoProbeExactMatchedBlocks /
+BC.Stats.NumStaleBlocks,
+BC.Stats.NumPseudoProbeExactMatchedBlocks, BC.Stats.NumStaleBlocks,
+100.0 * BC.Stats.PseudoProbeExactMatchedSampleCount /
+BC.Stats.StaleSampleCount,
+BC.Stats.PseudoProbeExactMatchedSampleCount, 
BC.Stats.StaleSampleCount);
+BC.outs() << format(
+"BOLT-INFO: inference found a loose pseudo probe match for %.2f%% of "
+"basic blocks (%zu out of %zu stale) responsible for %.2f%% samples"
+" (%zu out of %zu stale)\n",
+100.0 * BC.Stats.NumPseudoProbeLooseMatchedBlocks /
+BC.Stats.NumStaleBlocks,
+BC.Stats.NumPseudoProbeLooseMatchedBlocks, BC.Stats.NumStaleBlocks,
+100.0 * BC.Stats.PseudoProbeLooseMatchedSampleCount /
+BC.Stats.StaleSampleCount,
+BC.Stats.PseudoProbeLooseMatchedSampleCount, 
BC.Stats.StaleSampleCount);
+BC.outs() << format(
+"BOLT-INFO: inference found a call match for %.2f%% of basic "
 "blocks"
 " (%zu out of %zu stale) responsible for %.2f%% samples"
 " (%zu out of %zu stale)\n",
-100.0 * BC.Stats.NumPseudoProbeMatchedBlocks / BC.Stats.NumStaleBlocks,
-BC.Stats.NumPseudoProbeMatchedBlocks, BC.Stats.NumStaleBlocks,
-100.0 * BC.Stats.PseudoProbeMatchedSampleCount /
-BC.Stats.StaleSampleCount,
-BC.Stats.PseudoProbeMatchedSampleCount, BC.Stats.StaleSampleCount);
+100.0 * BC.Stats.NumCallMatchedBlocks / BC.Stats.NumStaleBlocks,
+BC.Stats.NumCallMatchedBlocks, BC.Stats.NumStaleBlocks,
+100.0 * BC.Stats.CallMatchedSampleCount / BC.Stats.StaleSampleCount,
+BC.Stats.CallMatchedSampleCount, BC.Stats.StaleSampleCount);
+BC

[llvm-branch-commits] [llvm] [BOLT] Match blocks with pseudo probes (PR #99891)

2024-09-11 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov updated 
https://github.com/llvm/llvm-project/pull/99891

>From 36197b175681d07b4704e576fb008cec3cc1e05e Mon Sep 17 00:00:00 2001
From: Amir Ayupov 
Date: Wed, 28 Aug 2024 21:10:25 +0200
Subject: [PATCH 1/2] Reworked block probe matching

Use new probe ifaces
Get all function probes at once
Drop ProfileUsePseudoProbes
Unify matchWithBlockPseudoProbes
Distinguish exact and loose probe match
---
 bolt/include/bolt/Core/BinaryContext.h|  20 +-
 bolt/lib/Passes/BinaryPasses.cpp  |  40 ++-
 bolt/lib/Profile/StaleProfileMatching.cpp | 404 ++
 bolt/lib/Rewrite/PseudoProbeRewriter.cpp  |   8 +-
 4 files changed, 237 insertions(+), 235 deletions(-)

diff --git a/bolt/include/bolt/Core/BinaryContext.h 
b/bolt/include/bolt/Core/BinaryContext.h
index 3e20cb607e657b..3f7b2ac0bc6cf9 100644
--- a/bolt/include/bolt/Core/BinaryContext.h
+++ b/bolt/include/bolt/Core/BinaryContext.h
@@ -724,14 +724,26 @@ class BinaryContext {
 uint32_t NumStaleBlocks{0};
 ///   the number of exactly matched basic blocks
 uint32_t NumExactMatchedBlocks{0};
-///   the number of pseudo probe matched basic blocks
-uint32_t NumPseudoProbeMatchedBlocks{0};
+///   the number of loosely matched basic blocks
+uint32_t NumLooseMatchedBlocks{0};
+///   the number of exactly pseudo probe matched basic blocks
+uint32_t NumPseudoProbeExactMatchedBlocks{0};
+///   the number of loosely pseudo probe matched basic blocks
+uint32_t NumPseudoProbeLooseMatchedBlocks{0};
+///   the number of call matched basic blocks
+uint32_t NumCallMatchedBlocks{0};
 ///   the total count of samples in the profile
 uint64_t StaleSampleCount{0};
 ///   the count of exactly matched samples
 uint64_t ExactMatchedSampleCount{0};
-///   the count of pseudo probe matched samples
-uint64_t PseudoProbeMatchedSampleCount{0};
+///   the count of exactly matched samples
+uint64_t LooseMatchedSampleCount{0};
+///   the count of exactly pseudo probe matched samples
+uint64_t PseudoProbeExactMatchedSampleCount{0};
+///   the count of loosely pseudo probe matched samples
+uint64_t PseudoProbeLooseMatchedSampleCount{0};
+///   the count of call matched samples
+uint64_t CallMatchedSampleCount{0};
 ///   the number of stale functions that have matching number of blocks in
 ///   the profile
 uint64_t NumStaleFuncsWithEqualBlockCount{0};
diff --git a/bolt/lib/Passes/BinaryPasses.cpp b/bolt/lib/Passes/BinaryPasses.cpp
index b786f07a6a6651..8edbd58c3ed3de 100644
--- a/bolt/lib/Passes/BinaryPasses.cpp
+++ b/bolt/lib/Passes/BinaryPasses.cpp
@@ -1524,15 +1524,43 @@ Error PrintProgramStats::runOnFunctions(BinaryContext 
&BC) {
 100.0 * BC.Stats.ExactMatchedSampleCount / BC.Stats.StaleSampleCount,
 BC.Stats.ExactMatchedSampleCount, BC.Stats.StaleSampleCount);
 BC.outs() << format(
-"BOLT-INFO: inference found a pseudo probe match for %.2f%% of basic "
+"BOLT-INFO: inference found an exact pseudo probe match for %.2f%% of "
+"basic blocks (%zu out of %zu stale) responsible for %.2f%% samples"
+" (%zu out of %zu stale)\n",
+100.0 * BC.Stats.NumPseudoProbeExactMatchedBlocks /
+BC.Stats.NumStaleBlocks,
+BC.Stats.NumPseudoProbeExactMatchedBlocks, BC.Stats.NumStaleBlocks,
+100.0 * BC.Stats.PseudoProbeExactMatchedSampleCount /
+BC.Stats.StaleSampleCount,
+BC.Stats.PseudoProbeExactMatchedSampleCount, 
BC.Stats.StaleSampleCount);
+BC.outs() << format(
+"BOLT-INFO: inference found a loose pseudo probe match for %.2f%% of "
+"basic blocks (%zu out of %zu stale) responsible for %.2f%% samples"
+" (%zu out of %zu stale)\n",
+100.0 * BC.Stats.NumPseudoProbeLooseMatchedBlocks /
+BC.Stats.NumStaleBlocks,
+BC.Stats.NumPseudoProbeLooseMatchedBlocks, BC.Stats.NumStaleBlocks,
+100.0 * BC.Stats.PseudoProbeLooseMatchedSampleCount /
+BC.Stats.StaleSampleCount,
+BC.Stats.PseudoProbeLooseMatchedSampleCount, 
BC.Stats.StaleSampleCount);
+BC.outs() << format(
+"BOLT-INFO: inference found a call match for %.2f%% of basic "
 "blocks"
 " (%zu out of %zu stale) responsible for %.2f%% samples"
 " (%zu out of %zu stale)\n",
-100.0 * BC.Stats.NumPseudoProbeMatchedBlocks / BC.Stats.NumStaleBlocks,
-BC.Stats.NumPseudoProbeMatchedBlocks, BC.Stats.NumStaleBlocks,
-100.0 * BC.Stats.PseudoProbeMatchedSampleCount /
-BC.Stats.StaleSampleCount,
-BC.Stats.PseudoProbeMatchedSampleCount, BC.Stats.StaleSampleCount);
+100.0 * BC.Stats.NumCallMatchedBlocks / BC.Stats.NumStaleBlocks,
+BC.Stats.NumCallMatchedBlocks, BC.Stats.NumStaleBlocks,
+100.0 * BC.Stats.CallMatchedSampleCount / BC.Stats.StaleSampleCount,
+BC.Stats.CallMatchedSampleCount, BC.Stats.StaleSampleCount);
+BC

[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)

2024-09-11 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov updated 
https://github.com/llvm/llvm-project/pull/107137

>From 50c021b09950cf7d6a8f25b1ac0dec246f2325f5 Mon Sep 17 00:00:00 2001
From: Amir Ayupov 
Date: Tue, 3 Sep 2024 11:38:04 -0700
Subject: [PATCH 1/6] update pseudoprobe-decoding-inline.test

Created using spr 1.3.4
---
 .../test/X86/pseudoprobe-decoding-inline.test | 31 ---
 1 file changed, 20 insertions(+), 11 deletions(-)

diff --git a/bolt/test/X86/pseudoprobe-decoding-inline.test 
b/bolt/test/X86/pseudoprobe-decoding-inline.test
index 1fdd00c7ef6c4b..629dd84ab8e1dc 100644
--- a/bolt/test/X86/pseudoprobe-decoding-inline.test
+++ b/bolt/test/X86/pseudoprobe-decoding-inline.test
@@ -14,29 +14,38 @@
 # RUN: FileCheck --input-file %t.yaml2 %s --check-prefix CHECK-YAML
 # CHECK-YAML: name: bar
 # CHECK-YAML: - bid: 0
-# CHECK-YAML:   pseudo_probes: [ { guid: 0xE413754A191DB537, id: 1, type: 0 }, 
{ guid: 0xE413754A191DB537, id: 4, type: 0 } ]
-# CHECK-YAML: guid: 0xE413754A191DB537
-# CHECK-YAML: pseudo_probe_desc_hash: 0x10E852DA94
+# CHECK-YAML:  pseudo_probes:
+# CHECK-YAML-NEXT:   - { id: 1, type: 0
+# CHECK-YAML-NEXT:   - { id: 4, type: 0
+# CHECK-YAML:  inline_tree:
+# CHECK-YAML-NEXT:   - { guid: 0xE413754A191DB537, hash: 0x10E852DA94, id: 0 }
 #
 # CHECK-YAML: name: foo
 # CHECK-YAML: - bid: 0
-# CHECK-YAML:   pseudo_probes: [ { guid: 0x5CF8C24CDB18BDAC, id: 1, type: 0 }, 
{ guid: 0x5CF8C24CDB18BDAC, id: 2, type: 0 } ]
-# CHECK-YAML: guid: 0x5CF8C24CDB18BDAC
-# CHECK-YAML: pseudo_probe_desc_hash: 0x200205A19C5B4
+# CHECK-YAML:  pseudo_probes:
+# CHECK-YAML-NEXT: - { id: 1, type: 0 }
+# CHECK-YAML-NEXT: - { id: 2, type: 0 }
+# CHECK-YAML:  inline_tree:
+# CHECK-YAML-NEXT:   - { guid: 0x5CF8C24CDB18BDAC, hash: 0x200205A19C5B4, id: 
0 }
+# CHECK-YAML-NEXT:   - { guid: 0xE413754A191DB537, hash: 0x10E852DA94, id: 1, 
callsite: 8 }
 #
 # CHECK-YAML: name: main
 # CHECK-YAML: - bid: 0
-# CHECK-YAML:   pseudo_probes: [ { guid: 0xDB956436E78DD5FA, id: 1, type: 0 }, 
{ guid: 0x5CF8C24CDB18BDAC, id: 1, type: 0 }, { guid: 0x5CF8C24CDB18BDAC, id: 
2, type: 0 } ]
-# CHECK-YAML: guid: 0xDB956436E78DD5FA
-# CHECK-YAML: pseudo_probe_desc_hash: 0x1
+# CHECK-YAML:  pseudo_probes:
+# CHECK-YAML-NEXT: - { id: 1, type: 0 }
+# CHECK-YAML-NEXT: - { id: 1, type: 0, inline_tree_id: 1 }
+# CHECK-YAML-NEXT: - { id: 2, type: 0, inline_tree_id: 1 }
+# CHECK-YAML:  inline_tree:
+# CHECK-YAML-NEXT:   - { guid: 0xDB956436E78DD5FA, hash: 0x1, id: 
0 }
+# CHECK-YAML-NEXT:   - { guid: 0x5CF8C24CDB18BDAC, hash: 0x200205A19C5B4, id: 
1, callsite: 2 }
+# CHECK-YAML-NEXT:   - { guid: 0xE413754A191DB537, hash: 0x10E852DA94, id: 2, 
parent: 1, callsite: 8 }
 #
 ## Check that without --profile-write-pseudo-probes option, no pseudo probes 
are
 ## generated
 # RUN: perf2bolt 
%S/../../../llvm/test/tools/llvm-profgen/Inputs/inline-cs-pseudoprobe.perfbin 
-p %t.preagg --pa -w %t.yaml -o %t.fdata
 # RUN: FileCheck --input-file %t.yaml %s --check-prefix CHECK-NO-OPT
 # CHECK-NO-OPT-NOT: pseudo_probes
-# CHECK-NO-OPT-NOT: guid
-# CHECK-NO-OPT-NOT: pseudo_probe_desc_hash
+# CHECK-NO-OPT-NOT: inline_tree
 
 CHECK: Report of decoding input pseudo probe binaries
 

>From 6ec4cf6bf05551d02cbf17e9edbe8d6931588ff1 Mon Sep 17 00:00:00 2001
From: Amir Ayupov 
Date: Mon, 9 Sep 2024 21:37:28 -0700
Subject: [PATCH 2/6] clang-format

Created using spr 1.3.4
---
 bolt/lib/Profile/YAMLProfileWriter.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/bolt/lib/Profile/YAMLProfileWriter.cpp 
b/bolt/lib/Profile/YAMLProfileWriter.cpp
index 70e5e09e2920e5..f2609de18ce63c 100644
--- a/bolt/lib/Profile/YAMLProfileWriter.cpp
+++ b/bolt/lib/Profile/YAMLProfileWriter.cpp
@@ -90,7 +90,7 @@ YAMLProfileWriter::convertPseudoProbeDesc(const 
MCPseudoProbeDecoder &Decoder) {
   InlineTreeDesc InlineTree;
 
   for (const MCDecodedPseudoProbeInlineTree &TopLev :
-  Decoder.getDummyInlineRoot().getChildren())
+   Decoder.getDummyInlineRoot().getChildren())
 InlineTree.TopLevelGUIDToInlineTree[TopLev.Guid] = &TopLev;
 
   for (const auto &FuncDesc : Decoder.getGUID2FuncDescMap())

>From 852eb07f345dd1d9e77a6faead8bf0f73ff64ba7 Mon Sep 17 00:00:00 2001
From: Amir Ayupov 
Date: Tue, 10 Sep 2024 12:26:11 -0700
Subject: [PATCH 3/6] Make pseudo_probe_desc optional

Created using spr 1.3.4
---
 bolt/include/bolt/Profile/ProfileYAMLMapping.h | 9 -
 bolt/test/X86/pseudoprobe-decoding-inline.test | 5 +++--
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/bolt/include/bolt/Profile/ProfileYAMLMapping.h 
b/bolt/include/bolt/Profile/ProfileYAMLMapping.h
index 588e2f59d67e01..9cc33264d70718 100644
--- a/bolt/include/bolt/Profile/ProfileYAMLMapping.h
+++ b/bolt/include/bolt/Profile/ProfileYAMLMapping.h
@@ -275,6 +275,12 @@ struct PseudoProbeDesc {
   std::vector GUID;
   std::vector Hash;
   std::vector GUIDHash; // Index of hash for that GUID in Hash
+
+  bool operator==(const PseudoProbeDesc &Ot