[llvm-branch-commits] [BOLT] Detect .warm split functions as cold fragments (PR #93759)

2024-05-29 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov created 
https://github.com/llvm/llvm-project/pull/93759

CDSplit splits functions up to three ways: main fragment with no suffix,
and fragments with .cold and .warm suffixes.

Add .warm suffix to the regex used to recognize split fragments.

Test Plan: updated register-fragments-bolt-symbols.s



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [BOLT][BAT] Add support for three-way split functions (PR #93760)

2024-05-29 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov created 
https://github.com/llvm/llvm-project/pull/93760

In three-way split functions, if only .warm fragment is present, BAT
incorrectly overwrites the map for .warm fragment by empty .cold
fragment.

Test Plan: updated register-fragments-bolt-symbols.s



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Detect .warm split functions as cold fragments (PR #93759)

2024-05-30 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov updated 
https://github.com/llvm/llvm-project/pull/93759

>From 9ac7c047c6d15082f05e7494b764d50150e3fd27 Mon Sep 17 00:00:00 2001
From: Amir Ayupov 
Date: Thu, 30 May 2024 14:52:45 -0700
Subject: [PATCH] s/ColdFragment/FunctionFragmentTemplate/

Created using spr 1.3.4
---
 bolt/include/bolt/Rewrite/RewriteInstance.h | 2 +-
 bolt/lib/Rewrite/RewriteInstance.cpp| 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/bolt/include/bolt/Rewrite/RewriteInstance.h 
b/bolt/include/bolt/Rewrite/RewriteInstance.h
index d8d62badcc377..a55516d553979 100644
--- a/bolt/include/bolt/Rewrite/RewriteInstance.h
+++ b/bolt/include/bolt/Rewrite/RewriteInstance.h
@@ -598,7 +598,7 @@ class RewriteInstance {
   NameResolver NR;
 
   // Regex object matching split function names.
-  const Regex ColdFragment{"(.*)\\.(cold|warm)(\\.[0-9]+)?"};
+  const Regex FunctionFragmentTemplate{"(.*)\\.(cold|warm)(\\.[0-9]+)?"};
 
   friend class RewriteInstanceDiff;
 };
diff --git a/bolt/lib/Rewrite/RewriteInstance.cpp 
b/bolt/lib/Rewrite/RewriteInstance.cpp
index fb920ebbeafc4..e452e956c949e 100644
--- a/bolt/lib/Rewrite/RewriteInstance.cpp
+++ b/bolt/lib/Rewrite/RewriteInstance.cpp
@@ -1225,7 +1225,7 @@ void RewriteInstance::discoverFileObjects() {
 }
 
 // Check if it's a cold function fragment.
-if (ColdFragment.match(SymName)) {
+if (FunctionFragmentTemplate.match(SymName)) {
   static bool PrintedWarning = false;
   if (!PrintedWarning) {
 PrintedWarning = true;
@@ -1457,7 +1457,7 @@ void RewriteInstance::registerFragments() {
   StringRef BaseName = NR.restore(Name);
   const bool IsGlobal = BaseName == Name;
   SmallVector Matches;
-  if (!ColdFragment.match(BaseName, &Matches))
+  if (!FunctionFragmentTemplate.match(BaseName, &Matches))
 continue;
   StringRef ParentName = Matches[1];
   const BinaryData *BD = BC->getBinaryDataByName(ParentName);

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Detect .warm split functions as cold fragments (PR #93759)

2024-05-30 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov closed https://github.com/llvm/llvm-project/pull/93759
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [BOLT][BAT] Add support for three-way split functions (PR #93760)

2024-05-31 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov edited https://github.com/llvm/llvm-project/pull/93760
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [BOLT][BAT] Add support for three-way split functions (PR #93760)

2024-06-04 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov updated 
https://github.com/llvm/llvm-project/pull/93760


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [BOLT][BAT] Add support for three-way split functions (PR #93760)

2024-06-04 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov updated 
https://github.com/llvm/llvm-project/pull/93760


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Drop high discrepancy profiles in matching (PR #95156)

2024-06-12 Thread Amir Ayupov via llvm-branch-commits


@@ -592,10 +599,15 @@ void preprocessUnreachableBlocks(FlowFunction &Func) {
 /// Decide if stale profile matching can be applied for a given function.
 /// Currently we skip inference for (very) large instances and for instances
 /// having "unexpected" control flow (e.g., having no sink basic blocks).
-bool canApplyInference(const FlowFunction &Func) {
+bool canApplyInference(const FlowFunction &Func,
+   const yaml::bolt::BinaryFunctionProfile &YamlBF) {
   if (Func.Blocks.size() > opts::StaleMatchingMaxFuncSize)
 return false;
 
+  if ((double)Func.MatchedExecCount / YamlBF.ExecCount >=
+  opts::MatchedProfileThreshold / 100.0)
+return false;

aaupov wrote:

It's a tricky question how to define the cutoff in terms of sufficient matching.
I first thought of defining a block count based cutoff (if we matched >5% of 
blocks, proceed with matching), but then what if these are cold blocks covering 
<1% of exec count? In this case we'd end up guessing/propagating most samples.

For block-based matching, the threshold should be higher than 5%, perhaps 
closer to a half? For exec count based matching, I'd feel comfortable with 5% 
as threshold.

https://github.com/llvm/llvm-project/pull/95156
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Drop high discrepancy profiles in matching (PR #95156)

2024-06-16 Thread Amir Ayupov via llvm-branch-commits


@@ -575,10 +583,16 @@ void preprocessUnreachableBlocks(FlowFunction &Func) {
 /// Decide if stale profile matching can be applied for a given function.
 /// Currently we skip inference for (very) large instances and for instances
 /// having "unexpected" control flow (e.g., having no sink basic blocks).
-bool canApplyInference(const FlowFunction &Func) {
+bool canApplyInference(const FlowFunction &Func,
+   const yaml::bolt::BinaryFunctionProfile &YamlBF,
+   const uint64_t &MatchedBlocks) {
   if (Func.Blocks.size() > opts::StaleMatchingMaxFuncSize)
 return false;
 
+  if ((double)(MatchedBlocks) / YamlBF.Blocks.size() <=

aaupov wrote:

It's preferred to use integer math when dealing with options, e.g.:
https://github.com/llvm/llvm-project/blob/67285feffd6708e6db36c11faf95eeab449e797a/bolt/lib/Passes/IndirectCallPromotion.cpp#L584-L586


https://github.com/llvm/llvm-project/pull/95156
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Drop high discrepancy profiles in matching (PR #95156)

2024-06-16 Thread Amir Ayupov via llvm-branch-commits


@@ -394,7 +400,7 @@ createFlowFunction(const 
BinaryFunction::BasicBlockOrderType &BlockOrder) {
 void matchWeightsByHashes(BinaryContext &BC,
   const BinaryFunction::BasicBlockOrderType 
&BlockOrder,
   const yaml::bolt::BinaryFunctionProfile &YamlBF,
-  FlowFunction &Func) {
+  FlowFunction &Func, uint64_t &MatchedBlocksCount) {

aaupov wrote:

Why not return MatchedBlocksCount instead?

https://github.com/llvm/llvm-project/pull/95156
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Drop high discrepancy profiles in matching (PR #95156)

2024-06-16 Thread Amir Ayupov via llvm-branch-commits


@@ -394,7 +400,7 @@ createFlowFunction(const 
BinaryFunction::BasicBlockOrderType &BlockOrder) {
 void matchWeightsByHashes(BinaryContext &BC,
   const BinaryFunction::BasicBlockOrderType 
&BlockOrder,
   const yaml::bolt::BinaryFunctionProfile &YamlBF,
-  FlowFunction &Func) {
+  FlowFunction &Func, uint64_t &MatchedBlocksCount) {

aaupov wrote:

I think it makes sense to return the stats as a result of matching. We can 
extend the return type into struct in the future if the need arises.

https://github.com/llvm/llvm-project/pull/95156
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Drop high discrepancy profiles in matching (PR #95156)

2024-06-17 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov approved this pull request.

LGTM with one nit

https://github.com/llvm/llvm-project/pull/95156
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Drop high discrepancy profiles in matching (PR #95156)

2024-06-17 Thread Amir Ayupov via llvm-branch-commits


@@ -391,10 +397,9 @@ createFlowFunction(const 
BinaryFunction::BasicBlockOrderType &BlockOrder) {
 /// of the basic blocks in the binary, the count is "matched" to the block.
 /// Similarly, if both the source and the target of a count in the profile are
 /// matched to a jump in the binary, the count is recorded in CFG.
-void matchWeightsByHashes(BinaryContext &BC,
-  const BinaryFunction::BasicBlockOrderType 
&BlockOrder,
-  const yaml::bolt::BinaryFunctionProfile &YamlBF,
-  FlowFunction &Func) {
+uint64_t matchWeightsByHashes(

aaupov wrote:

```suggestion
size_t matchWeightsByHashes(
```

https://github.com/llvm/llvm-project/pull/95156
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Drop high discrepancy profiles in matching (PR #95156)

2024-06-17 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov edited https://github.com/llvm/llvm-project/pull/95156
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Hash-based function matching (PR #95821)

2024-06-18 Thread Amir Ayupov via llvm-branch-commits


@@ -420,6 +442,7 @@ Error YAMLProfileReader::readProfile(BinaryContext &BC) {
   errs() << "BOLT-WARNING: profile ignored for function " << YamlBF.Name
  << '\n';
 
+

aaupov wrote:

nit: please drop 

https://github.com/llvm/llvm-project/pull/95821
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Hash-based function matching (PR #95821)

2024-06-18 Thread Amir Ayupov via llvm-branch-commits


@@ -0,0 +1,67 @@
+## Test YAMLProfileReader support for pass-through blocks in non-matching 
edges:
+## match the profile edge A -> C to the CFG with blocks A -> B -> C.
+
+# REQUIRES: system-linux
+# RUN: split-file %s %t
+# RUN: llvm-mc -filetype=obj -triple x86_64-unknown-unknown %t/main.s -o %t.o
+# RUN: %clang %cflags %t.o -o %t.exe -Wl,-q -nostdlib
+# RUN: llvm-bolt %t.exe -o %t.out --data %t/yaml -v=1 \
+# RUN:   --print-cfg 2>&1 | FileCheck %s

aaupov wrote:

Please add verbosity=2 logging for hash-based matching and explicitly check 
that we matched `main` BF to `main2` profile.

https://github.com/llvm/llvm-project/pull/95821
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Hash-based function matching (PR #95821)

2024-06-18 Thread Amir Ayupov via llvm-branch-commits


@@ -383,6 +381,30 @@ Error YAMLProfileReader::readProfile(BinaryContext &BC) {
   matchProfileToFunction(YamlBF, Function);
   }
 
+  // Uses the strict hash of profiled and binary functions to match functions

aaupov wrote:

Let's put this under `if (!opts::IgnoreHash)`

https://github.com/llvm/llvm-project/pull/95821
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Hash-based function matching (PR #95821)

2024-06-18 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov edited https://github.com/llvm/llvm-project/pull/95821
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Hash-based function matching (PR #95821)

2024-06-18 Thread Amir Ayupov via llvm-branch-commits


@@ -383,6 +381,30 @@ Error YAMLProfileReader::readProfile(BinaryContext &BC) {
   matchProfileToFunction(YamlBF, Function);
   }
 
+  // Uses the strict hash of profiled and binary functions to match functions
+  // that are not matched by name or common name.
+  std::unordered_map StrictBinaryFunctionHashes;
+  StrictBinaryFunctionHashes.reserve(BC.getBinaryFunctions().size());
+
+  for (auto &[_, BF] : BC.getBinaryFunctions()) {
+if (ProfiledFunctions.count(&BF))
+  continue;
+BF.computeHash(YamlBP.Header.IsDFSOrder, YamlBP.Header.HashFunction);

aaupov wrote:

Looks like we'd need to compute hashes for all functions (unless IgnoreHash is 
used). Let's compute them in a single place, instead of two (here and line 377).

We'd also need a sense of the runtime overhead for that – can you please run 
BOLT on a large binary with `-time-rewrite` and include profile reader wall 
time before and after into the summary?

https://github.com/llvm/llvm-project/pull/95821
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Hash-based function matching (PR #95821)

2024-06-18 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov commented:

Looks very good, but please address some minor issues.

https://github.com/llvm/llvm-project/pull/95821
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Hash-based function matching (PR #95821)

2024-06-18 Thread Amir Ayupov via llvm-branch-commits


@@ -383,6 +381,30 @@ Error YAMLProfileReader::readProfile(BinaryContext &BC) {
   matchProfileToFunction(YamlBF, Function);
   }
 
+  // Uses the strict hash of profiled and binary functions to match functions
+  // that are not matched by name or common name.
+  std::unordered_map StrictBinaryFunctionHashes;

aaupov wrote:

```suggestion
  std::unordered_map StrictHashToBF;
```

https://github.com/llvm/llvm-project/pull/95821
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Name similarity function matching (PR #95884)

2024-06-21 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov commented:

Can you please also note the runtime of the namespace-filtered pairwise checks? 
This would guide us in whether to add a BB count filtering.

https://github.com/llvm/llvm-project/pull/95884
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Name similarity function matching (PR #95884)

2024-06-21 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov edited https://github.com/llvm/llvm-project/pull/95884
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Name similarity function matching (PR #95884)

2024-06-21 Thread Amir Ayupov via llvm-branch-commits


@@ -415,6 +422,75 @@ Error YAMLProfileReader::readProfile(BinaryContext &BC) {
 if (!YamlBF.Used && BF && !ProfiledFunctions.count(BF))
   matchProfileToFunction(YamlBF, *BF);
 
+  // Uses name similarity to match functions that were not matched by name.
+  uint64_t MatchedWithDemangledName = 0;
+
+  if (opts::NameSimilarityFunctionMatchingThreshold > 0) {
+auto DemangleName = [&](const char* String) {
+  int Status = 0;
+  char *DemangledName = abi::__cxa_demangle(String,
+nullptr, nullptr, &Status);
+  return Status == 0 ? new std::string(DemangledName) : nullptr;
+};
+
+auto DeriveNameSpace = [&](std::string DemangledName) {
+  size_t LParen = std::string(DemangledName).find("(");
+  std::string FunctionName = std::string(DemangledName).substr(0, LParen);
+  size_t ScopeResolutionOperator = std::string(FunctionName).rfind("::");
+  return ScopeResolutionOperator == std::string::npos ? std::string("") : 
std::string(DemangledName).substr(0, ScopeResolutionOperator);
+};
+
+std::unordered_map> 
NamespaceToBFs;
+NamespaceToBFs.reserve(BC.getBinaryFunctions().size());
+
+for (BinaryFunction *BF : BC.getAllBinaryFunctions()) {
+  std::string* DemangledName = 
DemangleName(BF->getOneName().str().c_str());
+  if (!DemangledName)
+continue;
+  std::string Namespace = DeriveNameSpace(*DemangledName);
+  auto It = NamespaceToBFs.find(Namespace);
+  if (It == NamespaceToBFs.end())
+NamespaceToBFs[Namespace] = {BF};
+  else
+It->second.push_back(BF);
+}
+
+for (auto YamlBF : YamlBP.Functions) {
+  if (YamlBF.Used)
+continue;
+  std::string* YamlBFDemangledName = DemangleName(YamlBF.Name.c_str());
+  if (!YamlBFDemangledName)
+continue;
+  std::string Namespace = DeriveNameSpace(*YamlBFDemangledName);
+  auto It = NamespaceToBFs.find(Namespace);
+  if (It == NamespaceToBFs.end())
+continue;
+  std::vector BFs = It->second;
+
+  unsigned MinEditDistance = UINT_MAX;
+  BinaryFunction *ClosestNameBF = nullptr;
+
+  for (BinaryFunction *BF : BFs) {
+if (ProfiledFunctions.count(BF))
+  continue;
+std::string *BFDemangledName = 
DemangleName(BF->getOneName().str().c_str());

aaupov wrote:

```suggestion
std::string *BFDemangledName = BF->getDemangledName();
```

https://github.com/llvm/llvm-project/pull/95884
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Name similarity function matching (PR #95884)

2024-06-21 Thread Amir Ayupov via llvm-branch-commits


@@ -415,6 +422,75 @@ Error YAMLProfileReader::readProfile(BinaryContext &BC) {
 if (!YamlBF.Used && BF && !ProfiledFunctions.count(BF))
   matchProfileToFunction(YamlBF, *BF);
 
+  // Uses name similarity to match functions that were not matched by name.
+  uint64_t MatchedWithDemangledName = 0;
+
+  if (opts::NameSimilarityFunctionMatchingThreshold > 0) {
+auto DemangleName = [&](const char* String) {
+  int Status = 0;
+  char *DemangledName = abi::__cxa_demangle(String,

aaupov wrote:

Let's use LLVM's `demangle` (llvm/Demangle/Demangle.h)

https://github.com/llvm/llvm-project/pull/95884
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Name similarity function matching (PR #95884)

2024-06-21 Thread Amir Ayupov via llvm-branch-commits


@@ -415,6 +422,75 @@ Error YAMLProfileReader::readProfile(BinaryContext &BC) {
 if (!YamlBF.Used && BF && !ProfiledFunctions.count(BF))
   matchProfileToFunction(YamlBF, *BF);
 
+  // Uses name similarity to match functions that were not matched by name.
+  uint64_t MatchedWithDemangledName = 0;
+
+  if (opts::NameSimilarityFunctionMatchingThreshold > 0) {
+auto DemangleName = [&](const char* String) {
+  int Status = 0;
+  char *DemangledName = abi::__cxa_demangle(String,
+nullptr, nullptr, &Status);
+  return Status == 0 ? new std::string(DemangledName) : nullptr;
+};
+
+auto DeriveNameSpace = [&](std::string DemangledName) {

aaupov wrote:

Would be cool to use `getFunctionDeclContextName` from ItaniumPartialDemangler

https://github.com/llvm/llvm-project/pull/95884
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Hash-based function matching (PR #95821)

2024-06-21 Thread Amir Ayupov via llvm-branch-commits


@@ -128,6 +128,11 @@ cl::opt
cl::desc("instrument code to generate accurate profile data"),
cl::cat(BoltOptCategory));
 
+cl::opt
+MatchingFunctionsWithHash("stale-matching-matching-functions-with-hash",
+  cl::desc("Matching functions using hash"),

aaupov wrote:

Technically it's independent of stale matching:
```suggestion
MatchProfileWithFunctionHash("match-profile-with-function-hash",
  cl::desc("Match profile with function hash"),
```

https://github.com/llvm/llvm-project/pull/95821
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Hash-based function matching (PR #95821)

2024-06-21 Thread Amir Ayupov via llvm-branch-commits


@@ -2982,6 +2983,9 @@ void RewriteInstance::selectFunctionsToProcess() {
 if (mustSkip(Function))
   return false;
 
+if (opts::MatchingFunctionsWithHash)
+  return true;

aaupov wrote:

Since we're forcing the processing of all functions here to construct CFG and 
compute hashes, let's also mark functions without profile as ignored in Lite 
mode. 
This should happen after we assign profile, at the end of 
YAMLProfileReader::readProfile. Under the new option, go over all functions and 
check if the function has profile, if not, mark it ignored.

https://github.com/llvm/llvm-project/pull/95821
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Hash-based function matching (PR #95821)

2024-06-21 Thread Amir Ayupov via llvm-branch-commits


@@ -363,9 +364,27 @@ Error YAMLProfileReader::readProfile(BinaryContext &BC) {
 return Profile.Hash == static_cast(BF.getHash());
   };
 
-  // We have to do 2 passes since LTO introduces an ambiguity in function
-  // names. The first pass assigns profiles that match 100% by name and
-  // by hash. The second pass allows name ambiguity for LTO private functions.
+  uint64_t MatchedWithExactName = 0;
+  uint64_t MatchedWithHash = 0;
+  uint64_t MatchedWithLTOCommonName = 0;
+
+  // Computes hash for binary functions.
+  if (opts::MatchingFunctionsWithHash) {
+for (auto &[_, BF] : BC.getBinaryFunctions())
+  BF.computeHash(YamlBP.Header.IsDFSOrder, YamlBP.Header.HashFunction);
+  } else {
+for (auto [YamlBF, BF] : llvm::zip_equal(YamlBP.Functions, ProfileBFs)) {
+  if (!BF)
+continue;
+  BinaryFunction &Function = *BF;
+
+  if (!opts::IgnoreHash)
+Function.computeHash(YamlBP.Header.IsDFSOrder,
+ YamlBP.Header.HashFunction);
+}
+  }

aaupov wrote:

```suggestion
  if (opts::MatchingFunctionsWithHash)
for (auto &[_, BF] : BC.getBinaryFunctions())
  BF.computeHash(YamlBP.Header.IsDFSOrder, YamlBP.Header.HashFunction);
  else if (!opts::IgnoreHash)
for (BinaryFunction *BF : ProfileBFs)
  BF->computeHash(YamlBP.Header.IsDFSOrder, YamlBP.Header.HashFunction);
```

https://github.com/llvm/llvm-project/pull/95821
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Hash-based function matching (PR #95821)

2024-06-21 Thread Amir Ayupov via llvm-branch-commits


@@ -1161,4 +1165,4 @@
 
 - `--print-options`
 
-  Print non-default options after command line parsing

aaupov wrote:

nit: please avoid unrelated changes

https://github.com/llvm/llvm-project/pull/95821
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Hash-based function matching (PR #95821)

2024-06-22 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov edited https://github.com/llvm/llvm-project/pull/95821
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Hash-based function matching (PR #95821)

2024-06-22 Thread Amir Ayupov via llvm-branch-commits


@@ -439,6 +482,11 @@ Error YAMLProfileReader::readProfile(BinaryContext &BC) {
 
   BC.setNumUnusedProfiledObjects(NumUnused);
 
+  if (opts::Lite)
+for (BinaryFunction *BF : BC.getAllBinaryFunctions())
+  if (ProfiledFunctions.find(BF) == ProfiledFunctions.end())

aaupov wrote:

At this point we're supposed to have attached the profile to BF:
```suggestion
  if (!BF->hasProfile())
```

https://github.com/llvm/llvm-project/pull/95821
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Hash-based function matching (PR #95821)

2024-06-22 Thread Amir Ayupov via llvm-branch-commits


@@ -0,0 +1,65 @@
+## Test YAMLProfileReader support for pass-through blocks in non-matching 
edges:
+## match the profile edge A -> C to the CFG with blocks A -> B -> C.

aaupov wrote:

Please update the test description.

https://github.com/llvm/llvm-project/pull/95821
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Hash-based function matching (PR #95821)

2024-06-22 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov approved this pull request.

LG overall, with a couple of nits. 

https://github.com/llvm/llvm-project/pull/95821
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Hash-based function matching (PR #95821)

2024-06-22 Thread Amir Ayupov via llvm-branch-commits


@@ -374,15 +386,34 @@ Error YAMLProfileReader::readProfile(BinaryContext &BC) {
 // the profile.
 Function.setExecutionCount(BinaryFunction::COUNT_NO_PROFILE);
 
-// Recompute hash once per function.
-if (!opts::IgnoreHash)
-  Function.computeHash(YamlBP.Header.IsDFSOrder,
-   YamlBP.Header.HashFunction);
-
-if (profileMatches(YamlBF, Function))
+if (profileMatches(YamlBF, Function)) {
   matchProfileToFunction(YamlBF, Function);
+  ++MatchedWithExactName;
+}
   }
 
+  // Uses the strict hash of profiled and binary functions to match functions
+  // that are not matched by name or common name.
+  if (opts::MatchProfileWithFunctionHash) {
+std::unordered_map StrictHashToBF;
+StrictHashToBF.reserve(BC.getBinaryFunctions().size());
+
+for (auto &[_, BF] : BC.getBinaryFunctions())
+  StrictHashToBF[BF.getHash()] = &BF;
+
+for (yaml::bolt::BinaryFunctionProfile &YamlBF : YamlBP.Functions) {
+  if (YamlBF.Used)
+continue;
+  auto It = StrictHashToBF.find(YamlBF.Hash);
+  if (It != StrictHashToBF.end() && !ProfiledFunctions.count(It->second)) {
+auto *BF = It->second;

aaupov wrote:

nit: LLVM's guideline is to use explicit type here: [LLVM Coding 
Standards](https://llvm.org/docs/CodingStandards.html#use-auto-type-deduction-to-make-code-more-readable)

https://github.com/llvm/llvm-project/pull/95821
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Name similarity function matching (PR #95884)

2024-06-25 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov requested changes to this pull request.

Thank you for checking the runtime, and it's quite high. We'll need extra 
filtering by block count to keep runtime low - please add it as we discussed.

https://github.com/llvm/llvm-project/pull/95884
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Name similarity function matching (PR #95884)

2024-06-25 Thread Amir Ayupov via llvm-branch-commits


@@ -23,6 +26,10 @@ extern cl::opt Verbosity;
 extern cl::OptionCategory BoltOptCategory;
 extern cl::opt InferStaleProfile;
 
+cl::opt NameSimilarityFunctionMatchingThreshold(
+"name-similarity-function-matching-threshold", cl::desc("edit distance."),

aaupov wrote:

Please expand the description

https://github.com/llvm/llvm-project/pull/95884
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Name similarity function matching (PR #95884)

2024-06-25 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov edited https://github.com/llvm/llvm-project/pull/95884
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Name similarity function matching (PR #95884)

2024-06-25 Thread Amir Ayupov via llvm-branch-commits


@@ -415,11 +422,87 @@ Error YAMLProfileReader::readProfile(BinaryContext &BC) {
 if (!YamlBF.Used && BF && !ProfiledFunctions.count(BF))
   matchProfileToFunction(YamlBF, *BF);
 
+  // Uses name similarity to match functions that were not matched by name.
+  uint64_t MatchedWithNameSimilarity = 0;
+
+  if (opts::NameSimilarityFunctionMatchingThreshold > 0) {
+auto DemangleName = [&](std::string &FunctionName) {
+  StringRef RestoredName = NameResolver::restore(FunctionName);
+  return demangle(RestoredName);
+};
+
+auto DeriveNameSpace = [&](ItaniumPartialDemangler 
&ItaniumPartialDemangler,

aaupov wrote:

You can have ItaniumPartialDemangler defined above, and it will be captured by 
DeriveNameSpace lambda, so no need to pass it explicitly.

https://github.com/llvm/llvm-project/pull/95884
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Name similarity function matching (PR #95884)

2024-06-25 Thread Amir Ayupov via llvm-branch-commits

aaupov wrote:

Please expand the summary of the PR as well.

https://github.com/llvm/llvm-project/pull/95884
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Function matching with function calls as anchors (PR #96596)

2024-06-25 Thread Amir Ayupov via llvm-branch-commits


@@ -415,11 +495,20 @@ Error YAMLProfileReader::readProfile(BinaryContext &BC) {
 if (!YamlBF.Used && BF && !ProfiledFunctions.count(BF))
   matchProfileToFunction(YamlBF, *BF);
 
+  uint64_t MatchedWithCallsAsAnchors = 0;
+  if (opts::MatchWithCallsAsAnchors)
+matchWithCallsAsAnchors(BC, MatchedWithCallsAsAnchors);

aaupov wrote:

This new matching needs to be inside `inferStaleProfile` as we discussed, 
conceptually sandwiched between strict and loose block hash matching. 
Logistically it happen in this vicinity:
https://github.com/llvm/llvm-project/blob/e214ed9d7060f6caa0c1bb756edb62643f23aa5b/bolt/lib/Profile/StaleProfileMatching.cpp#L439-L440

https://github.com/llvm/llvm-project/pull/96596
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Function matching with function calls as anchors (PR #96596)

2024-06-25 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov edited https://github.com/llvm/llvm-project/pull/96596
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [BOLT][NFC] Refactoring CallGraph (PR #96922)

2024-06-27 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov commented:

Please build with shared libraries mode to ensure cross-component dependencies 
are satisfied.

https://github.com/llvm/llvm-project/pull/96922
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [BOLT][NFC] Refactoring CallGraph (PR #96922)

2024-06-27 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov edited https://github.com/llvm/llvm-project/pull/96922
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [BOLT][NFC] Refactoring CallGraph (PR #96922)

2024-06-27 Thread Amir Ayupov via llvm-branch-commits


@@ -10,7 +10,7 @@
 //
 
//===--===//
 
-#include "bolt/Passes/CallGraph.h"
+#include "bolt/Core/CallGraph.h"

aaupov wrote:

Please also update file headers (first line)

https://github.com/llvm/llvm-project/pull/96922
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT][NFC] Refactoring CallGraph (PR #96922)

2024-06-27 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov approved this pull request.

LGTM but please ensure that the diff passes NFC checks and shared build work.

https://github.com/llvm/llvm-project/pull/96922
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT][NFC] Refactoring CallGraph (PR #96922)

2024-06-27 Thread Amir Ayupov via llvm-branch-commits

aaupov wrote:

Please also retitle as an imperative statement, e.g. "Move CallGraph from 
Passes to Core"

https://github.com/llvm/llvm-project/pull/96922
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [BOLT] Drop macro-fusion alignment (PR #97358)

2024-07-01 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov created 
https://github.com/llvm/llvm-project/pull/97358

9d0754ada5dbbc0c009bcc2f7824488419cc5530 dropped MC support required for
macro-fusion alignment in BOLT. Remove the support in BOLT.

Test Plan:
macro-fusion alignment was never upstreamed, so no upstream tests are
affected.



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [BOLT] Drop macro-fusion alignment (PR #97358)

2024-07-01 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov ready_for_review 
https://github.com/llvm/llvm-project/pull/97358
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [BOLT] Drop macro-fusion alignment (PR #97358)

2024-07-02 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov edited https://github.com/llvm/llvm-project/pull/97358
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [BOLT] Drop macro-fusion alignment (PR #97358)

2024-07-02 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov closed https://github.com/llvm/llvm-project/pull/97358
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [BOLT] Drop macro-fusion alignment (PR #97358)

2024-07-02 Thread Amir Ayupov via llvm-branch-commits

aaupov wrote:

It's not. I landed to main manually.

https://github.com/llvm/llvm-project/pull/97358
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match functions with name similarity (PR #95884)

2024-07-02 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov edited https://github.com/llvm/llvm-project/pull/95884
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match functions with name similarity (PR #95884)

2024-07-02 Thread Amir Ayupov via llvm-branch-commits


@@ -342,6 +350,107 @@ bool YAMLProfileReader::mayHaveProfileData(const 
BinaryFunction &BF) {
   return false;
 }
 
+uint64_t YAMLProfileReader::matchWithNameSimilarity(BinaryContext &BC) {
+  uint64_t MatchedWithNameSimilarity = 0;
+  ItaniumPartialDemangler ItaniumPartialDemangler;

aaupov wrote:

```suggestion
  ItaniumPartialDemangler Demangler;
```

https://github.com/llvm/llvm-project/pull/95884
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match functions with name similarity (PR #95884)

2024-07-02 Thread Amir Ayupov via llvm-branch-commits


@@ -342,6 +350,107 @@ bool YAMLProfileReader::mayHaveProfileData(const 
BinaryFunction &BF) {
   return false;
 }
 
+uint64_t YAMLProfileReader::matchWithNameSimilarity(BinaryContext &BC) {
+  uint64_t MatchedWithNameSimilarity = 0;
+  ItaniumPartialDemangler ItaniumPartialDemangler;
+
+  // Demangle and derive namespace from function name.
+  auto DemangleName = [&](std::string &FunctionName) {
+StringRef RestoredName = NameResolver::restore(FunctionName);
+return demangle(RestoredName);
+  };
+  auto DeriveNameSpace = [&](std::string &DemangledName) {
+if (ItaniumPartialDemangler.partialDemangle(DemangledName.c_str()))
+  return std::string("");
+std::vector Buffer(DemangledName.begin(), DemangledName.end());
+size_t BufferSize = Buffer.size();
+char *NameSpace = ItaniumPartialDemangler.getFunctionDeclContextName(
+&Buffer[0], &BufferSize);

aaupov wrote:

What's bothering me here is the use of BufferSize – it's an output parameter 
from `getFunctionDecContextName` and yet unused in the construction of output 
string.

https://github.com/llvm/llvm-project/pull/95884
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match functions with name similarity (PR #95884)

2024-07-02 Thread Amir Ayupov via llvm-branch-commits


@@ -342,6 +350,107 @@ bool YAMLProfileReader::mayHaveProfileData(const 
BinaryFunction &BF) {
   return false;
 }
 
+uint64_t YAMLProfileReader::matchWithNameSimilarity(BinaryContext &BC) {
+  uint64_t MatchedWithNameSimilarity = 0;
+  ItaniumPartialDemangler ItaniumPartialDemangler;
+
+  // Demangle and derive namespace from function name.
+  auto DemangleName = [&](std::string &FunctionName) {
+StringRef RestoredName = NameResolver::restore(FunctionName);
+return demangle(RestoredName);
+  };
+  auto DeriveNameSpace = [&](std::string &DemangledName) {
+if (ItaniumPartialDemangler.partialDemangle(DemangledName.c_str()))
+  return std::string("");
+std::vector Buffer(DemangledName.begin(), DemangledName.end());
+size_t BufferSize = Buffer.size();
+char *NameSpace = ItaniumPartialDemangler.getFunctionDeclContextName(
+&Buffer[0], &BufferSize);
+return NameSpace ? std::string(NameSpace) : std::string("");
+  };
+
+  // Maps namespaces to associated function block counts and gets profile
+  // function names and namespaces to minimize the number of BFs to process and
+  // avoid repeated name demangling/namespace derivision.
+  StringMap> NamespaceToProfiledBFSizes;
+  std::vector ProfileBFDemangledNames;
+  ProfileBFDemangledNames.reserve(YamlBP.Functions.size());
+  std::vector ProfiledBFNamespaces;
+  ProfiledBFNamespaces.reserve(YamlBP.Functions.size());
+
+  for (auto &YamlBF : YamlBP.Functions) {
+std::string YamlBFDemangledName = DemangleName(YamlBF.Name);
+ProfileBFDemangledNames.push_back(YamlBFDemangledName);
+std::string YamlBFNamespace = DeriveNameSpace(YamlBFDemangledName);
+ProfiledBFNamespaces.push_back(YamlBFNamespace);
+NamespaceToProfiledBFSizes[YamlBFNamespace].insert(YamlBF.NumBasicBlocks);
+  }
+
+  StringMap> NamespaceToBFs;
+
+  // Maps namespaces to BFs disincluding binary functions with no equal sized

aaupov wrote:

```suggestion
  // Maps namespaces to BFs excluding binary functions with no equal sized
```

https://github.com/llvm/llvm-project/pull/95884
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match functions with name similarity (PR #95884)

2024-07-02 Thread Amir Ayupov via llvm-branch-commits


@@ -342,6 +350,107 @@ bool YAMLProfileReader::mayHaveProfileData(const 
BinaryFunction &BF) {
   return false;
 }
 
+uint64_t YAMLProfileReader::matchWithNameSimilarity(BinaryContext &BC) {
+  uint64_t MatchedWithNameSimilarity = 0;
+  ItaniumPartialDemangler ItaniumPartialDemangler;
+
+  // Demangle and derive namespace from function name.
+  auto DemangleName = [&](std::string &FunctionName) {
+StringRef RestoredName = NameResolver::restore(FunctionName);
+return demangle(RestoredName);
+  };
+  auto DeriveNameSpace = [&](std::string &DemangledName) {
+if (ItaniumPartialDemangler.partialDemangle(DemangledName.c_str()))
+  return std::string("");
+std::vector Buffer(DemangledName.begin(), DemangledName.end());
+size_t BufferSize = Buffer.size();
+char *NameSpace = ItaniumPartialDemangler.getFunctionDeclContextName(
+&Buffer[0], &BufferSize);
+return NameSpace ? std::string(NameSpace) : std::string("");
+  };
+
+  // Maps namespaces to associated function block counts and gets profile
+  // function names and namespaces to minimize the number of BFs to process and
+  // avoid repeated name demangling/namespace derivision.

aaupov wrote:

```suggestion
  // avoid repeated name demangling/namespace derivation.
```

https://github.com/llvm/llvm-project/pull/95884
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match functions with name similarity (PR #95884)

2024-07-02 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov commented:

LG in general with some comments

https://github.com/llvm/llvm-project/pull/95884
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match functions with name similarity (PR #95884)

2024-07-02 Thread Amir Ayupov via llvm-branch-commits


@@ -342,6 +350,107 @@ bool YAMLProfileReader::mayHaveProfileData(const 
BinaryFunction &BF) {
   return false;
 }
 
+uint64_t YAMLProfileReader::matchWithNameSimilarity(BinaryContext &BC) {
+  uint64_t MatchedWithNameSimilarity = 0;
+  ItaniumPartialDemangler ItaniumPartialDemangler;
+
+  // Demangle and derive namespace from function name.
+  auto DemangleName = [&](std::string &FunctionName) {
+StringRef RestoredName = NameResolver::restore(FunctionName);
+return demangle(RestoredName);
+  };
+  auto DeriveNameSpace = [&](std::string &DemangledName) {
+if (ItaniumPartialDemangler.partialDemangle(DemangledName.c_str()))
+  return std::string("");
+std::vector Buffer(DemangledName.begin(), DemangledName.end());
+size_t BufferSize = Buffer.size();
+char *NameSpace = ItaniumPartialDemangler.getFunctionDeclContextName(
+&Buffer[0], &BufferSize);
+return NameSpace ? std::string(NameSpace) : std::string("");

aaupov wrote:

```suggestion
return std::string(NameSpace, BufferSize);
```

https://github.com/llvm/llvm-project/pull/95884
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match functions with name similarity (PR #95884)

2024-07-02 Thread Amir Ayupov via llvm-branch-commits


@@ -23,6 +26,11 @@ extern cl::opt Verbosity;
 extern cl::OptionCategory BoltOptCategory;
 extern cl::opt InferStaleProfile;
 
+cl::opt NameSimilarityFunctionMatchingThreshold(
+"name-similarity-function-matching-threshold",
+cl::desc("Match functions using namespace and edit distance."), 
cl::init(0),

aaupov wrote:

```suggestion
cl::desc("Match functions using namespace and edit distance"), cl::init(0),
```

https://github.com/llvm/llvm-project/pull/95884
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match blocks with calls as anchors (PR #96596)

2024-07-02 Thread Amir Ayupov via llvm-branch-commits


@@ -70,12 +70,16 @@ class YAMLProfileReader : public ProfileReaderBase {
   std::vector ProfileBFs;
 
   /// Populate \p Function profile with the one supplied in YAML format.
-  bool parseFunctionProfile(BinaryFunction &Function,
-const yaml::bolt::BinaryFunctionProfile &YamlBF);
+  bool parseFunctionProfile(
+  const DenseMap &IdToFunctionName,

aaupov wrote:

Define the map as YAMLProfileReader member, with a comment about its use.

https://github.com/llvm/llvm-project/pull/96596
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match blocks with calls as anchors (PR #96596)

2024-07-02 Thread Amir Ayupov via llvm-branch-commits


@@ -479,6 +481,11 @@ Error YAMLProfileReader::readProfile(BinaryContext &BC) {
   NormalizeByInsnCount = usesEvent("cycles") || usesEvent("instructions");
   NormalizeByCalls = usesEvent("branches");
 
+  // Map profiled function ids to names.
+  DenseMap IdToFunctionName;

aaupov wrote:

Do we need a map and not a vector?

https://github.com/llvm/llvm-project/pull/96596
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match functions with name similarity (PR #95884)

2024-07-03 Thread Amir Ayupov via llvm-branch-commits


@@ -342,6 +350,107 @@ bool YAMLProfileReader::mayHaveProfileData(const 
BinaryFunction &BF) {
   return false;
 }
 
+uint64_t YAMLProfileReader::matchWithNameSimilarity(BinaryContext &BC) {
+  uint64_t MatchedWithNameSimilarity = 0;
+  ItaniumPartialDemangler Demangler;
+
+  // Demangle and derive namespace from function name.
+  auto DemangleName = [&](std::string &FunctionName) {
+StringRef RestoredName = NameResolver::restore(FunctionName);
+return demangle(RestoredName);
+  };
+  auto DeriveNameSpace = [&](std::string &DemangledName) {
+if (Demangler.partialDemangle(DemangledName.c_str()))
+  return std::string("");
+std::vector Buffer(DemangledName.begin(), DemangledName.end());
+size_t BufferSize = Buffer.size();
+char *NameSpace =
+Demangler.getFunctionDeclContextName(&Buffer[0], &BufferSize);
+return std::string(NameSpace, BufferSize);
+  };
+
+  // Maps namespaces to associated function block counts and gets profile
+  // function names and namespaces to minimize the number of BFs to process and
+  // avoid repeated name demangling/namespace derivation.
+  StringMap> NamespaceToProfiledBFSizes;
+  std::vector ProfileBFDemangledNames;
+  ProfileBFDemangledNames.reserve(YamlBP.Functions.size());
+  std::vector ProfiledBFNamespaces;
+  ProfiledBFNamespaces.reserve(YamlBP.Functions.size());
+
+  for (auto &YamlBF : YamlBP.Functions) {
+std::string YamlBFDemangledName = DemangleName(YamlBF.Name);
+ProfileBFDemangledNames.push_back(YamlBFDemangledName);
+std::string YamlBFNamespace = DeriveNameSpace(YamlBFDemangledName);
+ProfiledBFNamespaces.push_back(YamlBFNamespace);
+NamespaceToProfiledBFSizes[YamlBFNamespace].insert(YamlBF.NumBasicBlocks);
+  }
+
+  StringMap> NamespaceToBFs;
+
+  // Maps namespaces to BFs excluding binary functions with no equal sized
+  // profiled functions belonging to the same namespace.
+  for (BinaryFunction *BF : BC.getAllBinaryFunctions()) {
+std::string DemangledName = BF->getDemangledName();
+std::string Namespace = DeriveNameSpace(DemangledName);
+
+auto NamespaceToProfiledBFSizesIt =
+NamespaceToProfiledBFSizes.find(Namespace);
+if (NamespaceToProfiledBFSizesIt == NamespaceToProfiledBFSizes.end())
+  continue;
+if (NamespaceToProfiledBFSizesIt->second.count(BF->size()) == 0)
+  continue;
+auto NamespaceToBFsIt = NamespaceToBFs.find(Namespace);
+if (NamespaceToBFsIt == NamespaceToBFs.end())
+  NamespaceToBFs[Namespace] = {BF};
+else
+  NamespaceToBFsIt->second.push_back(BF);
+  }
+
+  // Iterates through all profiled functions and binary functions belonging to
+  // the same namespace and matches based on edit distance thresehold.

aaupov wrote:

```suggestion
  // the same namespace and matches based on edit distance threshold.
```

https://github.com/llvm/llvm-project/pull/95884
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match functions with name similarity (PR #95884)

2024-07-03 Thread Amir Ayupov via llvm-branch-commits


@@ -342,6 +350,107 @@ bool YAMLProfileReader::mayHaveProfileData(const 
BinaryFunction &BF) {
   return false;
 }
 
+uint64_t YAMLProfileReader::matchWithNameSimilarity(BinaryContext &BC) {
+  uint64_t MatchedWithNameSimilarity = 0;
+  ItaniumPartialDemangler Demangler;
+
+  // Demangle and derive namespace from function name.
+  auto DemangleName = [&](std::string &FunctionName) {
+StringRef RestoredName = NameResolver::restore(FunctionName);
+return demangle(RestoredName);
+  };
+  auto DeriveNameSpace = [&](std::string &DemangledName) {
+if (Demangler.partialDemangle(DemangledName.c_str()))
+  return std::string("");
+std::vector Buffer(DemangledName.begin(), DemangledName.end());
+size_t BufferSize = Buffer.size();
+char *NameSpace =
+Demangler.getFunctionDeclContextName(&Buffer[0], &BufferSize);
+return std::string(NameSpace, BufferSize);
+  };
+
+  // Maps namespaces to associated function block counts and gets profile
+  // function names and namespaces to minimize the number of BFs to process and
+  // avoid repeated name demangling/namespace derivation.
+  StringMap> NamespaceToProfiledBFSizes;
+  std::vector ProfileBFDemangledNames;
+  ProfileBFDemangledNames.reserve(YamlBP.Functions.size());
+  std::vector ProfiledBFNamespaces;
+  ProfiledBFNamespaces.reserve(YamlBP.Functions.size());
+
+  for (auto &YamlBF : YamlBP.Functions) {
+std::string YamlBFDemangledName = DemangleName(YamlBF.Name);
+ProfileBFDemangledNames.push_back(YamlBFDemangledName);
+std::string YamlBFNamespace = DeriveNameSpace(YamlBFDemangledName);
+ProfiledBFNamespaces.push_back(YamlBFNamespace);
+NamespaceToProfiledBFSizes[YamlBFNamespace].insert(YamlBF.NumBasicBlocks);
+  }
+
+  StringMap> NamespaceToBFs;
+
+  // Maps namespaces to BFs excluding binary functions with no equal sized
+  // profiled functions belonging to the same namespace.
+  for (BinaryFunction *BF : BC.getAllBinaryFunctions()) {
+std::string DemangledName = BF->getDemangledName();
+std::string Namespace = DeriveNameSpace(DemangledName);
+
+auto NamespaceToProfiledBFSizesIt =
+NamespaceToProfiledBFSizes.find(Namespace);
+if (NamespaceToProfiledBFSizesIt == NamespaceToProfiledBFSizes.end())
+  continue;
+if (NamespaceToProfiledBFSizesIt->second.count(BF->size()) == 0)
+  continue;
+auto NamespaceToBFsIt = NamespaceToBFs.find(Namespace);
+if (NamespaceToBFsIt == NamespaceToBFs.end())
+  NamespaceToBFs[Namespace] = {BF};
+else
+  NamespaceToBFsIt->second.push_back(BF);
+  }
+
+  // Iterates through all profiled functions and binary functions belonging to
+  // the same namespace and matches based on edit distance thresehold.
+  assert(YamlBP.Functions.size() == ProfiledBFNamespaces.size() &&
+ ProfiledBFNamespaces.size() == ProfileBFDemangledNames.size());
+  for (size_t I = 0; I < YamlBP.Functions.size(); ++I) {
+yaml::bolt::BinaryFunctionProfile &YamlBF = YamlBP.Functions[I];
+std::string &YamlBFNamespace = ProfiledBFNamespaces[I];
+if (YamlBF.Used)
+  continue;
+auto It = NamespaceToBFs.find(YamlBFNamespace);

aaupov wrote:

```suggestion
// Skip if there are no BFs in a given \p Namespace.
auto It = NamespaceToBFs.find(YamlBFNamespace);
```

https://github.com/llvm/llvm-project/pull/95884
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match functions with name similarity (PR #95884)

2024-07-03 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov edited https://github.com/llvm/llvm-project/pull/95884
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match functions with name similarity (PR #95884)

2024-07-03 Thread Amir Ayupov via llvm-branch-commits


@@ -342,6 +350,107 @@ bool YAMLProfileReader::mayHaveProfileData(const 
BinaryFunction &BF) {
   return false;
 }
 
+uint64_t YAMLProfileReader::matchWithNameSimilarity(BinaryContext &BC) {
+  uint64_t MatchedWithNameSimilarity = 0;
+  ItaniumPartialDemangler Demangler;
+
+  // Demangle and derive namespace from function name.
+  auto DemangleName = [&](std::string &FunctionName) {
+StringRef RestoredName = NameResolver::restore(FunctionName);
+return demangle(RestoredName);
+  };
+  auto DeriveNameSpace = [&](std::string &DemangledName) {
+if (Demangler.partialDemangle(DemangledName.c_str()))
+  return std::string("");
+std::vector Buffer(DemangledName.begin(), DemangledName.end());
+size_t BufferSize = Buffer.size();
+char *NameSpace =
+Demangler.getFunctionDeclContextName(&Buffer[0], &BufferSize);
+return std::string(NameSpace, BufferSize);
+  };
+
+  // Maps namespaces to associated function block counts and gets profile
+  // function names and namespaces to minimize the number of BFs to process and
+  // avoid repeated name demangling/namespace derivation.
+  StringMap> NamespaceToProfiledBFSizes;
+  std::vector ProfileBFDemangledNames;
+  ProfileBFDemangledNames.reserve(YamlBP.Functions.size());
+  std::vector ProfiledBFNamespaces;
+  ProfiledBFNamespaces.reserve(YamlBP.Functions.size());
+
+  for (auto &YamlBF : YamlBP.Functions) {
+std::string YamlBFDemangledName = DemangleName(YamlBF.Name);
+ProfileBFDemangledNames.push_back(YamlBFDemangledName);
+std::string YamlBFNamespace = DeriveNameSpace(YamlBFDemangledName);
+ProfiledBFNamespaces.push_back(YamlBFNamespace);
+NamespaceToProfiledBFSizes[YamlBFNamespace].insert(YamlBF.NumBasicBlocks);
+  }
+
+  StringMap> NamespaceToBFs;
+
+  // Maps namespaces to BFs excluding binary functions with no equal sized
+  // profiled functions belonging to the same namespace.
+  for (BinaryFunction *BF : BC.getAllBinaryFunctions()) {
+std::string DemangledName = BF->getDemangledName();
+std::string Namespace = DeriveNameSpace(DemangledName);
+
+auto NamespaceToProfiledBFSizesIt =
+NamespaceToProfiledBFSizes.find(Namespace);
+if (NamespaceToProfiledBFSizesIt == NamespaceToProfiledBFSizes.end())
+  continue;
+if (NamespaceToProfiledBFSizesIt->second.count(BF->size()) == 0)

aaupov wrote:

```suggestion
// Skip if there are no ProfileBFs in a given \p Namespace with
// equal number of blocks.
if (NamespaceToProfiledBFSizesIt->second.count(BF->size()) == 0)
```

https://github.com/llvm/llvm-project/pull/95884
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match functions with name similarity (PR #95884)

2024-07-03 Thread Amir Ayupov via llvm-branch-commits


@@ -342,6 +350,107 @@ bool YAMLProfileReader::mayHaveProfileData(const 
BinaryFunction &BF) {
   return false;
 }
 
+uint64_t YAMLProfileReader::matchWithNameSimilarity(BinaryContext &BC) {
+  uint64_t MatchedWithNameSimilarity = 0;
+  ItaniumPartialDemangler Demangler;
+
+  // Demangle and derive namespace from function name.
+  auto DemangleName = [&](std::string &FunctionName) {
+StringRef RestoredName = NameResolver::restore(FunctionName);
+return demangle(RestoredName);
+  };
+  auto DeriveNameSpace = [&](std::string &DemangledName) {
+if (Demangler.partialDemangle(DemangledName.c_str()))
+  return std::string("");
+std::vector Buffer(DemangledName.begin(), DemangledName.end());
+size_t BufferSize = Buffer.size();
+char *NameSpace =
+Demangler.getFunctionDeclContextName(&Buffer[0], &BufferSize);
+return std::string(NameSpace, BufferSize);
+  };
+
+  // Maps namespaces to associated function block counts and gets profile
+  // function names and namespaces to minimize the number of BFs to process and
+  // avoid repeated name demangling/namespace derivation.
+  StringMap> NamespaceToProfiledBFSizes;
+  std::vector ProfileBFDemangledNames;
+  ProfileBFDemangledNames.reserve(YamlBP.Functions.size());
+  std::vector ProfiledBFNamespaces;
+  ProfiledBFNamespaces.reserve(YamlBP.Functions.size());
+
+  for (auto &YamlBF : YamlBP.Functions) {
+std::string YamlBFDemangledName = DemangleName(YamlBF.Name);
+ProfileBFDemangledNames.push_back(YamlBFDemangledName);
+std::string YamlBFNamespace = DeriveNameSpace(YamlBFDemangledName);
+ProfiledBFNamespaces.push_back(YamlBFNamespace);
+NamespaceToProfiledBFSizes[YamlBFNamespace].insert(YamlBF.NumBasicBlocks);
+  }
+
+  StringMap> NamespaceToBFs;
+
+  // Maps namespaces to BFs excluding binary functions with no equal sized
+  // profiled functions belonging to the same namespace.
+  for (BinaryFunction *BF : BC.getAllBinaryFunctions()) {
+std::string DemangledName = BF->getDemangledName();
+std::string Namespace = DeriveNameSpace(DemangledName);
+
+auto NamespaceToProfiledBFSizesIt =
+NamespaceToProfiledBFSizes.find(Namespace);
+if (NamespaceToProfiledBFSizesIt == NamespaceToProfiledBFSizes.end())

aaupov wrote:

```suggestion
// Skip if there are no ProfileBFs with a given \p Namespace.
if (NamespaceToProfiledBFSizesIt == NamespaceToProfiledBFSizes.end())
```

https://github.com/llvm/llvm-project/pull/95884
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match functions with name similarity (PR #95884)

2024-07-03 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov approved this pull request.

LG with a couple of nits.

https://github.com/llvm/llvm-project/pull/95884
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match functions with name similarity (PR #95884)

2024-07-03 Thread Amir Ayupov via llvm-branch-commits


@@ -342,6 +350,107 @@ bool YAMLProfileReader::mayHaveProfileData(const 
BinaryFunction &BF) {
   return false;
 }
 
+uint64_t YAMLProfileReader::matchWithNameSimilarity(BinaryContext &BC) {
+  uint64_t MatchedWithNameSimilarity = 0;
+  ItaniumPartialDemangler Demangler;
+
+  // Demangle and derive namespace from function name.
+  auto DemangleName = [&](std::string &FunctionName) {
+StringRef RestoredName = NameResolver::restore(FunctionName);
+return demangle(RestoredName);
+  };
+  auto DeriveNameSpace = [&](std::string &DemangledName) {
+if (Demangler.partialDemangle(DemangledName.c_str()))
+  return std::string("");
+std::vector Buffer(DemangledName.begin(), DemangledName.end());
+size_t BufferSize = Buffer.size();

aaupov wrote:

```suggestion
size_t BufferSize;
```

https://github.com/llvm/llvm-project/pull/95884
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT][NFC] Refactor function matching (PR #97502)

2024-07-03 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov edited https://github.com/llvm/llvm-project/pull/97502
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT][NFC] Refactor function matching (PR #97502)

2024-07-03 Thread Amir Ayupov via llvm-branch-commits


@@ -334,6 +334,13 @@ Error YAMLProfileReader::preprocessProfile(BinaryContext 
&BC) {
   return Error::success();
 }
 
+bool YAMLProfileReader::profileMatches(
+const yaml::bolt::BinaryFunctionProfile &Profile, BinaryFunction &BF) {

aaupov wrote:

```suggestion
const yaml::bolt::BinaryFunctionProfile &Profile, const BinaryFunction &BF) 
{
```

https://github.com/llvm/llvm-project/pull/97502
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT][NFC] Refactor function matching (PR #97502)

2024-07-03 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov approved this pull request.

LG % nit

https://github.com/llvm/llvm-project/pull/97502
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT][NFC] Refactor function matching (PR #97502)

2024-07-03 Thread Amir Ayupov via llvm-branch-commits


@@ -456,6 +435,39 @@ Error YAMLProfileReader::readProfile(BinaryContext &BC) {
   ++MatchedWithLTOCommonName;
 }
   }
+  return MatchedWithLTOCommonName;
+}
+
+Error YAMLProfileReader::readProfile(BinaryContext &BC) {
+  if (opts::Verbosity >= 1) {
+outs() << "BOLT-INFO: YAML profile with hash: ";
+switch (YamlBP.Header.HashFunction) {
+case HashFunction::StdHash:
+  outs() << "std::hash\n";

aaupov wrote:

@ayermolo, we didn't switch Profile component to BC logger class. That would be 
a separate effort.

https://github.com/llvm/llvm-project/pull/97502
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match blocks with calls as anchors (PR #96596)

2024-07-03 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov commented:

Sorry, couple of final comments

https://github.com/llvm/llvm-project/pull/96596
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match blocks with calls as anchors (PR #96596)

2024-07-03 Thread Amir Ayupov via llvm-branch-commits


@@ -193,18 +193,43 @@ class StaleMatcher {
 public:
   /// Initialize stale matcher.
   void init(const std::vector &Blocks,
-const std::vector &Hashes) {
+const std::vector &Hashes,
+const std::vector &CallHashes) {
 assert(Blocks.size() == Hashes.size() &&
+   Hashes.size() == CallHashes.size() &&
"incorrect matcher initialization");
 for (size_t I = 0; I < Blocks.size(); I++) {
   FlowBlock *Block = Blocks[I];
   uint16_t OpHash = Hashes[I].OpcodeHash;
   OpHashToBlocks[OpHash].push_back(std::make_pair(Hashes[I], Block));
+  if (CallHashes[I])
+CallHashToBlocks[CallHashes[I]].push_back(
+std::make_pair(Hashes[I], Block));
 }
   }
 
   /// Find the most similar block for a given hash.
-  const FlowBlock *matchBlock(BlendedBlockHash BlendedHash) const {
+  const FlowBlock *matchBlock(BlendedBlockHash &BlendedHash,
+  uint64_t &CallHash) const {

aaupov wrote:

```suggestion
  const FlowBlock *matchBlock(BlendedBlockHash BlendedHash,
  uint64_t CallHash) const {
```
BlendedBlockHash is aliased to uint64_t, and integral types should be passed by 
value.

https://github.com/llvm/llvm-project/pull/96596
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match blocks with calls as anchors (PR #96596)

2024-07-03 Thread Amir Ayupov via llvm-branch-commits


@@ -412,33 +447,62 @@ createFlowFunction(const 
BinaryFunction::BasicBlockOrderType &BlockOrder) {
 /// of the basic blocks in the binary, the count is "matched" to the block.
 /// Similarly, if both the source and the target of a count in the profile are
 /// matched to a jump in the binary, the count is recorded in CFG.
-size_t matchWeightsByHashes(
-BinaryContext &BC, const BinaryFunction::BasicBlockOrderType &BlockOrder,
-const yaml::bolt::BinaryFunctionProfile &YamlBF, FlowFunction &Func) {
+size_t
+matchWeightsByHashes(BinaryContext &BC,
+ const DenseMap &IdToFunctionName,
+ const BinaryFunction::BasicBlockOrderType &BlockOrder,
+ const yaml::bolt::BinaryFunctionProfile &YamlBF,
+ FlowFunction &Func, HashFunction HashFunction) {

aaupov wrote:

```suggestion
size_t
matchWeightsByHashes(BinaryContext &BC,
 const BinaryFunction::BasicBlockOrderType &BlockOrder,
 const yaml::bolt::BinaryFunctionProfile &YamlBF,
 FlowFunction &Func, HashFunction HashFunction,
 const DenseMap &IdToFunctionName) 
{
```
It's customary to add new parameters to the end

https://github.com/llvm/llvm-project/pull/96596
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match blocks with calls as anchors (PR #96596)

2024-07-03 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov edited https://github.com/llvm/llvm-project/pull/96596
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match blocks with calls as anchors (PR #96596)

2024-07-03 Thread Amir Ayupov via llvm-branch-commits


@@ -193,18 +193,43 @@ class StaleMatcher {
 public:
   /// Initialize stale matcher.
   void init(const std::vector &Blocks,
-const std::vector &Hashes) {
+const std::vector &Hashes,
+const std::vector &CallHashes) {
 assert(Blocks.size() == Hashes.size() &&
+   Hashes.size() == CallHashes.size() &&
"incorrect matcher initialization");
 for (size_t I = 0; I < Blocks.size(); I++) {
   FlowBlock *Block = Blocks[I];
   uint16_t OpHash = Hashes[I].OpcodeHash;
   OpHashToBlocks[OpHash].push_back(std::make_pair(Hashes[I], Block));
+  if (CallHashes[I])
+CallHashToBlocks[CallHashes[I]].push_back(
+std::make_pair(Hashes[I], Block));
 }
   }
 
   /// Find the most similar block for a given hash.
-  const FlowBlock *matchBlock(BlendedBlockHash BlendedHash) const {
+  const FlowBlock *matchBlock(BlendedBlockHash &BlendedHash,
+  uint64_t &CallHash) const {
+const FlowBlock *BestBlock = matchWithOpcodes(BlendedHash);
+return BestBlock ? BestBlock : matchWithCalls(BlendedHash, CallHash);
+  }
+
+  /// Returns true if the two basic blocks (in the binary and in the profile)
+  /// corresponding to the given hashes are matched to each other with a high
+  /// confidence.
+  static bool isHighConfidenceMatch(BlendedBlockHash Hash1,
+BlendedBlockHash Hash2) {
+return Hash1.InstrHash == Hash2.InstrHash;
+  }
+
+private:
+  using HashBlockPairType = std::pair;
+  std::unordered_map> OpHashToBlocks;
+  std::unordered_map> 
CallHashToBlocks;
+
+  // Uses OpcodeHash to find the most similar block for a given hash.
+  const FlowBlock *matchWithOpcodes(BlendedBlockHash &BlendedHash) const {

aaupov wrote:

ditto

https://github.com/llvm/llvm-project/pull/96596
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match blocks with calls as anchors (PR #96596)

2024-07-03 Thread Amir Ayupov via llvm-branch-commits


@@ -220,17 +245,27 @@ class StaleMatcher {
 return BestBlock;
   }
 
-  /// Returns true if the two basic blocks (in the binary and in the profile)
-  /// corresponding to the given hashes are matched to each other with a high
-  /// confidence.
-  static bool isHighConfidenceMatch(BlendedBlockHash Hash1,
-BlendedBlockHash Hash2) {
-return Hash1.InstrHash == Hash2.InstrHash;
+  // Uses CallHash to find the most similar block for a given hash.
+  const FlowBlock *matchWithCalls(BlendedBlockHash &BlendedHash,

aaupov wrote:

ditto

https://github.com/llvm/llvm-project/pull/96596
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match blocks with calls as anchors (PR #96596)

2024-07-03 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov edited https://github.com/llvm/llvm-project/pull/96596
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Support POSSIBLE_PIC_FIXED_BRANCH (PR #91667)

2024-07-05 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov updated 
https://github.com/llvm/llvm-project/pull/91667

>From dd4d0de42048c063d5e5095a0c2594c7cc578df5 Mon Sep 17 00:00:00 2001
From: Amir Ayupov 
Date: Thu, 9 May 2024 19:35:26 -0700
Subject: [PATCH] Fix RISCVMCPlusBuilder

Created using spr 1.3.4
---
 bolt/lib/Target/RISCV/RISCVMCPlusBuilder.cpp | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/bolt/lib/Target/RISCV/RISCVMCPlusBuilder.cpp 
b/bolt/lib/Target/RISCV/RISCVMCPlusBuilder.cpp
index 74f2f0aae91e6..020e62463ee2f 100644
--- a/bolt/lib/Target/RISCV/RISCVMCPlusBuilder.cpp
+++ b/bolt/lib/Target/RISCV/RISCVMCPlusBuilder.cpp
@@ -177,13 +177,14 @@ class RISCVMCPlusBuilder : public MCPlusBuilder {
   MCInst &Instruction, InstructionIterator Begin, InstructionIterator End,
   const unsigned PtrSize, MCInst *&MemLocInstr, unsigned &BaseRegNum,
   unsigned &IndexRegNum, int64_t &DispValue, const MCExpr *&DispExpr,
-  MCInst *&PCRelBaseOut) const override {
+  MCInst *&PCRelBaseOut, MCInst *&FixedEntryLoadInst) const override {
 MemLocInstr = nullptr;
 BaseRegNum = 0;
 IndexRegNum = 0;
 DispValue = 0;
 DispExpr = nullptr;
 PCRelBaseOut = nullptr;
+FixedEntryLoadInst = nullptr;
 
 // Check for the following long tail call sequence:
 // 1: auipc xi, %pcrel_hi(sym)

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Support POSSIBLE_PIC_FIXED_BRANCH (PR #91667)

2024-07-05 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov updated 
https://github.com/llvm/llvm-project/pull/91667

>From dd4d0de42048c063d5e5095a0c2594c7cc578df5 Mon Sep 17 00:00:00 2001
From: Amir Ayupov 
Date: Thu, 9 May 2024 19:35:26 -0700
Subject: [PATCH] Fix RISCVMCPlusBuilder

Created using spr 1.3.4
---
 bolt/lib/Target/RISCV/RISCVMCPlusBuilder.cpp | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/bolt/lib/Target/RISCV/RISCVMCPlusBuilder.cpp 
b/bolt/lib/Target/RISCV/RISCVMCPlusBuilder.cpp
index 74f2f0aae91e6..020e62463ee2f 100644
--- a/bolt/lib/Target/RISCV/RISCVMCPlusBuilder.cpp
+++ b/bolt/lib/Target/RISCV/RISCVMCPlusBuilder.cpp
@@ -177,13 +177,14 @@ class RISCVMCPlusBuilder : public MCPlusBuilder {
   MCInst &Instruction, InstructionIterator Begin, InstructionIterator End,
   const unsigned PtrSize, MCInst *&MemLocInstr, unsigned &BaseRegNum,
   unsigned &IndexRegNum, int64_t &DispValue, const MCExpr *&DispExpr,
-  MCInst *&PCRelBaseOut) const override {
+  MCInst *&PCRelBaseOut, MCInst *&FixedEntryLoadInst) const override {
 MemLocInstr = nullptr;
 BaseRegNum = 0;
 IndexRegNum = 0;
 DispValue = 0;
 DispExpr = nullptr;
 PCRelBaseOut = nullptr;
+FixedEntryLoadInst = nullptr;
 
 // Check for the following long tail call sequence:
 // 1: auipc xi, %pcrel_hi(sym)

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Eliminate dead jump tables (PR #91666)

2024-07-05 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov closed https://github.com/llvm/llvm-project/pull/91666
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Support POSSIBLE_PIC_FIXED_BRANCH (PR #91667)

2024-07-05 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov updated 
https://github.com/llvm/llvm-project/pull/91667

>From dd4d0de42048c063d5e5095a0c2594c7cc578df5 Mon Sep 17 00:00:00 2001
From: Amir Ayupov 
Date: Thu, 9 May 2024 19:35:26 -0700
Subject: [PATCH 1/2] Fix RISCVMCPlusBuilder

Created using spr 1.3.4
---
 bolt/lib/Target/RISCV/RISCVMCPlusBuilder.cpp | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/bolt/lib/Target/RISCV/RISCVMCPlusBuilder.cpp 
b/bolt/lib/Target/RISCV/RISCVMCPlusBuilder.cpp
index 74f2f0aae91e66..020e62463ee2f4 100644
--- a/bolt/lib/Target/RISCV/RISCVMCPlusBuilder.cpp
+++ b/bolt/lib/Target/RISCV/RISCVMCPlusBuilder.cpp
@@ -177,13 +177,14 @@ class RISCVMCPlusBuilder : public MCPlusBuilder {
   MCInst &Instruction, InstructionIterator Begin, InstructionIterator End,
   const unsigned PtrSize, MCInst *&MemLocInstr, unsigned &BaseRegNum,
   unsigned &IndexRegNum, int64_t &DispValue, const MCExpr *&DispExpr,
-  MCInst *&PCRelBaseOut) const override {
+  MCInst *&PCRelBaseOut, MCInst *&FixedEntryLoadInst) const override {
 MemLocInstr = nullptr;
 BaseRegNum = 0;
 IndexRegNum = 0;
 DispValue = 0;
 DispExpr = nullptr;
 PCRelBaseOut = nullptr;
+FixedEntryLoadInst = nullptr;
 
 // Check for the following long tail call sequence:
 // 1: auipc xi, %pcrel_hi(sym)

>From 62391bb5aa01f2b77d4315d1e72a9924eec9ecc0 Mon Sep 17 00:00:00 2001
From: Amir Ayupov 
Date: Fri, 5 Jul 2024 14:54:51 -0700
Subject: [PATCH 2/2] Drop deregisterJumpTable

Created using spr 1.3.4
---
 bolt/lib/Core/BinaryFunction.cpp | 10 +-
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/bolt/lib/Core/BinaryFunction.cpp b/bolt/lib/Core/BinaryFunction.cpp
index 09a6ca1d68730c..f587d5a2cadd49 100644
--- a/bolt/lib/Core/BinaryFunction.cpp
+++ b/bolt/lib/Core/BinaryFunction.cpp
@@ -899,17 +899,9 @@ BinaryFunction::processIndirectBranch(MCInst &Instruction, 
unsigned Size,
 
 TargetAddress = ArrayStart + *Value;
 
-// Remove spurious JumpTable at EntryAddress caused by PIC reference from
-// the load instruction.
-JumpTable *JT = BC.getJumpTableContainingAddress(EntryAddress);
-assert(JT && "Must have a jump table at fixed entry address");
-BC.deregisterJumpTable(EntryAddress);
-JumpTables.erase(EntryAddress);
-delete JT;
-
 // Replace FixedEntryDispExpr used in target address calculation with outer
 // jump table reference.
-JT = BC.getJumpTableContainingAddress(ArrayStart);
+JumpTable *JT = BC.getJumpTableContainingAddress(ArrayStart);
 assert(JT && "Must have a containing jump table for PIC fixed branch");
 BC.MIB->replaceMemOperandDisp(*FixedEntryLoadInstr, JT->getFirstLabel(),
   EntryAddress - ArrayStart, &*BC.Ctx);

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Support POSSIBLE_PIC_FIXED_BRANCH (PR #91667)

2024-07-06 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov updated 
https://github.com/llvm/llvm-project/pull/91667

>From dd4d0de42048c063d5e5095a0c2594c7cc578df5 Mon Sep 17 00:00:00 2001
From: Amir Ayupov 
Date: Thu, 9 May 2024 19:35:26 -0700
Subject: [PATCH 1/3] Fix RISCVMCPlusBuilder

Created using spr 1.3.4
---
 bolt/lib/Target/RISCV/RISCVMCPlusBuilder.cpp | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/bolt/lib/Target/RISCV/RISCVMCPlusBuilder.cpp 
b/bolt/lib/Target/RISCV/RISCVMCPlusBuilder.cpp
index 74f2f0aae91e66..020e62463ee2f4 100644
--- a/bolt/lib/Target/RISCV/RISCVMCPlusBuilder.cpp
+++ b/bolt/lib/Target/RISCV/RISCVMCPlusBuilder.cpp
@@ -177,13 +177,14 @@ class RISCVMCPlusBuilder : public MCPlusBuilder {
   MCInst &Instruction, InstructionIterator Begin, InstructionIterator End,
   const unsigned PtrSize, MCInst *&MemLocInstr, unsigned &BaseRegNum,
   unsigned &IndexRegNum, int64_t &DispValue, const MCExpr *&DispExpr,
-  MCInst *&PCRelBaseOut) const override {
+  MCInst *&PCRelBaseOut, MCInst *&FixedEntryLoadInst) const override {
 MemLocInstr = nullptr;
 BaseRegNum = 0;
 IndexRegNum = 0;
 DispValue = 0;
 DispExpr = nullptr;
 PCRelBaseOut = nullptr;
+FixedEntryLoadInst = nullptr;
 
 // Check for the following long tail call sequence:
 // 1: auipc xi, %pcrel_hi(sym)

>From 62391bb5aa01f2b77d4315d1e72a9924eec9ecc0 Mon Sep 17 00:00:00 2001
From: Amir Ayupov 
Date: Fri, 5 Jul 2024 14:54:51 -0700
Subject: [PATCH 2/3] Drop deregisterJumpTable

Created using spr 1.3.4
---
 bolt/lib/Core/BinaryFunction.cpp | 10 +-
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/bolt/lib/Core/BinaryFunction.cpp b/bolt/lib/Core/BinaryFunction.cpp
index 09a6ca1d68730c..f587d5a2cadd49 100644
--- a/bolt/lib/Core/BinaryFunction.cpp
+++ b/bolt/lib/Core/BinaryFunction.cpp
@@ -899,17 +899,9 @@ BinaryFunction::processIndirectBranch(MCInst &Instruction, 
unsigned Size,
 
 TargetAddress = ArrayStart + *Value;
 
-// Remove spurious JumpTable at EntryAddress caused by PIC reference from
-// the load instruction.
-JumpTable *JT = BC.getJumpTableContainingAddress(EntryAddress);
-assert(JT && "Must have a jump table at fixed entry address");
-BC.deregisterJumpTable(EntryAddress);
-JumpTables.erase(EntryAddress);
-delete JT;
-
 // Replace FixedEntryDispExpr used in target address calculation with outer
 // jump table reference.
-JT = BC.getJumpTableContainingAddress(ArrayStart);
+JumpTable *JT = BC.getJumpTableContainingAddress(ArrayStart);
 assert(JT && "Must have a containing jump table for PIC fixed branch");
 BC.MIB->replaceMemOperandDisp(*FixedEntryLoadInstr, JT->getFirstLabel(),
   EntryAddress - ArrayStart, &*BC.Ctx);

>From 5336879ab68aedb1217e2c6c139d171f31e89e03 Mon Sep 17 00:00:00 2001
From: Amir Ayupov 
Date: Sat, 6 Jul 2024 22:26:14 -0700
Subject: [PATCH 3/3] Surgically drop spurious jump table

Created using spr 1.3.4
---
 bolt/include/bolt/Core/BinaryContext.h  |  5 +
 bolt/lib/Core/BinaryFunction.cpp| 12 ++--
 bolt/test/X86/jump-table-fixed-ref-pic.test | 11 ---
 3 files changed, 19 insertions(+), 9 deletions(-)

diff --git a/bolt/include/bolt/Core/BinaryContext.h 
b/bolt/include/bolt/Core/BinaryContext.h
index 73932c4ca2fb33..c5e2c6cd02179e 100644
--- a/bolt/include/bolt/Core/BinaryContext.h
+++ b/bolt/include/bolt/Core/BinaryContext.h
@@ -431,6 +431,11 @@ class BinaryContext {
 return nullptr;
   }
 
+  /// Deregister JumpTable registered at a given \p Address.
+  bool deregisterJumpTable(uint64_t Address) {
+return JumpTables.erase(Address);
+  }
+
   unsigned getDWARFEncodingSize(unsigned Encoding) {
 if (Encoding == dwarf::DW_EH_PE_omit)
   return 0;
diff --git a/bolt/lib/Core/BinaryFunction.cpp b/bolt/lib/Core/BinaryFunction.cpp
index f587d5a2cadd49..2ecca32a5985c0 100644
--- a/bolt/lib/Core/BinaryFunction.cpp
+++ b/bolt/lib/Core/BinaryFunction.cpp
@@ -899,9 +899,17 @@ BinaryFunction::processIndirectBranch(MCInst &Instruction, 
unsigned Size,
 
 TargetAddress = ArrayStart + *Value;
 
+// Remove spurious JumpTable at EntryAddress caused by PIC reference from
+// the load instruction.
+JumpTable *JT = BC.getJumpTableContainingAddress(EntryAddress);
+assert(JT && "Must have a jump table at fixed entry address");
+BC.deregisterJumpTable(EntryAddress);
+JumpTables.erase(EntryAddress);
+delete JT;
+
 // Replace FixedEntryDispExpr used in target address calculation with outer
 // jump table reference.
-JumpTable *JT = BC.getJumpTableContainingAddress(ArrayStart);
+JT = BC.getJumpTableContainingAddress(ArrayStart);
 assert(JT && "Must have a containing jump table for PIC fixed branch");
 BC.MIB->replaceMemOperandDisp(*FixedEntryLoadInstr, JT->getFirstLabel(),
   EntryAddress - ArrayStart, &*BC.Ctx);
@@ -1158,10 +1166,10 @@ void BinaryF

[llvm-branch-commits] [llvm] [BOLT] Match blocks with calls as anchors (PR #96596)

2024-07-08 Thread Amir Ayupov via llvm-branch-commits


@@ -35,6 +36,12 @@ std::string hashBlock(BinaryContext &BC, const 
BinaryBasicBlock &BB,
 
 std::string hashBlockLoose(BinaryContext &BC, const BinaryBasicBlock &BB);
 
+std::string hashBlockCalls(BinaryContext &BC, const BinaryBasicBlock &BB);
+
+std::string
+hashBlockCalls(const DenseMap &IdToFunctionName,

aaupov wrote:

Please use StringRef.

https://github.com/llvm/llvm-project/pull/96596
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match blocks with calls as anchors (PR #96596)

2024-07-08 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov approved this pull request.

Thanks. 

https://github.com/llvm/llvm-project/pull/96596
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match blocks with calls as anchors (PR #96596)

2024-07-09 Thread Amir Ayupov via llvm-branch-commits


@@ -40,6 +40,8 @@ class YAMLProfileReader : public ProfileReaderBase {
   /// Check if the file contains YAML.
   static bool isYAML(StringRef Filename);
 
+  using FunctionMap = DenseMap;

aaupov wrote:

```suggestion
  using ProfileLookupMap = DenseMap;
```

https://github.com/llvm/llvm-project/pull/96596
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match blocks with calls as anchors (PR #96596)

2024-07-09 Thread Amir Ayupov via llvm-branch-commits


@@ -181,20 +182,19 @@ std::string hashBlockCalls(BinaryContext &BC, const 
BinaryBasicBlock &BB) {
 
 /// The same as the $hashBlockCalls function, but for profiled functions.
 std::string
-hashBlockCalls(const DenseMap &IdToFunctionName,
+hashBlockCalls(const DenseMap
+   &IdToYamlFunction,
const yaml::bolt::BinaryBasicBlockProfile &YamlBB) {
-  std::multiset FunctionNames;
+  std::vector FunctionNames;
   for (const yaml::bolt::CallSiteInfo &CallSiteInfo : YamlBB.CallSites) {
-auto It = IdToFunctionName.find(CallSiteInfo.DestId);
-if (It == IdToFunctionName.end())
+auto It = IdToYamlFunction.find(CallSiteInfo.DestId);
+if (It == IdToYamlFunction.end())
   continue;
-StringRef Name = It->second;
-const size_t Pos = Name.find("(*");
-if (Pos != StringRef::npos)
-  Name = Name.substr(0, Pos);
-FunctionNames.insert(std::string(Name));
+StringRef Name =
+NameResolver::removeSuffix(It->second->Name, StringRef("(*"));

aaupov wrote:

I think the intent of Maksim's comment was to hide an implementation detail 
(using `(*N)` suffix) in relevant component (NameResolver).

https://github.com/llvm/llvm-project/pull/96596
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match blocks with calls as anchors (PR #96596)

2024-07-10 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov edited https://github.com/llvm/llvm-project/pull/96596
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match blocks with calls as anchors (PR #96596)

2024-07-10 Thread Amir Ayupov via llvm-branch-commits


@@ -63,8 +63,8 @@ class NameResolver {
   }
 
   // Removes a suffix from a function name.
-  static StringRef removeSuffix(StringRef Name, StringRef Suffix) {
-const size_t Pos = Name.find(Suffix);
+  static StringRef unify(StringRef Name) {

aaupov wrote:

Let's use a self-descriptive name e.g. `dropNumNames`.
This would follow the convention used in BinaryFunction.h `getPrintName`.

https://github.com/llvm/llvm-project/pull/96596
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match blocks with calls as anchors (PR #96596)

2024-07-10 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov commented:

Thanks, LG with one nit.

https://github.com/llvm/llvm-project/pull/96596
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match blocks with calls as anchors (PR #96596)

2024-07-10 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov approved this pull request.


https://github.com/llvm/llvm-project/pull/96596
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [NFC][BOLT] Rename createDummyReturnFunction to createReturnBody (PR #98448)

2024-07-11 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov approved this pull request.

Suggestion: `createReturnInstructionList`

https://github.com/llvm/llvm-project/pull/98448
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Added more details on heatmap docs. (PR #98162)

2024-07-11 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov approved this pull request.

Awesome, thanks for updating the documentation

https://github.com/llvm/llvm-project/pull/98162
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Added more details on heatmap docs. (PR #98162)

2024-07-11 Thread Amir Ayupov via llvm-branch-commits

aaupov wrote:

Please retitle as imperative statement and drop trailing dot, e.g. 
`[BOLT][docs] Expand Heatmaps.md`

https://github.com/llvm/llvm-project/pull/98162
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Support POSSIBLE_PIC_FIXED_BRANCH (PR #91667)

2024-07-14 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov updated 
https://github.com/llvm/llvm-project/pull/91667

>From dd4d0de42048c063d5e5095a0c2594c7cc578df5 Mon Sep 17 00:00:00 2001
From: Amir Ayupov 
Date: Thu, 9 May 2024 19:35:26 -0700
Subject: [PATCH 1/4] Fix RISCVMCPlusBuilder

Created using spr 1.3.4
---
 bolt/lib/Target/RISCV/RISCVMCPlusBuilder.cpp | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/bolt/lib/Target/RISCV/RISCVMCPlusBuilder.cpp 
b/bolt/lib/Target/RISCV/RISCVMCPlusBuilder.cpp
index 74f2f0aae91e6..020e62463ee2f 100644
--- a/bolt/lib/Target/RISCV/RISCVMCPlusBuilder.cpp
+++ b/bolt/lib/Target/RISCV/RISCVMCPlusBuilder.cpp
@@ -177,13 +177,14 @@ class RISCVMCPlusBuilder : public MCPlusBuilder {
   MCInst &Instruction, InstructionIterator Begin, InstructionIterator End,
   const unsigned PtrSize, MCInst *&MemLocInstr, unsigned &BaseRegNum,
   unsigned &IndexRegNum, int64_t &DispValue, const MCExpr *&DispExpr,
-  MCInst *&PCRelBaseOut) const override {
+  MCInst *&PCRelBaseOut, MCInst *&FixedEntryLoadInst) const override {
 MemLocInstr = nullptr;
 BaseRegNum = 0;
 IndexRegNum = 0;
 DispValue = 0;
 DispExpr = nullptr;
 PCRelBaseOut = nullptr;
+FixedEntryLoadInst = nullptr;
 
 // Check for the following long tail call sequence:
 // 1: auipc xi, %pcrel_hi(sym)

>From 62391bb5aa01f2b77d4315d1e72a9924eec9ecc0 Mon Sep 17 00:00:00 2001
From: Amir Ayupov 
Date: Fri, 5 Jul 2024 14:54:51 -0700
Subject: [PATCH 2/4] Drop deregisterJumpTable

Created using spr 1.3.4
---
 bolt/lib/Core/BinaryFunction.cpp | 10 +-
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/bolt/lib/Core/BinaryFunction.cpp b/bolt/lib/Core/BinaryFunction.cpp
index 09a6ca1d68730..f587d5a2cadd4 100644
--- a/bolt/lib/Core/BinaryFunction.cpp
+++ b/bolt/lib/Core/BinaryFunction.cpp
@@ -899,17 +899,9 @@ BinaryFunction::processIndirectBranch(MCInst &Instruction, 
unsigned Size,
 
 TargetAddress = ArrayStart + *Value;
 
-// Remove spurious JumpTable at EntryAddress caused by PIC reference from
-// the load instruction.
-JumpTable *JT = BC.getJumpTableContainingAddress(EntryAddress);
-assert(JT && "Must have a jump table at fixed entry address");
-BC.deregisterJumpTable(EntryAddress);
-JumpTables.erase(EntryAddress);
-delete JT;
-
 // Replace FixedEntryDispExpr used in target address calculation with outer
 // jump table reference.
-JT = BC.getJumpTableContainingAddress(ArrayStart);
+JumpTable *JT = BC.getJumpTableContainingAddress(ArrayStart);
 assert(JT && "Must have a containing jump table for PIC fixed branch");
 BC.MIB->replaceMemOperandDisp(*FixedEntryLoadInstr, JT->getFirstLabel(),
   EntryAddress - ArrayStart, &*BC.Ctx);

>From 5336879ab68aedb1217e2c6c139d171f31e89e03 Mon Sep 17 00:00:00 2001
From: Amir Ayupov 
Date: Sat, 6 Jul 2024 22:26:14 -0700
Subject: [PATCH 3/4] Surgically drop spurious jump table

Created using spr 1.3.4
---
 bolt/include/bolt/Core/BinaryContext.h  |  5 +
 bolt/lib/Core/BinaryFunction.cpp| 12 ++--
 bolt/test/X86/jump-table-fixed-ref-pic.test | 11 ---
 3 files changed, 19 insertions(+), 9 deletions(-)

diff --git a/bolt/include/bolt/Core/BinaryContext.h 
b/bolt/include/bolt/Core/BinaryContext.h
index 73932c4ca2fb3..c5e2c6cd02179 100644
--- a/bolt/include/bolt/Core/BinaryContext.h
+++ b/bolt/include/bolt/Core/BinaryContext.h
@@ -431,6 +431,11 @@ class BinaryContext {
 return nullptr;
   }
 
+  /// Deregister JumpTable registered at a given \p Address.
+  bool deregisterJumpTable(uint64_t Address) {
+return JumpTables.erase(Address);
+  }
+
   unsigned getDWARFEncodingSize(unsigned Encoding) {
 if (Encoding == dwarf::DW_EH_PE_omit)
   return 0;
diff --git a/bolt/lib/Core/BinaryFunction.cpp b/bolt/lib/Core/BinaryFunction.cpp
index f587d5a2cadd4..2ecca32a5985c 100644
--- a/bolt/lib/Core/BinaryFunction.cpp
+++ b/bolt/lib/Core/BinaryFunction.cpp
@@ -899,9 +899,17 @@ BinaryFunction::processIndirectBranch(MCInst &Instruction, 
unsigned Size,
 
 TargetAddress = ArrayStart + *Value;
 
+// Remove spurious JumpTable at EntryAddress caused by PIC reference from
+// the load instruction.
+JumpTable *JT = BC.getJumpTableContainingAddress(EntryAddress);
+assert(JT && "Must have a jump table at fixed entry address");
+BC.deregisterJumpTable(EntryAddress);
+JumpTables.erase(EntryAddress);
+delete JT;
+
 // Replace FixedEntryDispExpr used in target address calculation with outer
 // jump table reference.
-JumpTable *JT = BC.getJumpTableContainingAddress(ArrayStart);
+JT = BC.getJumpTableContainingAddress(ArrayStart);
 assert(JT && "Must have a containing jump table for PIC fixed branch");
 BC.MIB->replaceMemOperandDisp(*FixedEntryLoadInstr, JT->getFirstLabel(),
   EntryAddress - ArrayStart, &*BC.Ctx);
@@ -1158,10 +1166,10 @@ void BinaryFunction:

[llvm-branch-commits] [llvm] [BOLT] Match functions with call graph (PR #98125)

2024-07-15 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov commented:

Thank you for working on this! It looks very good overall, left a couple of 
comments inline. Please run this new matching on a large binary to answer 
questions about runtime and matching quality in ambiguous cases.

https://github.com/llvm/llvm-project/pull/98125
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match functions with call graph (PR #98125)

2024-07-15 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov edited https://github.com/llvm/llvm-project/pull/98125
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match functions with call graph (PR #98125)

2024-07-15 Thread Amir Ayupov via llvm-branch-commits


@@ -568,12 +675,30 @@ Error YAMLProfileReader::readProfile(BinaryContext &BC) {
   }
   YamlProfileToFunction.resize(YamlBP.Functions.size() + 1);
 
-  // Computes hash for binary functions.
+  // Map profiled function ids to names.
+  for (yaml::bolt::BinaryFunctionProfile &YamlBF : YamlBP.Functions)
+IdToYamLBF[YamlBF.Id] = &YamlBF;
+
+  // Creates a vector of lamdas that preprocess binary functions for function
+  // matching to avoid multiple preprocessing passes over binary functions in
+  // different function matching techniques.
+  std::vector> BFPreprocessingFuncs;

aaupov wrote:

Let's move it into a separate NFC change.

https://github.com/llvm/llvm-project/pull/98125
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


  1   2   3   4   5   6   7   8   9   10   >