[llvm-branch-commits] [flang] [Flang][OpenMP] Derived type explicit allocatable member mapping (PR #96266)

2024-07-15 Thread Kareem Ergawy via llvm-branch-commits

https://github.com/ergawy edited https://github.com/llvm/llvm-project/pull/96266
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [Flang][OpenMP] Derived type explicit allocatable member mapping (PR #96266)

2024-07-15 Thread Kareem Ergawy via llvm-branch-commits

https://github.com/ergawy edited https://github.com/llvm/llvm-project/pull/96266
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [Flang][OpenMP] Derived type explicit allocatable member mapping (PR #96266)

2024-07-15 Thread Kareem Ergawy via llvm-branch-commits


@@ -141,6 +143,110 @@ createMapInfoOp(fir::FirOpBuilder &builder, 
mlir::Location loc,
   return op;
 }
 
+omp::ObjectList gatherObjects(omp::Object obj,
+  semantics::SemanticsContext &semaCtx) {
+  omp::ObjectList objList;
+  std::optional baseObj = getBaseObject(obj, semaCtx);
+  while (baseObj.has_value()) {
+objList.push_back(baseObj.value());
+baseObj = getBaseObject(baseObj.value(), semaCtx);
+  }
+  return omp::ObjectList{llvm::reverse(objList)};
+}
+
+bool duplicateMemberMapInfo(OmpMapMemberIndicesData &parentMembers,
+llvm::SmallVectorImpl &memberIndices) {
+  // A variation of std:equal that supports non-equal length index lists for 
our
+  // specific use-case, if one is larger than the other, we use -1, the default
+  // filler element in place of the smaller vector, this prevents UB from over
+  // indexing and removes the need for us to do any filling of intermediate
+  // index lists we'll discard.
+  auto isEqual = [](auto first1, auto last1, auto first2, auto last2) {
+int v1, v2;
+for (; first1 != last1; ++first1, ++first2) {
+  v1 = (first1 == last1) ? -1 : *first1;
+  v2 = (first2 == last2) ? -1 : *first2;
+
+  if (!(v1 == v2))
+return false;
+}
+return true;
+  };
+
+  for (auto memberData : parentMembers.memberPlacementIndices)
+if (isEqual(memberData.begin(), memberData.end(), memberIndices.begin(),
+memberIndices.end()))
+  return true;
+  return false;
+}

ergawy wrote:

If my above comments make sense, then I think the following should be enough:
```suggestion
bool isDuplicateMemberMapInfo(OmpMapMemberIndicesData &parentMembers,
llvm::SmallVectorImpl &memberIndices) {
  for (auto memberData : parentMembers.memberPlacementIndices)
if (std::equal(memberData.begin(), memberData.end(), memberIndices.begin()))
  return true;
  return false;
}
```

In any case, I think the function name change is suggested to better describe 
the purpose and behavior of the function.

https://github.com/llvm/llvm-project/pull/96266
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [Flang][OpenMP] Derived type explicit allocatable member mapping (PR #96266)

2024-07-15 Thread Kareem Ergawy via llvm-branch-commits


@@ -141,6 +143,110 @@ createMapInfoOp(fir::FirOpBuilder &builder, 
mlir::Location loc,
   return op;
 }
 
+omp::ObjectList gatherObjects(omp::Object obj,
+  semantics::SemanticsContext &semaCtx) {
+  omp::ObjectList objList;
+  std::optional baseObj = getBaseObject(obj, semaCtx);
+  while (baseObj.has_value()) {
+objList.push_back(baseObj.value());
+baseObj = getBaseObject(baseObj.value(), semaCtx);
+  }
+  return omp::ObjectList{llvm::reverse(objList)};
+}
+
+bool duplicateMemberMapInfo(OmpMapMemberIndicesData &parentMembers,
+llvm::SmallVectorImpl &memberIndices) {
+  // A variation of std:equal that supports non-equal length index lists for 
our
+  // specific use-case, if one is larger than the other, we use -1, the default
+  // filler element in place of the smaller vector, this prevents UB from over
+  // indexing and removes the need for us to do any filling of intermediate
+  // index lists we'll discard.
+  auto isEqual = [](auto first1, auto last1, auto first2, auto last2) {
+int v1, v2;
+for (; first1 != last1; ++first1, ++first2) {
+  v1 = (first1 == last1) ? -1 : *first1;
+  v2 = (first2 == last2) ? -1 : *first2;
+
+  if (!(v1 == v2))
+return false;
+}
+return true;
+  };
+
+  for (auto memberData : parentMembers.memberPlacementIndices)
+if (isEqual(memberData.begin(), memberData.end(), memberIndices.begin(),
+memberIndices.end()))
+  return true;
+  return false;
+}
+
+// When mapping members of derived types, there is a chance that one of the
+// members along the way to a mapped member is an descriptor. In which case
+// we have to make sure we generate a map for those along the way otherwise
+// we will be missing a chunk of data required to actually map the member
+// type to device. This function effectively generates these maps and the
+// appropriate data accesses required to generate these maps. It will avoid
+// creating duplicate maps, as duplicates are just as bad as unmapped
+// descriptor data in a lot of cases for the runtime (and unnecessary
+// data movement should be avoided where possible)
+mlir::Value createParentSymAndGenIntermediateMaps(
+mlir::Location clauseLocation, Fortran::lower::AbstractConverter 
&converter,
+omp::ObjectList &objectList, llvm::SmallVector &indices,
+OmpMapMemberIndicesData &parentMemberIndices, std::string asFortran,
+llvm::omp::OpenMPOffloadMappingFlags mapTypeBits) {
+  fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder();
+  Fortran::lower::AddrAndBoundsInfo parentBaseAddr =
+  Fortran::lower::getDataOperandBaseAddr(
+  converter, firOpBuilder, *objectList[0].sym(), clauseLocation);
+  mlir::Value curValue = parentBaseAddr.addr;
+
+  for (size_t i = 0; i < objectList.size(); ++i) {
+mlir::Type unwrappedTy =
+fir::unwrapSequenceType(fir::unwrapPassByRefType(curValue.getType()));
+if (fir::RecordType recordType =
+mlir::dyn_cast_or_null(unwrappedTy)) {

ergawy wrote:

If `i < objectList.size() - 1`, is this ever going to be `false`? I mean, won't 
we have record types all the way until the last object in the list (and maybe 
this is one is a record type and maybe not).

https://github.com/llvm/llvm-project/pull/96266
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [Flang][OpenMP] Derived type explicit allocatable member mapping (PR #96266)

2024-07-15 Thread Kareem Ergawy via llvm-branch-commits


@@ -141,6 +143,110 @@ createMapInfoOp(fir::FirOpBuilder &builder, 
mlir::Location loc,
   return op;
 }
 
+omp::ObjectList gatherObjects(omp::Object obj,
+  semantics::SemanticsContext &semaCtx) {
+  omp::ObjectList objList;
+  std::optional baseObj = getBaseObject(obj, semaCtx);
+  while (baseObj.has_value()) {
+objList.push_back(baseObj.value());
+baseObj = getBaseObject(baseObj.value(), semaCtx);
+  }
+  return omp::ObjectList{llvm::reverse(objList)};
+}
+
+bool duplicateMemberMapInfo(OmpMapMemberIndicesData &parentMembers,
+llvm::SmallVectorImpl &memberIndices) {
+  // A variation of std:equal that supports non-equal length index lists for 
our
+  // specific use-case, if one is larger than the other, we use -1, the default
+  // filler element in place of the smaller vector, this prevents UB from over
+  // indexing and removes the need for us to do any filling of intermediate
+  // index lists we'll discard.
+  auto isEqual = [](auto first1, auto last1, auto first2, auto last2) {
+int v1, v2;
+for (; first1 != last1; ++first1, ++first2) {
+  v1 = (first1 == last1) ? -1 : *first1;

ergawy wrote:

The condition is never going to be true AFAICT.

https://github.com/llvm/llvm-project/pull/96266
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [Flang][OpenMP] Derived type explicit allocatable member mapping (PR #96266)

2024-07-15 Thread Kareem Ergawy via llvm-branch-commits


@@ -51,21 +55,66 @@ class OMPMapInfoFinalizationPass
 : public fir::impl::OMPMapInfoFinalizationPassBase<
   OMPMapInfoFinalizationPass> {
 
-  void genDescriptorMemberMaps(mlir::omp::MapInfoOp op,
-   fir::FirOpBuilder &builder,
-   mlir::Operation *target) {
-mlir::Location loc = op.getLoc();
-mlir::Value descriptor = op.getVarPtr();
+  /// Small helper class tracking a members parent and its

ergawy wrote:

nit: no need to quantify the size of the helper class 😛.
```suggestion
  /// Tracks a member's parent and its
```

https://github.com/llvm/llvm-project/pull/96266
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [Flang][OpenMP] Derived type explicit allocatable member mapping (PR #96266)

2024-07-15 Thread Kareem Ergawy via llvm-branch-commits


@@ -141,6 +143,110 @@ createMapInfoOp(fir::FirOpBuilder &builder, 
mlir::Location loc,
   return op;
 }
 
+omp::ObjectList gatherObjects(omp::Object obj,
+  semantics::SemanticsContext &semaCtx) {
+  omp::ObjectList objList;
+  std::optional baseObj = getBaseObject(obj, semaCtx);
+  while (baseObj.has_value()) {
+objList.push_back(baseObj.value());
+baseObj = getBaseObject(baseObj.value(), semaCtx);
+  }
+  return omp::ObjectList{llvm::reverse(objList)};
+}
+
+bool duplicateMemberMapInfo(OmpMapMemberIndicesData &parentMembers,
+llvm::SmallVectorImpl &memberIndices) {
+  // A variation of std:equal that supports non-equal length index lists for 
our
+  // specific use-case, if one is larger than the other, we use -1, the default
+  // filler element in place of the smaller vector, this prevents UB from over
+  // indexing and removes the need for us to do any filling of intermediate
+  // index lists we'll discard.
+  auto isEqual = [](auto first1, auto last1, auto first2, auto last2) {
+int v1, v2;
+for (; first1 != last1; ++first1, ++first2) {
+  v1 = (first1 == last1) ? -1 : *first1;
+  v2 = (first2 == last2) ? -1 : *first2;
+
+  if (!(v1 == v2))
+return false;
+}
+return true;
+  };
+
+  for (auto memberData : parentMembers.memberPlacementIndices)
+if (isEqual(memberData.begin(), memberData.end(), memberIndices.begin(),
+memberIndices.end()))
+  return true;
+  return false;
+}
+
+// When mapping members of derived types, there is a chance that one of the
+// members along the way to a mapped member is an descriptor. In which case
+// we have to make sure we generate a map for those along the way otherwise
+// we will be missing a chunk of data required to actually map the member
+// type to device. This function effectively generates these maps and the
+// appropriate data accesses required to generate these maps. It will avoid
+// creating duplicate maps, as duplicates are just as bad as unmapped
+// descriptor data in a lot of cases for the runtime (and unnecessary
+// data movement should be avoided where possible)
+mlir::Value createParentSymAndGenIntermediateMaps(
+mlir::Location clauseLocation, Fortran::lower::AbstractConverter 
&converter,
+omp::ObjectList &objectList, llvm::SmallVector &indices,
+OmpMapMemberIndicesData &parentMemberIndices, std::string asFortran,
+llvm::omp::OpenMPOffloadMappingFlags mapTypeBits) {
+  fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder();
+  Fortran::lower::AddrAndBoundsInfo parentBaseAddr =
+  Fortran::lower::getDataOperandBaseAddr(
+  converter, firOpBuilder, *objectList[0].sym(), clauseLocation);
+  mlir::Value curValue = parentBaseAddr.addr;
+
+  for (size_t i = 0; i < objectList.size(); ++i) {
+mlir::Type unwrappedTy =
+fir::unwrapSequenceType(fir::unwrapPassByRefType(curValue.getType()));
+if (fir::RecordType recordType =
+mlir::dyn_cast_or_null(unwrappedTy)) {
+  mlir::Value idxConst = firOpBuilder.createIntegerConstant(
+  clauseLocation, firOpBuilder.getIndexType(), indices[i]);
+  mlir::Type memberTy = recordType.getTypeList().at(indices[i]).second;
+  curValue = firOpBuilder.create(
+  clauseLocation, firOpBuilder.getRefType(memberTy), curValue,
+  idxConst);
+
+  if ((i != indices.size() - 1) && fir::isTypeWithDescriptor(memberTy)) {
+llvm::SmallVector intermIndices = indices;
+std::fill(std::next(intermIndices.begin(), i + 1), intermIndices.end(),
+  -1);
+if (!duplicateMemberMapInfo(parentMemberIndices, intermIndices)) {
+  // TODO: Perhaps generate bounds for these intermediate maps, as it
+  // may be required for cases such as:
+  //dtype(1)%second(3)%array
+  // where second is an allocatable (and dtype may be an allocatable as
+  // well, although in this case I am not sure the fortran syntax would
+  // be legal)
+  mlir::omp::MapInfoOp mapOp = createMapInfoOp(
+  firOpBuilder, clauseLocation, curValue,
+  /*varPtrPtr=*/mlir::Value{}, asFortran,
+  /*bounds=*/llvm::SmallVector{},
+  /*members=*/{},
+  /*membersIndex=*/mlir::DenseIntElementsAttr{},
+  static_cast<
+  
std::underlying_type_t>(
+  mapTypeBits),
+  mlir::omp::VariableCaptureKind::ByRef, curValue.getType());
+
+  parentMemberIndices.memberPlacementIndices.push_back(intermIndices);
+  parentMemberIndices.memberMap.push_back(mapOp);
+}
+
+if (i != indices.size() - 1)

ergawy wrote:

We already know this is true from the enclosing `if`.

https://github.com/llvm/llvm-project/pull/96266
___
llvm-branch-commits mailing list
llvm-branch-commi

[llvm-branch-commits] [flang] [Flang][OpenMP] Derived type explicit allocatable member mapping (PR #96266)

2024-07-15 Thread via llvm-branch-commits


@@ -671,4 +672,51 @@ static inline bool isEqual(const Fortran::lower::SomeExpr 
*x,
 }
 } // end namespace Fortran::lower
 
+// OpenMP utility functions used in locations outside of the
+// OpenMP lowering.
+namespace Fortran::lower::omp {
+
+[[maybe_unused]] static void fillMemberIndices(

agozillon wrote:

Can do, if it seems reasonable to add them to this file, wasn't so sure if this 
was the best place to add them but it seemed the most appropriate when I was 
looking around..

https://github.com/llvm/llvm-project/pull/96266
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [Flang][OpenMP] Derived type explicit allocatable member mapping (PR #96266)

2024-07-15 Thread via llvm-branch-commits


@@ -30,17 +30,17 @@ subroutine mapType_array
   !$omp end target
 end subroutine mapType_array
 
-!CHECK: @.offload_sizes{{.*}} = private unnamed_addr constant [3 x i64] [i64 
0, i64 24, i64 4]
-!CHECK: @.offload_maptypes{{.*}} = private unnamed_addr constant [3 x i64] 
[i64 32, i64 281474976710657, i64 281474976711187]
+!CHECK: @.offload_sizes{{.*}} = private unnamed_addr constant [4 x i64] [i64 
0, i64 24, i64 8, i64 4]

agozillon wrote:

can do, but it's generally data sizes corresponding to the type, so I am not 
too sure it'd be great to comment each individual one, there's a lot in this 
test and it may just add excess noise.

It will likely fail if you haven't got the corresponding lowering PR applied as 
well, which is part of the PR stack. Otherwise, the PR is stale about 3-4 weeks 
from sitting in review so that might be part of it as well! I'll address it 
when I update the PR if that is the case, thank you for pointing it out!

https://github.com/llvm/llvm-project/pull/96266
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [Flang][OpenMP] Derived type explicit allocatable member mapping (PR #96266)

2024-07-15 Thread via llvm-branch-commits


@@ -671,4 +672,51 @@ static inline bool isEqual(const Fortran::lower::SomeExpr 
*x,
 }
 } // end namespace Fortran::lower
 
+// OpenMP utility functions used in locations outside of the
+// OpenMP lowering.
+namespace Fortran::lower::omp {
+
+[[maybe_unused]] static void fillMemberIndices(
+llvm::SmallVector> &memberPlacementData) {
+  size_t largestIndicesSize =
+  std::max_element(memberPlacementData.begin(), memberPlacementData.end(),
+   [](auto a, auto b) { return a.size() < b.size(); })
+  ->size();
+
+  // DenseElementsAttr expects a rectangular shape for the data, so all
+  // index lists have to be of the same length, this emplaces -1 as filler.

agozillon wrote:

The DenseElementsAttr construct unfortunately needs to be the same size in all 
dimensions, so in cases where the index data is not the same length in all 
dimensions we need some filler, It's a random value that makes some sense to 
indicate the end of an indice or a non-index value as indices can only be 
positive. Unfortunately DenseElementsAttr was the only appropriate MLIR 
attribute I could find for this case and it's a little rough around the edges 
it seems. 

https://github.com/llvm/llvm-project/pull/96266
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [Flang][OpenMP] Derived type explicit allocatable member mapping (PR #96266)

2024-07-15 Thread via llvm-branch-commits


@@ -52,12 +52,22 @@ using DeclareTargetCapturePair =
 struct OmpMapMemberIndicesData {
   // The indices representing the component members placement in its derived
   // type parents hierarchy.
-  llvm::SmallVector memberPlacementIndices;
+  llvm::SmallVector> memberPlacementIndices;
 
   // Placement of the member in the member vector.
-  mlir::omp::MapInfoOp memberMap;
+  llvm::SmallVector memberMap;
 };
 
+llvm::SmallVector
+generateMemberPlacementIndices(const Object &object,

agozillon wrote:

Nope! Bit of an artifact from merging the last PR stack on top of what I was 
working on after it had been altered in review and I apparently missed this 
one! Thank you for catching it :D

https://github.com/llvm/llvm-project/pull/96266
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match functions with call graph (PR #98125)

2024-07-15 Thread Shaw Young via llvm-branch-commits

https://github.com/shawbyoung updated 
https://github.com/llvm/llvm-project/pull/98125

>From cf32a43e7c2b04079c6123fe13df4fb7226d771f Mon Sep 17 00:00:00 2001
From: shawbyoung 
Date: Tue, 9 Jul 2024 10:04:25 -0700
Subject: [PATCH 01/11] Comments

Created using spr 1.3.4
---
 bolt/lib/Profile/YAMLProfileReader.cpp | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/bolt/lib/Profile/YAMLProfileReader.cpp 
b/bolt/lib/Profile/YAMLProfileReader.cpp
index 69ea0899c5f2c..6753337c24ea7 100644
--- a/bolt/lib/Profile/YAMLProfileReader.cpp
+++ b/bolt/lib/Profile/YAMLProfileReader.cpp
@@ -501,7 +501,6 @@ size_t YAMLProfileReader::matchWithCallGraph(BinaryContext 
&BC) {
 
   // Maps binary functions to adjacent functions in the FCG.
   for (const BinaryFunction *CallerBF : BFs) {
-// Add all call targets to the hash map.
 for (const BinaryBasicBlock &BB : CallerBF->blocks()) {
   for (const MCInst &Inst : BB) {
 if (!BC.MIB->isCall(Instr))
@@ -533,7 +532,8 @@ size_t YAMLProfileReader::matchWithCallGraph(BinaryContext 
&BC) {
 }
   }
 
-  // Create mapping from neighbor hash to BFs.
+  // Using the constructed adjacent function mapping, creates mapping from
+  // neighbor hash to BFs.
   std::unordered_map>
   NeighborHashToBFs;
   for (const BinaryFunction *BF : BFs) {
@@ -552,12 +552,12 @@ size_t 
YAMLProfileReader::matchWithCallGraph(BinaryContext &BC) {
 .push_back(BF);
   }
 
-  // TODO: change call anchor PR to have this representation - we need it here
+  // TODO: note, this will be introduced in the matching functions with calls
+  // as anchors pr
   DenseMap
   IdToYAMLBF;
-  // TODO: change call anchor PR to have this representation - we need it here
 
-  // Maps hashes to profiled functions.
+  // Maps YAML functions to adjacent functions in the profile FCG.
   std::unordered_map
   YamlBFToHashes(BFs.size());
@@ -590,7 +590,7 @@ size_t YAMLProfileReader::matchWithCallGraph(BinaryContext 
&BC) {
 }
   }
 
-  // Matching YAMLBF with neighbor hashes.
+  // Matches YAMLBF to BFs with neighbor hashes.
   for (yaml::bolt::BinaryFunctionProfile &YamlBF : YamlBP.Functions) {
 if (YamlBF.Used)
   continue;

>From ee9049fc4bd3d4203c19c9c0982a78ab3b47666f Mon Sep 17 00:00:00 2001
From: shawbyoung 
Date: Tue, 9 Jul 2024 13:52:05 -0700
Subject: [PATCH 02/11] Moved blended hash definition

Created using spr 1.3.4
---
 bolt/include/bolt/Profile/YAMLProfileReader.h |  69 ++-
 bolt/lib/Profile/StaleProfileMatching.cpp |  65 ---
 bolt/lib/Profile/YAMLProfileReader.cpp| 110 --
 3 files changed, 119 insertions(+), 125 deletions(-)

diff --git a/bolt/include/bolt/Profile/YAMLProfileReader.h 
b/bolt/include/bolt/Profile/YAMLProfileReader.h
index 36e8f8739eee1..e8a34ecad9a08 100644
--- a/bolt/include/bolt/Profile/YAMLProfileReader.h
+++ b/bolt/include/bolt/Profile/YAMLProfileReader.h
@@ -16,6 +16,73 @@
 namespace llvm {
 namespace bolt {
 
+/// An object wrapping several components of a basic block hash. The combined
+/// (blended) hash is represented and stored as one uint64_t, while individual
+/// components are of smaller size (e.g., uint16_t or uint8_t).
+struct BlendedBlockHash {
+private:
+  using ValueOffset = Bitfield::Element;
+  using ValueOpcode = Bitfield::Element;
+  using ValueInstr = Bitfield::Element;
+  using ValuePred = Bitfield::Element;
+  using ValueSucc = Bitfield::Element;
+
+public:
+  explicit BlendedBlockHash() {}
+
+  explicit BlendedBlockHash(uint64_t Hash) {
+Offset = Bitfield::get(Hash);
+OpcodeHash = Bitfield::get(Hash);
+InstrHash = Bitfield::get(Hash);
+PredHash = Bitfield::get(Hash);
+SuccHash = Bitfield::get(Hash);
+  }
+
+  /// Combine the blended hash into uint64_t.
+  uint64_t combine() const {
+uint64_t Hash = 0;
+Bitfield::set(Hash, Offset);
+Bitfield::set(Hash, OpcodeHash);
+Bitfield::set(Hash, InstrHash);
+Bitfield::set(Hash, PredHash);
+Bitfield::set(Hash, SuccHash);
+return Hash;
+  }
+
+  /// Compute a distance between two given blended hashes. The smaller the
+  /// distance, the more similar two blocks are. For identical basic blocks,
+  /// the distance is zero.
+  uint64_t distance(const BlendedBlockHash &BBH) const {
+assert(OpcodeHash == BBH.OpcodeHash &&
+   "incorrect blended hash distance computation");
+uint64_t Dist = 0;
+// Account for NeighborHash
+Dist += SuccHash == BBH.SuccHash ? 0 : 1;
+Dist += PredHash == BBH.PredHash ? 0 : 1;
+Dist <<= 16;
+// Account for InstrHash
+Dist += InstrHash == BBH.InstrHash ? 0 : 1;
+Dist <<= 16;
+// Account for Offset
+Dist += (Offset >= BBH.Offset ? Offset - BBH.Offset : BBH.Offset - Offset);
+return Dist;
+  }
+
+  /// The offset of the basic block from the function start.
+  uint16_t Offset{0};
+  /// (Loose) Hash of the basic block instructions, excluding operands.
+  uint16_t OpcodeHash{0};
+  /// (Str

[llvm-branch-commits] [flang] [Flang][OpenMP] Derived type explicit allocatable member mapping (PR #96266)

2024-07-15 Thread via llvm-branch-commits

https://github.com/agozillon commented:

Tried to reply to some of your initial comments and questions @ergawy ! 
Hopefully helpful :-) and no question is a newbie question, they're all very 
appreciated as it makes me re-question myself which helps spot things I 
overlooked!

I'll update the PR at some point this week a little (perhaps over) invested in 
a current task, if you get some free time to review the other components: 
https://github.com/llvm/llvm-project/pull/96265 and 
https://github.com/llvm/llvm-project/pull/96265 it would be greatly 
appreciated! 

Thank you very much for your review and help :-)!

https://github.com/llvm/llvm-project/pull/96266
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [Flang][OpenMP] Derived type explicit allocatable member mapping (PR #96266)

2024-07-15 Thread via llvm-branch-commits


@@ -141,6 +143,110 @@ createMapInfoOp(fir::FirOpBuilder &builder, 
mlir::Location loc,
   return op;
 }
 
+omp::ObjectList gatherObjects(omp::Object obj,
+  semantics::SemanticsContext &semaCtx) {
+  omp::ObjectList objList;
+  std::optional baseObj = getBaseObject(obj, semaCtx);
+  while (baseObj.has_value()) {
+objList.push_back(baseObj.value());
+baseObj = getBaseObject(baseObj.value(), semaCtx);
+  }
+  return omp::ObjectList{llvm::reverse(objList)};
+}
+
+bool duplicateMemberMapInfo(OmpMapMemberIndicesData &parentMembers,
+llvm::SmallVectorImpl &memberIndices) {
+  // A variation of std:equal that supports non-equal length index lists for 
our
+  // specific use-case, if one is larger than the other, we use -1, the default
+  // filler element in place of the smaller vector, this prevents UB from over
+  // indexing and removes the need for us to do any filling of intermediate
+  // index lists we'll discard.
+  auto isEqual = [](auto first1, auto last1, auto first2, auto last2) {
+int v1, v2;
+for (; first1 != last1; ++first1, ++first2) {

agozillon wrote:

it's a slightly modified copy of the std::equal implementation. extended to 
substitute -1 (the filler element indicating a non-index value) in cases where 
we over index the ranges (but as you've stated the scenario for the 
first1/last1 range is impossible, thank you for that, you can tell I was 
pulling my hair out trying to find the UB I'd made in the original 
implementation :-)).

However, I think you're correct in that it would not work perfectly in these 
scenarios, I believe in most scenarios the pre-condition is met (hence not 
having issues with it before hand), but this isn't a guarantee and I don't 
believe the behavior of std::equal would reflect equality appropriately in 
these cases, so I'll adjust it to select the larger length range as the index 
or just use the largest std::distance. 

Thank you very much for the catch! :D 


https://github.com/llvm/llvm-project/pull/96266
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [Flang][OpenMP] Derived type explicit allocatable member mapping (PR #96266)

2024-07-15 Thread via llvm-branch-commits


@@ -141,6 +143,110 @@ createMapInfoOp(fir::FirOpBuilder &builder, 
mlir::Location loc,
   return op;
 }
 
+omp::ObjectList gatherObjects(omp::Object obj,
+  semantics::SemanticsContext &semaCtx) {
+  omp::ObjectList objList;
+  std::optional baseObj = getBaseObject(obj, semaCtx);
+  while (baseObj.has_value()) {
+objList.push_back(baseObj.value());
+baseObj = getBaseObject(baseObj.value(), semaCtx);
+  }
+  return omp::ObjectList{llvm::reverse(objList)};
+}
+
+bool duplicateMemberMapInfo(OmpMapMemberIndicesData &parentMembers,
+llvm::SmallVectorImpl &memberIndices) {
+  // A variation of std:equal that supports non-equal length index lists for 
our
+  // specific use-case, if one is larger than the other, we use -1, the default
+  // filler element in place of the smaller vector, this prevents UB from over
+  // indexing and removes the need for us to do any filling of intermediate
+  // index lists we'll discard.
+  auto isEqual = [](auto first1, auto last1, auto first2, auto last2) {
+int v1, v2;
+for (; first1 != last1; ++first1, ++first2) {
+  v1 = (first1 == last1) ? -1 : *first1;
+  v2 = (first2 == last2) ? -1 : *first2;
+
+  if (!(v1 == v2))
+return false;
+}
+return true;
+  };
+
+  for (auto memberData : parentMembers.memberPlacementIndices)
+if (isEqual(memberData.begin(), memberData.end(), memberIndices.begin(),
+memberIndices.end()))
+  return true;
+  return false;
+}
+
+// When mapping members of derived types, there is a chance that one of the
+// members along the way to a mapped member is an descriptor. In which case
+// we have to make sure we generate a map for those along the way otherwise
+// we will be missing a chunk of data required to actually map the member
+// type to device. This function effectively generates these maps and the
+// appropriate data accesses required to generate these maps. It will avoid
+// creating duplicate maps, as duplicates are just as bad as unmapped
+// descriptor data in a lot of cases for the runtime (and unnecessary
+// data movement should be avoided where possible)
+mlir::Value createParentSymAndGenIntermediateMaps(
+mlir::Location clauseLocation, Fortran::lower::AbstractConverter 
&converter,
+omp::ObjectList &objectList, llvm::SmallVector &indices,
+OmpMapMemberIndicesData &parentMemberIndices, std::string asFortran,
+llvm::omp::OpenMPOffloadMappingFlags mapTypeBits) {
+  fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder();
+  Fortran::lower::AddrAndBoundsInfo parentBaseAddr =
+  Fortran::lower::getDataOperandBaseAddr(
+  converter, firOpBuilder, *objectList[0].sym(), clauseLocation);
+  mlir::Value curValue = parentBaseAddr.addr;
+
+  for (size_t i = 0; i < objectList.size(); ++i) {
+mlir::Type unwrappedTy =
+fir::unwrapSequenceType(fir::unwrapPassByRefType(curValue.getType()));
+if (fir::RecordType recordType =
+mlir::dyn_cast_or_null(unwrappedTy)) {

agozillon wrote:

I should likely have an llvm_unreachable at the end of the if stating 
unsupported type or more simply a cast and a nullary assert check, primarily as 
ClassType is a possibility and I've yet to test it with the explicit member 
mapping so it won't work in this iteration of the PR. Although, I imagine 
(hope) it will work almost identically minus some tweaks to account for the 
type in the future, at least from my limited understanding of the type so far!  

https://github.com/llvm/llvm-project/pull/96266
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [Flang][OpenMP] Derived type explicit allocatable member mapping (PR #96266)

2024-07-15 Thread via llvm-branch-commits


@@ -141,6 +143,110 @@ createMapInfoOp(fir::FirOpBuilder &builder, 
mlir::Location loc,
   return op;
 }
 
+omp::ObjectList gatherObjects(omp::Object obj,
+  semantics::SemanticsContext &semaCtx) {
+  omp::ObjectList objList;
+  std::optional baseObj = getBaseObject(obj, semaCtx);
+  while (baseObj.has_value()) {
+objList.push_back(baseObj.value());
+baseObj = getBaseObject(baseObj.value(), semaCtx);
+  }
+  return omp::ObjectList{llvm::reverse(objList)};
+}
+
+bool duplicateMemberMapInfo(OmpMapMemberIndicesData &parentMembers,
+llvm::SmallVectorImpl &memberIndices) {
+  // A variation of std:equal that supports non-equal length index lists for 
our
+  // specific use-case, if one is larger than the other, we use -1, the default
+  // filler element in place of the smaller vector, this prevents UB from over
+  // indexing and removes the need for us to do any filling of intermediate
+  // index lists we'll discard.
+  auto isEqual = [](auto first1, auto last1, auto first2, auto last2) {
+int v1, v2;
+for (; first1 != last1; ++first1, ++first2) {
+  v1 = (first1 == last1) ? -1 : *first1;
+  v2 = (first2 == last2) ? -1 : *first2;
+
+  if (!(v1 == v2))
+return false;
+}
+return true;
+  };
+
+  for (auto memberData : parentMembers.memberPlacementIndices)
+if (isEqual(memberData.begin(), memberData.end(), memberIndices.begin(),
+memberIndices.end()))
+  return true;
+  return false;
+}

agozillon wrote:

Happy to change the name, but unfortunately std::equal will not work and will 
result in the previously stated UB, I had this previously :-) The 
implementation is effectively the exact same (provided I am looking at the 
correct overload at least) as the above implementation, minus line 168 which 
supplements -1's when we over index, without this in the original 
implementation of std::equal we over index the 2nd range in certain cases and 
end up accessing random values which yields incorrect results. So I'll try to 
do what I stated in the previous comment and see how that goes!

https://github.com/llvm/llvm-project/pull/96266
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [Flang][OpenMP] Derived type explicit allocatable member mapping (PR #96266)

2024-07-15 Thread via llvm-branch-commits

https://github.com/agozillon edited 
https://github.com/llvm/llvm-project/pull/96266
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [Flang][OpenMP] Derived type explicit allocatable member mapping (PR #96266)

2024-07-15 Thread via llvm-branch-commits


@@ -953,6 +954,22 @@ bool ClauseProcessor::processMap(
   if (origSymbol && fir::isTypeWithDescriptor(origSymbol.getType()))
 symAddr = origSymbol;
 
+  if (object.sym()->owner().IsDerivedType()) {
+omp::ObjectList objectList = gatherObjects(object, semaCtx);
+parentObj = objectList[0];
+parentMemberIndices.insert({parentObj.value(), {}});

agozillon wrote:

I believe insert will block subsequent inserts if the key is already 
registered, so it shouldn't overwrite, but I could be misunderstanding it and 
I'll double check/verify this again when i get around to updating the PR :-) 

https://github.com/llvm/llvm-project/pull/96266
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [Flang][OpenMP] Derived type explicit allocatable member mapping (PR #96266)

2024-07-15 Thread via llvm-branch-commits


@@ -141,6 +143,110 @@ createMapInfoOp(fir::FirOpBuilder &builder, 
mlir::Location loc,
   return op;
 }
 
+omp::ObjectList gatherObjects(omp::Object obj,
+  semantics::SemanticsContext &semaCtx) {
+  omp::ObjectList objList;
+  std::optional baseObj = getBaseObject(obj, semaCtx);
+  while (baseObj.has_value()) {
+objList.push_back(baseObj.value());
+baseObj = getBaseObject(baseObj.value(), semaCtx);
+  }
+  return omp::ObjectList{llvm::reverse(objList)};
+}
+
+bool duplicateMemberMapInfo(OmpMapMemberIndicesData &parentMembers,
+llvm::SmallVectorImpl &memberIndices) {
+  // A variation of std:equal that supports non-equal length index lists for 
our
+  // specific use-case, if one is larger than the other, we use -1, the default
+  // filler element in place of the smaller vector, this prevents UB from over
+  // indexing and removes the need for us to do any filling of intermediate
+  // index lists we'll discard.
+  auto isEqual = [](auto first1, auto last1, auto first2, auto last2) {
+int v1, v2;
+for (; first1 != last1; ++first1, ++first2) {
+  v1 = (first1 == last1) ? -1 : *first1;

agozillon wrote:

True, thank you for the catch :-)! 

https://github.com/llvm/llvm-project/pull/96266
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match functions with call graph (PR #98125)

2024-07-15 Thread Davide Italiano via llvm-branch-commits

https://github.com/dcci commented:

This is heading in the right direction.

https://github.com/llvm/llvm-project/pull/98125
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match functions with call graph (PR #98125)

2024-07-15 Thread Davide Italiano via llvm-branch-commits

https://github.com/dcci edited https://github.com/llvm/llvm-project/pull/98125
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match functions with call graph (PR #98125)

2024-07-15 Thread Davide Italiano via llvm-branch-commits


@@ -16,6 +16,37 @@
 namespace llvm {
 namespace bolt {
 
+/// A class for matching binary functions in functions in the YAML profile.

dcci wrote:

I think it would be useful to add a general description of the algorithm, in 
particular if you have a reference to the paper, somewhere.

https://github.com/llvm/llvm-project/pull/98125
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match functions with call graph (PR #98125)

2024-07-15 Thread Davide Italiano via llvm-branch-commits


@@ -446,6 +503,56 @@ size_t YAMLProfileReader::matchWithLTOCommonName() {
   return MatchedWithLTOCommonName;
 }
 
+size_t YAMLProfileReader::matchWithCallGraph(BinaryContext &BC) {
+  if (!opts::MatchWithCallGraph)
+return 0;
+
+  size_t MatchedWithCallGraph = 0;
+  CGMatcher.computeBFNeighborHashes(BC);
+  CGMatcher.constructYAMLFCG(YamlBP, IdToYamLBF);
+
+  // Matches YAMLBF to BFs with neighbor hashes.
+  for (yaml::bolt::BinaryFunctionProfile &YamlBF : YamlBP.Functions) {
+if (YamlBF.Used)
+  continue;
+auto It = CGMatcher.YamlBFAdjacencyMap.find(&YamlBF);
+if (It == CGMatcher.YamlBFAdjacencyMap.end())
+  continue;
+// Computes profiled function's neighbor hash.
+std::set &AdjacentFunctions =

dcci wrote:

why is this a set and not a `SmallSet`/`DenseSet` ?

https://github.com/llvm/llvm-project/pull/98125
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match functions with call graph (PR #98125)

2024-07-15 Thread Davide Italiano via llvm-branch-commits


@@ -568,12 +675,30 @@ Error YAMLProfileReader::readProfile(BinaryContext &BC) {
   }
   YamlProfileToFunction.resize(YamlBP.Functions.size() + 1);
 
-  // Computes hash for binary functions.
+  // Map profiled function ids to names.
+  for (yaml::bolt::BinaryFunctionProfile &YamlBF : YamlBP.Functions)
+IdToYamLBF[YamlBF.Id] = &YamlBF;
+
+  // Creates a vector of lamdas that preprocess binary functions for function

dcci wrote:

`lamdas` -> `lambdas`

https://github.com/llvm/llvm-project/pull/98125
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match functions with call graph (PR #98125)

2024-07-15 Thread Davide Italiano via llvm-branch-commits


@@ -50,6 +54,59 @@ llvm::cl::opt ProfileUseDFS("profile-use-dfs",
 namespace llvm {
 namespace bolt {
 
+void CallGraphMatcher::addBFCGEdges(BinaryContext &BC,
+yaml::bolt::BinaryProfile &YamlBP,
+BinaryFunction *BF) {
+  for (const BinaryBasicBlock &BB : BF->blocks()) {
+for (const MCInst &Instr : BB) {
+  if (!BC.MIB->isCall(Instr))
+continue;
+  const MCSymbol *CallSymbol = BC.MIB->getTargetSymbol(Instr);
+  if (!CallSymbol)
+continue;
+  BinaryData *BD = BC.getBinaryDataByName(CallSymbol->getName());
+  if (!BD)
+continue;
+  BinaryFunction *CalleeBF = BC.getFunctionForSymbol(BD->getSymbol());

dcci wrote:

Is it possible to have a caller without a callee?
This sounds like it should be an assertion.

https://github.com/llvm/llvm-project/pull/98125
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match functions with call graph (PR #98125)

2024-07-15 Thread Davide Italiano via llvm-branch-commits


@@ -446,6 +503,56 @@ size_t YAMLProfileReader::matchWithLTOCommonName() {
   return MatchedWithLTOCommonName;
 }
 
+size_t YAMLProfileReader::matchWithCallGraph(BinaryContext &BC) {
+  if (!opts::MatchWithCallGraph)
+return 0;
+
+  size_t MatchedWithCallGraph = 0;
+  CGMatcher.computeBFNeighborHashes(BC);
+  CGMatcher.constructYAMLFCG(YamlBP, IdToYamLBF);
+
+  // Matches YAMLBF to BFs with neighbor hashes.
+  for (yaml::bolt::BinaryFunctionProfile &YamlBF : YamlBP.Functions) {
+if (YamlBF.Used)
+  continue;
+auto It = CGMatcher.YamlBFAdjacencyMap.find(&YamlBF);
+if (It == CGMatcher.YamlBFAdjacencyMap.end())
+  continue;
+// Computes profiled function's neighbor hash.
+std::set &AdjacentFunctions =
+It->second;
+std::string AdjacentFunctionHashStr;
+for (auto &AdjacentFunction : AdjacentFunctions) {

dcci wrote:

Single line generally bodies of conditionals don't have braces in LLVM

https://github.com/llvm/llvm-project/pull/98125
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match functions with call graph (PR #98125)

2024-07-15 Thread Shaw Young via llvm-branch-commits


@@ -446,6 +503,56 @@ size_t YAMLProfileReader::matchWithLTOCommonName() {
   return MatchedWithLTOCommonName;
 }
 
+size_t YAMLProfileReader::matchWithCallGraph(BinaryContext &BC) {
+  if (!opts::MatchWithCallGraph)
+return 0;
+
+  size_t MatchedWithCallGraph = 0;
+  CGMatcher.computeBFNeighborHashes(BC);
+  CGMatcher.constructYAMLFCG(YamlBP, IdToYamLBF);
+
+  // Matches YAMLBF to BFs with neighbor hashes.
+  for (yaml::bolt::BinaryFunctionProfile &YamlBF : YamlBP.Functions) {
+if (YamlBF.Used)
+  continue;
+auto It = CGMatcher.YamlBFAdjacencyMap.find(&YamlBF);
+if (It == CGMatcher.YamlBFAdjacencyMap.end())
+  continue;
+// Computes profiled function's neighbor hash.
+std::set &AdjacentFunctions =

shawbyoung wrote:

I'm using std::set so that I can both perform lookups and iterate through the 
BF profiles in order.

https://github.com/llvm/llvm-project/pull/98125
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match functions with call graph (PR #98125)

2024-07-15 Thread Shaw Young via llvm-branch-commits


@@ -446,6 +503,56 @@ size_t YAMLProfileReader::matchWithLTOCommonName() {
   return MatchedWithLTOCommonName;
 }
 
+size_t YAMLProfileReader::matchWithCallGraph(BinaryContext &BC) {
+  if (!opts::MatchWithCallGraph)
+return 0;
+
+  size_t MatchedWithCallGraph = 0;
+  CGMatcher.computeBFNeighborHashes(BC);
+  CGMatcher.constructYAMLFCG(YamlBP, IdToYamLBF);
+
+  // Matches YAMLBF to BFs with neighbor hashes.
+  for (yaml::bolt::BinaryFunctionProfile &YamlBF : YamlBP.Functions) {
+if (YamlBF.Used)
+  continue;
+auto It = CGMatcher.YamlBFAdjacencyMap.find(&YamlBF);
+if (It == CGMatcher.YamlBFAdjacencyMap.end())
+  continue;
+// Computes profiled function's neighbor hash.
+std::set &AdjacentFunctions =

shawbyoung wrote:

Although, this did make me think to use std::find and a SmallVector instead - I 
anticipate these sets to generally be quite small. I'll run some small tests 
and see if that improves performance 

https://github.com/llvm/llvm-project/pull/98125
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match functions with call graph (PR #98125)

2024-07-15 Thread Shaw Young via llvm-branch-commits


@@ -50,6 +54,59 @@ llvm::cl::opt ProfileUseDFS("profile-use-dfs",
 namespace llvm {
 namespace bolt {
 
+void CallGraphMatcher::addBFCGEdges(BinaryContext &BC,
+yaml::bolt::BinaryProfile &YamlBP,
+BinaryFunction *BF) {
+  for (const BinaryBasicBlock &BB : BF->blocks()) {
+for (const MCInst &Instr : BB) {
+  if (!BC.MIB->isCall(Instr))
+continue;
+  const MCSymbol *CallSymbol = BC.MIB->getTargetSymbol(Instr);
+  if (!CallSymbol)
+continue;
+  BinaryData *BD = BC.getBinaryDataByName(CallSymbol->getName());
+  if (!BD)
+continue;
+  BinaryFunction *CalleeBF = BC.getFunctionForSymbol(BD->getSymbol());

shawbyoung wrote:

I believe it's possible to calls to functions pointers in the profile as well 
as the binary

https://github.com/llvm/llvm-project/pull/98125
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [tsan] Replace ALIGNED with alignas (PR #98959)

2024-07-15 Thread Fangrui Song via llvm-branch-commits

https://github.com/MaskRay created 
https://github.com/llvm/llvm-project/pull/98959

Similar to #98958.



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [tsan] Replace ALIGNED with alignas (PR #98959)

2024-07-15 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-compiler-rt-sanitizer

Author: Fangrui Song (MaskRay)


Changes

Similar to #98958.


---
Full diff: https://github.com/llvm/llvm-project/pull/98959.diff


9 Files Affected:

- (modified) compiler-rt/lib/tsan/rtl/tsan_defs.h (+1-1) 
- (modified) compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp (+2-2) 
- (modified) compiler-rt/lib/tsan/rtl/tsan_interface_ann.cpp (+1-1) 
- (modified) compiler-rt/lib/tsan/rtl/tsan_mman.cpp (+2-2) 
- (modified) compiler-rt/lib/tsan/rtl/tsan_platform_mac.cpp (+1-1) 
- (modified) compiler-rt/lib/tsan/rtl/tsan_rtl.cpp (+4-5) 
- (modified) compiler-rt/lib/tsan/rtl/tsan_rtl.h (+4-4) 
- (modified) compiler-rt/lib/tsan/rtl/tsan_suppressions.cpp (+1-1) 
- (modified) compiler-rt/lib/tsan/rtl/tsan_vector_clock.h (+1-1) 


``diff
diff --git a/compiler-rt/lib/tsan/rtl/tsan_defs.h 
b/compiler-rt/lib/tsan/rtl/tsan_defs.h
index 1ffa3d6aec40b..270d441dc90b7 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_defs.h
+++ b/compiler-rt/lib/tsan/rtl/tsan_defs.h
@@ -30,7 +30,7 @@
 #  define __MM_MALLOC_H
 #  include 
 #  include 
-#  define VECTOR_ALIGNED ALIGNED(16)
+#  define VECTOR_ALIGNED alignas(16)
 typedef __m128i m128;
 #else
 #  define VECTOR_ALIGNED
diff --git a/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp 
b/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp
index 034ae3d322b56..9cab2a3727128 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp
+++ b/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp
@@ -208,7 +208,7 @@ struct AtExitCtx {
 struct InterceptorContext {
   // The object is 64-byte aligned, because we want hot data to be located
   // in a single cache line if possible (it's accessed in every interceptor).
-  ALIGNED(64) LibIgnore libignore;
+  alignas(64) LibIgnore libignore;
   __sanitizer_sigaction sigactions[kSigCount];
 #if !SANITIZER_APPLE && !SANITIZER_NETBSD
   unsigned finalize_key;
@@ -220,7 +220,7 @@ struct InterceptorContext {
   InterceptorContext() : libignore(LINKER_INITIALIZED), 
atexit_mu(MutexTypeAtExit), AtExitStack() {}
 };
 
-static ALIGNED(64) char interceptor_placeholder[sizeof(InterceptorContext)];
+alignas(64) static char interceptor_placeholder[sizeof(InterceptorContext)];
 InterceptorContext *interceptor_ctx() {
   return reinterpret_cast(&interceptor_placeholder[0]);
 }
diff --git a/compiler-rt/lib/tsan/rtl/tsan_interface_ann.cpp 
b/compiler-rt/lib/tsan/rtl/tsan_interface_ann.cpp
index 5154662034c56..befd6a369026d 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_interface_ann.cpp
+++ b/compiler-rt/lib/tsan/rtl/tsan_interface_ann.cpp
@@ -76,7 +76,7 @@ struct DynamicAnnContext {
 };
 
 static DynamicAnnContext *dyn_ann_ctx;
-static char dyn_ann_ctx_placeholder[sizeof(DynamicAnnContext)] ALIGNED(64);
+alignas(64) static char dyn_ann_ctx_placeholder[sizeof(DynamicAnnContext)];
 
 static void AddExpectRace(ExpectRace *list,
 char *f, int l, uptr addr, uptr size, char *desc) {
diff --git a/compiler-rt/lib/tsan/rtl/tsan_mman.cpp 
b/compiler-rt/lib/tsan/rtl/tsan_mman.cpp
index e129e9af272f5..0705365d77427 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_mman.cpp
+++ b/compiler-rt/lib/tsan/rtl/tsan_mman.cpp
@@ -54,7 +54,7 @@ struct MapUnmapCallback {
   }
 };
 
-static char allocator_placeholder[sizeof(Allocator)] ALIGNED(64);
+alignas(64) static char allocator_placeholder[sizeof(Allocator)];
 Allocator *allocator() {
   return reinterpret_cast(&allocator_placeholder);
 }
@@ -75,7 +75,7 @@ struct GlobalProc {
 internal_alloc_mtx(MutexTypeInternalAlloc) {}
 };
 
-static char global_proc_placeholder[sizeof(GlobalProc)] ALIGNED(64);
+alignas(64) static char global_proc_placeholder[sizeof(GlobalProc)];
 GlobalProc *global_proc() {
   return reinterpret_cast(&global_proc_placeholder);
 }
diff --git a/compiler-rt/lib/tsan/rtl/tsan_platform_mac.cpp 
b/compiler-rt/lib/tsan/rtl/tsan_platform_mac.cpp
index 07d83e1a9a9ff..c8a66e60a69f1 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_platform_mac.cpp
+++ b/compiler-rt/lib/tsan/rtl/tsan_platform_mac.cpp
@@ -46,7 +46,7 @@
 namespace __tsan {
 
 #if !SANITIZER_GO
-static char main_thread_state[sizeof(ThreadState)] ALIGNED(
+static char main_thread_state[sizeof(ThreadState)] alignas(
 SANITIZER_CACHE_LINE_SIZE);
 static ThreadState *dead_thread_state;
 static pthread_key_t thread_state_key;
diff --git a/compiler-rt/lib/tsan/rtl/tsan_rtl.cpp 
b/compiler-rt/lib/tsan/rtl/tsan_rtl.cpp
index e5ebb65754b32..b5a4abb87ec0b 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_rtl.cpp
+++ b/compiler-rt/lib/tsan/rtl/tsan_rtl.cpp
@@ -48,12 +48,11 @@ int (*on_finalize)(int);
 #endif
 
 #if !SANITIZER_GO && !SANITIZER_APPLE
-__attribute__((tls_model("initial-exec")))
-THREADLOCAL char cur_thread_placeholder[sizeof(ThreadState)] ALIGNED(
-SANITIZER_CACHE_LINE_SIZE);
+alignas(SANITIZER_CACHE_LINE_SIZE) THREADLOCAL __attribute__((tls_model(
+"initial-exec"))) char cur_thread_placeholder[sizeof(ThreadState)];
 #endif
-static char ctx_placeholder[sizeof(Context)] 
ALIGNED(SA

[llvm-branch-commits] [tsan] Replace ALIGNED with alignas (PR #98959)

2024-07-15 Thread Thurston Dang via llvm-branch-commits

https://github.com/thurstond approved this pull request.


https://github.com/llvm/llvm-project/pull/98959
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match functions with call graph (PR #98125)

2024-07-15 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov commented:

Thank you for working on this! It looks very good overall, left a couple of 
comments inline. Please run this new matching on a large binary to answer 
questions about runtime and matching quality in ambiguous cases.

https://github.com/llvm/llvm-project/pull/98125
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match functions with call graph (PR #98125)

2024-07-15 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov edited https://github.com/llvm/llvm-project/pull/98125
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match functions with call graph (PR #98125)

2024-07-15 Thread Amir Ayupov via llvm-branch-commits


@@ -568,12 +675,30 @@ Error YAMLProfileReader::readProfile(BinaryContext &BC) {
   }
   YamlProfileToFunction.resize(YamlBP.Functions.size() + 1);
 
-  // Computes hash for binary functions.
+  // Map profiled function ids to names.
+  for (yaml::bolt::BinaryFunctionProfile &YamlBF : YamlBP.Functions)
+IdToYamLBF[YamlBF.Id] = &YamlBF;
+
+  // Creates a vector of lamdas that preprocess binary functions for function
+  // matching to avoid multiple preprocessing passes over binary functions in
+  // different function matching techniques.
+  std::vector> BFPreprocessingFuncs;

aaupov wrote:

Let's move it into a separate NFC change.

https://github.com/llvm/llvm-project/pull/98125
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match functions with call graph (PR #98125)

2024-07-15 Thread Amir Ayupov via llvm-branch-commits


@@ -16,6 +16,37 @@
 namespace llvm {
 namespace bolt {
 
+/// A class for matching binary functions in functions in the YAML profile.
+struct CallGraphMatcher {

aaupov wrote:

Let's make it a proper class

https://github.com/llvm/llvm-project/pull/98125
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Match functions with call graph (PR #98125)

2024-07-15 Thread Amir Ayupov via llvm-branch-commits


@@ -446,6 +503,56 @@ size_t YAMLProfileReader::matchWithLTOCommonName() {
   return MatchedWithLTOCommonName;
 }
 
+size_t YAMLProfileReader::matchWithCallGraph(BinaryContext &BC) {
+  if (!opts::MatchWithCallGraph)
+return 0;
+
+  size_t MatchedWithCallGraph = 0;
+  CGMatcher.computeBFNeighborHashes(BC);
+  CGMatcher.constructYAMLFCG(YamlBP, IdToYamLBF);
+
+  // Matches YAMLBF to BFs with neighbor hashes.
+  for (yaml::bolt::BinaryFunctionProfile &YamlBF : YamlBP.Functions) {
+if (YamlBF.Used)
+  continue;
+auto It = CGMatcher.YamlBFAdjacencyMap.find(&YamlBF);
+if (It == CGMatcher.YamlBFAdjacencyMap.end())
+  continue;
+// Computes profiled function's neighbor hash.
+std::set &AdjacentFunctions =
+It->second;
+std::string AdjacentFunctionHashStr;
+for (auto &AdjacentFunction : AdjacentFunctions) {
+  AdjacentFunctionHashStr += AdjacentFunction->Name;
+}
+uint64_t Hash = std::hash{}(AdjacentFunctionHashStr);
+auto NeighborHashToBFsIt = CGMatcher.NeighborHashToBFs.find(Hash);
+if (NeighborHashToBFsIt == CGMatcher.NeighborHashToBFs.end())
+  continue;
+// Finds the binary function with the closest block size to the profiled

aaupov wrote:

Two concerns here:
- worst-case runtime - please check the size of the largest NeighborHashToBFs 
bucket in a large binary
- closest block count may not find the best match, we'll need to look at a 
couple of examples from real binary+stale profile to find out a good proxy.

https://github.com/llvm/llvm-project/pull/98125
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits