date:20240912

[llvm-branch-commits] [llvm] ecd542d - Revert "Reland "[amdgpu] Add llvm.amdgcn.init.whole.wave intrinsic" (#108054)…"

2024-09-12 Thread via llvm-branch-commits


Author: Diana Picus
Date: 2024-09-12T09:51:27+02:00
New Revision: ecd542d0e8ee3a37e979ff761ab3c633bcda5baf

URL: 
https://github.com/llvm/llvm-project/commit/ecd542d0e8ee3a37e979ff761ab3c633bcda5baf
DIFF: 
https://github.com/llvm/llvm-project/commit/ecd542d0e8ee3a37e979ff761ab3c633bcda5baf.diff

LOG: Revert "Reland "[amdgpu] Add llvm.amdgcn.init.whole.wave intrinsic" 
(#108054)…"

This reverts commit 703ebca869e1e684147d316b7bdb15437c12206a.

Added: 


Modified: 
llvm/include/llvm/IR/IntrinsicsAMDGPU.td
llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.h
llvm/lib/Target/AMDGPU/AMDGPUMachineFunction.h
llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp
llvm/lib/Target/AMDGPU/AMDGPUSearchableTables.td
llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
llvm/lib/Target/AMDGPU/SIInstructions.td
llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.h
llvm/lib/Target/AMDGPU/SIWholeQuadMode.cpp
llvm/test/CodeGen/AMDGPU/pei-amdgpu-cs-chain.mir
llvm/test/CodeGen/MIR/AMDGPU/long-branch-reg-all-sgpr-used.ll
llvm/test/CodeGen/MIR/AMDGPU/machine-function-info-after-pei.ll
llvm/test/CodeGen/MIR/AMDGPU/machine-function-info-long-branch-reg-debug.ll
llvm/test/CodeGen/MIR/AMDGPU/machine-function-info-long-branch-reg.ll
llvm/test/CodeGen/MIR/AMDGPU/machine-function-info-no-ir.mir
llvm/test/CodeGen/MIR/AMDGPU/machine-function-info.ll

Removed: 
llvm/test/CodeGen/AMDGPU/llvm.amdgcn.init.whole.wave-w32.ll
llvm/test/CodeGen/AMDGPU/llvm.amdgcn.init.whole.wave-w64.ll
llvm/test/CodeGen/AMDGPU/si-init-whole-wave.mir



diff  --git a/llvm/include/llvm/IR/IntrinsicsAMDGPU.td 
b/llvm/include/llvm/IR/IntrinsicsAMDGPU.td
index 4cd32a0502c66d..e20c26eb837875 100644
--- a/llvm/include/llvm/IR/IntrinsicsAMDGPU.td
+++ b/llvm/include/llvm/IR/IntrinsicsAMDGPU.td
@@ -208,16 +208,6 @@ def int_amdgcn_init_exec_from_input : Intrinsic<[],
   [IntrConvergent, IntrHasSideEffects, IntrNoMem, IntrNoCallback,
IntrNoFree, IntrWillReturn, ImmArg>]>;
 
-// Sets the function into whole-wave-mode and returns whether the lane was
-// active when entering the function. A branch depending on this return will
-// revert the EXEC mask to what it was when entering the function, thus
-// resulting in a no-op. This pattern is used to optimize branches when 
function
-// tails need to be run in whole-wave-mode. It may also have other consequences
-// (mostly related to WWM CSR handling) that 
diff erentiate it from using
-// a plain `amdgcn.init.exec -1`.
-def int_amdgcn_init_whole_wave : Intrinsic<[llvm_i1_ty], [], [
-IntrHasSideEffects, IntrNoMem, IntrConvergent]>;
-
 def int_amdgcn_wavefrontsize :
   ClangBuiltin<"__builtin_amdgcn_wavefrontsize">,
   DefaultAttrsIntrinsic<[llvm_i32_ty], [], [NoUndef, IntrNoMem, 
IntrSpeculatable]>;

diff  --git a/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
index 380dc7d3312f32..0daaf6b6576030 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
@@ -2738,11 +2738,6 @@ void AMDGPUDAGToDAGISel::SelectINTRINSIC_W_CHAIN(SDNode 
*N) {
   case Intrinsic::amdgcn_ds_bvh_stack_rtn:
 SelectDSBvhStackIntrinsic(N);
 return;
-  case Intrinsic::amdgcn_init_whole_wave:
-CurDAG->getMachineFunction()
-.getInfo()
-->setInitWholeWave();
-break;
   }
 
   SelectCode(N);

diff  --git a/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
index 53085d423cefb8..4dfd3f087c1ae4 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
@@ -1772,14 +1772,6 @@ bool 
AMDGPUInstructionSelector::selectDSAppendConsume(MachineInstr &MI,
   return constrainSelectedInstRegOperands(*MIB, TII, TRI, RBI);
 }
 
-bool AMDGPUInstructionSelector::selectInitWholeWave(MachineInstr &MI) const {
-  MachineFunction *MF = MI.getParent()->getParent();
-  SIMachineFunctionInfo *MFInfo = MF->getInfo();
-
-  MFInfo->setInitWholeWave();
-  return selectImpl(MI, *CoverageInfo);
-}
-
 bool AMDGPUInstructionSelector::selectSBarrier(MachineInstr &MI) const {
   if (TM.getOptLevel() > CodeGenOptLevel::None) {
 unsigned WGSize = STI.getFlatWorkGroupSizes(MF->getFunction()).second;
@@ -2107,8 +2099,6 @@ bool 
AMDGPUInstructionSelector::selectG_INTRINSIC_W_SIDE_EFFECTS(
 return selectDSAppendConsume(I, true);
   case Intrinsic::amdgcn_ds_consume:
 return selectDSAppendConsume(I, false);
-  case Intrinsic::amdgcn_init_whole_wave:
-return selectInitWholeWave(I);
   case Intrinsic::amdgcn_s_barrier:
 return selectSBarrier(I);
   case Intrinsic::amdgcn_raw_buffer_load_lds:

diff  --git a/llvm/lib/Target/AMDGPU/AMDGP

[llvm-branch-commits] Test (PR #108349)

2024-09-12 Thread Vitaly Buka via llvm-branch-commits


https://github.com/vitalybuka created 
https://github.com/llvm/llvm-project/pull/108349

None


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [sanitizer] Allow to override GetDTLSRange (PR #108348)

2024-09-12 Thread Vitaly Buka via llvm-branch-commits


https://github.com/vitalybuka created 
https://github.com/llvm/llvm-project/pull/108348

And rename it into __sanitizer_get_dtls_size.

The test will be in a separate patch, as I
expected reverts of the test.



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] release/19.x: [AVR] Fix 16-bit LDDs with immediate overflows (#104923) (PR #106993)

2024-09-12 Thread Patryk Wychowaniec via llvm-branch-commits


Patryk27 wrote:

@benshi001 / @aykevl, is there something we can do to push this forward or it's 
waiting for someone else?

https://github.com/llvm/llvm-project/pull/106993
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [Clang] Fix handling of placeholder variables name in init captures (#107055) (PR #107214)

2024-09-12 Thread via llvm-branch-commits


https://github.com/cor3ntin updated 
https://github.com/llvm/llvm-project/pull/107214

>From 8290ce0998788b6a575ed7b4988b093f48c25b3d Mon Sep 17 00:00:00 2001
From: cor3ntin 
Date: Tue, 3 Sep 2024 20:36:15 +0200
Subject: [PATCH] [Clang] Fix handling of placeholder variables name in init
 captures (#107055)

We were incorrectly not deduplicating results when looking up `_` which,
for a lambda init capture, would result in an ambiguous lookup.

The same bug caused some diagnostic notes to be emitted twice.

Fixes #107024
---
 clang/docs/ReleaseNotes.rst   | 1 +
 clang/lib/Sema/SemaLambda.cpp | 1 -
 clang/lib/Sema/SemaLookup.cpp | 2 +-
 clang/test/SemaCXX/cxx2c-placeholder-vars.cpp | 6 --
 4 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 53d819c6c44574..8c7a6ba70acd28 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -1122,6 +1122,7 @@ Bug Fixes to C++ Support
 - Fixed a crash-on-invalid bug involving extraneous template parameter with 
concept substitution. (#GH73885)
 - Fixed assertion failure by skipping the analysis of an invalid field 
declaration. (#GH99868)
 - Fix an issue with dependent source location expressions (#GH106428), 
(#GH81155), (#GH80210), (#GH85373)
+- Fix handling of ``_`` as the name of a lambda's init capture variable. 
(#GH107024)
 
 
 Bug Fixes to AST Handling
diff --git a/clang/lib/Sema/SemaLambda.cpp b/clang/lib/Sema/SemaLambda.cpp
index 601077e9f3334d..809b94bb7412b9 100644
--- a/clang/lib/Sema/SemaLambda.cpp
+++ b/clang/lib/Sema/SemaLambda.cpp
@@ -1318,7 +1318,6 @@ void 
Sema::ActOnLambdaExpressionAfterIntroducer(LambdaIntroducer &Intro,
 
 if (C->Init.isUsable()) {
   addInitCapture(LSI, cast(Var), C->Kind == LCK_ByRef);
-  PushOnScopeChains(Var, CurScope, false);
 } else {
   TryCaptureKind Kind = C->Kind == LCK_ByRef ? TryCapture_ExplicitByRef
  : TryCapture_ExplicitByVal;
diff --git a/clang/lib/Sema/SemaLookup.cpp b/clang/lib/Sema/SemaLookup.cpp
index 7a6a64529f52ec..d3d4bf27ae7283 100644
--- a/clang/lib/Sema/SemaLookup.cpp
+++ b/clang/lib/Sema/SemaLookup.cpp
@@ -570,7 +570,7 @@ void LookupResult::resolveKind() {
 
 // For non-type declarations, check for a prior lookup result naming this
 // canonical declaration.
-if (!D->isPlaceholderVar(getSema().getLangOpts()) && !ExistingI) {
+if (!ExistingI) {
   auto UniqueResult = Unique.insert(std::make_pair(D, I));
   if (!UniqueResult.second) {
 // We've seen this entity before.
diff --git a/clang/test/SemaCXX/cxx2c-placeholder-vars.cpp 
b/clang/test/SemaCXX/cxx2c-placeholder-vars.cpp
index 5cf66b48784e91..29ca3b5ef3df72 100644
--- a/clang/test/SemaCXX/cxx2c-placeholder-vars.cpp
+++ b/clang/test/SemaCXX/cxx2c-placeholder-vars.cpp
@@ -50,14 +50,16 @@ void f() {
 
 void lambda() {
 (void)[_ = 0, _ = 1] { // expected-warning {{placeholder variables are 
incompatible with C++ standards before C++2c}} \
-   // expected-note 4{{placeholder declared here}}
+   // expected-note 2{{placeholder declared here}}
 (void)_++; // expected-error {{ambiguous reference to placeholder '_', 
which is defined multiple times}}
 };
 
 {
 int _ = 12;
-(void)[_ = 0]{}; // no warning (different scope)
+(void)[_ = 0]{ return _;}; // no warning (different scope)
 }
+
+auto GH107024 = [_ = 42]() { return _; }();
 }
 
 namespace global_var {

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [Clang] Fix handling of placeholder variables name in init captures (#107055) (PR #107214)

2024-09-12 Thread via llvm-branch-commits


https://github.com/cor3ntin closed 
https://github.com/llvm/llvm-project/pull/107214
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [Clang] Fix handling of placeholder variables name in init captures (#107055) (PR #107214)

2024-09-12 Thread via llvm-branch-commits


cor3ntin wrote:

@tru done

https://github.com/llvm/llvm-project/pull/107214
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [Clang] Fix handling of placeholder variables name in init captures (#107055) (PR #107214)

2024-09-12 Thread via llvm-branch-commits


https://github.com/cor3ntin reopened 
https://github.com/llvm/llvm-project/pull/107214
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [mlir] [mlir][Transforms][NFC] Dialect conversion: Cache `UnresolvedMaterializationRewrite` (PR #108359)

2024-09-12 Thread Matthias Springer via llvm-branch-commits


https://github.com/matthias-springer created 
https://github.com/llvm/llvm-project/pull/108359

The dialect conversion maintains a set of unresolved materializations 
(`UnrealizedConversionCastOp`). Turn that set into a `DenseMap` that maps from 
ops to `UnresolvedMaterializationRewrite *`. This improves efficiency a bit, 
because an iteration over `ConversionPatternRewriterImpl::rewrites` can be 
avoided.

Also delete some dead code.

>From a9c69d1733662b3299bd3f4d41982422640dc034 Mon Sep 17 00:00:00 2001
From: Matthias Springer 
Date: Thu, 12 Sep 2024 12:45:44 +0200
Subject: [PATCH] [mlir][Transforms][NFC] Dialect conversion: Cache
 `UnresolvedMaterializationRewrite`

The dialect conversion already maintains a set of unresolved materializations 
(`UnrealizedConversionCastOp`). Turn that set into a map that maps from ops to 
`UnresolvedMaterializationRewrite *`. This improves efficiency a bit, because 
an iteration over `ConversionPatternRewriterImpl::rewrites` can be avoided.

Also delete some dead code.
---
 .../Transforms/Utils/DialectConversion.cpp| 60 +++
 1 file changed, 20 insertions(+), 40 deletions(-)

diff --git a/mlir/lib/Transforms/Utils/DialectConversion.cpp 
b/mlir/lib/Transforms/Utils/DialectConversion.cpp
index b58a95c3baf70a..ed15b571f01883 100644
--- a/mlir/lib/Transforms/Utils/DialectConversion.cpp
+++ b/mlir/lib/Transforms/Utils/DialectConversion.cpp
@@ -688,9 +688,7 @@ class UnresolvedMaterializationRewrite : public 
OperationRewrite {
   UnresolvedMaterializationRewrite(
   ConversionPatternRewriterImpl &rewriterImpl,
   UnrealizedConversionCastOp op, const TypeConverter *converter = nullptr,
-  MaterializationKind kind = MaterializationKind::Target)
-  : OperationRewrite(Kind::UnresolvedMaterialization, rewriterImpl, op),
-converterAndKind(converter, kind) {}
+  MaterializationKind kind = MaterializationKind::Target);
 
   static bool classof(const IRRewrite *rewrite) {
 return rewrite->getKind() == Kind::UnresolvedMaterialization;
@@ -730,26 +728,6 @@ static bool hasRewrite(R &&rewrites, Operation *op) {
   });
 }
 
-/// Find the single rewrite object of the specified type and block among the
-/// given rewrites. In debug mode, asserts that there is mo more than one such
-/// object. Return "nullptr" if no object was found.
-template 
-static RewriteTy *findSingleRewrite(R &&rewrites, Block *block) {
-  RewriteTy *result = nullptr;
-  for (auto &rewrite : rewrites) {
-auto *rewriteTy = dyn_cast(rewrite.get());
-if (rewriteTy && rewriteTy->getBlock() == block) {
-#ifndef NDEBUG
-  assert(!result && "expected single matching rewrite");
-  result = rewriteTy;
-#else
-  return rewriteTy;
-#endif // NDEBUG
-}
-  }
-  return result;
-}
-
 
//===--===//
 // ConversionPatternRewriterImpl
 
//===--===//
@@ -892,10 +870,6 @@ struct ConversionPatternRewriterImpl : public 
RewriterBase::Listener {
 
 bool wasErased(void *ptr) const { return erased.contains(ptr); }
 
-bool wasErased(OperationRewrite *rewrite) const {
-  return wasErased(rewrite->getOperation());
-}
-
 void notifyOperationErased(Operation *op) override { erased.insert(op); }
 
 void notifyBlockErased(Block *block) override { erased.insert(block); }
@@ -935,8 +909,10 @@ struct ConversionPatternRewriterImpl : public 
RewriterBase::Listener {
   /// to modify/access them is invalid rewriter API usage.
   SetVector replacedOps;
 
-  /// A set of all unresolved materializations.
-  DenseSet unresolvedMaterializations;
+  /// A mapping of all unresolved materializations (UnrealizedConversionCastOp)
+  /// to the corresponding rewrite objects.
+  DenseMap
+  unresolvedMaterializations;
 
   /// The current type converter, or nullptr if no type converter is currently
   /// active.
@@ -1058,6 +1034,14 @@ void CreateOperationRewrite::rollback() {
   op->erase();
 }
 
+UnresolvedMaterializationRewrite::UnresolvedMaterializationRewrite(
+ConversionPatternRewriterImpl &rewriterImpl, UnrealizedConversionCastOp op,
+const TypeConverter *converter, MaterializationKind kind)
+: OperationRewrite(Kind::UnresolvedMaterialization, rewriterImpl, op),
+  converterAndKind(converter, kind) {
+  rewriterImpl.unresolvedMaterializations[op] = this;
+}
+
 void UnresolvedMaterializationRewrite::rollback() {
   if (getMaterializationKind() == MaterializationKind::Target) {
 for (Value input : op->getOperands())
@@ -1345,7 +1329,6 @@ Value 
ConversionPatternRewriterImpl::buildUnresolvedMaterialization(
   builder.setInsertionPoint(ip.getBlock(), ip.getPoint());
   auto convertOp =
   builder.create(loc, outputType, inputs);
-  unresolvedMaterializations.insert(convertOp);
   appendRewrite(convertOp, converter, kind);
   return convertOp.getResult(0);
 }
@@ -2499,15 +2482,12 @@ LogicalResult 
Operati

[llvm-branch-commits] [mlir] [mlir][Transforms][NFC] Dialect conversion: Cache `UnresolvedMaterializationRewrite` (PR #108359)

2024-09-12 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-mlir-core

Author: Matthias Springer (matthias-springer)


Changes

The dialect conversion maintains a set of unresolved materializations 
(`UnrealizedConversionCastOp`). Turn that set into a `DenseMap` that maps from 
ops to `UnresolvedMaterializationRewrite *`. This improves efficiency a bit, 
because an iteration over `ConversionPatternRewriterImpl::rewrites` can be 
avoided.

Also delete some dead code.

---
Full diff: https://github.com/llvm/llvm-project/pull/108359.diff


1 Files Affected:

- (modified) mlir/lib/Transforms/Utils/DialectConversion.cpp (+20-40) 


``diff
diff --git a/mlir/lib/Transforms/Utils/DialectConversion.cpp 
b/mlir/lib/Transforms/Utils/DialectConversion.cpp
index b58a95c3baf70a..ed15b571f01883 100644
--- a/mlir/lib/Transforms/Utils/DialectConversion.cpp
+++ b/mlir/lib/Transforms/Utils/DialectConversion.cpp
@@ -688,9 +688,7 @@ class UnresolvedMaterializationRewrite : public 
OperationRewrite {
   UnresolvedMaterializationRewrite(
   ConversionPatternRewriterImpl &rewriterImpl,
   UnrealizedConversionCastOp op, const TypeConverter *converter = nullptr,
-  MaterializationKind kind = MaterializationKind::Target)
-  : OperationRewrite(Kind::UnresolvedMaterialization, rewriterImpl, op),
-converterAndKind(converter, kind) {}
+  MaterializationKind kind = MaterializationKind::Target);
 
   static bool classof(const IRRewrite *rewrite) {
 return rewrite->getKind() == Kind::UnresolvedMaterialization;
@@ -730,26 +728,6 @@ static bool hasRewrite(R &&rewrites, Operation *op) {
   });
 }
 
-/// Find the single rewrite object of the specified type and block among the
-/// given rewrites. In debug mode, asserts that there is mo more than one such
-/// object. Return "nullptr" if no object was found.
-template 
-static RewriteTy *findSingleRewrite(R &&rewrites, Block *block) {
-  RewriteTy *result = nullptr;
-  for (auto &rewrite : rewrites) {
-auto *rewriteTy = dyn_cast(rewrite.get());
-if (rewriteTy && rewriteTy->getBlock() == block) {
-#ifndef NDEBUG
-  assert(!result && "expected single matching rewrite");
-  result = rewriteTy;
-#else
-  return rewriteTy;
-#endif // NDEBUG
-}
-  }
-  return result;
-}
-
 
//===--===//
 // ConversionPatternRewriterImpl
 
//===--===//
@@ -892,10 +870,6 @@ struct ConversionPatternRewriterImpl : public 
RewriterBase::Listener {
 
 bool wasErased(void *ptr) const { return erased.contains(ptr); }
 
-bool wasErased(OperationRewrite *rewrite) const {
-  return wasErased(rewrite->getOperation());
-}
-
 void notifyOperationErased(Operation *op) override { erased.insert(op); }
 
 void notifyBlockErased(Block *block) override { erased.insert(block); }
@@ -935,8 +909,10 @@ struct ConversionPatternRewriterImpl : public 
RewriterBase::Listener {
   /// to modify/access them is invalid rewriter API usage.
   SetVector replacedOps;
 
-  /// A set of all unresolved materializations.
-  DenseSet unresolvedMaterializations;
+  /// A mapping of all unresolved materializations (UnrealizedConversionCastOp)
+  /// to the corresponding rewrite objects.
+  DenseMap
+  unresolvedMaterializations;
 
   /// The current type converter, or nullptr if no type converter is currently
   /// active.
@@ -1058,6 +1034,14 @@ void CreateOperationRewrite::rollback() {
   op->erase();
 }
 
+UnresolvedMaterializationRewrite::UnresolvedMaterializationRewrite(
+ConversionPatternRewriterImpl &rewriterImpl, UnrealizedConversionCastOp op,
+const TypeConverter *converter, MaterializationKind kind)
+: OperationRewrite(Kind::UnresolvedMaterialization, rewriterImpl, op),
+  converterAndKind(converter, kind) {
+  rewriterImpl.unresolvedMaterializations[op] = this;
+}
+
 void UnresolvedMaterializationRewrite::rollback() {
   if (getMaterializationKind() == MaterializationKind::Target) {
 for (Value input : op->getOperands())
@@ -1345,7 +1329,6 @@ Value 
ConversionPatternRewriterImpl::buildUnresolvedMaterialization(
   builder.setInsertionPoint(ip.getBlock(), ip.getPoint());
   auto convertOp =
   builder.create(loc, outputType, inputs);
-  unresolvedMaterializations.insert(convertOp);
   appendRewrite(convertOp, converter, kind);
   return convertOp.getResult(0);
 }
@@ -2499,15 +2482,12 @@ LogicalResult 
OperationConverter::convertOperations(ArrayRef ops) {
 
   // Gather all unresolved materializations.
   SmallVector allCastOps;
-  DenseMap rewriteMap;
-  for (std::unique_ptr &rewrite : rewriterImpl.rewrites) {
-auto *mat = dyn_cast(rewrite.get());
-if (!mat)
-  continue;
-if (rewriterImpl.eraseRewriter.wasErased(mat))
+  const DenseMap
+  &materializations = rewriterImpl.unresolvedMaterializations;
+  for (auto it : materializations) {
+if (rewriterImpl.eraseRewriter.wasErased(it.first))

[llvm-branch-commits] [compiler-rt] [TySan] Fixed false positive when accessing offset member variables (PR #95387)

2024-09-12 Thread via llvm-branch-commits


https://github.com/gbMattN updated 
https://github.com/llvm/llvm-project/pull/95387

>From 8099113d68bd7c47c29f635bb10a048ddb99833b Mon Sep 17 00:00:00 2001
From: Matthew Nagy 
Date: Fri, 28 Jun 2024 16:12:31 +
Subject: [PATCH 1/2] [TySan] Fixed false positive when accessing global
 object's member variables

---
 compiler-rt/lib/tysan/tysan.cpp   | 19 +++-
 .../test/tysan/global-struct-members.c| 31 +++
 2 files changed, 49 insertions(+), 1 deletion(-)
 create mode 100644 compiler-rt/test/tysan/global-struct-members.c

diff --git a/compiler-rt/lib/tysan/tysan.cpp b/compiler-rt/lib/tysan/tysan.cpp
index f627851d049e6a..8235b0ec2b55e7 100644
--- a/compiler-rt/lib/tysan/tysan.cpp
+++ b/compiler-rt/lib/tysan/tysan.cpp
@@ -221,7 +221,24 @@ __tysan_check(void *addr, int size, tysan_type_descriptor 
*td, int flags) {
 OldTDPtr -= i;
 OldTD = *OldTDPtr;
 
-if (!isAliasingLegal(td, OldTD))
+// When shadow memory is set for global objects, the entire object is 
tagged with the struct type
+// This means that when you access a member variable, tysan reads that as 
you accessing a struct midway
+// through, with 'i' being the offset
+// Therefore, if you are accessing a struct, we need to find the member 
type. We can go through the
+// members of the struct type and see if there is a member at the offset 
you are accessing the struct by.
+// If there is indeed a member starting at offset 'i' in the struct, we 
should check aliasing legality
+// with that type. If there isn't, we run alias checking on the struct 
with will give us the correct error.
+tysan_type_descriptor *InternalMember = OldTD;
+if (OldTD->Tag == TYSAN_STRUCT_TD) {
+  for (int j = 0; j < OldTD->Struct.MemberCount; j++) {
+if (OldTD->Struct.Members[j].Offset == i) {
+  InternalMember = OldTD->Struct.Members[j].Type;
+  break;
+}
+  }
+}
+
+if (!isAliasingLegal(td, InternalMember))
   reportError(addr, size, td, OldTD, AccessStr,
   "accesses part of an existing object", -i, pc, bp, sp);
 
diff --git a/compiler-rt/test/tysan/global-struct-members.c 
b/compiler-rt/test/tysan/global-struct-members.c
new file mode 100644
index 00..76ea3c431dd7bc
--- /dev/null
+++ b/compiler-rt/test/tysan/global-struct-members.c
@@ -0,0 +1,31 @@
+// RUN: %clang_tysan -O0 %s -o %t && %run %t >%t.out 2>&1
+// RUN: FileCheck %s < %t.out
+
+#include 
+
+struct X {
+  int a, b, c;
+} x;
+
+static struct X xArray[2];
+
+int main() {
+  x.a = 1;
+  x.b = 2;
+  x.c = 3;
+
+  printf("%d %d %d\n", x.a, x.b, x.c);
+  // CHECK-NOT: ERROR: TypeSanitizer: type-aliasing-violation
+
+  for (size_t i = 0; i < 2; i++) {
+xArray[i].a = 1;
+xArray[i].b = 1;
+xArray[i].c = 1;
+  }
+
+  struct X *xPtr = (struct X *)&(xArray[0].c);
+  xPtr->a = 1;
+  // CHECK: ERROR: TypeSanitizer: type-aliasing-violation
+  // CHECK: WRITE of size 4 at {{.*}} with type int (in X at offset 0) 
accesses an existing object of type int (in X at offset 8)
+  // CHECK: {{#0 0x.* in main .*struct-members.c:}}[[@LINE-3]]
+}

>From 83a368867533e316b4272c19d0bf61da842c5b4b Mon Sep 17 00:00:00 2001
From: Matthew Nagy 
Date: Thu, 12 Sep 2024 10:52:19 +
Subject: [PATCH 2/2] Fix more member offset bugs

---
 compiler-rt/lib/tysan/tysan.cpp   | 25 +--
 .../tysan/struct-offset-different-base.cpp| 31 +++
 2 files changed, 47 insertions(+), 9 deletions(-)
 create mode 100644 compiler-rt/test/tysan/struct-offset-different-base.cpp

diff --git a/compiler-rt/lib/tysan/tysan.cpp b/compiler-rt/lib/tysan/tysan.cpp
index 8235b0ec2b55e7..abad429de7ed9b 100644
--- a/compiler-rt/lib/tysan/tysan.cpp
+++ b/compiler-rt/lib/tysan/tysan.cpp
@@ -128,8 +128,13 @@ static bool isAliasingLegalUp(tysan_type_descriptor *TDA,
   break;
   }
 
-  OffsetA -= TDA->Struct.Members[Idx].Offset;
-  TDA = TDA->Struct.Members[Idx].Type;
+  if (TDA->Struct.Members[Idx].Offset > OffsetA) {
+OffsetA = TDA->Struct.Members[Idx].Offset - OffsetA;
+TDA = TDA->Struct.Members[Idx - 1].Type;
+  } else {
+OffsetA -= TDA->Struct.Members[Idx].Offset;
+TDA = TDA->Struct.Members[Idx].Type;
+  }
 } else {
   DCHECK(0);
   break;
@@ -221,13 +226,15 @@ __tysan_check(void *addr, int size, tysan_type_descriptor 
*td, int flags) {
 OldTDPtr -= i;
 OldTD = *OldTDPtr;
 
-// When shadow memory is set for global objects, the entire object is 
tagged with the struct type
-// This means that when you access a member variable, tysan reads that as 
you accessing a struct midway
-// through, with 'i' being the offset
-// Therefore, if you are accessing a struct, we need to find the member 
type. We can go through the
-// members of the struct type and see if there is a member at the offset 
you are accessing the struct by.
-// If there is indeed a m

[llvm-branch-commits] [mlir] [mlir][Transforms][NFC] Dialect conversion: Cache `UnresolvedMaterializationRewrite` (PR #108359)

2024-09-12 Thread Mehdi Amini via llvm-branch-commits



@@ -935,8 +909,10 @@ struct ConversionPatternRewriterImpl : public 
RewriterBase::Listener {
   /// to modify/access them is invalid rewriter API usage.
   SetVector replacedOps;
 
-  /// A set of all unresolved materializations.
-  DenseSet unresolvedMaterializations;
+  /// A mapping of all unresolved materializations (UnrealizedConversionCastOp)
+  /// to the corresponding rewrite objects.
+  DenseMap

joker-eph wrote:

Can the key be directly `UnrealizedConversionCastOp` ?

https://github.com/llvm/llvm-project/pull/108359
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [mlir] [mlir][Transforms][NFC] Dialect conversion: Cache `UnresolvedMaterializationRewrite` (PR #108359)

2024-09-12 Thread Mehdi Amini via llvm-branch-commits


https://github.com/joker-eph approved this pull request.


https://github.com/llvm/llvm-project/pull/108359
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [compiler-rt] [TySan] Fixed false positive when accessing offset member variables (PR #95387)

2024-09-12 Thread via llvm-branch-commits


https://github.com/gbMattN updated 
https://github.com/llvm/llvm-project/pull/95387

>From 8099113d68bd7c47c29f635bb10a048ddb99833b Mon Sep 17 00:00:00 2001
From: Matthew Nagy 
Date: Fri, 28 Jun 2024 16:12:31 +
Subject: [PATCH] [TySan] Fixed false positive when accessing global object's
 member variables

---
 compiler-rt/lib/tysan/tysan.cpp   | 19 +++-
 .../test/tysan/global-struct-members.c| 31 +++
 2 files changed, 49 insertions(+), 1 deletion(-)
 create mode 100644 compiler-rt/test/tysan/global-struct-members.c

diff --git a/compiler-rt/lib/tysan/tysan.cpp b/compiler-rt/lib/tysan/tysan.cpp
index f627851d049e6a..8235b0ec2b55e7 100644
--- a/compiler-rt/lib/tysan/tysan.cpp
+++ b/compiler-rt/lib/tysan/tysan.cpp
@@ -221,7 +221,24 @@ __tysan_check(void *addr, int size, tysan_type_descriptor 
*td, int flags) {
 OldTDPtr -= i;
 OldTD = *OldTDPtr;
 
-if (!isAliasingLegal(td, OldTD))
+// When shadow memory is set for global objects, the entire object is 
tagged with the struct type
+// This means that when you access a member variable, tysan reads that as 
you accessing a struct midway
+// through, with 'i' being the offset
+// Therefore, if you are accessing a struct, we need to find the member 
type. We can go through the
+// members of the struct type and see if there is a member at the offset 
you are accessing the struct by.
+// If there is indeed a member starting at offset 'i' in the struct, we 
should check aliasing legality
+// with that type. If there isn't, we run alias checking on the struct 
with will give us the correct error.
+tysan_type_descriptor *InternalMember = OldTD;
+if (OldTD->Tag == TYSAN_STRUCT_TD) {
+  for (int j = 0; j < OldTD->Struct.MemberCount; j++) {
+if (OldTD->Struct.Members[j].Offset == i) {
+  InternalMember = OldTD->Struct.Members[j].Type;
+  break;
+}
+  }
+}
+
+if (!isAliasingLegal(td, InternalMember))
   reportError(addr, size, td, OldTD, AccessStr,
   "accesses part of an existing object", -i, pc, bp, sp);
 
diff --git a/compiler-rt/test/tysan/global-struct-members.c 
b/compiler-rt/test/tysan/global-struct-members.c
new file mode 100644
index 00..76ea3c431dd7bc
--- /dev/null
+++ b/compiler-rt/test/tysan/global-struct-members.c
@@ -0,0 +1,31 @@
+// RUN: %clang_tysan -O0 %s -o %t && %run %t >%t.out 2>&1
+// RUN: FileCheck %s < %t.out
+
+#include 
+
+struct X {
+  int a, b, c;
+} x;
+
+static struct X xArray[2];
+
+int main() {
+  x.a = 1;
+  x.b = 2;
+  x.c = 3;
+
+  printf("%d %d %d\n", x.a, x.b, x.c);
+  // CHECK-NOT: ERROR: TypeSanitizer: type-aliasing-violation
+
+  for (size_t i = 0; i < 2; i++) {
+xArray[i].a = 1;
+xArray[i].b = 1;
+xArray[i].c = 1;
+  }
+
+  struct X *xPtr = (struct X *)&(xArray[0].c);
+  xPtr->a = 1;
+  // CHECK: ERROR: TypeSanitizer: type-aliasing-violation
+  // CHECK: WRITE of size 4 at {{.*}} with type int (in X at offset 0) 
accesses an existing object of type int (in X at offset 8)
+  // CHECK: {{#0 0x.* in main .*struct-members.c:}}[[@LINE-3]]
+}

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] release/19.x: [AVR] Fix 16-bit LDDs with immediate overflows (#104923) (PR #106993)

2024-09-12 Thread via llvm-branch-commits


aykevl wrote:

I think it's up to the release managers now to merge this PR.

https://github.com/llvm/llvm-project/pull/106993
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [mlir] [mlir][Transforms] Dialect conversion: Unify materialization of value replacements (PR #108381)

2024-09-12 Thread Matthias Springer via llvm-branch-commits


https://github.com/matthias-springer created 
https://github.com/llvm/llvm-project/pull/108381

PR #106760 aligned the handling of dropped block arguments and dropped op 
results. The two helper functions that insert source materializations for uses 
of replaced block arguments / op results that survived the conversion are now 
almost identical (`legalizeConvertedArgumentTypes` and 
`legalizeConvertedOpResultTypes`). This PR merges the two functions and moves 
the implementation directly into `finalize`.

This PR simplifies the code base and improves the efficiency a bit: previously, 
`finalize` iterated over `ConversionPatternRewriterImpl::rewrites` twice. Now, 
only one iteration is needed.

>From 1f215ac7861a76f653c9911a31bf484a5fd6dac4 Mon Sep 17 00:00:00 2001
From: Matthias Springer 
Date: Thu, 12 Sep 2024 14:49:23 +0200
Subject: [PATCH] [mlir][Transforms] Dialect conversion: Unify materialization
 of value replacements

PR #106760 aligned the handling of dropped block arguments and dropped op 
results. The two helper functions that insert source materializations for uses 
of replaced block arguments / op results that survived the conversion are now 
almost identical (`legalizeConvertedArgumentTypes` and 
`legalizeConvertedOpResultTypes`). This PR merges the two functions and moves 
the implementation directly into `finalize`.

This PR simplifies the code base and improves the efficiency a bit: previously, 
`finalize` iterates over `ConversionPatternRewriterImpl::rewrites` twice. Now, 
only one iteration is needed.
---
 .../Transforms/Utils/DialectConversion.cpp| 134 ++
 .../VectorToSPIRV/vector-to-spirv.mlir|   4 +-
 2 files changed, 44 insertions(+), 94 deletions(-)

diff --git a/mlir/lib/Transforms/Utils/DialectConversion.cpp 
b/mlir/lib/Transforms/Utils/DialectConversion.cpp
index ed15b571f01883..0556b4ab833c30 100644
--- a/mlir/lib/Transforms/Utils/DialectConversion.cpp
+++ b/mlir/lib/Transforms/Utils/DialectConversion.cpp
@@ -2336,17 +2336,6 @@ struct OperationConverter {
   /// remaining artifacts and complete the conversion.
   LogicalResult finalize(ConversionPatternRewriter &rewriter);
 
-  /// Legalize the types of converted block arguments.
-  LogicalResult
-  legalizeConvertedArgumentTypes(ConversionPatternRewriter &rewriter,
- ConversionPatternRewriterImpl &rewriterImpl);
-
-  /// Legalize the types of converted op results.
-  LogicalResult legalizeConvertedOpResultTypes(
-  ConversionPatternRewriter &rewriter,
-  ConversionPatternRewriterImpl &rewriterImpl,
-  DenseMap> &inverseMapping);
-
   /// Dialect conversion configuration.
   ConversionConfig config;
 
@@ -2510,19 +2499,6 @@ LogicalResult 
OperationConverter::convertOperations(ArrayRef ops) {
   return success();
 }
 
-LogicalResult
-OperationConverter::finalize(ConversionPatternRewriter &rewriter) {
-  ConversionPatternRewriterImpl &rewriterImpl = rewriter.getImpl();
-  if (failed(legalizeConvertedArgumentTypes(rewriter, rewriterImpl)))
-return failure();
-  DenseMap> inverseMapping =
-  rewriterImpl.mapping.getInverse();
-  if (failed(legalizeConvertedOpResultTypes(rewriter, rewriterImpl,
-inverseMapping)))
-return failure();
-  return success();
-}
-
 /// Finds a user of the given value, or of any other value that the given value
 /// replaced, that was not replaced in the conversion process.
 static Operation *findLiveUserOfReplaced(
@@ -2546,87 +2522,61 @@ static Operation *findLiveUserOfReplaced(
   return nullptr;
 }
 
-LogicalResult OperationConverter::legalizeConvertedOpResultTypes(
-ConversionPatternRewriter &rewriter,
-ConversionPatternRewriterImpl &rewriterImpl,
-DenseMap> &inverseMapping) {
-  // Process requested operation replacements.
-  for (unsigned i = 0; i < rewriterImpl.rewrites.size(); ++i) {
-auto *opReplacement =
-dyn_cast(rewriterImpl.rewrites[i].get());
-if (!opReplacement)
-  continue;
-Operation *op = opReplacement->getOperation();
-for (OpResult result : op->getResults()) {
-  // If the type of this op result changed and the result is still live,
-  // we need to materialize a conversion.
-  if (rewriterImpl.mapping.lookupOrNull(result, result.getType()))
+/// Helper function that returns the replaced values and the type converter if
+/// the given rewrite object is an "operation replacement" or a "block type
+/// conversion" (which corresponds to a "block replacement"). Otherwise, return
+/// an empty ValueRange and a null type converter pointer.
+static std::pair
+getReplacedValues(IRRewrite *rewrite) {
+  if (auto *opRewrite = dyn_cast(rewrite))
+return std::make_pair(opRewrite->getOperation()->getResults(),
+  opRewrite->getConverter());
+  if (auto *blockRewrite = dyn_cast(rewrite))
+return std::make_pair(blockRewrite->getOrigBlock()->getArguments(),
+  blockRewri

[llvm-branch-commits] [mlir] [mlir][Transforms] Dialect conversion: Unify materialization of value replacements (PR #108381)

2024-09-12 Thread via llvm-branch-commits


llvmbot wrote:



@llvm/pr-subscribers-mlir-spirv
@llvm/pr-subscribers-mlir

@llvm/pr-subscribers-mlir-core

Author: Matthias Springer (matthias-springer)


Changes

PR #106760 aligned the handling of dropped block arguments and dropped 
op results. The two helper functions that insert source materializations for 
uses of replaced block arguments / op results that survived the conversion are 
now almost identical (`legalizeConvertedArgumentTypes` and 
`legalizeConvertedOpResultTypes`). This PR merges the two functions and moves 
the implementation directly into `finalize`.

This PR simplifies the code base and improves the efficiency a bit: previously, 
`finalize` iterated over `ConversionPatternRewriterImpl::rewrites` twice. Now, 
only one iteration is needed.

---
Full diff: https://github.com/llvm/llvm-project/pull/108381.diff


2 Files Affected:

- (modified) mlir/lib/Transforms/Utils/DialectConversion.cpp (+42-92) 
- (modified) mlir/test/Conversion/VectorToSPIRV/vector-to-spirv.mlir (+2-2) 


``diff
diff --git a/mlir/lib/Transforms/Utils/DialectConversion.cpp 
b/mlir/lib/Transforms/Utils/DialectConversion.cpp
index ed15b571f01883..0556b4ab833c30 100644
--- a/mlir/lib/Transforms/Utils/DialectConversion.cpp
+++ b/mlir/lib/Transforms/Utils/DialectConversion.cpp
@@ -2336,17 +2336,6 @@ struct OperationConverter {
   /// remaining artifacts and complete the conversion.
   LogicalResult finalize(ConversionPatternRewriter &rewriter);
 
-  /// Legalize the types of converted block arguments.
-  LogicalResult
-  legalizeConvertedArgumentTypes(ConversionPatternRewriter &rewriter,
- ConversionPatternRewriterImpl &rewriterImpl);
-
-  /// Legalize the types of converted op results.
-  LogicalResult legalizeConvertedOpResultTypes(
-  ConversionPatternRewriter &rewriter,
-  ConversionPatternRewriterImpl &rewriterImpl,
-  DenseMap> &inverseMapping);
-
   /// Dialect conversion configuration.
   ConversionConfig config;
 
@@ -2510,19 +2499,6 @@ LogicalResult 
OperationConverter::convertOperations(ArrayRef ops) {
   return success();
 }
 
-LogicalResult
-OperationConverter::finalize(ConversionPatternRewriter &rewriter) {
-  ConversionPatternRewriterImpl &rewriterImpl = rewriter.getImpl();
-  if (failed(legalizeConvertedArgumentTypes(rewriter, rewriterImpl)))
-return failure();
-  DenseMap> inverseMapping =
-  rewriterImpl.mapping.getInverse();
-  if (failed(legalizeConvertedOpResultTypes(rewriter, rewriterImpl,
-inverseMapping)))
-return failure();
-  return success();
-}
-
 /// Finds a user of the given value, or of any other value that the given value
 /// replaced, that was not replaced in the conversion process.
 static Operation *findLiveUserOfReplaced(
@@ -2546,87 +2522,61 @@ static Operation *findLiveUserOfReplaced(
   return nullptr;
 }
 
-LogicalResult OperationConverter::legalizeConvertedOpResultTypes(
-ConversionPatternRewriter &rewriter,
-ConversionPatternRewriterImpl &rewriterImpl,
-DenseMap> &inverseMapping) {
-  // Process requested operation replacements.
-  for (unsigned i = 0; i < rewriterImpl.rewrites.size(); ++i) {
-auto *opReplacement =
-dyn_cast(rewriterImpl.rewrites[i].get());
-if (!opReplacement)
-  continue;
-Operation *op = opReplacement->getOperation();
-for (OpResult result : op->getResults()) {
-  // If the type of this op result changed and the result is still live,
-  // we need to materialize a conversion.
-  if (rewriterImpl.mapping.lookupOrNull(result, result.getType()))
+/// Helper function that returns the replaced values and the type converter if
+/// the given rewrite object is an "operation replacement" or a "block type
+/// conversion" (which corresponds to a "block replacement"). Otherwise, return
+/// an empty ValueRange and a null type converter pointer.
+static std::pair
+getReplacedValues(IRRewrite *rewrite) {
+  if (auto *opRewrite = dyn_cast(rewrite))
+return std::make_pair(opRewrite->getOperation()->getResults(),
+  opRewrite->getConverter());
+  if (auto *blockRewrite = dyn_cast(rewrite))
+return std::make_pair(blockRewrite->getOrigBlock()->getArguments(),
+  blockRewrite->getConverter());
+  return std::make_pair(ValueRange(), nullptr);
+}
+
+LogicalResult
+OperationConverter::finalize(ConversionPatternRewriter &rewriter) {
+  ConversionPatternRewriterImpl &rewriterImpl = rewriter.getImpl();
+  DenseMap> inverseMapping =
+  rewriterImpl.mapping.getInverse();
+
+  // Process requested value replacements.
+  for (unsigned i = 0, e = rewriterImpl.rewrites.size(); i < e; ++i) {
+ValueRange replacedValues;
+const TypeConverter *converter;
+std::tie(replacedValues, converter) =
+getReplacedValues(rewriterImpl.rewrites[i].get());
+for (Value originalValue : replacedValues) {
+  // If the type of this value changed and the value is st

[llvm-branch-commits] [mlir] [mlir][Transforms] Dialect conversion: Unify materialization of value replacements (PR #108381)

2024-09-12 Thread Mehdi Amini via llvm-branch-commits


https://github.com/joker-eph approved this pull request.


https://github.com/llvm/llvm-project/pull/108381
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [mlir] [mlir][Transforms] Dialect conversion: Unify materialization of value replacements (PR #108381)

2024-09-12 Thread Mehdi Amini via llvm-branch-commits



@@ -558,8 +558,8 @@ func.func @deinterleave(%a: vector<4xf32>) -> 
(vector<2xf32>, vector<2xf32>) {
 
 // CHECK-LABEL: func @deinterleave_scalar
 // CHECK-SAME: (%[[ARG0:.+]]: vector<2xf32>)
-//   CHECK: %[[EXTRACT0:.*]] = spirv.CompositeExtract %[[ARG0]][0 : i32] : 
vector<2xf32>
-//   CHECK: %[[EXTRACT1:.*]] = spirv.CompositeExtract %[[ARG0]][1 : i32] : 
vector<2xf32>
+//   CHECK-DAG: %[[EXTRACT0:.*]] = spirv.CompositeExtract %[[ARG0]][0 : i32] : 
vector<2xf32>
+//   CHECK-DAG: %[[EXTRACT1:.*]] = spirv.CompositeExtract %[[ARG0]][1 : i32] : 
vector<2xf32>

joker-eph wrote:

Can you just push this separately ahead as its own commit?

https://github.com/llvm/llvm-project/pull/108381
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [compiler-rt] [TySan] Fix struct access with different bases (PR #108385)

2024-09-12 Thread via llvm-branch-commits


https://github.com/gbMattN created 
https://github.com/llvm/llvm-project/pull/108385

Fixes issue #105960

If a member in a struct is also a struct, accessing a member partway through 
this inner struct currently causes a false positive. This is because when 
checking aliasing, the access offset is seen as greater than the starting 
offset of the inner struct, so the loop continues one iteration, and believes 
we are accessing the member after the inner struct. 

The next member's offset is greater than the offset we are looking for, so when 
we subtract the next member's offset from what we are looking for, the offset 
underflows.

To fix this, we check if the member we think we are accessing has a greater 
offset than the offset we are looking for. If so, we take a step back. We 
cannot do this in the loop, since the loop does not check the final member. 
This means the penultimate member would still cause false positives.

>From 2dffe46bc8af4ccd5627478ba9546647907104cc Mon Sep 17 00:00:00 2001
From: Matthew Nagy 
Date: Thu, 12 Sep 2024 12:36:57 +
Subject: [PATCH] [TySan] Fix struct access with different bases

---
 compiler-rt/lib/tysan/tysan.cpp   |  4 +++
 .../tysan/struct-offset-different-base.cpp| 31 +++
 2 files changed, 35 insertions(+)
 create mode 100644 compiler-rt/test/tysan/struct-offset-different-base.cpp

diff --git a/compiler-rt/lib/tysan/tysan.cpp b/compiler-rt/lib/tysan/tysan.cpp
index f627851d049e6a..f2cb6faddf45ac 100644
--- a/compiler-rt/lib/tysan/tysan.cpp
+++ b/compiler-rt/lib/tysan/tysan.cpp
@@ -128,6 +128,10 @@ static bool isAliasingLegalUp(tysan_type_descriptor *TDA,
   break;
   }
 
+  //You can't have negative offset, you must be partially inside the last 
type
+  if (TDA->Struct.Members[Idx].Offset > OffsetA)
+Idx -=1;
+
   OffsetA -= TDA->Struct.Members[Idx].Offset;
   TDA = TDA->Struct.Members[Idx].Type;
 } else {
diff --git a/compiler-rt/test/tysan/struct-offset-different-base.cpp 
b/compiler-rt/test/tysan/struct-offset-different-base.cpp
new file mode 100644
index 00..c1ef5f8669c280
--- /dev/null
+++ b/compiler-rt/test/tysan/struct-offset-different-base.cpp
@@ -0,0 +1,31 @@
+// RUN: %clangxx_tysan -O0 %s -o %t && %run %t >%t.out 2>&1
+// RUN: FileCheck %s < %t.out
+
+#include 
+
+struct inner {
+   char buffer;
+   int i;
+};
+
+void init_inner(inner *list) {
+   list->i = 0;
+}
+
+struct outer {
+   inner foo;
+char buffer;
+};
+
+int main(void) {
+   outer *l = new outer();
+
+init_inner(&l->foo);
+
+int access_offsets_with_different_base = l->foo.i;
+   printf("%d\n", access_offsets_with_different_base);
+
+   return 0;
+}
+
+// CHECK-NOT: ERROR: TypeSanitizer: type-aliasing-violation

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [compiler-rt] [TySan] Fix struct access with different bases (PR #108385)

2024-09-12 Thread via llvm-branch-commits


github-actions[bot] wrote:



Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this 
page.

If this is not working for you, it is probably because you do not have write 
permissions for the repository. In which case you can instead tag reviewers by 
name in a comment by using `@` followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a 
review by "ping"ing the PR by adding a comment “Ping”. The common courtesy 
"ping" rate is once a week. Please remember that you are asking for valuable 
time from other developers.

If you have further questions, they may be answered by the [LLVM GitHub User 
Guide](https://llvm.org/docs/GitHub.html).

You can also ask questions in a comment on this PR, on the [LLVM 
Discord](https://discord.com/invite/xS7Z362) or on the 
[forums](https://discourse.llvm.org/).

https://github.com/llvm/llvm-project/pull/108385
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [compiler-rt] [TySan] Fix struct access with different bases (PR #108385)

2024-09-12 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-compiler-rt-sanitizer

Author: None (gbMattN)


Changes

Fixes issue #105960

If a member in a struct is also a struct, accessing a member partway through 
this inner struct currently causes a false positive. This is because when 
checking aliasing, the access offset is seen as greater than the starting 
offset of the inner struct, so the loop continues one iteration, and believes 
we are accessing the member after the inner struct. 

The next member's offset is greater than the offset we are looking for, so when 
we subtract the next member's offset from what we are looking for, the offset 
underflows.

To fix this, we check if the member we think we are accessing has a greater 
offset than the offset we are looking for. If so, we take a step back. We 
cannot do this in the loop, since the loop does not check the final member. 
This means the penultimate member would still cause false positives.

---
Full diff: https://github.com/llvm/llvm-project/pull/108385.diff


2 Files Affected:

- (modified) compiler-rt/lib/tysan/tysan.cpp (+4) 
- (added) compiler-rt/test/tysan/struct-offset-different-base.cpp (+31) 


``diff
diff --git a/compiler-rt/lib/tysan/tysan.cpp b/compiler-rt/lib/tysan/tysan.cpp
index f627851d049e6a..f2cb6faddf45ac 100644
--- a/compiler-rt/lib/tysan/tysan.cpp
+++ b/compiler-rt/lib/tysan/tysan.cpp
@@ -128,6 +128,10 @@ static bool isAliasingLegalUp(tysan_type_descriptor *TDA,
   break;
   }
 
+  //You can't have negative offset, you must be partially inside the last 
type
+  if (TDA->Struct.Members[Idx].Offset > OffsetA)
+Idx -=1;
+
   OffsetA -= TDA->Struct.Members[Idx].Offset;
   TDA = TDA->Struct.Members[Idx].Type;
 } else {
diff --git a/compiler-rt/test/tysan/struct-offset-different-base.cpp 
b/compiler-rt/test/tysan/struct-offset-different-base.cpp
new file mode 100644
index 00..c1ef5f8669c280
--- /dev/null
+++ b/compiler-rt/test/tysan/struct-offset-different-base.cpp
@@ -0,0 +1,31 @@
+// RUN: %clangxx_tysan -O0 %s -o %t && %run %t >%t.out 2>&1
+// RUN: FileCheck %s < %t.out
+
+#include 
+
+struct inner {
+   char buffer;
+   int i;
+};
+
+void init_inner(inner *list) {
+   list->i = 0;
+}
+
+struct outer {
+   inner foo;
+char buffer;
+};
+
+int main(void) {
+   outer *l = new outer();
+
+init_inner(&l->foo);
+
+int access_offsets_with_different_base = l->foo.i;
+   printf("%d\n", access_offsets_with_different_base);
+
+   return 0;
+}
+
+// CHECK-NOT: ERROR: TypeSanitizer: type-aliasing-violation

``




https://github.com/llvm/llvm-project/pull/108385
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [compiler-rt] [TySan] Fix struct access with different bases (PR #108385)

2024-09-12 Thread via llvm-branch-commits


https://github.com/gbMattN updated 
https://github.com/llvm/llvm-project/pull/108385

>From d312bd99486dc7fbff79b880026512e949e7d212 Mon Sep 17 00:00:00 2001
From: Matthew Nagy 
Date: Thu, 12 Sep 2024 12:36:57 +
Subject: [PATCH] [TySan] Fix struct access with different bases

---
 compiler-rt/lib/tysan/tysan.cpp   |  4 +++
 .../tysan/struct-offset-different-base.cpp| 31 +++
 2 files changed, 35 insertions(+)
 create mode 100644 compiler-rt/test/tysan/struct-offset-different-base.cpp

diff --git a/compiler-rt/lib/tysan/tysan.cpp b/compiler-rt/lib/tysan/tysan.cpp
index f627851d049e6a..f2cb6faddf45ac 100644
--- a/compiler-rt/lib/tysan/tysan.cpp
+++ b/compiler-rt/lib/tysan/tysan.cpp
@@ -128,6 +128,10 @@ static bool isAliasingLegalUp(tysan_type_descriptor *TDA,
   break;
   }
 
+  //You can't have negative offset, you must be partially inside the last 
type
+  if (TDA->Struct.Members[Idx].Offset > OffsetA)
+Idx -=1;
+
   OffsetA -= TDA->Struct.Members[Idx].Offset;
   TDA = TDA->Struct.Members[Idx].Type;
 } else {
diff --git a/compiler-rt/test/tysan/struct-offset-different-base.cpp 
b/compiler-rt/test/tysan/struct-offset-different-base.cpp
new file mode 100644
index 00..c1ef5f8669c280
--- /dev/null
+++ b/compiler-rt/test/tysan/struct-offset-different-base.cpp
@@ -0,0 +1,31 @@
+// RUN: %clangxx_tysan -O0 %s -o %t && %run %t >%t.out 2>&1
+// RUN: FileCheck %s < %t.out
+
+#include 
+
+struct inner {
+   char buffer;
+   int i;
+};
+
+void init_inner(inner *list) {
+   list->i = 0;
+}
+
+struct outer {
+   inner foo;
+char buffer;
+};
+
+int main(void) {
+   outer *l = new outer();
+
+init_inner(&l->foo);
+
+int access_offsets_with_different_base = l->foo.i;
+   printf("%d\n", access_offsets_with_different_base);
+
+   return 0;
+}
+
+// CHECK-NOT: ERROR: TypeSanitizer: type-aliasing-violation

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [compiler-rt] [TySan] Fix struct access with different bases (PR #108385)

2024-09-12 Thread via llvm-branch-commits


gbMattN wrote:

(Manually pinging potential reviewers) @tavianator @fhahn 

https://github.com/llvm/llvm-project/pull/108385
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [compiler-rt] [TySan] Fix struct access with different bases (PR #108385)

2024-09-12 Thread via llvm-branch-commits


https://github.com/gbMattN updated 
https://github.com/llvm/llvm-project/pull/108385

>From 91f560d69a6dd21cf177f8969422b478cd4e5f5e Mon Sep 17 00:00:00 2001
From: Matthew Nagy 
Date: Thu, 12 Sep 2024 12:36:57 +
Subject: [PATCH] [TySan] Fix struct access with different bases

---
 compiler-rt/lib/tysan/tysan.cpp   |  4 +++
 .../tysan/struct-offset-different-base.cpp| 31 +++
 2 files changed, 35 insertions(+)
 create mode 100644 compiler-rt/test/tysan/struct-offset-different-base.cpp

diff --git a/compiler-rt/lib/tysan/tysan.cpp b/compiler-rt/lib/tysan/tysan.cpp
index f627851d049e6a..f2cb6faddf45ac 100644
--- a/compiler-rt/lib/tysan/tysan.cpp
+++ b/compiler-rt/lib/tysan/tysan.cpp
@@ -128,6 +128,10 @@ static bool isAliasingLegalUp(tysan_type_descriptor *TDA,
   break;
   }
 
+  //You can't have negative offset, you must be partially inside the last 
type
+  if (TDA->Struct.Members[Idx].Offset > OffsetA)
+Idx -=1;
+
   OffsetA -= TDA->Struct.Members[Idx].Offset;
   TDA = TDA->Struct.Members[Idx].Type;
 } else {
diff --git a/compiler-rt/test/tysan/struct-offset-different-base.cpp 
b/compiler-rt/test/tysan/struct-offset-different-base.cpp
new file mode 100644
index 00..c091975c956d24
--- /dev/null
+++ b/compiler-rt/test/tysan/struct-offset-different-base.cpp
@@ -0,0 +1,31 @@
+// RUN: %clangxx_tysan -O0 %s -o %t && %run %t >%t.out 2>&1
+// RUN: FileCheck %s < %t.out
+
+#include 
+
+struct inner {
+   char buffer;
+   int i;
+};
+
+void init_inner(inner *iPtr) {
+   iPtr->i = 0;
+}
+
+struct outer {
+   inner foo;
+char buffer;
+};
+
+int main(void) {
+   outer *l = new outer();
+
+init_inner(&l->foo);
+
+int access_offsets_with_different_base = l->foo.i;
+   printf("%d\n", access_offsets_with_different_base);
+
+   return 0;
+}
+
+// CHECK-NOT: ERROR: TypeSanitizer: type-aliasing-violation

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [compiler-rt] [TySan] Fix struct access with different bases (PR #108385)

2024-09-12 Thread via llvm-branch-commits


https://github.com/gbMattN updated 
https://github.com/llvm/llvm-project/pull/108385

>From 65f1e1fce67bfc9ae60f83abe6d3b487a174c6b1 Mon Sep 17 00:00:00 2001
From: Matthew Nagy 
Date: Thu, 12 Sep 2024 12:36:57 +
Subject: [PATCH] [TySan] Fix struct access with different bases

---
 compiler-rt/lib/tysan/tysan.cpp   |  4 +++
 .../tysan/struct-offset-different-base.cpp| 31 +++
 2 files changed, 35 insertions(+)
 create mode 100644 compiler-rt/test/tysan/struct-offset-different-base.cpp

diff --git a/compiler-rt/lib/tysan/tysan.cpp b/compiler-rt/lib/tysan/tysan.cpp
index f627851d049e6a..f2cb6faddf45ac 100644
--- a/compiler-rt/lib/tysan/tysan.cpp
+++ b/compiler-rt/lib/tysan/tysan.cpp
@@ -128,6 +128,10 @@ static bool isAliasingLegalUp(tysan_type_descriptor *TDA,
   break;
   }
 
+  //You can't have negative offset, you must be partially inside the last 
type
+  if (TDA->Struct.Members[Idx].Offset > OffsetA)
+Idx -=1;
+
   OffsetA -= TDA->Struct.Members[Idx].Offset;
   TDA = TDA->Struct.Members[Idx].Type;
 } else {
diff --git a/compiler-rt/test/tysan/struct-offset-different-base.cpp 
b/compiler-rt/test/tysan/struct-offset-different-base.cpp
new file mode 100644
index 00..716d21f844f96c
--- /dev/null
+++ b/compiler-rt/test/tysan/struct-offset-different-base.cpp
@@ -0,0 +1,31 @@
+// RUN: %clangxx_tysan -O0 %s -o %t && %run %t >%t.out 2>&1
+// RUN: FileCheck %s < %t.out
+
+#include 
+
+struct inner {
+   char buffer;
+   int i;
+};
+
+void init_inner(inner *iPtr) {
+   iPtr->i = 0;
+}
+
+struct outer {
+   inner foo;
+char buffer;
+};
+
+int main(void) {
+outer *l = new outer();
+
+init_inner(&l->foo);
+
+int access_offsets_with_different_base = l->foo.i;
+printf("%d\n", access_offsets_with_different_base);
+
+return 0;
+}
+
+// CHECK-NOT: ERROR: TypeSanitizer: type-aliasing-violation

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [mlir] [mlir][Transforms] Dialect conversion: Unify materialization of value replacements (PR #108381)

2024-09-12 Thread Matthias Springer via llvm-branch-commits


https://github.com/matthias-springer updated 
https://github.com/llvm/llvm-project/pull/108381

>From 0fd4cb81dbf6e9c2766a086e2e3fdffd3cf67510 Mon Sep 17 00:00:00 2001
From: Matthias Springer 
Date: Thu, 12 Sep 2024 14:49:23 +0200
Subject: [PATCH] [mlir][Transforms] Dialect conversion: Unify materialization
 of value replacements

PR #106760 aligned the handling of dropped block arguments and dropped op 
results. The two helper functions that insert source materializations for uses 
of replaced block arguments / op results that survived the conversion are now 
almost identical (`legalizeConvertedArgumentTypes` and 
`legalizeConvertedOpResultTypes`). This PR merges the two functions and moves 
the implementation directly into `finalize`.

This PR simplifies the code base and improves the efficiency a bit: previously, 
`finalize` iterates over `ConversionPatternRewriterImpl::rewrites` twice. Now, 
only one iteration is needed.
---
 .../Transforms/Utils/DialectConversion.cpp| 134 ++
 1 file changed, 42 insertions(+), 92 deletions(-)

diff --git a/mlir/lib/Transforms/Utils/DialectConversion.cpp 
b/mlir/lib/Transforms/Utils/DialectConversion.cpp
index ed15b571f01883..0556b4ab833c30 100644
--- a/mlir/lib/Transforms/Utils/DialectConversion.cpp
+++ b/mlir/lib/Transforms/Utils/DialectConversion.cpp
@@ -2336,17 +2336,6 @@ struct OperationConverter {
   /// remaining artifacts and complete the conversion.
   LogicalResult finalize(ConversionPatternRewriter &rewriter);
 
-  /// Legalize the types of converted block arguments.
-  LogicalResult
-  legalizeConvertedArgumentTypes(ConversionPatternRewriter &rewriter,
- ConversionPatternRewriterImpl &rewriterImpl);
-
-  /// Legalize the types of converted op results.
-  LogicalResult legalizeConvertedOpResultTypes(
-  ConversionPatternRewriter &rewriter,
-  ConversionPatternRewriterImpl &rewriterImpl,
-  DenseMap> &inverseMapping);
-
   /// Dialect conversion configuration.
   ConversionConfig config;
 
@@ -2510,19 +2499,6 @@ LogicalResult 
OperationConverter::convertOperations(ArrayRef ops) {
   return success();
 }
 
-LogicalResult
-OperationConverter::finalize(ConversionPatternRewriter &rewriter) {
-  ConversionPatternRewriterImpl &rewriterImpl = rewriter.getImpl();
-  if (failed(legalizeConvertedArgumentTypes(rewriter, rewriterImpl)))
-return failure();
-  DenseMap> inverseMapping =
-  rewriterImpl.mapping.getInverse();
-  if (failed(legalizeConvertedOpResultTypes(rewriter, rewriterImpl,
-inverseMapping)))
-return failure();
-  return success();
-}
-
 /// Finds a user of the given value, or of any other value that the given value
 /// replaced, that was not replaced in the conversion process.
 static Operation *findLiveUserOfReplaced(
@@ -2546,87 +2522,61 @@ static Operation *findLiveUserOfReplaced(
   return nullptr;
 }
 
-LogicalResult OperationConverter::legalizeConvertedOpResultTypes(
-ConversionPatternRewriter &rewriter,
-ConversionPatternRewriterImpl &rewriterImpl,
-DenseMap> &inverseMapping) {
-  // Process requested operation replacements.
-  for (unsigned i = 0; i < rewriterImpl.rewrites.size(); ++i) {
-auto *opReplacement =
-dyn_cast(rewriterImpl.rewrites[i].get());
-if (!opReplacement)
-  continue;
-Operation *op = opReplacement->getOperation();
-for (OpResult result : op->getResults()) {
-  // If the type of this op result changed and the result is still live,
-  // we need to materialize a conversion.
-  if (rewriterImpl.mapping.lookupOrNull(result, result.getType()))
+/// Helper function that returns the replaced values and the type converter if
+/// the given rewrite object is an "operation replacement" or a "block type
+/// conversion" (which corresponds to a "block replacement"). Otherwise, return
+/// an empty ValueRange and a null type converter pointer.
+static std::pair
+getReplacedValues(IRRewrite *rewrite) {
+  if (auto *opRewrite = dyn_cast(rewrite))
+return std::make_pair(opRewrite->getOperation()->getResults(),
+  opRewrite->getConverter());
+  if (auto *blockRewrite = dyn_cast(rewrite))
+return std::make_pair(blockRewrite->getOrigBlock()->getArguments(),
+  blockRewrite->getConverter());
+  return std::make_pair(ValueRange(), nullptr);
+}
+
+LogicalResult
+OperationConverter::finalize(ConversionPatternRewriter &rewriter) {
+  ConversionPatternRewriterImpl &rewriterImpl = rewriter.getImpl();
+  DenseMap> inverseMapping =
+  rewriterImpl.mapping.getInverse();
+
+  // Process requested value replacements.
+  for (unsigned i = 0, e = rewriterImpl.rewrites.size(); i < e; ++i) {
+ValueRange replacedValues;
+const TypeConverter *converter;
+std::tie(replacedValues, converter) =
+getReplacedValues(rewriterImpl.rewrites[i].get());
+for (Value originalValue : replacedValues) {
+  // If the type of

[llvm-branch-commits] [llvm] release/19.x: [DAGCombiner] cache negative result from getMergeStoreCandidates() (#106949) (PR #108397)

2024-09-12 Thread via llvm-branch-commits


llvmbot wrote:

@arsenm What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/108397
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] release/19.x: [DAGCombiner] cache negative result from getMergeStoreCandidates() (#106949) (PR #108397)

2024-09-12 Thread via llvm-branch-commits


https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/108397
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] release/19.x: [DAGCombiner] cache negative result from getMergeStoreCandidates() (#106949) (PR #108397)

2024-09-12 Thread via llvm-branch-commits


https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/108397

Backport 8f77d37f256809766fd83a09c6d144b785e9165a

Requested by: @nikic

>From 3d14aafa9c51b816d9bc1792898de9df84cc2fd6 Mon Sep 17 00:00:00 2001
From: Princeton Ferro 
Date: Wed, 4 Sep 2024 07:18:53 -0700
Subject: [PATCH] [DAGCombiner] cache negative result from
 getMergeStoreCandidates() (#106949)

Cache negative search result from getStoreMergeCandidates() so that
mergeConsecutiveStores() does not iterate quadratically over a
potentially long sequence of unmergeable stores.

(cherry picked from commit 8f77d37f256809766fd83a09c6d144b785e9165a)
---
 llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | 83 ---
 1 file changed, 51 insertions(+), 32 deletions(-)

diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp 
b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index 71cdec91e5f67a..7b1f1dc40211d5 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -191,6 +191,11 @@ namespace {
 // AA - Used for DAG load/store alias analysis.
 AliasAnalysis *AA;
 
+/// This caches all chains that have already been processed in
+/// DAGCombiner::getStoreMergeCandidates() and found to have no mergeable
+/// stores candidates.
+SmallPtrSet ChainsWithoutMergeableStores;
+
 /// When an instruction is simplified, add all users of the instruction to
 /// the work lists because they might get more simplified now.
 void AddUsersToWorklist(SDNode *N) {
@@ -776,11 +781,10 @@ namespace {
  bool UseTrunc);
 
 /// This is a helper function for mergeConsecutiveStores. Stores that
-/// potentially may be merged with St are placed in StoreNodes. RootNode is
-/// a chain predecessor to all store candidates.
-void getStoreMergeCandidates(StoreSDNode *St,
- SmallVectorImpl &StoreNodes,
- SDNode *&Root);
+/// potentially may be merged with St are placed in StoreNodes. On success,
+/// returns a chain predecessor to all store candidates.
+SDNode *getStoreMergeCandidates(StoreSDNode *St,
+SmallVectorImpl &StoreNodes);
 
 /// Helper function for mergeConsecutiveStores. Checks if candidate stores
 /// have indirect dependency through their operands. RootNode is the
@@ -1782,6 +1786,9 @@ void DAGCombiner::Run(CombineLevel AtLevel) {
 
 ++NodesCombined;
 
+// Invalidate cached info.
+ChainsWithoutMergeableStores.clear();
+
 // If we get back the same node we passed in, rather than a new node or
 // zero, we know that the node must have defined multiple values and
 // CombineTo was used.  Since CombineTo takes care of the worklist
@@ -20372,15 +20379,15 @@ bool DAGCombiner::mergeStoresOfConstantsOrVecElts(
   return true;
 }
 
-void DAGCombiner::getStoreMergeCandidates(
-StoreSDNode *St, SmallVectorImpl &StoreNodes,
-SDNode *&RootNode) {
+SDNode *
+DAGCombiner::getStoreMergeCandidates(StoreSDNode *St,
+ SmallVectorImpl &StoreNodes) {
   // This holds the base pointer, index, and the offset in bytes from the base
   // pointer. We must have a base and an offset. Do not handle stores to undef
   // base pointers.
   BaseIndexOffset BasePtr = BaseIndexOffset::match(St, DAG);
   if (!BasePtr.getBase().getNode() || BasePtr.getBase().isUndef())
-return;
+return nullptr;
 
   SDValue Val = peekThroughBitcasts(St->getValue());
   StoreSource StoreSrc = getStoreSource(Val);
@@ -20396,14 +20403,14 @@ void DAGCombiner::getStoreMergeCandidates(
 LoadVT = Ld->getMemoryVT();
 // Load and store should be the same type.
 if (MemVT != LoadVT)
-  return;
+  return nullptr;
 // Loads must only have one use.
 if (!Ld->hasNUsesOfValue(1, 0))
-  return;
+  return nullptr;
 // The memory operands must not be volatile/indexed/atomic.
 // TODO: May be able to relax for unordered atomics (see D66309)
 if (!Ld->isSimple() || Ld->isIndexed())
-  return;
+  return nullptr;
   }
   auto CandidateMatch = [&](StoreSDNode *Other, BaseIndexOffset &Ptr,
 int64_t &Offset) -> bool {
@@ -20471,6 +20478,27 @@ void DAGCombiner::getStoreMergeCandidates(
 return (BasePtr.equalBaseIndex(Ptr, DAG, Offset));
   };
 
+  // We are looking for a root node which is an ancestor to all mergable
+  // stores. We search up through a load, to our root and then down
+  // through all children. For instance we will find Store{1,2,3} if
+  // St is Store1, Store2. or Store3 where the root is not a load
+  // which always true for nonvolatile ops. TODO: Expand
+  // the search to find all valid candidates through multiple layers of loads.
+  //
+  // Root
+  // |---|---|
+  // LoadLoadStore3
+  // |   |
+  // Store1   Store2
+  //
+  // FIXME: We should be a

[llvm-branch-commits] [llvm] release/19.x: [DAGCombiner] cache negative result from getMergeStoreCandidates() (#106949) (PR #108397)

2024-09-12 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-llvm-selectiondag

Author: None (llvmbot)


Changes

Backport 8f77d37f256809766fd83a09c6d144b785e9165a

Requested by: @nikic

---
Full diff: https://github.com/llvm/llvm-project/pull/108397.diff


1 Files Affected:

- (modified) llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (+51-32) 


``diff
diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp 
b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index 71cdec91e5f67a..7b1f1dc40211d5 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -191,6 +191,11 @@ namespace {
 // AA - Used for DAG load/store alias analysis.
 AliasAnalysis *AA;
 
+/// This caches all chains that have already been processed in
+/// DAGCombiner::getStoreMergeCandidates() and found to have no mergeable
+/// stores candidates.
+SmallPtrSet ChainsWithoutMergeableStores;
+
 /// When an instruction is simplified, add all users of the instruction to
 /// the work lists because they might get more simplified now.
 void AddUsersToWorklist(SDNode *N) {
@@ -776,11 +781,10 @@ namespace {
  bool UseTrunc);
 
 /// This is a helper function for mergeConsecutiveStores. Stores that
-/// potentially may be merged with St are placed in StoreNodes. RootNode is
-/// a chain predecessor to all store candidates.
-void getStoreMergeCandidates(StoreSDNode *St,
- SmallVectorImpl &StoreNodes,
- SDNode *&Root);
+/// potentially may be merged with St are placed in StoreNodes. On success,
+/// returns a chain predecessor to all store candidates.
+SDNode *getStoreMergeCandidates(StoreSDNode *St,
+SmallVectorImpl &StoreNodes);
 
 /// Helper function for mergeConsecutiveStores. Checks if candidate stores
 /// have indirect dependency through their operands. RootNode is the
@@ -1782,6 +1786,9 @@ void DAGCombiner::Run(CombineLevel AtLevel) {
 
 ++NodesCombined;
 
+// Invalidate cached info.
+ChainsWithoutMergeableStores.clear();
+
 // If we get back the same node we passed in, rather than a new node or
 // zero, we know that the node must have defined multiple values and
 // CombineTo was used.  Since CombineTo takes care of the worklist
@@ -20372,15 +20379,15 @@ bool DAGCombiner::mergeStoresOfConstantsOrVecElts(
   return true;
 }
 
-void DAGCombiner::getStoreMergeCandidates(
-StoreSDNode *St, SmallVectorImpl &StoreNodes,
-SDNode *&RootNode) {
+SDNode *
+DAGCombiner::getStoreMergeCandidates(StoreSDNode *St,
+ SmallVectorImpl &StoreNodes) {
   // This holds the base pointer, index, and the offset in bytes from the base
   // pointer. We must have a base and an offset. Do not handle stores to undef
   // base pointers.
   BaseIndexOffset BasePtr = BaseIndexOffset::match(St, DAG);
   if (!BasePtr.getBase().getNode() || BasePtr.getBase().isUndef())
-return;
+return nullptr;
 
   SDValue Val = peekThroughBitcasts(St->getValue());
   StoreSource StoreSrc = getStoreSource(Val);
@@ -20396,14 +20403,14 @@ void DAGCombiner::getStoreMergeCandidates(
 LoadVT = Ld->getMemoryVT();
 // Load and store should be the same type.
 if (MemVT != LoadVT)
-  return;
+  return nullptr;
 // Loads must only have one use.
 if (!Ld->hasNUsesOfValue(1, 0))
-  return;
+  return nullptr;
 // The memory operands must not be volatile/indexed/atomic.
 // TODO: May be able to relax for unordered atomics (see D66309)
 if (!Ld->isSimple() || Ld->isIndexed())
-  return;
+  return nullptr;
   }
   auto CandidateMatch = [&](StoreSDNode *Other, BaseIndexOffset &Ptr,
 int64_t &Offset) -> bool {
@@ -20471,6 +20478,27 @@ void DAGCombiner::getStoreMergeCandidates(
 return (BasePtr.equalBaseIndex(Ptr, DAG, Offset));
   };
 
+  // We are looking for a root node which is an ancestor to all mergable
+  // stores. We search up through a load, to our root and then down
+  // through all children. For instance we will find Store{1,2,3} if
+  // St is Store1, Store2. or Store3 where the root is not a load
+  // which always true for nonvolatile ops. TODO: Expand
+  // the search to find all valid candidates through multiple layers of loads.
+  //
+  // Root
+  // |---|---|
+  // LoadLoadStore3
+  // |   |
+  // Store1   Store2
+  //
+  // FIXME: We should be able to climb and
+  // descend TokenFactors to find candidates as well.
+
+  SDNode *RootNode = St->getChain().getNode();
+  // Bail out if we already analyzed this root node and found nothing.
+  if (ChainsWithoutMergeableStores.contains(RootNode))
+return nullptr;
+
   // Check if the pair of StoreNode and the RootNode already bail out many
   // times which is over the limit in dependence check.
   auto OverLimit

[llvm-branch-commits] [mlir] [mlir][Transforms] Dialect conversion: Unify materialization of value replacements (PR #108381)

2024-09-12 Thread Jakub Kuderski via llvm-branch-commits



@@ -2546,87 +2522,61 @@ static Operation *findLiveUserOfReplaced(
   return nullptr;
 }
 
-LogicalResult OperationConverter::legalizeConvertedOpResultTypes(
-ConversionPatternRewriter &rewriter,
-ConversionPatternRewriterImpl &rewriterImpl,
-DenseMap> &inverseMapping) {
-  // Process requested operation replacements.
-  for (unsigned i = 0; i < rewriterImpl.rewrites.size(); ++i) {
-auto *opReplacement =
-dyn_cast(rewriterImpl.rewrites[i].get());
-if (!opReplacement)
-  continue;
-Operation *op = opReplacement->getOperation();
-for (OpResult result : op->getResults()) {
-  // If the type of this op result changed and the result is still live,
-  // we need to materialize a conversion.
-  if (rewriterImpl.mapping.lookupOrNull(result, result.getType()))
+/// Helper function that returns the replaced values and the type converter if
+/// the given rewrite object is an "operation replacement" or a "block type
+/// conversion" (which corresponds to a "block replacement"). Otherwise, return
+/// an empty ValueRange and a null type converter pointer.
+static std::pair
+getReplacedValues(IRRewrite *rewrite) {
+  if (auto *opRewrite = dyn_cast(rewrite))
+return std::make_pair(opRewrite->getOperation()->getResults(),
+  opRewrite->getConverter());
+  if (auto *blockRewrite = dyn_cast(rewrite))
+return std::make_pair(blockRewrite->getOrigBlock()->getArguments(),
+  blockRewrite->getConverter());
+  return std::make_pair(ValueRange(), nullptr);

kuhar wrote:

nit
```suggestion
return {opRewrite->getOperation()->getResults(),
  opRewrite->getConverter()};
  if (auto *blockRewrite = dyn_cast(rewrite))
return {blockRewrite->getOrigBlock()->getArguments(),
  blockRewrite->getConverter()};
  return {};
```
```suggestion
return std::make_pair(opRewrite->getOperation()->getResults(),
  opRewrite->getConverter());
  if (auto *blockRewrite = dyn_cast(rewrite))
return std::make_pair(blockRewrite->getOrigBlock()->getArguments(),
  blockRewrite->getConverter());
  return std::make_pair(ValueRange(), nullptr);
```

https://github.com/llvm/llvm-project/pull/108381
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [mlir] [mlir][Transforms] Dialect conversion: Unify materialization of value replacements (PR #108381)

2024-09-12 Thread Jakub Kuderski via llvm-branch-commits


https://github.com/kuhar edited https://github.com/llvm/llvm-project/pull/108381
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [mlir] [mlir][Transforms] Dialect conversion: Unify materialization of value replacements (PR #108381)

2024-09-12 Thread Jakub Kuderski via llvm-branch-commits



@@ -2546,87 +2522,61 @@ static Operation *findLiveUserOfReplaced(
   return nullptr;
 }
 
-LogicalResult OperationConverter::legalizeConvertedOpResultTypes(
-ConversionPatternRewriter &rewriter,
-ConversionPatternRewriterImpl &rewriterImpl,
-DenseMap> &inverseMapping) {
-  // Process requested operation replacements.
-  for (unsigned i = 0; i < rewriterImpl.rewrites.size(); ++i) {
-auto *opReplacement =
-dyn_cast(rewriterImpl.rewrites[i].get());
-if (!opReplacement)
-  continue;
-Operation *op = opReplacement->getOperation();
-for (OpResult result : op->getResults()) {
-  // If the type of this op result changed and the result is still live,
-  // we need to materialize a conversion.
-  if (rewriterImpl.mapping.lookupOrNull(result, result.getType()))
+/// Helper function that returns the replaced values and the type converter if
+/// the given rewrite object is an "operation replacement" or a "block type
+/// conversion" (which corresponds to a "block replacement"). Otherwise, return
+/// an empty ValueRange and a null type converter pointer.
+static std::pair
+getReplacedValues(IRRewrite *rewrite) {
+  if (auto *opRewrite = dyn_cast(rewrite))
+return std::make_pair(opRewrite->getOperation()->getResults(),
+  opRewrite->getConverter());
+  if (auto *blockRewrite = dyn_cast(rewrite))
+return std::make_pair(blockRewrite->getOrigBlock()->getArguments(),
+  blockRewrite->getConverter());
+  return std::make_pair(ValueRange(), nullptr);
+}
+
+LogicalResult
+OperationConverter::finalize(ConversionPatternRewriter &rewriter) {
+  ConversionPatternRewriterImpl &rewriterImpl = rewriter.getImpl();
+  DenseMap> inverseMapping =
+  rewriterImpl.mapping.getInverse();
+
+  // Process requested value replacements.
+  for (unsigned i = 0, e = rewriterImpl.rewrites.size(); i < e; ++i) {

kuhar wrote:

Nit: use range for? I don't see `i` being use outside of indexing into 
`rewriters`.

https://github.com/llvm/llvm-project/pull/108381
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [Serialization] Code cleanups and polish 83233 (PR #83237)

2024-09-12 Thread Ilya Biryukov via llvm-branch-commits


ilya-biryukov wrote:

I got to a small reproducer that only uses STL, but it only produces an error 
in our environment and if I try it with this patch, the error goes away.
I am probably missing something subtle, will dig deeper tomorrow.

Sorry for another delay.

https://github.com/llvm/llvm-project/pull/83237
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [compiler-rt] [TySan] Fix struct access with different bases (PR #108385)

2024-09-12 Thread Tavian Barnes via llvm-branch-commits



@@ -128,6 +128,10 @@ static bool isAliasingLegalUp(tysan_type_descriptor *TDA,
   break;
   }
 
+  //You can't have negative offset, you must be partially inside the last 
type
+  if (TDA->Struct.Members[Idx].Offset > OffsetA)
+Idx -=1;
+

tavianator wrote:

```suggestion
Idx -= 1;

```

https://github.com/llvm/llvm-project/pull/108385
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [compiler-rt] [TySan] Fix struct access with different bases (PR #108385)

2024-09-12 Thread Tavian Barnes via llvm-branch-commits


https://github.com/tavianator commented:

This fixes my reduced testcase but not the unreduced one.  I'll try to make a 
new reduction.

https://github.com/llvm/llvm-project/pull/108385
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [compiler-rt] [TySan] Fix struct access with different bases (PR #108385)

2024-09-12 Thread Tavian Barnes via llvm-branch-commits


https://github.com/tavianator edited 
https://github.com/llvm/llvm-project/pull/108385
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)

2024-09-12 Thread Luke Lau via llvm-branch-commits


lukel97 wrote:

The run just finished, I'm seeing a 0.75% improvement on 500.perlbench_r, no 
regressions or improvements on the other benchmarks as far as I can see. Seems 
to check out with the number of memcmps inlined reported for perlbench!

https://github.com/llvm/llvm-project/pull/107548
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [sanitizer] Allow to override GetDTLSRange (PR #108348)

2024-09-12 Thread Thurston Dang via llvm-branch-commits


https://github.com/thurstond approved this pull request.


https://github.com/llvm/llvm-project/pull/108348
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [sanitizer] Test for #108348 (PR #108349)

2024-09-12 Thread Thurston Dang via llvm-branch-commits


https://github.com/thurstond approved this pull request.


https://github.com/llvm/llvm-project/pull/108349
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [sanitizer] Test for #108348 (PR #108349)

2024-09-12 Thread Florian Mayer via llvm-branch-commits


fmayer wrote:

Maybe improve the message a bit so people don't have to look at another pull 
request to understand what this is about?

https://github.com/llvm/llvm-project/pull/108349
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] release/19.x: workflows/release-binaries: Fix automatic upload (#107315) (PR #108422)

2024-09-12 Thread via llvm-branch-commits


https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/108422

Backport ab96409180aaad5417030f06a386253722a99d71

Requested by: @tstellar

>From 5ec4f6033e5ad37f3a6f30ca48b74305770e5796 Mon Sep 17 00:00:00 2001
From: Tom Stellard 
Date: Thu, 12 Sep 2024 09:50:57 -0700
Subject: [PATCH] workflows/release-binaries: Fix automatic upload (#107315)

(cherry picked from commit ab96409180aaad5417030f06a386253722a99d71)
---
 .github/workflows/release-binaries.yml | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/.github/workflows/release-binaries.yml 
b/.github/workflows/release-binaries.yml
index 509016e5b89c45..fcd371d49e6c91 100644
--- a/.github/workflows/release-binaries.yml
+++ b/.github/workflows/release-binaries.yml
@@ -450,11 +450,22 @@ jobs:
 name: ${{ needs.prepare.outputs.release-binary-filename }}-attestation
 path: ${{ needs.prepare.outputs.release-binary-filename }}.jsonl
 
+- name: Checkout Release Scripts
+  uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1
+  with:
+sparse-checkout: |
+  llvm/utils/release/github-upload-release.py
+  llvm/utils/git/requirements.txt
+sparse-checkout-cone-mode: false
+
+- name: Install Python Requirements
+  run: |
+pip install --require-hashes -r ./llvm/utils/git/requirements.txt
+
 - name: Upload Release
   shell: bash
   run: |
-sudo apt install python3-github
-./llvm-project/llvm/utils/release/github-upload-release.py \
+./llvm/utils/release/github-upload-release.py \
 --token ${{ github.token }} \
 --release ${{ needs.prepare.outputs.release-version }} \
 upload \

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] release/19.x: workflows/release-binaries: Fix automatic upload (#107315) (PR #108422)

2024-09-12 Thread via llvm-branch-commits


https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/108422
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] release/19.x: workflows/release-binaries: Fix automatic upload (#107315) (PR #108422)

2024-09-12 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-github-workflow

Author: None (llvmbot)


Changes

Backport ab96409180aaad5417030f06a386253722a99d71

Requested by: @tstellar

---
Full diff: https://github.com/llvm/llvm-project/pull/108422.diff


1 Files Affected:

- (modified) .github/workflows/release-binaries.yml (+13-2) 


``diff
diff --git a/.github/workflows/release-binaries.yml 
b/.github/workflows/release-binaries.yml
index 509016e5b89c45..fcd371d49e6c91 100644
--- a/.github/workflows/release-binaries.yml
+++ b/.github/workflows/release-binaries.yml
@@ -450,11 +450,22 @@ jobs:
 name: ${{ needs.prepare.outputs.release-binary-filename }}-attestation
 path: ${{ needs.prepare.outputs.release-binary-filename }}.jsonl
 
+- name: Checkout Release Scripts
+  uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1
+  with:
+sparse-checkout: |
+  llvm/utils/release/github-upload-release.py
+  llvm/utils/git/requirements.txt
+sparse-checkout-cone-mode: false
+
+- name: Install Python Requirements
+  run: |
+pip install --require-hashes -r ./llvm/utils/git/requirements.txt
+
 - name: Upload Release
   shell: bash
   run: |
-sudo apt install python3-github
-./llvm-project/llvm/utils/release/github-upload-release.py \
+./llvm/utils/release/github-upload-release.py \
 --token ${{ github.token }} \
 --release ${{ needs.prepare.outputs.release-version }} \
 upload \

``




https://github.com/llvm/llvm-project/pull/108422
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [sanitizer] Test for __sanitizer_get_dtls_size (PR #108349)

2024-09-12 Thread Vitaly Buka via llvm-branch-commits


https://github.com/vitalybuka edited 
https://github.com/llvm/llvm-project/pull/108349
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [sanitizer] Test for __sanitizer_get_dtls_size (PR #108349)

2024-09-12 Thread Vitaly Buka via llvm-branch-commits


https://github.com/vitalybuka edited 
https://github.com/llvm/llvm-project/pull/108349
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [sanitizer] Test for __sanitizer_get_dtls_size (PR #108349)

2024-09-12 Thread Vitaly Buka via llvm-branch-commits


vitalybuka wrote:

> Maybe improve the message a bit so people don't have to look at another pull 
> request to understand what this is about?

done

https://github.com/llvm/llvm-project/pull/108349
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [sanitizer] Allow to override GetDTLSRange (PR #108348)

2024-09-12 Thread Vitaly Buka via llvm-branch-commits


https://github.com/vitalybuka updated 
https://github.com/llvm/llvm-project/pull/108348


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [sanitizer] Test for __sanitizer_get_dtls_size (PR #108349)

2024-09-12 Thread Vitaly Buka via llvm-branch-commits


https://github.com/vitalybuka updated 
https://github.com/llvm/llvm-project/pull/108349


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [sanitizer] Test for __sanitizer_get_dtls_size (PR #108349)

2024-09-12 Thread Vitaly Buka via llvm-branch-commits


https://github.com/vitalybuka updated 
https://github.com/llvm/llvm-project/pull/108349


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [sanitizer] Allow to override GetDTLSRange (PR #108348)

2024-09-12 Thread Vitaly Buka via llvm-branch-commits


https://github.com/vitalybuka updated 
https://github.com/llvm/llvm-project/pull/108348


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [BOLT] Match blocks with pseudo probes (PR #99891)

2024-09-12 Thread Amir Ayupov via llvm-branch-commits


https://github.com/aaupov updated 
https://github.com/llvm/llvm-project/pull/99891

>From 36197b175681d07b4704e576fb008cec3cc1e05e Mon Sep 17 00:00:00 2001
From: Amir Ayupov 
Date: Wed, 28 Aug 2024 21:10:25 +0200
Subject: [PATCH 1/2] Reworked block probe matching

Use new probe ifaces
Get all function probes at once
Drop ProfileUsePseudoProbes
Unify matchWithBlockPseudoProbes
Distinguish exact and loose probe match
---
 bolt/include/bolt/Core/BinaryContext.h|  20 +-
 bolt/lib/Passes/BinaryPasses.cpp  |  40 ++-
 bolt/lib/Profile/StaleProfileMatching.cpp | 404 ++
 bolt/lib/Rewrite/PseudoProbeRewriter.cpp  |   8 +-
 4 files changed, 237 insertions(+), 235 deletions(-)

diff --git a/bolt/include/bolt/Core/BinaryContext.h 
b/bolt/include/bolt/Core/BinaryContext.h
index 3e20cb607e657b..3f7b2ac0bc6cf9 100644
--- a/bolt/include/bolt/Core/BinaryContext.h
+++ b/bolt/include/bolt/Core/BinaryContext.h
@@ -724,14 +724,26 @@ class BinaryContext {
 uint32_t NumStaleBlocks{0};
 ///   the number of exactly matched basic blocks
 uint32_t NumExactMatchedBlocks{0};
-///   the number of pseudo probe matched basic blocks
-uint32_t NumPseudoProbeMatchedBlocks{0};
+///   the number of loosely matched basic blocks
+uint32_t NumLooseMatchedBlocks{0};
+///   the number of exactly pseudo probe matched basic blocks
+uint32_t NumPseudoProbeExactMatchedBlocks{0};
+///   the number of loosely pseudo probe matched basic blocks
+uint32_t NumPseudoProbeLooseMatchedBlocks{0};
+///   the number of call matched basic blocks
+uint32_t NumCallMatchedBlocks{0};
 ///   the total count of samples in the profile
 uint64_t StaleSampleCount{0};
 ///   the count of exactly matched samples
 uint64_t ExactMatchedSampleCount{0};
-///   the count of pseudo probe matched samples
-uint64_t PseudoProbeMatchedSampleCount{0};
+///   the count of exactly matched samples
+uint64_t LooseMatchedSampleCount{0};
+///   the count of exactly pseudo probe matched samples
+uint64_t PseudoProbeExactMatchedSampleCount{0};
+///   the count of loosely pseudo probe matched samples
+uint64_t PseudoProbeLooseMatchedSampleCount{0};
+///   the count of call matched samples
+uint64_t CallMatchedSampleCount{0};
 ///   the number of stale functions that have matching number of blocks in
 ///   the profile
 uint64_t NumStaleFuncsWithEqualBlockCount{0};
diff --git a/bolt/lib/Passes/BinaryPasses.cpp b/bolt/lib/Passes/BinaryPasses.cpp
index b786f07a6a6651..8edbd58c3ed3de 100644
--- a/bolt/lib/Passes/BinaryPasses.cpp
+++ b/bolt/lib/Passes/BinaryPasses.cpp
@@ -1524,15 +1524,43 @@ Error PrintProgramStats::runOnFunctions(BinaryContext 
&BC) {
 100.0 * BC.Stats.ExactMatchedSampleCount / BC.Stats.StaleSampleCount,
 BC.Stats.ExactMatchedSampleCount, BC.Stats.StaleSampleCount);
 BC.outs() << format(
-"BOLT-INFO: inference found a pseudo probe match for %.2f%% of basic "
+"BOLT-INFO: inference found an exact pseudo probe match for %.2f%% of "
+"basic blocks (%zu out of %zu stale) responsible for %.2f%% samples"
+" (%zu out of %zu stale)\n",
+100.0 * BC.Stats.NumPseudoProbeExactMatchedBlocks /
+BC.Stats.NumStaleBlocks,
+BC.Stats.NumPseudoProbeExactMatchedBlocks, BC.Stats.NumStaleBlocks,
+100.0 * BC.Stats.PseudoProbeExactMatchedSampleCount /
+BC.Stats.StaleSampleCount,
+BC.Stats.PseudoProbeExactMatchedSampleCount, 
BC.Stats.StaleSampleCount);
+BC.outs() << format(
+"BOLT-INFO: inference found a loose pseudo probe match for %.2f%% of "
+"basic blocks (%zu out of %zu stale) responsible for %.2f%% samples"
+" (%zu out of %zu stale)\n",
+100.0 * BC.Stats.NumPseudoProbeLooseMatchedBlocks /
+BC.Stats.NumStaleBlocks,
+BC.Stats.NumPseudoProbeLooseMatchedBlocks, BC.Stats.NumStaleBlocks,
+100.0 * BC.Stats.PseudoProbeLooseMatchedSampleCount /
+BC.Stats.StaleSampleCount,
+BC.Stats.PseudoProbeLooseMatchedSampleCount, 
BC.Stats.StaleSampleCount);
+BC.outs() << format(
+"BOLT-INFO: inference found a call match for %.2f%% of basic "
 "blocks"
 " (%zu out of %zu stale) responsible for %.2f%% samples"
 " (%zu out of %zu stale)\n",
-100.0 * BC.Stats.NumPseudoProbeMatchedBlocks / BC.Stats.NumStaleBlocks,
-BC.Stats.NumPseudoProbeMatchedBlocks, BC.Stats.NumStaleBlocks,
-100.0 * BC.Stats.PseudoProbeMatchedSampleCount /
-BC.Stats.StaleSampleCount,
-BC.Stats.PseudoProbeMatchedSampleCount, BC.Stats.StaleSampleCount);
+100.0 * BC.Stats.NumCallMatchedBlocks / BC.Stats.NumStaleBlocks,
+BC.Stats.NumCallMatchedBlocks, BC.Stats.NumStaleBlocks,
+100.0 * BC.Stats.CallMatchedSampleCount / BC.Stats.StaleSampleCount,
+BC.Stats.CallMatchedSampleCount, BC.Stats.StaleSampleCount);
+BC

[llvm-branch-commits] [llvm] [BOLT] Match blocks with pseudo probes (PR #99891)

2024-09-12 Thread Amir Ayupov via llvm-branch-commits


https://github.com/aaupov updated 
https://github.com/llvm/llvm-project/pull/99891

>From 36197b175681d07b4704e576fb008cec3cc1e05e Mon Sep 17 00:00:00 2001
From: Amir Ayupov 
Date: Wed, 28 Aug 2024 21:10:25 +0200
Subject: [PATCH 1/2] Reworked block probe matching

Use new probe ifaces
Get all function probes at once
Drop ProfileUsePseudoProbes
Unify matchWithBlockPseudoProbes
Distinguish exact and loose probe match
---
 bolt/include/bolt/Core/BinaryContext.h|  20 +-
 bolt/lib/Passes/BinaryPasses.cpp  |  40 ++-
 bolt/lib/Profile/StaleProfileMatching.cpp | 404 ++
 bolt/lib/Rewrite/PseudoProbeRewriter.cpp  |   8 +-
 4 files changed, 237 insertions(+), 235 deletions(-)

diff --git a/bolt/include/bolt/Core/BinaryContext.h 
b/bolt/include/bolt/Core/BinaryContext.h
index 3e20cb607e657b..3f7b2ac0bc6cf9 100644
--- a/bolt/include/bolt/Core/BinaryContext.h
+++ b/bolt/include/bolt/Core/BinaryContext.h
@@ -724,14 +724,26 @@ class BinaryContext {
 uint32_t NumStaleBlocks{0};
 ///   the number of exactly matched basic blocks
 uint32_t NumExactMatchedBlocks{0};
-///   the number of pseudo probe matched basic blocks
-uint32_t NumPseudoProbeMatchedBlocks{0};
+///   the number of loosely matched basic blocks
+uint32_t NumLooseMatchedBlocks{0};
+///   the number of exactly pseudo probe matched basic blocks
+uint32_t NumPseudoProbeExactMatchedBlocks{0};
+///   the number of loosely pseudo probe matched basic blocks
+uint32_t NumPseudoProbeLooseMatchedBlocks{0};
+///   the number of call matched basic blocks
+uint32_t NumCallMatchedBlocks{0};
 ///   the total count of samples in the profile
 uint64_t StaleSampleCount{0};
 ///   the count of exactly matched samples
 uint64_t ExactMatchedSampleCount{0};
-///   the count of pseudo probe matched samples
-uint64_t PseudoProbeMatchedSampleCount{0};
+///   the count of exactly matched samples
+uint64_t LooseMatchedSampleCount{0};
+///   the count of exactly pseudo probe matched samples
+uint64_t PseudoProbeExactMatchedSampleCount{0};
+///   the count of loosely pseudo probe matched samples
+uint64_t PseudoProbeLooseMatchedSampleCount{0};
+///   the count of call matched samples
+uint64_t CallMatchedSampleCount{0};
 ///   the number of stale functions that have matching number of blocks in
 ///   the profile
 uint64_t NumStaleFuncsWithEqualBlockCount{0};
diff --git a/bolt/lib/Passes/BinaryPasses.cpp b/bolt/lib/Passes/BinaryPasses.cpp
index b786f07a6a6651..8edbd58c3ed3de 100644
--- a/bolt/lib/Passes/BinaryPasses.cpp
+++ b/bolt/lib/Passes/BinaryPasses.cpp
@@ -1524,15 +1524,43 @@ Error PrintProgramStats::runOnFunctions(BinaryContext 
&BC) {
 100.0 * BC.Stats.ExactMatchedSampleCount / BC.Stats.StaleSampleCount,
 BC.Stats.ExactMatchedSampleCount, BC.Stats.StaleSampleCount);
 BC.outs() << format(
-"BOLT-INFO: inference found a pseudo probe match for %.2f%% of basic "
+"BOLT-INFO: inference found an exact pseudo probe match for %.2f%% of "
+"basic blocks (%zu out of %zu stale) responsible for %.2f%% samples"
+" (%zu out of %zu stale)\n",
+100.0 * BC.Stats.NumPseudoProbeExactMatchedBlocks /
+BC.Stats.NumStaleBlocks,
+BC.Stats.NumPseudoProbeExactMatchedBlocks, BC.Stats.NumStaleBlocks,
+100.0 * BC.Stats.PseudoProbeExactMatchedSampleCount /
+BC.Stats.StaleSampleCount,
+BC.Stats.PseudoProbeExactMatchedSampleCount, 
BC.Stats.StaleSampleCount);
+BC.outs() << format(
+"BOLT-INFO: inference found a loose pseudo probe match for %.2f%% of "
+"basic blocks (%zu out of %zu stale) responsible for %.2f%% samples"
+" (%zu out of %zu stale)\n",
+100.0 * BC.Stats.NumPseudoProbeLooseMatchedBlocks /
+BC.Stats.NumStaleBlocks,
+BC.Stats.NumPseudoProbeLooseMatchedBlocks, BC.Stats.NumStaleBlocks,
+100.0 * BC.Stats.PseudoProbeLooseMatchedSampleCount /
+BC.Stats.StaleSampleCount,
+BC.Stats.PseudoProbeLooseMatchedSampleCount, 
BC.Stats.StaleSampleCount);
+BC.outs() << format(
+"BOLT-INFO: inference found a call match for %.2f%% of basic "
 "blocks"
 " (%zu out of %zu stale) responsible for %.2f%% samples"
 " (%zu out of %zu stale)\n",
-100.0 * BC.Stats.NumPseudoProbeMatchedBlocks / BC.Stats.NumStaleBlocks,
-BC.Stats.NumPseudoProbeMatchedBlocks, BC.Stats.NumStaleBlocks,
-100.0 * BC.Stats.PseudoProbeMatchedSampleCount /
-BC.Stats.StaleSampleCount,
-BC.Stats.PseudoProbeMatchedSampleCount, BC.Stats.StaleSampleCount);
+100.0 * BC.Stats.NumCallMatchedBlocks / BC.Stats.NumStaleBlocks,
+BC.Stats.NumCallMatchedBlocks, BC.Stats.NumStaleBlocks,
+100.0 * BC.Stats.CallMatchedSampleCount / BC.Stats.StaleSampleCount,
+BC.Stats.CallMatchedSampleCount, BC.Stats.StaleSampleCount);
+BC

[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)

2024-09-12 Thread Lei Wang via llvm-branch-commits



@@ -58,8 +59,158 @@ const BinaryFunction *YAMLProfileWriter::setCSIDestination(
   return nullptr;
 }
 
+std::vector
+YAMLProfileWriter::collectInlineTree(
+const MCPseudoProbeDecoder &Decoder,
+const MCDecodedPseudoProbeInlineTree &Root) {
+  auto getHash = [&](const MCDecodedPseudoProbeInlineTree &Node) {
+return Decoder.getFuncDescForGUID(Node.Guid)->FuncHash;
+  };
+  std::vector InlineTree(
+  {InlineTreeNode{&Root, Root.Guid, getHash(Root), 0, 0}});
+  uint32_t ParentId = 0;
+  while (ParentId != InlineTree.size()) {
+const MCDecodedPseudoProbeInlineTree *Cur = 
InlineTree[ParentId].InlineTree;
+for (const MCDecodedPseudoProbeInlineTree &Child : Cur->getChildren())
+  InlineTree.emplace_back(
+  InlineTreeNode{&Child, Child.Guid, getHash(Child), ParentId,
+ std::get<1>(Child.getInlineSite())});
+++ParentId;
+  }
+
+  return InlineTree;
+}
+
+std::tuple
+YAMLProfileWriter::convertPseudoProbeDesc(const MCPseudoProbeDecoder &Decoder) 
{
+  yaml::bolt::PseudoProbeDesc Desc;
+  InlineTreeDesc InlineTree;
+
+  for (const MCDecodedPseudoProbeInlineTree &TopLev :
+   Decoder.getDummyInlineRoot().getChildren())
+InlineTree.TopLevelGUIDToInlineTree[TopLev.Guid] = &TopLev;
+
+  for (const auto &FuncDesc : Decoder.getGUID2FuncDescMap())
+++InlineTree.HashIdxMap[FuncDesc.FuncHash];
+
+  InlineTree.GUIDIdxMap.reserve(Decoder.getGUID2FuncDescMap().size());
+  for (const auto &Node : Decoder.getInlineTreeVec())
+++InlineTree.GUIDIdxMap[Node.Guid];
+
+  std::vector> GUIDFreqVec;
+  GUIDFreqVec.reserve(InlineTree.GUIDIdxMap.size());
+  for (const auto [GUID, Cnt] : InlineTree.GUIDIdxMap)
+GUIDFreqVec.emplace_back(Cnt, GUID);
+  llvm::sort(GUIDFreqVec);
+
+  std::vector> HashFreqVec;
+  HashFreqVec.reserve(InlineTree.HashIdxMap.size());
+  for (const auto [Hash, Cnt] : InlineTree.HashIdxMap)
+HashFreqVec.emplace_back(Cnt, Hash);
+  llvm::sort(HashFreqVec);
+
+  uint32_t Index = 0;
+  Desc.Hash.reserve(HashFreqVec.size());
+  for (uint64_t Hash : llvm::make_second_range(llvm::reverse(HashFreqVec))) {
+Desc.Hash.emplace_back(Hash);
+InlineTree.HashIdxMap[Hash] = Index++;
+  }
+
+  Index = 0;
+  Desc.GUID.reserve(GUIDFreqVec.size());
+  for (uint64_t GUID : llvm::make_second_range(llvm::reverse(GUIDFreqVec))) {
+Desc.GUID.emplace_back(GUID);
+InlineTree.GUIDIdxMap[GUID] = Index++;
+uint64_t Hash = Decoder.getFuncDescForGUID(GUID)->FuncHash;
+Desc.GUIDHashIdx.emplace_back(InlineTree.HashIdxMap[Hash]);
+  }
+
+  return {Desc, InlineTree};
+}
+
+std::vector
+YAMLProfileWriter::convertNodeProbes(NodeIdToProbes &NodeProbes) {
+  struct BlockProbeInfoHasher {
+size_t operator()(const yaml::bolt::PseudoProbeInfo &BPI) const {
+  auto HashCombine = [](auto &Range) {
+return llvm::hash_combine_range(Range.begin(), Range.end());
+  };
+  return llvm::hash_combine(HashCombine(BPI.BlockProbes),
+HashCombine(BPI.CallProbes),
+HashCombine(BPI.IndCallProbes));
+}
+  };
+
+  // Check identical BlockProbeInfo structs and merge them
+  std::unordered_map,
+ BlockProbeInfoHasher>
+  BPIToNodes;
+  for (auto &[NodeId, Probes] : NodeProbes) {
+yaml::bolt::PseudoProbeInfo BPI;
+BPI.BlockProbes = std::vector(Probes[0].begin(), Probes[0].end());
+BPI.IndCallProbes = std::vector(Probes[1].begin(), Probes[1].end());
+BPI.CallProbes = std::vector(Probes[2].begin(), Probes[2].end());
+BPIToNodes[BPI].push_back(NodeId);
+  }
+
+  auto handleMask = [](const auto &Ids, auto &Vec, auto &Mask) {
+for (auto Id : Ids)
+  if (Id > 64)
+Vec.emplace_back(Id);
+  else
+Mask |= 1ull << (Id - 1);
+  };
+
+  // Add to YAML with merged nodes/block mask optimizations
+  std::vector YamlProbes;
+  YamlProbes.reserve(BPIToNodes.size());
+  for (const auto &[BPI, Nodes] : BPIToNodes) {
+auto &YamlBPI = YamlProbes.emplace_back(yaml::bolt::PseudoProbeInfo());
+YamlBPI.CallProbes = BPI.CallProbes;
+YamlBPI.IndCallProbes = BPI.IndCallProbes;
+if (Nodes.size() == 1)
+  YamlBPI.InlineTreeIndex = Nodes.front();
+else
+  YamlBPI.InlineTreeNodes = Nodes;
+handleMask(BPI.BlockProbes, YamlBPI.BlockProbes, YamlBPI.BlockMask);
+  }
+  return YamlProbes;
+}
+
+std::tuple,
+   YAMLProfileWriter::InlineTreeMapTy>
+YAMLProfileWriter::convertBFInlineTree(const MCPseudoProbeDecoder &Decoder,
+   const InlineTreeDesc &InlineTree,
+   uint64_t GUID) {
+  DenseMap InlineTreeNodeId;
+  std::vector YamlInlineTree;
+  auto It = InlineTree.TopLevelGUIDToInlineTree.find(GUID);
+  if (It == InlineTree.TopLevelGUIDToInlineTree.end())
+return {YamlInlineTree, InlineTreeNodeId};
+  const MCDecodedPseudoProbeInlineTree *Root = It->second;
+  assert(Root && "Malformed TopLevelGUIDToInlineTree");
+

[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)

2024-09-12 Thread Lei Wang via llvm-branch-commits



@@ -14,29 +14,31 @@
 # RUN: FileCheck --input-file %t.yaml2 %s --check-prefix CHECK-YAML
 # CHECK-YAML: name: bar
 # CHECK-YAML: - bid: 0
-# CHECK-YAML:   pseudo_probes: [ { guid: 0xE413754A191DB537, id: 1, type: 0 }, 
{ guid: 0xE413754A191DB537, id: 4, type: 0 } ]
-# CHECK-YAML: guid: 0xE413754A191DB537
-# CHECK-YAML: pseudo_probe_desc_hash: 0x10E852DA94
+# CHECK-YAML:   probes: [ { blx: 9 } ]

wlei-llvm wrote:

There is no call probe case, IIRC, noinline-cs-pseudoprobe.test should contain 
some call probes, we can use that to create the test.

there are still some cases not covered I think, but I guess  that requires to 
create a large binary which we don't want to upload to the repo. 

https://github.com/llvm/llvm-project/pull/107137
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)

2024-09-12 Thread Lei Wang via llvm-branch-commits



@@ -2421,11 +2433,14 @@ std::error_code 
DataAggregator::writeBATYAML(BinaryContext &BC,
 const uint32_t InputOffset = BAT->translate(
 FuncAddr, OutputAddress - FuncAddr, /*IsBranchSrc=*/true);
 const unsigned BlockIndex = getBlock(InputOffset).second;
-YamlBF.Blocks[BlockIndex].PseudoProbes.emplace_back(
-yaml::bolt::PseudoProbeInfo{Probe.getGuid(), Probe.getIndex(),
-Probe.getType()});
+BlockProbes[BlockIndex].emplace_back(Probe);
   }
 }
+
+for (auto &[Block, Probes] : BlockProbes) {
+  YamlBF.Blocks[Block].PseudoProbes =
+  YAMLProfileWriter::writeBlockProbes(Probes, InlineTreeNodeId);

wlei-llvm wrote:

Thanks for the context.

https://github.com/llvm/llvm-project/pull/107137
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)

2024-09-12 Thread Amir Ayupov via llvm-branch-commits



@@ -58,8 +59,158 @@ const BinaryFunction *YAMLProfileWriter::setCSIDestination(
   return nullptr;
 }
 
+std::vector
+YAMLProfileWriter::collectInlineTree(
+const MCPseudoProbeDecoder &Decoder,
+const MCDecodedPseudoProbeInlineTree &Root) {
+  auto getHash = [&](const MCDecodedPseudoProbeInlineTree &Node) {
+return Decoder.getFuncDescForGUID(Node.Guid)->FuncHash;
+  };
+  std::vector InlineTree(
+  {InlineTreeNode{&Root, Root.Guid, getHash(Root), 0, 0}});
+  uint32_t ParentId = 0;
+  while (ParentId != InlineTree.size()) {
+const MCDecodedPseudoProbeInlineTree *Cur = 
InlineTree[ParentId].InlineTree;
+for (const MCDecodedPseudoProbeInlineTree &Child : Cur->getChildren())
+  InlineTree.emplace_back(
+  InlineTreeNode{&Child, Child.Guid, getHash(Child), ParentId,
+ std::get<1>(Child.getInlineSite())});
+++ParentId;
+  }
+
+  return InlineTree;
+}
+
+std::tuple
+YAMLProfileWriter::convertPseudoProbeDesc(const MCPseudoProbeDecoder &Decoder) 
{
+  yaml::bolt::PseudoProbeDesc Desc;
+  InlineTreeDesc InlineTree;
+
+  for (const MCDecodedPseudoProbeInlineTree &TopLev :
+   Decoder.getDummyInlineRoot().getChildren())
+InlineTree.TopLevelGUIDToInlineTree[TopLev.Guid] = &TopLev;
+
+  for (const auto &FuncDesc : Decoder.getGUID2FuncDescMap())
+++InlineTree.HashIdxMap[FuncDesc.FuncHash];
+
+  InlineTree.GUIDIdxMap.reserve(Decoder.getGUID2FuncDescMap().size());
+  for (const auto &Node : Decoder.getInlineTreeVec())
+++InlineTree.GUIDIdxMap[Node.Guid];
+
+  std::vector> GUIDFreqVec;
+  GUIDFreqVec.reserve(InlineTree.GUIDIdxMap.size());
+  for (const auto [GUID, Cnt] : InlineTree.GUIDIdxMap)
+GUIDFreqVec.emplace_back(Cnt, GUID);
+  llvm::sort(GUIDFreqVec);
+
+  std::vector> HashFreqVec;
+  HashFreqVec.reserve(InlineTree.HashIdxMap.size());
+  for (const auto [Hash, Cnt] : InlineTree.HashIdxMap)
+HashFreqVec.emplace_back(Cnt, Hash);
+  llvm::sort(HashFreqVec);
+
+  uint32_t Index = 0;
+  Desc.Hash.reserve(HashFreqVec.size());
+  for (uint64_t Hash : llvm::make_second_range(llvm::reverse(HashFreqVec))) {
+Desc.Hash.emplace_back(Hash);
+InlineTree.HashIdxMap[Hash] = Index++;
+  }
+
+  Index = 0;
+  Desc.GUID.reserve(GUIDFreqVec.size());
+  for (uint64_t GUID : llvm::make_second_range(llvm::reverse(GUIDFreqVec))) {
+Desc.GUID.emplace_back(GUID);
+InlineTree.GUIDIdxMap[GUID] = Index++;
+uint64_t Hash = Decoder.getFuncDescForGUID(GUID)->FuncHash;
+Desc.GUIDHashIdx.emplace_back(InlineTree.HashIdxMap[Hash]);
+  }
+
+  return {Desc, InlineTree};
+}
+
+std::vector
+YAMLProfileWriter::convertNodeProbes(NodeIdToProbes &NodeProbes) {
+  struct BlockProbeInfoHasher {
+size_t operator()(const yaml::bolt::PseudoProbeInfo &BPI) const {
+  auto HashCombine = [](auto &Range) {
+return llvm::hash_combine_range(Range.begin(), Range.end());
+  };
+  return llvm::hash_combine(HashCombine(BPI.BlockProbes),
+HashCombine(BPI.CallProbes),
+HashCombine(BPI.IndCallProbes));
+}
+  };
+
+  // Check identical BlockProbeInfo structs and merge them
+  std::unordered_map,
+ BlockProbeInfoHasher>
+  BPIToNodes;
+  for (auto &[NodeId, Probes] : NodeProbes) {
+yaml::bolt::PseudoProbeInfo BPI;
+BPI.BlockProbes = std::vector(Probes[0].begin(), Probes[0].end());
+BPI.IndCallProbes = std::vector(Probes[1].begin(), Probes[1].end());
+BPI.CallProbes = std::vector(Probes[2].begin(), Probes[2].end());
+BPIToNodes[BPI].push_back(NodeId);
+  }
+
+  auto handleMask = [](const auto &Ids, auto &Vec, auto &Mask) {
+for (auto Id : Ids)
+  if (Id > 64)
+Vec.emplace_back(Id);
+  else
+Mask |= 1ull << (Id - 1);
+  };
+
+  // Add to YAML with merged nodes/block mask optimizations
+  std::vector YamlProbes;
+  YamlProbes.reserve(BPIToNodes.size());
+  for (const auto &[BPI, Nodes] : BPIToNodes) {
+auto &YamlBPI = YamlProbes.emplace_back(yaml::bolt::PseudoProbeInfo());
+YamlBPI.CallProbes = BPI.CallProbes;
+YamlBPI.IndCallProbes = BPI.IndCallProbes;
+if (Nodes.size() == 1)
+  YamlBPI.InlineTreeIndex = Nodes.front();
+else
+  YamlBPI.InlineTreeNodes = Nodes;
+handleMask(BPI.BlockProbes, YamlBPI.BlockProbes, YamlBPI.BlockMask);
+  }
+  return YamlProbes;
+}
+
+std::tuple,
+   YAMLProfileWriter::InlineTreeMapTy>
+YAMLProfileWriter::convertBFInlineTree(const MCPseudoProbeDecoder &Decoder,
+   const InlineTreeDesc &InlineTree,
+   uint64_t GUID) {
+  DenseMap InlineTreeNodeId;
+  std::vector YamlInlineTree;
+  auto It = InlineTree.TopLevelGUIDToInlineTree.find(GUID);
+  if (It == InlineTree.TopLevelGUIDToInlineTree.end())
+return {YamlInlineTree, InlineTreeNodeId};
+  const MCDecodedPseudoProbeInlineTree *Root = It->second;
+  assert(Root && "Malformed TopLevelGUIDToInlineTree");
+

[llvm-branch-commits] [compiler-rt] [TySan] Fix struct access with different bases (PR #108385)

2024-09-12 Thread Tavian Barnes via llvm-branch-commits


tavianator wrote:

Here's the new testcase.  Not sure if this bug is related or not.  It has to do 
with `memcpy()`; if you replace the call with the commented-out line above it, 
it works.

```c
struct node {
struct node *next;
};

struct list {
struct node *head, **tail;
};

int main(void) {
struct list *list = __builtin_malloc(sizeof(*list));
list->head = 0;
list->tail = &list->head;

struct node *node = __builtin_malloc(sizeof(*node));
node->next = 0;

*list->tail = node;
list->tail = &node->next;

while (list->head) {
struct node *node = list->head;
// list->head = node->next;
__builtin_memcpy(&list->head, &node->next, sizeof(list->head));
node->next = 0;
}

return 0;
}
```

```console
tavianator@tachyon $ ~/code/llvm/llvm-project/build/bin/clang -Wall -g 
-fsanitize=type foo.c -o foo
tavianator@tachyon $ ./foo
==5885==ERROR: TypeSanitizer: type-aliasing-violation on address 0x55af02a8c2a0 
(pc 0x55aef600fb36 bp 0x7ffcbf810cf0 sp 0x7ffcbf810c90 tid 5885)
READ of size 8 at 0x55af02a8c2a0 with type any pointer (in list at offset 0) 
accesses an existing object of type any pointer (in node at offset 0)
#0 0x55aef600fb35 in main /home/tavianator/code/bfs/foo.c:20:15

```

https://github.com/llvm/llvm-project/pull/108385
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)

2024-09-12 Thread Amir Ayupov via llvm-branch-commits


https://github.com/aaupov updated 
https://github.com/llvm/llvm-project/pull/107137

>From 50c021b09950cf7d6a8f25b1ac0dec246f2325f5 Mon Sep 17 00:00:00 2001
From: Amir Ayupov 
Date: Tue, 3 Sep 2024 11:38:04 -0700
Subject: [PATCH 1/6] update pseudoprobe-decoding-inline.test

Created using spr 1.3.4
---
 .../test/X86/pseudoprobe-decoding-inline.test | 31 ---
 1 file changed, 20 insertions(+), 11 deletions(-)

diff --git a/bolt/test/X86/pseudoprobe-decoding-inline.test 
b/bolt/test/X86/pseudoprobe-decoding-inline.test
index 1fdd00c7ef6c4b..629dd84ab8e1dc 100644
--- a/bolt/test/X86/pseudoprobe-decoding-inline.test
+++ b/bolt/test/X86/pseudoprobe-decoding-inline.test
@@ -14,29 +14,38 @@
 # RUN: FileCheck --input-file %t.yaml2 %s --check-prefix CHECK-YAML
 # CHECK-YAML: name: bar
 # CHECK-YAML: - bid: 0
-# CHECK-YAML:   pseudo_probes: [ { guid: 0xE413754A191DB537, id: 1, type: 0 }, 
{ guid: 0xE413754A191DB537, id: 4, type: 0 } ]
-# CHECK-YAML: guid: 0xE413754A191DB537
-# CHECK-YAML: pseudo_probe_desc_hash: 0x10E852DA94
+# CHECK-YAML:  pseudo_probes:
+# CHECK-YAML-NEXT:   - { id: 1, type: 0
+# CHECK-YAML-NEXT:   - { id: 4, type: 0
+# CHECK-YAML:  inline_tree:
+# CHECK-YAML-NEXT:   - { guid: 0xE413754A191DB537, hash: 0x10E852DA94, id: 0 }
 #
 # CHECK-YAML: name: foo
 # CHECK-YAML: - bid: 0
-# CHECK-YAML:   pseudo_probes: [ { guid: 0x5CF8C24CDB18BDAC, id: 1, type: 0 }, 
{ guid: 0x5CF8C24CDB18BDAC, id: 2, type: 0 } ]
-# CHECK-YAML: guid: 0x5CF8C24CDB18BDAC
-# CHECK-YAML: pseudo_probe_desc_hash: 0x200205A19C5B4
+# CHECK-YAML:  pseudo_probes:
+# CHECK-YAML-NEXT: - { id: 1, type: 0 }
+# CHECK-YAML-NEXT: - { id: 2, type: 0 }
+# CHECK-YAML:  inline_tree:
+# CHECK-YAML-NEXT:   - { guid: 0x5CF8C24CDB18BDAC, hash: 0x200205A19C5B4, id: 
0 }
+# CHECK-YAML-NEXT:   - { guid: 0xE413754A191DB537, hash: 0x10E852DA94, id: 1, 
callsite: 8 }
 #
 # CHECK-YAML: name: main
 # CHECK-YAML: - bid: 0
-# CHECK-YAML:   pseudo_probes: [ { guid: 0xDB956436E78DD5FA, id: 1, type: 0 }, 
{ guid: 0x5CF8C24CDB18BDAC, id: 1, type: 0 }, { guid: 0x5CF8C24CDB18BDAC, id: 
2, type: 0 } ]
-# CHECK-YAML: guid: 0xDB956436E78DD5FA
-# CHECK-YAML: pseudo_probe_desc_hash: 0x1
+# CHECK-YAML:  pseudo_probes:
+# CHECK-YAML-NEXT: - { id: 1, type: 0 }
+# CHECK-YAML-NEXT: - { id: 1, type: 0, inline_tree_id: 1 }
+# CHECK-YAML-NEXT: - { id: 2, type: 0, inline_tree_id: 1 }
+# CHECK-YAML:  inline_tree:
+# CHECK-YAML-NEXT:   - { guid: 0xDB956436E78DD5FA, hash: 0x1, id: 
0 }
+# CHECK-YAML-NEXT:   - { guid: 0x5CF8C24CDB18BDAC, hash: 0x200205A19C5B4, id: 
1, callsite: 2 }
+# CHECK-YAML-NEXT:   - { guid: 0xE413754A191DB537, hash: 0x10E852DA94, id: 2, 
parent: 1, callsite: 8 }
 #
 ## Check that without --profile-write-pseudo-probes option, no pseudo probes 
are
 ## generated
 # RUN: perf2bolt 
%S/../../../llvm/test/tools/llvm-profgen/Inputs/inline-cs-pseudoprobe.perfbin 
-p %t.preagg --pa -w %t.yaml -o %t.fdata
 # RUN: FileCheck --input-file %t.yaml %s --check-prefix CHECK-NO-OPT
 # CHECK-NO-OPT-NOT: pseudo_probes
-# CHECK-NO-OPT-NOT: guid
-# CHECK-NO-OPT-NOT: pseudo_probe_desc_hash
+# CHECK-NO-OPT-NOT: inline_tree
 
 CHECK: Report of decoding input pseudo probe binaries
 

>From 6ec4cf6bf05551d02cbf17e9edbe8d6931588ff1 Mon Sep 17 00:00:00 2001
From: Amir Ayupov 
Date: Mon, 9 Sep 2024 21:37:28 -0700
Subject: [PATCH 2/6] clang-format

Created using spr 1.3.4
---
 bolt/lib/Profile/YAMLProfileWriter.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/bolt/lib/Profile/YAMLProfileWriter.cpp 
b/bolt/lib/Profile/YAMLProfileWriter.cpp
index 70e5e09e2920e5..f2609de18ce63c 100644
--- a/bolt/lib/Profile/YAMLProfileWriter.cpp
+++ b/bolt/lib/Profile/YAMLProfileWriter.cpp
@@ -90,7 +90,7 @@ YAMLProfileWriter::convertPseudoProbeDesc(const 
MCPseudoProbeDecoder &Decoder) {
   InlineTreeDesc InlineTree;
 
   for (const MCDecodedPseudoProbeInlineTree &TopLev :
-  Decoder.getDummyInlineRoot().getChildren())
+   Decoder.getDummyInlineRoot().getChildren())
 InlineTree.TopLevelGUIDToInlineTree[TopLev.Guid] = &TopLev;
 
   for (const auto &FuncDesc : Decoder.getGUID2FuncDescMap())

>From 852eb07f345dd1d9e77a6faead8bf0f73ff64ba7 Mon Sep 17 00:00:00 2001
From: Amir Ayupov 
Date: Tue, 10 Sep 2024 12:26:11 -0700
Subject: [PATCH 3/6] Make pseudo_probe_desc optional

Created using spr 1.3.4
---
 bolt/include/bolt/Profile/ProfileYAMLMapping.h | 9 -
 bolt/test/X86/pseudoprobe-decoding-inline.test | 5 +++--
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/bolt/include/bolt/Profile/ProfileYAMLMapping.h 
b/bolt/include/bolt/Profile/ProfileYAMLMapping.h
index 588e2f59d67e01..9cc33264d70718 100644
--- a/bolt/include/bolt/Profile/ProfileYAMLMapping.h
+++ b/bolt/include/bolt/Profile/ProfileYAMLMapping.h
@@ -275,6 +275,12 @@ struct PseudoProbeDesc {
   std::vector GUID;
   std::vector Hash;
   std::vector GUIDHash; // Index of hash for that GUID in Hash
+
+  bool operator==(const PseudoProbeDesc &Ot

[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)

2024-09-12 Thread Amir Ayupov via llvm-branch-commits


https://github.com/aaupov updated 
https://github.com/llvm/llvm-project/pull/107137

>From 50c021b09950cf7d6a8f25b1ac0dec246f2325f5 Mon Sep 17 00:00:00 2001
From: Amir Ayupov 
Date: Tue, 3 Sep 2024 11:38:04 -0700
Subject: [PATCH 1/6] update pseudoprobe-decoding-inline.test

Created using spr 1.3.4
---
 .../test/X86/pseudoprobe-decoding-inline.test | 31 ---
 1 file changed, 20 insertions(+), 11 deletions(-)

diff --git a/bolt/test/X86/pseudoprobe-decoding-inline.test 
b/bolt/test/X86/pseudoprobe-decoding-inline.test
index 1fdd00c7ef6c4b..629dd84ab8e1dc 100644
--- a/bolt/test/X86/pseudoprobe-decoding-inline.test
+++ b/bolt/test/X86/pseudoprobe-decoding-inline.test
@@ -14,29 +14,38 @@
 # RUN: FileCheck --input-file %t.yaml2 %s --check-prefix CHECK-YAML
 # CHECK-YAML: name: bar
 # CHECK-YAML: - bid: 0
-# CHECK-YAML:   pseudo_probes: [ { guid: 0xE413754A191DB537, id: 1, type: 0 }, 
{ guid: 0xE413754A191DB537, id: 4, type: 0 } ]
-# CHECK-YAML: guid: 0xE413754A191DB537
-# CHECK-YAML: pseudo_probe_desc_hash: 0x10E852DA94
+# CHECK-YAML:  pseudo_probes:
+# CHECK-YAML-NEXT:   - { id: 1, type: 0
+# CHECK-YAML-NEXT:   - { id: 4, type: 0
+# CHECK-YAML:  inline_tree:
+# CHECK-YAML-NEXT:   - { guid: 0xE413754A191DB537, hash: 0x10E852DA94, id: 0 }
 #
 # CHECK-YAML: name: foo
 # CHECK-YAML: - bid: 0
-# CHECK-YAML:   pseudo_probes: [ { guid: 0x5CF8C24CDB18BDAC, id: 1, type: 0 }, 
{ guid: 0x5CF8C24CDB18BDAC, id: 2, type: 0 } ]
-# CHECK-YAML: guid: 0x5CF8C24CDB18BDAC
-# CHECK-YAML: pseudo_probe_desc_hash: 0x200205A19C5B4
+# CHECK-YAML:  pseudo_probes:
+# CHECK-YAML-NEXT: - { id: 1, type: 0 }
+# CHECK-YAML-NEXT: - { id: 2, type: 0 }
+# CHECK-YAML:  inline_tree:
+# CHECK-YAML-NEXT:   - { guid: 0x5CF8C24CDB18BDAC, hash: 0x200205A19C5B4, id: 
0 }
+# CHECK-YAML-NEXT:   - { guid: 0xE413754A191DB537, hash: 0x10E852DA94, id: 1, 
callsite: 8 }
 #
 # CHECK-YAML: name: main
 # CHECK-YAML: - bid: 0
-# CHECK-YAML:   pseudo_probes: [ { guid: 0xDB956436E78DD5FA, id: 1, type: 0 }, 
{ guid: 0x5CF8C24CDB18BDAC, id: 1, type: 0 }, { guid: 0x5CF8C24CDB18BDAC, id: 
2, type: 0 } ]
-# CHECK-YAML: guid: 0xDB956436E78DD5FA
-# CHECK-YAML: pseudo_probe_desc_hash: 0x1
+# CHECK-YAML:  pseudo_probes:
+# CHECK-YAML-NEXT: - { id: 1, type: 0 }
+# CHECK-YAML-NEXT: - { id: 1, type: 0, inline_tree_id: 1 }
+# CHECK-YAML-NEXT: - { id: 2, type: 0, inline_tree_id: 1 }
+# CHECK-YAML:  inline_tree:
+# CHECK-YAML-NEXT:   - { guid: 0xDB956436E78DD5FA, hash: 0x1, id: 
0 }
+# CHECK-YAML-NEXT:   - { guid: 0x5CF8C24CDB18BDAC, hash: 0x200205A19C5B4, id: 
1, callsite: 2 }
+# CHECK-YAML-NEXT:   - { guid: 0xE413754A191DB537, hash: 0x10E852DA94, id: 2, 
parent: 1, callsite: 8 }
 #
 ## Check that without --profile-write-pseudo-probes option, no pseudo probes 
are
 ## generated
 # RUN: perf2bolt 
%S/../../../llvm/test/tools/llvm-profgen/Inputs/inline-cs-pseudoprobe.perfbin 
-p %t.preagg --pa -w %t.yaml -o %t.fdata
 # RUN: FileCheck --input-file %t.yaml %s --check-prefix CHECK-NO-OPT
 # CHECK-NO-OPT-NOT: pseudo_probes
-# CHECK-NO-OPT-NOT: guid
-# CHECK-NO-OPT-NOT: pseudo_probe_desc_hash
+# CHECK-NO-OPT-NOT: inline_tree
 
 CHECK: Report of decoding input pseudo probe binaries
 

>From 6ec4cf6bf05551d02cbf17e9edbe8d6931588ff1 Mon Sep 17 00:00:00 2001
From: Amir Ayupov 
Date: Mon, 9 Sep 2024 21:37:28 -0700
Subject: [PATCH 2/6] clang-format

Created using spr 1.3.4
---
 bolt/lib/Profile/YAMLProfileWriter.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/bolt/lib/Profile/YAMLProfileWriter.cpp 
b/bolt/lib/Profile/YAMLProfileWriter.cpp
index 70e5e09e2920e5..f2609de18ce63c 100644
--- a/bolt/lib/Profile/YAMLProfileWriter.cpp
+++ b/bolt/lib/Profile/YAMLProfileWriter.cpp
@@ -90,7 +90,7 @@ YAMLProfileWriter::convertPseudoProbeDesc(const 
MCPseudoProbeDecoder &Decoder) {
   InlineTreeDesc InlineTree;
 
   for (const MCDecodedPseudoProbeInlineTree &TopLev :
-  Decoder.getDummyInlineRoot().getChildren())
+   Decoder.getDummyInlineRoot().getChildren())
 InlineTree.TopLevelGUIDToInlineTree[TopLev.Guid] = &TopLev;
 
   for (const auto &FuncDesc : Decoder.getGUID2FuncDescMap())

>From 852eb07f345dd1d9e77a6faead8bf0f73ff64ba7 Mon Sep 17 00:00:00 2001
From: Amir Ayupov 
Date: Tue, 10 Sep 2024 12:26:11 -0700
Subject: [PATCH 3/6] Make pseudo_probe_desc optional

Created using spr 1.3.4
---
 bolt/include/bolt/Profile/ProfileYAMLMapping.h | 9 -
 bolt/test/X86/pseudoprobe-decoding-inline.test | 5 +++--
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/bolt/include/bolt/Profile/ProfileYAMLMapping.h 
b/bolt/include/bolt/Profile/ProfileYAMLMapping.h
index 588e2f59d67e01..9cc33264d70718 100644
--- a/bolt/include/bolt/Profile/ProfileYAMLMapping.h
+++ b/bolt/include/bolt/Profile/ProfileYAMLMapping.h
@@ -275,6 +275,12 @@ struct PseudoProbeDesc {
   std::vector GUID;
   std::vector Hash;
   std::vector GUIDHash; // Index of hash for that GUID in Hash
+
+  bool operator==(const PseudoProbeDesc &Ot

[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)

2024-09-12 Thread Amir Ayupov via llvm-branch-commits



@@ -14,29 +14,31 @@
 # RUN: FileCheck --input-file %t.yaml2 %s --check-prefix CHECK-YAML
 # CHECK-YAML: name: bar
 # CHECK-YAML: - bid: 0
-# CHECK-YAML:   pseudo_probes: [ { guid: 0xE413754A191DB537, id: 1, type: 0 }, 
{ guid: 0xE413754A191DB537, id: 4, type: 0 } ]
-# CHECK-YAML: guid: 0xE413754A191DB537
-# CHECK-YAML: pseudo_probe_desc_hash: 0x10E852DA94
+# CHECK-YAML:   probes: [ { blx: 9 } ]

aaupov wrote:

Added bolt/test/X86/pseudoprobe-decoding-noinline.test

If there are any other binaries/tests in llvm tree with pseudo probes, I can 
check them as well.

https://github.com/llvm/llvm-project/pull/107137
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [compiler-rt] [TySan] Fix struct access with different bases (PR #108385)

2024-09-12 Thread Tavian Barnes via llvm-branch-commits


tavianator wrote:

I guess the bug there is that the memcpy() interceptor literally copies the 
dynamic type from `node->next` to `list->head`.  Then `list->head` is accessed 
but tysan thinks the memory has type `struct node::next` which doesn't match.

https://github.com/llvm/llvm-project/pull/108385
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [CIR] Add .clang-tidy files to agree with our style convention (PR #108444)

2024-09-12 Thread Nathan Lanza via llvm-branch-commits


https://github.com/lanza created 
https://github.com/llvm/llvm-project/pull/108444

https://llvm.github.io/clangir/GettingStarted/coding-guideline.html



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [CIR] Add .clang-tidy files to agree with our style convention (PR #108444)

2024-09-12 Thread Nathan Lanza via llvm-branch-commits


https://github.com/lanza closed https://github.com/llvm/llvm-project/pull/108444
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)

2024-09-12 Thread Rafael Auler via llvm-branch-commits


https://github.com/rafaelauler approved this pull request.

Not an expert but looks good. Why is operator== in struct InlineTreeInfo always 
returning false? Is this intentional? 

https://github.com/llvm/llvm-project/pull/107137
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)

2024-09-12 Thread Rafael Auler via llvm-branch-commits


rafaelauler wrote:

Sorry, didn't see lei was already reviewing this. Go ahead with his expert's 
opinion, please.

https://github.com/llvm/llvm-project/pull/107137
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)

2024-09-12 Thread Amir Ayupov via llvm-branch-commits

aaupov wrote:

> Not an expert but looks good. Why is operator== in struct InlineTreeInfo 
> always returning false? Is this intentional?

It's a quirk of YAML: `BinaryFunctionProfile` has `std::vector 
InlineTree` as optional field. Optional fields compare against the default 
value using `operator==`, which for vector transitively requires `operator==` 
for `InlineTreeNode`. However the default value for `InlineTree` is empty 
vector, so no `InlineTreeNode` comparison is actually necessary. Hence we just 
say that `InlineTreeNode::operator==` is false.

https://github.com/llvm/llvm-project/pull/107137
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)

2024-09-12 Thread Lei Wang via llvm-branch-commits


https://github.com/wlei-llvm approved this pull request.

LGTM, thanks!

https://github.com/llvm/llvm-project/pull/107137
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)

2024-09-12 Thread Amir Ayupov via llvm-branch-commits


aaupov wrote:

Thanks for a review, @wlei-llvm, @rafaelauler, @WenleiHe!

https://github.com/llvm/llvm-project/pull/107137
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)

2024-09-12 Thread Amir Ayupov via llvm-branch-commits


https://github.com/aaupov updated 
https://github.com/llvm/llvm-project/pull/107137

>From 50c021b09950cf7d6a8f25b1ac0dec246f2325f5 Mon Sep 17 00:00:00 2001
From: Amir Ayupov 
Date: Tue, 3 Sep 2024 11:38:04 -0700
Subject: [PATCH 1/6] update pseudoprobe-decoding-inline.test

Created using spr 1.3.4
---
 .../test/X86/pseudoprobe-decoding-inline.test | 31 ---
 1 file changed, 20 insertions(+), 11 deletions(-)

diff --git a/bolt/test/X86/pseudoprobe-decoding-inline.test 
b/bolt/test/X86/pseudoprobe-decoding-inline.test
index 1fdd00c7ef6c4b..629dd84ab8e1dc 100644
--- a/bolt/test/X86/pseudoprobe-decoding-inline.test
+++ b/bolt/test/X86/pseudoprobe-decoding-inline.test
@@ -14,29 +14,38 @@
 # RUN: FileCheck --input-file %t.yaml2 %s --check-prefix CHECK-YAML
 # CHECK-YAML: name: bar
 # CHECK-YAML: - bid: 0
-# CHECK-YAML:   pseudo_probes: [ { guid: 0xE413754A191DB537, id: 1, type: 0 }, 
{ guid: 0xE413754A191DB537, id: 4, type: 0 } ]
-# CHECK-YAML: guid: 0xE413754A191DB537
-# CHECK-YAML: pseudo_probe_desc_hash: 0x10E852DA94
+# CHECK-YAML:  pseudo_probes:
+# CHECK-YAML-NEXT:   - { id: 1, type: 0
+# CHECK-YAML-NEXT:   - { id: 4, type: 0
+# CHECK-YAML:  inline_tree:
+# CHECK-YAML-NEXT:   - { guid: 0xE413754A191DB537, hash: 0x10E852DA94, id: 0 }
 #
 # CHECK-YAML: name: foo
 # CHECK-YAML: - bid: 0
-# CHECK-YAML:   pseudo_probes: [ { guid: 0x5CF8C24CDB18BDAC, id: 1, type: 0 }, 
{ guid: 0x5CF8C24CDB18BDAC, id: 2, type: 0 } ]
-# CHECK-YAML: guid: 0x5CF8C24CDB18BDAC
-# CHECK-YAML: pseudo_probe_desc_hash: 0x200205A19C5B4
+# CHECK-YAML:  pseudo_probes:
+# CHECK-YAML-NEXT: - { id: 1, type: 0 }
+# CHECK-YAML-NEXT: - { id: 2, type: 0 }
+# CHECK-YAML:  inline_tree:
+# CHECK-YAML-NEXT:   - { guid: 0x5CF8C24CDB18BDAC, hash: 0x200205A19C5B4, id: 
0 }
+# CHECK-YAML-NEXT:   - { guid: 0xE413754A191DB537, hash: 0x10E852DA94, id: 1, 
callsite: 8 }
 #
 # CHECK-YAML: name: main
 # CHECK-YAML: - bid: 0
-# CHECK-YAML:   pseudo_probes: [ { guid: 0xDB956436E78DD5FA, id: 1, type: 0 }, 
{ guid: 0x5CF8C24CDB18BDAC, id: 1, type: 0 }, { guid: 0x5CF8C24CDB18BDAC, id: 
2, type: 0 } ]
-# CHECK-YAML: guid: 0xDB956436E78DD5FA
-# CHECK-YAML: pseudo_probe_desc_hash: 0x1
+# CHECK-YAML:  pseudo_probes:
+# CHECK-YAML-NEXT: - { id: 1, type: 0 }
+# CHECK-YAML-NEXT: - { id: 1, type: 0, inline_tree_id: 1 }
+# CHECK-YAML-NEXT: - { id: 2, type: 0, inline_tree_id: 1 }
+# CHECK-YAML:  inline_tree:
+# CHECK-YAML-NEXT:   - { guid: 0xDB956436E78DD5FA, hash: 0x1, id: 
0 }
+# CHECK-YAML-NEXT:   - { guid: 0x5CF8C24CDB18BDAC, hash: 0x200205A19C5B4, id: 
1, callsite: 2 }
+# CHECK-YAML-NEXT:   - { guid: 0xE413754A191DB537, hash: 0x10E852DA94, id: 2, 
parent: 1, callsite: 8 }
 #
 ## Check that without --profile-write-pseudo-probes option, no pseudo probes 
are
 ## generated
 # RUN: perf2bolt 
%S/../../../llvm/test/tools/llvm-profgen/Inputs/inline-cs-pseudoprobe.perfbin 
-p %t.preagg --pa -w %t.yaml -o %t.fdata
 # RUN: FileCheck --input-file %t.yaml %s --check-prefix CHECK-NO-OPT
 # CHECK-NO-OPT-NOT: pseudo_probes
-# CHECK-NO-OPT-NOT: guid
-# CHECK-NO-OPT-NOT: pseudo_probe_desc_hash
+# CHECK-NO-OPT-NOT: inline_tree
 
 CHECK: Report of decoding input pseudo probe binaries
 

>From 6ec4cf6bf05551d02cbf17e9edbe8d6931588ff1 Mon Sep 17 00:00:00 2001
From: Amir Ayupov 
Date: Mon, 9 Sep 2024 21:37:28 -0700
Subject: [PATCH 2/6] clang-format

Created using spr 1.3.4
---
 bolt/lib/Profile/YAMLProfileWriter.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/bolt/lib/Profile/YAMLProfileWriter.cpp 
b/bolt/lib/Profile/YAMLProfileWriter.cpp
index 70e5e09e2920e5..f2609de18ce63c 100644
--- a/bolt/lib/Profile/YAMLProfileWriter.cpp
+++ b/bolt/lib/Profile/YAMLProfileWriter.cpp
@@ -90,7 +90,7 @@ YAMLProfileWriter::convertPseudoProbeDesc(const 
MCPseudoProbeDecoder &Decoder) {
   InlineTreeDesc InlineTree;
 
   for (const MCDecodedPseudoProbeInlineTree &TopLev :
-  Decoder.getDummyInlineRoot().getChildren())
+   Decoder.getDummyInlineRoot().getChildren())
 InlineTree.TopLevelGUIDToInlineTree[TopLev.Guid] = &TopLev;
 
   for (const auto &FuncDesc : Decoder.getGUID2FuncDescMap())

>From 852eb07f345dd1d9e77a6faead8bf0f73ff64ba7 Mon Sep 17 00:00:00 2001
From: Amir Ayupov 
Date: Tue, 10 Sep 2024 12:26:11 -0700
Subject: [PATCH 3/6] Make pseudo_probe_desc optional

Created using spr 1.3.4
---
 bolt/include/bolt/Profile/ProfileYAMLMapping.h | 9 -
 bolt/test/X86/pseudoprobe-decoding-inline.test | 5 +++--
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/bolt/include/bolt/Profile/ProfileYAMLMapping.h 
b/bolt/include/bolt/Profile/ProfileYAMLMapping.h
index 588e2f59d67e01..9cc33264d70718 100644
--- a/bolt/include/bolt/Profile/ProfileYAMLMapping.h
+++ b/bolt/include/bolt/Profile/ProfileYAMLMapping.h
@@ -275,6 +275,12 @@ struct PseudoProbeDesc {
   std::vector GUID;
   std::vector Hash;
   std::vector GUIDHash; // Index of hash for that GUID in Hash
+
+  bool operator==(const PseudoProbeDesc &Ot

[llvm-branch-commits] [llvm] [BOLT] Add pseudo probe inline tree to YAML profile (PR #107137)

2024-09-12 Thread Amir Ayupov via llvm-branch-commits


https://github.com/aaupov updated 
https://github.com/llvm/llvm-project/pull/107137

>From 50c021b09950cf7d6a8f25b1ac0dec246f2325f5 Mon Sep 17 00:00:00 2001
From: Amir Ayupov 
Date: Tue, 3 Sep 2024 11:38:04 -0700
Subject: [PATCH 1/6] update pseudoprobe-decoding-inline.test

Created using spr 1.3.4
---
 .../test/X86/pseudoprobe-decoding-inline.test | 31 ---
 1 file changed, 20 insertions(+), 11 deletions(-)

diff --git a/bolt/test/X86/pseudoprobe-decoding-inline.test 
b/bolt/test/X86/pseudoprobe-decoding-inline.test
index 1fdd00c7ef6c4b..629dd84ab8e1dc 100644
--- a/bolt/test/X86/pseudoprobe-decoding-inline.test
+++ b/bolt/test/X86/pseudoprobe-decoding-inline.test
@@ -14,29 +14,38 @@
 # RUN: FileCheck --input-file %t.yaml2 %s --check-prefix CHECK-YAML
 # CHECK-YAML: name: bar
 # CHECK-YAML: - bid: 0
-# CHECK-YAML:   pseudo_probes: [ { guid: 0xE413754A191DB537, id: 1, type: 0 }, 
{ guid: 0xE413754A191DB537, id: 4, type: 0 } ]
-# CHECK-YAML: guid: 0xE413754A191DB537
-# CHECK-YAML: pseudo_probe_desc_hash: 0x10E852DA94
+# CHECK-YAML:  pseudo_probes:
+# CHECK-YAML-NEXT:   - { id: 1, type: 0
+# CHECK-YAML-NEXT:   - { id: 4, type: 0
+# CHECK-YAML:  inline_tree:
+# CHECK-YAML-NEXT:   - { guid: 0xE413754A191DB537, hash: 0x10E852DA94, id: 0 }
 #
 # CHECK-YAML: name: foo
 # CHECK-YAML: - bid: 0
-# CHECK-YAML:   pseudo_probes: [ { guid: 0x5CF8C24CDB18BDAC, id: 1, type: 0 }, 
{ guid: 0x5CF8C24CDB18BDAC, id: 2, type: 0 } ]
-# CHECK-YAML: guid: 0x5CF8C24CDB18BDAC
-# CHECK-YAML: pseudo_probe_desc_hash: 0x200205A19C5B4
+# CHECK-YAML:  pseudo_probes:
+# CHECK-YAML-NEXT: - { id: 1, type: 0 }
+# CHECK-YAML-NEXT: - { id: 2, type: 0 }
+# CHECK-YAML:  inline_tree:
+# CHECK-YAML-NEXT:   - { guid: 0x5CF8C24CDB18BDAC, hash: 0x200205A19C5B4, id: 
0 }
+# CHECK-YAML-NEXT:   - { guid: 0xE413754A191DB537, hash: 0x10E852DA94, id: 1, 
callsite: 8 }
 #
 # CHECK-YAML: name: main
 # CHECK-YAML: - bid: 0
-# CHECK-YAML:   pseudo_probes: [ { guid: 0xDB956436E78DD5FA, id: 1, type: 0 }, 
{ guid: 0x5CF8C24CDB18BDAC, id: 1, type: 0 }, { guid: 0x5CF8C24CDB18BDAC, id: 
2, type: 0 } ]
-# CHECK-YAML: guid: 0xDB956436E78DD5FA
-# CHECK-YAML: pseudo_probe_desc_hash: 0x1
+# CHECK-YAML:  pseudo_probes:
+# CHECK-YAML-NEXT: - { id: 1, type: 0 }
+# CHECK-YAML-NEXT: - { id: 1, type: 0, inline_tree_id: 1 }
+# CHECK-YAML-NEXT: - { id: 2, type: 0, inline_tree_id: 1 }
+# CHECK-YAML:  inline_tree:
+# CHECK-YAML-NEXT:   - { guid: 0xDB956436E78DD5FA, hash: 0x1, id: 
0 }
+# CHECK-YAML-NEXT:   - { guid: 0x5CF8C24CDB18BDAC, hash: 0x200205A19C5B4, id: 
1, callsite: 2 }
+# CHECK-YAML-NEXT:   - { guid: 0xE413754A191DB537, hash: 0x10E852DA94, id: 2, 
parent: 1, callsite: 8 }
 #
 ## Check that without --profile-write-pseudo-probes option, no pseudo probes 
are
 ## generated
 # RUN: perf2bolt 
%S/../../../llvm/test/tools/llvm-profgen/Inputs/inline-cs-pseudoprobe.perfbin 
-p %t.preagg --pa -w %t.yaml -o %t.fdata
 # RUN: FileCheck --input-file %t.yaml %s --check-prefix CHECK-NO-OPT
 # CHECK-NO-OPT-NOT: pseudo_probes
-# CHECK-NO-OPT-NOT: guid
-# CHECK-NO-OPT-NOT: pseudo_probe_desc_hash
+# CHECK-NO-OPT-NOT: inline_tree
 
 CHECK: Report of decoding input pseudo probe binaries
 

>From 6ec4cf6bf05551d02cbf17e9edbe8d6931588ff1 Mon Sep 17 00:00:00 2001
From: Amir Ayupov 
Date: Mon, 9 Sep 2024 21:37:28 -0700
Subject: [PATCH 2/6] clang-format

Created using spr 1.3.4
---
 bolt/lib/Profile/YAMLProfileWriter.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/bolt/lib/Profile/YAMLProfileWriter.cpp 
b/bolt/lib/Profile/YAMLProfileWriter.cpp
index 70e5e09e2920e5..f2609de18ce63c 100644
--- a/bolt/lib/Profile/YAMLProfileWriter.cpp
+++ b/bolt/lib/Profile/YAMLProfileWriter.cpp
@@ -90,7 +90,7 @@ YAMLProfileWriter::convertPseudoProbeDesc(const 
MCPseudoProbeDecoder &Decoder) {
   InlineTreeDesc InlineTree;
 
   for (const MCDecodedPseudoProbeInlineTree &TopLev :
-  Decoder.getDummyInlineRoot().getChildren())
+   Decoder.getDummyInlineRoot().getChildren())
 InlineTree.TopLevelGUIDToInlineTree[TopLev.Guid] = &TopLev;
 
   for (const auto &FuncDesc : Decoder.getGUID2FuncDescMap())

>From 852eb07f345dd1d9e77a6faead8bf0f73ff64ba7 Mon Sep 17 00:00:00 2001
From: Amir Ayupov 
Date: Tue, 10 Sep 2024 12:26:11 -0700
Subject: [PATCH 3/6] Make pseudo_probe_desc optional

Created using spr 1.3.4
---
 bolt/include/bolt/Profile/ProfileYAMLMapping.h | 9 -
 bolt/test/X86/pseudoprobe-decoding-inline.test | 5 +++--
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/bolt/include/bolt/Profile/ProfileYAMLMapping.h 
b/bolt/include/bolt/Profile/ProfileYAMLMapping.h
index 588e2f59d67e01..9cc33264d70718 100644
--- a/bolt/include/bolt/Profile/ProfileYAMLMapping.h
+++ b/bolt/include/bolt/Profile/ProfileYAMLMapping.h
@@ -275,6 +275,12 @@ struct PseudoProbeDesc {
   std::vector GUID;
   std::vector Hash;
   std::vector GUIDHash; // Index of hash for that GUID in Hash
+
+  bool operator==(const PseudoProbeDesc &Ot

[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)

2024-09-12 Thread Pengcheng Wang via llvm-branch-commits


wangpc-pp wrote:

> The run just finished, I'm seeing a 0.75% improvement on 500.perlbench_r on 
> the BPI F3 (-O3 -mcpu=spacemit-x60), no regressions or improvements on the 
> other benchmarks as far as I can see. Seems to check out with the number of 
> memcmps inlined reported for perlbench!

Thanks a lot! The result is within my expectation.
Is it OK to merge? The next step will be tuning for vectors.

https://github.com/llvm/llvm-project/pull/107548
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)

2024-09-12 Thread Craig Topper via llvm-branch-commits



@@ -2113,3 +2113,17 @@ bool RISCVTTIImpl::shouldConsiderAddressTypePromotion(
   }
   return Considerable;
 }
+
+RISCVTTIImpl::TTI::MemCmpExpansionOptions
+RISCVTTIImpl::enableMemCmpExpansion(bool OptSize, bool IsZeroCmp) const {
+  TTI::MemCmpExpansionOptions Options;
+  // FIXME: Vector haven't been tested.
+  Options.AllowOverlappingLoads =
+  (ST->enableUnalignedScalarMem() || ST->enableUnalignedVectorMem());
+  Options.MaxNumLoads = TLI->getMaxExpandSizeMemcmp(OptSize);
+  Options.NumLoadsPerBlock = Options.MaxNumLoads;
+  if (ST->is64Bit())

topperc wrote:

I wonder if this might be better

```
if (ST->is64Bit())
  Options.LoadSize = {8, 4, 2, 1};
else
  Options.LoadSize = {4, 2, 1};
```

https://github.com/llvm/llvm-project/pull/107548
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)

2024-09-12 Thread Craig Topper via llvm-branch-commits



@@ -112,42 +104,46 @@ entry:
 define i32 @bcmp_size_2(ptr %s1, ptr %s2) nounwind optsize {
 ; CHECK-ALIGNED-RV32-LABEL: bcmp_size_2:
 ; CHECK-ALIGNED-RV32:   # %bb.0: # %entry
-; CHECK-ALIGNED-RV32-NEXT:addi sp, sp, -16
-; CHECK-ALIGNED-RV32-NEXT:sw ra, 12(sp) # 4-byte Folded Spill
-; CHECK-ALIGNED-RV32-NEXT:li a2, 2
-; CHECK-ALIGNED-RV32-NEXT:call bcmp
-; CHECK-ALIGNED-RV32-NEXT:lw ra, 12(sp) # 4-byte Folded Reload
-; CHECK-ALIGNED-RV32-NEXT:addi sp, sp, 16
+; CHECK-ALIGNED-RV32-NEXT:lbu a2, 1(a0)

topperc wrote:

Would it be cheaper to Xor all the bytes individually and then Or the xor 
results together?

https://github.com/llvm/llvm-project/pull/107548
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)

2024-09-12 Thread Craig Topper via llvm-branch-commits



@@ -1144,42 +2872,116 @@ entry:
 define i32 @memcmp_size_4(ptr %s1, ptr %s2) nounwind {
 ; CHECK-ALIGNED-RV32-LABEL: memcmp_size_4:
 ; CHECK-ALIGNED-RV32:   # %bb.0: # %entry
-; CHECK-ALIGNED-RV32-NEXT:addi sp, sp, -16
-; CHECK-ALIGNED-RV32-NEXT:sw ra, 12(sp) # 4-byte Folded Spill
-; CHECK-ALIGNED-RV32-NEXT:li a2, 4
-; CHECK-ALIGNED-RV32-NEXT:call memcmp
-; CHECK-ALIGNED-RV32-NEXT:lw ra, 12(sp) # 4-byte Folded Reload
-; CHECK-ALIGNED-RV32-NEXT:addi sp, sp, 16
+; CHECK-ALIGNED-RV32-NEXT:lbu a2, 0(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a3, 1(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 3(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a0, 2(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a5, 0(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a6, 1(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a7, 3(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a1, 2(a1)
+; CHECK-ALIGNED-RV32-NEXT:slli a0, a0, 8
+; CHECK-ALIGNED-RV32-NEXT:or a0, a0, a4
+; CHECK-ALIGNED-RV32-NEXT:slli a3, a3, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a2, a2, 24
+; CHECK-ALIGNED-RV32-NEXT:or a2, a2, a3
+; CHECK-ALIGNED-RV32-NEXT:or a0, a2, a0
+; CHECK-ALIGNED-RV32-NEXT:slli a1, a1, 8
+; CHECK-ALIGNED-RV32-NEXT:or a1, a1, a7
+; CHECK-ALIGNED-RV32-NEXT:slli a6, a6, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 24
+; CHECK-ALIGNED-RV32-NEXT:or a2, a5, a6
+; CHECK-ALIGNED-RV32-NEXT:or a1, a2, a1
+; CHECK-ALIGNED-RV32-NEXT:sltu a2, a1, a0
+; CHECK-ALIGNED-RV32-NEXT:sltu a0, a0, a1
+; CHECK-ALIGNED-RV32-NEXT:sub a0, a2, a0
 ; CHECK-ALIGNED-RV32-NEXT:ret
 ;
 ; CHECK-ALIGNED-RV64-LABEL: memcmp_size_4:
 ; CHECK-ALIGNED-RV64:   # %bb.0: # %entry
-; CHECK-ALIGNED-RV64-NEXT:addi sp, sp, -16
-; CHECK-ALIGNED-RV64-NEXT:sd ra, 8(sp) # 8-byte Folded Spill
-; CHECK-ALIGNED-RV64-NEXT:li a2, 4
-; CHECK-ALIGNED-RV64-NEXT:call memcmp
-; CHECK-ALIGNED-RV64-NEXT:ld ra, 8(sp) # 8-byte Folded Reload
-; CHECK-ALIGNED-RV64-NEXT:addi sp, sp, 16
+; CHECK-ALIGNED-RV64-NEXT:lbu a2, 0(a0)
+; CHECK-ALIGNED-RV64-NEXT:lbu a3, 1(a0)
+; CHECK-ALIGNED-RV64-NEXT:lbu a4, 2(a0)
+; CHECK-ALIGNED-RV64-NEXT:lb a0, 3(a0)
+; CHECK-ALIGNED-RV64-NEXT:lbu a5, 0(a1)
+; CHECK-ALIGNED-RV64-NEXT:lbu a6, 1(a1)
+; CHECK-ALIGNED-RV64-NEXT:lbu a7, 2(a1)
+; CHECK-ALIGNED-RV64-NEXT:lb a1, 3(a1)
+; CHECK-ALIGNED-RV64-NEXT:andi a0, a0, 255
+; CHECK-ALIGNED-RV64-NEXT:slli a4, a4, 8
+; CHECK-ALIGNED-RV64-NEXT:or a0, a4, a0
+; CHECK-ALIGNED-RV64-NEXT:slli a3, a3, 16
+; CHECK-ALIGNED-RV64-NEXT:slliw a2, a2, 24
+; CHECK-ALIGNED-RV64-NEXT:or a2, a2, a3
+; CHECK-ALIGNED-RV64-NEXT:or a0, a2, a0
+; CHECK-ALIGNED-RV64-NEXT:andi a1, a1, 255
+; CHECK-ALIGNED-RV64-NEXT:slli a7, a7, 8
+; CHECK-ALIGNED-RV64-NEXT:or a1, a7, a1
+; CHECK-ALIGNED-RV64-NEXT:slli a6, a6, 16
+; CHECK-ALIGNED-RV64-NEXT:slliw a2, a5, 24
+; CHECK-ALIGNED-RV64-NEXT:or a2, a2, a6
+; CHECK-ALIGNED-RV64-NEXT:or a1, a2, a1
+; CHECK-ALIGNED-RV64-NEXT:sltu a2, a1, a0
+; CHECK-ALIGNED-RV64-NEXT:sltu a0, a0, a1
+; CHECK-ALIGNED-RV64-NEXT:sub a0, a2, a0
 ; CHECK-ALIGNED-RV64-NEXT:ret
 ;
 ; CHECK-UNALIGNED-RV32-LABEL: memcmp_size_4:
 ; CHECK-UNALIGNED-RV32:   # %bb.0: # %entry
-; CHECK-UNALIGNED-RV32-NEXT:addi sp, sp, -16
-; CHECK-UNALIGNED-RV32-NEXT:sw ra, 12(sp) # 4-byte Folded Spill
-; CHECK-UNALIGNED-RV32-NEXT:li a2, 4
-; CHECK-UNALIGNED-RV32-NEXT:call memcmp
-; CHECK-UNALIGNED-RV32-NEXT:lw ra, 12(sp) # 4-byte Folded Reload
-; CHECK-UNALIGNED-RV32-NEXT:addi sp, sp, 16
+; CHECK-UNALIGNED-RV32-NEXT:lw a0, 0(a0)

topperc wrote:

What is this code doing? It seems way more complicated than a 4 byte memcmp 
should be.

https://github.com/llvm/llvm-project/pull/107548
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)

2024-09-12 Thread Craig Topper via llvm-branch-commits


https://github.com/topperc edited 
https://github.com/llvm/llvm-project/pull/107548
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)

2024-09-12 Thread Craig Topper via llvm-branch-commits



@@ -1144,42 +2872,116 @@ entry:
 define i32 @memcmp_size_4(ptr %s1, ptr %s2) nounwind {
 ; CHECK-ALIGNED-RV32-LABEL: memcmp_size_4:
 ; CHECK-ALIGNED-RV32:   # %bb.0: # %entry
-; CHECK-ALIGNED-RV32-NEXT:addi sp, sp, -16
-; CHECK-ALIGNED-RV32-NEXT:sw ra, 12(sp) # 4-byte Folded Spill
-; CHECK-ALIGNED-RV32-NEXT:li a2, 4
-; CHECK-ALIGNED-RV32-NEXT:call memcmp
-; CHECK-ALIGNED-RV32-NEXT:lw ra, 12(sp) # 4-byte Folded Reload
-; CHECK-ALIGNED-RV32-NEXT:addi sp, sp, 16
+; CHECK-ALIGNED-RV32-NEXT:lbu a2, 0(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a3, 1(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 3(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a0, 2(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a5, 0(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a6, 1(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a7, 3(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a1, 2(a1)
+; CHECK-ALIGNED-RV32-NEXT:slli a0, a0, 8
+; CHECK-ALIGNED-RV32-NEXT:or a0, a0, a4
+; CHECK-ALIGNED-RV32-NEXT:slli a3, a3, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a2, a2, 24
+; CHECK-ALIGNED-RV32-NEXT:or a2, a2, a3
+; CHECK-ALIGNED-RV32-NEXT:or a0, a2, a0
+; CHECK-ALIGNED-RV32-NEXT:slli a1, a1, 8
+; CHECK-ALIGNED-RV32-NEXT:or a1, a1, a7
+; CHECK-ALIGNED-RV32-NEXT:slli a6, a6, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 24
+; CHECK-ALIGNED-RV32-NEXT:or a2, a5, a6
+; CHECK-ALIGNED-RV32-NEXT:or a1, a2, a1
+; CHECK-ALIGNED-RV32-NEXT:sltu a2, a1, a0
+; CHECK-ALIGNED-RV32-NEXT:sltu a0, a0, a1
+; CHECK-ALIGNED-RV32-NEXT:sub a0, a2, a0
 ; CHECK-ALIGNED-RV32-NEXT:ret
 ;
 ; CHECK-ALIGNED-RV64-LABEL: memcmp_size_4:
 ; CHECK-ALIGNED-RV64:   # %bb.0: # %entry
-; CHECK-ALIGNED-RV64-NEXT:addi sp, sp, -16
-; CHECK-ALIGNED-RV64-NEXT:sd ra, 8(sp) # 8-byte Folded Spill
-; CHECK-ALIGNED-RV64-NEXT:li a2, 4
-; CHECK-ALIGNED-RV64-NEXT:call memcmp
-; CHECK-ALIGNED-RV64-NEXT:ld ra, 8(sp) # 8-byte Folded Reload
-; CHECK-ALIGNED-RV64-NEXT:addi sp, sp, 16
+; CHECK-ALIGNED-RV64-NEXT:lbu a2, 0(a0)
+; CHECK-ALIGNED-RV64-NEXT:lbu a3, 1(a0)
+; CHECK-ALIGNED-RV64-NEXT:lbu a4, 2(a0)
+; CHECK-ALIGNED-RV64-NEXT:lb a0, 3(a0)
+; CHECK-ALIGNED-RV64-NEXT:lbu a5, 0(a1)
+; CHECK-ALIGNED-RV64-NEXT:lbu a6, 1(a1)
+; CHECK-ALIGNED-RV64-NEXT:lbu a7, 2(a1)
+; CHECK-ALIGNED-RV64-NEXT:lb a1, 3(a1)
+; CHECK-ALIGNED-RV64-NEXT:andi a0, a0, 255
+; CHECK-ALIGNED-RV64-NEXT:slli a4, a4, 8
+; CHECK-ALIGNED-RV64-NEXT:or a0, a4, a0
+; CHECK-ALIGNED-RV64-NEXT:slli a3, a3, 16
+; CHECK-ALIGNED-RV64-NEXT:slliw a2, a2, 24
+; CHECK-ALIGNED-RV64-NEXT:or a2, a2, a3
+; CHECK-ALIGNED-RV64-NEXT:or a0, a2, a0
+; CHECK-ALIGNED-RV64-NEXT:andi a1, a1, 255
+; CHECK-ALIGNED-RV64-NEXT:slli a7, a7, 8
+; CHECK-ALIGNED-RV64-NEXT:or a1, a7, a1
+; CHECK-ALIGNED-RV64-NEXT:slli a6, a6, 16
+; CHECK-ALIGNED-RV64-NEXT:slliw a2, a5, 24
+; CHECK-ALIGNED-RV64-NEXT:or a2, a2, a6
+; CHECK-ALIGNED-RV64-NEXT:or a1, a2, a1
+; CHECK-ALIGNED-RV64-NEXT:sltu a2, a1, a0
+; CHECK-ALIGNED-RV64-NEXT:sltu a0, a0, a1
+; CHECK-ALIGNED-RV64-NEXT:sub a0, a2, a0
 ; CHECK-ALIGNED-RV64-NEXT:ret
 ;
 ; CHECK-UNALIGNED-RV32-LABEL: memcmp_size_4:
 ; CHECK-UNALIGNED-RV32:   # %bb.0: # %entry
-; CHECK-UNALIGNED-RV32-NEXT:addi sp, sp, -16
-; CHECK-UNALIGNED-RV32-NEXT:sw ra, 12(sp) # 4-byte Folded Spill
-; CHECK-UNALIGNED-RV32-NEXT:li a2, 4
-; CHECK-UNALIGNED-RV32-NEXT:call memcmp
-; CHECK-UNALIGNED-RV32-NEXT:lw ra, 12(sp) # 4-byte Folded Reload
-; CHECK-UNALIGNED-RV32-NEXT:addi sp, sp, 16
+; CHECK-UNALIGNED-RV32-NEXT:lw a0, 0(a0)

topperc wrote:

I guess this is an expanded byteswap?

https://github.com/llvm/llvm-project/pull/107548
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)

2024-09-12 Thread Craig Topper via llvm-branch-commits


topperc wrote:

> The run just finished, I'm seeing a 0.75% improvement on 500.perlbench_r on 
> the BPI F3 (-O3 -mcpu=spacemit-x60), no regressions or improvements on the 
> other benchmarks as far as I can see. Seems to check out with the number of 
> memcmps inlined reported for perlbench!

Does spacemit-x60 support unaligned scalar memory and was your test with or 
without that enabled?

https://github.com/llvm/llvm-project/pull/107548
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)

2024-09-12 Thread Pengcheng Wang via llvm-branch-commits


wangpc-pp wrote:

> > The run just finished, I'm seeing a 0.75% improvement on 500.perlbench_r on 
> > the BPI F3 (-O3 -mcpu=spacemit-x60), no regressions or improvements on the 
> > other benchmarks as far as I can see. Seems to check out with the number of 
> > memcmps inlined reported for perlbench!
> 
> Does spacemit-x60 support unaligned scalar memory and was your test with or 
> without that enabled?

It supports unaligned scalar but not unaligned vector. And it seems we don't 
add these features to `-mcpu=spacemit-x60`. So I think @lukel97 ran the SPEC 
without unaligned scalar.

https://github.com/llvm/llvm-project/pull/107548
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)

2024-09-12 Thread Luke Lau via llvm-branch-commits


lukel97 wrote:

> > > The run just finished, I'm seeing a 0.75% improvement on 500.perlbench_r 
> > > on the BPI F3 (-O3 -mcpu=spacemit-x60), no regressions or improvements on 
> > > the other benchmarks as far as I can see. Seems to check out with the 
> > > number of memcmps inlined reported for perlbench!
> 
> > 
> 
> > Does spacemit-x60 support unaligned scalar memory and was your test with or 
> > without that enabled?
> 
> 
> 
> It supports unaligned scalar but not unaligned vector. And it seems we don't 
> add these features to `-mcpu=spacemit-x60`. So I think @lukel97 ran the SPEC 
> without unaligned scalar.

Yeah, -mno-strict-align gave a bus error. I ultimately built it without 
unaligned scalar since I wasn't sure if unaligned scalar was performant or not. 

https://github.com/llvm/llvm-project/pull/107548
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)

2024-09-12 Thread Pengcheng Wang via llvm-branch-commits


wangpc-pp wrote:

> > > > The run just finished, I'm seeing a 0.75% improvement on 
> > > > 500.perlbench_r on the BPI F3 (-O3 -mcpu=spacemit-x60), no regressions 
> > > > or improvements on the other benchmarks as far as I can see. Seems to 
> > > > check out with the number of memcmps inlined reported for perlbench!
> > 
> > > 
> > 
> > > Does spacemit-x60 support unaligned scalar memory and was your test with 
> > > or without that enabled?
> > 
> > 
> > 
> > It supports unaligned scalar but not unaligned vector. And it seems we 
> > don't add these features to `-mcpu=spacemit-x60`. So I think @lukel97 ran 
> > the SPEC without unaligned scalar.
> 
> Yeah, -mno-strict-align gave a bus error. I ultimately built it without 
> unaligned scalar since I wasn't sure if unaligned scalar was performant or 
> not. 

IIRC, we have separated this into two options(scalar/vector) now. So maybe we 
can specify the scalar one.

https://github.com/llvm/llvm-project/pull/107548
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [RISCV] Add initial support of memcmp expansion (PR #107548)

2024-09-12 Thread Pengcheng Wang via llvm-branch-commits


https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/107548

>From f21cfcfc90330ee3856746b6315a81a00313b0e0 Mon Sep 17 00:00:00 2001
From: Wang Pengcheng 
Date: Fri, 6 Sep 2024 17:20:51 +0800
Subject: [PATCH 1/5] =?UTF-8?q?[=F0=9D=98=80=F0=9D=97=BD=F0=9D=97=BF]=20in?=
 =?UTF-8?q?itial=20version?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Created using spr 1.3.6-beta.1
---
 .../Target/RISCV/RISCVTargetTransformInfo.cpp |  15 +
 .../Target/RISCV/RISCVTargetTransformInfo.h   |   3 +
 llvm/test/CodeGen/RISCV/memcmp.ll | 932 ++
 3 files changed, 950 insertions(+)
 create mode 100644 llvm/test/CodeGen/RISCV/memcmp.ll

diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp 
b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
index e809e15eacf696..ad532aadc83266 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
@@ -2113,3 +2113,18 @@ bool RISCVTTIImpl::shouldConsiderAddressTypePromotion(
   }
   return Considerable;
 }
+
+RISCVTTIImpl::TTI::MemCmpExpansionOptions
+RISCVTTIImpl::enableMemCmpExpansion(bool OptSize, bool IsZeroCmp) const {
+  TTI::MemCmpExpansionOptions Options;
+  // FIXME: Vector haven't been tested.
+  Options.AllowOverlappingLoads =
+  (ST->enableUnalignedScalarMem() || ST->enableUnalignedScalarMem());
+  Options.MaxNumLoads = TLI->getMaxExpandSizeMemcmp(OptSize);
+  Options.NumLoadsPerBlock = Options.MaxNumLoads;
+  if (ST->is64Bit())
+Options.LoadSizes.push_back(8);
+  llvm::append_range(Options.LoadSizes, ArrayRef({4, 2, 1}));
+  Options.AllowedTailExpansions = {3, 5, 6};
+  return Options;
+}
diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h 
b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
index 763b89bfec0a66..ee9bed09df97f3 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
@@ -404,6 +404,9 @@ class RISCVTTIImpl : public BasicTTIImplBase {
   shouldConsiderAddressTypePromotion(const Instruction &I,
  bool &AllowPromotionWithoutCommonHeader);
   std::optional getMinPageSize() const { return 4096; }
+
+  TTI::MemCmpExpansionOptions enableMemCmpExpansion(bool OptSize,
+bool IsZeroCmp) const;
 };
 
 } // end namespace llvm
diff --git a/llvm/test/CodeGen/RISCV/memcmp.ll 
b/llvm/test/CodeGen/RISCV/memcmp.ll
new file mode 100644
index 00..652cd02e2c750a
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/memcmp.ll
@@ -0,0 +1,932 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --version 5
+; RUN: sed 's/iXLen/i32/g' %s | llc -mtriple=riscv32 -O2 | FileCheck %s 
--check-prefix=CHECK-ALIGNED-RV32
+; RUN: sed 's/iXLen/i64/g' %s | llc -mtriple=riscv64 -O2 | FileCheck %s 
--check-prefix=CHECK-ALIGNED-RV64
+; RUN: sed 's/iXLen/i32/g' %s | llc -mtriple=riscv32 
-mattr=+unaligned-scalar-mem -O2 \
+; RUN:   | FileCheck %s --check-prefix=CHECK-UNALIGNED-RV32
+; RUN: sed 's/iXLen/i64/g' %s | llc -mtriple=riscv64 
-mattr=+unaligned-scalar-mem -O2 \
+; RUN:   | FileCheck %s --check-prefix=CHECK-UNALIGNED-RV64
+
+declare i32 @bcmp(i8*, i8*, iXLen) nounwind readonly
+declare i32 @memcmp(i8*, i8*, iXLen) nounwind readonly
+
+define i1 @bcmp_size_15(i8* %s1, i8* %s2) {
+; CHECK-ALIGNED-RV32-LABEL: bcmp_size_15:
+; CHECK-ALIGNED-RV32:   # %bb.0: # %entry
+; CHECK-ALIGNED-RV32-NEXT:lbu a2, 1(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a3, 0(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 2(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a5, 3(a0)
+; CHECK-ALIGNED-RV32-NEXT:slli a2, a2, 8
+; CHECK-ALIGNED-RV32-NEXT:or a2, a2, a3
+; CHECK-ALIGNED-RV32-NEXT:slli a4, a4, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 24
+; CHECK-ALIGNED-RV32-NEXT:or a4, a5, a4
+; CHECK-ALIGNED-RV32-NEXT:or a2, a4, a2
+; CHECK-ALIGNED-RV32-NEXT:lbu a3, 1(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 0(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a5, 2(a1)
+; CHECK-ALIGNED-RV32-NEXT:lbu a6, 3(a1)
+; CHECK-ALIGNED-RV32-NEXT:slli a3, a3, 8
+; CHECK-ALIGNED-RV32-NEXT:or a3, a3, a4
+; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a6, a6, 24
+; CHECK-ALIGNED-RV32-NEXT:or a4, a6, a5
+; CHECK-ALIGNED-RV32-NEXT:or a3, a4, a3
+; CHECK-ALIGNED-RV32-NEXT:xor a2, a2, a3
+; CHECK-ALIGNED-RV32-NEXT:lbu a3, 5(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 4(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a5, 6(a0)
+; CHECK-ALIGNED-RV32-NEXT:lbu a6, 7(a0)
+; CHECK-ALIGNED-RV32-NEXT:slli a3, a3, 8
+; CHECK-ALIGNED-RV32-NEXT:or a3, a3, a4
+; CHECK-ALIGNED-RV32-NEXT:slli a5, a5, 16
+; CHECK-ALIGNED-RV32-NEXT:slli a6, a6, 24
+; CHECK-ALIGNED-RV32-NEXT:or a4, a6, a5
+; CHECK-ALIGNED-RV32-NEXT:or a3, a4, a3
+; CHECK-ALIGNED-RV32-NEXT:lbu a4, 5(a1)
+; CHECK-ALIGNED-RV32-NEXT

[llvm-branch-commits] [llvm] release/19.x: [DAGCombiner] cache negative result from getMergeStoreCandidates() (#106949) (PR #108397)

2024-09-12 Thread Matt Arsenault via llvm-branch-commits


https://github.com/arsenm approved this pull request.


https://github.com/llvm/llvm-project/pull/108397
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] 8290ce0 - [Clang] Fix handling of placeholder variables name in init captures (#107055)

2024-09-12 Thread Corentin Jabot via llvm-branch-commits


Author: cor3ntin
Date: 2024-09-12T12:41:44+02:00
New Revision: 8290ce0998788b6a575ed7b4988b093f48c25b3d

URL: 
https://github.com/llvm/llvm-project/commit/8290ce0998788b6a575ed7b4988b093f48c25b3d
DIFF: 
https://github.com/llvm/llvm-project/commit/8290ce0998788b6a575ed7b4988b093f48c25b3d.diff

LOG: [Clang] Fix handling of placeholder variables name in init captures 
(#107055)

We were incorrectly not deduplicating results when looking up `_` which,
for a lambda init capture, would result in an ambiguous lookup.

The same bug caused some diagnostic notes to be emitted twice.

Fixes #107024

Added: 


Modified: 
clang/docs/ReleaseNotes.rst
clang/lib/Sema/SemaLambda.cpp
clang/lib/Sema/SemaLookup.cpp
clang/test/SemaCXX/cxx2c-placeholder-vars.cpp

Removed: 




diff  --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 53d819c6c44574..8c7a6ba70acd28 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -1122,6 +1122,7 @@ Bug Fixes to C++ Support
 - Fixed a crash-on-invalid bug involving extraneous template parameter with 
concept substitution. (#GH73885)
 - Fixed assertion failure by skipping the analysis of an invalid field 
declaration. (#GH99868)
 - Fix an issue with dependent source location expressions (#GH106428), 
(#GH81155), (#GH80210), (#GH85373)
+- Fix handling of ``_`` as the name of a lambda's init capture variable. 
(#GH107024)
 
 
 Bug Fixes to AST Handling

diff  --git a/clang/lib/Sema/SemaLambda.cpp b/clang/lib/Sema/SemaLambda.cpp
index 601077e9f3334d..809b94bb7412b9 100644
--- a/clang/lib/Sema/SemaLambda.cpp
+++ b/clang/lib/Sema/SemaLambda.cpp
@@ -1318,7 +1318,6 @@ void 
Sema::ActOnLambdaExpressionAfterIntroducer(LambdaIntroducer &Intro,
 
 if (C->Init.isUsable()) {
   addInitCapture(LSI, cast(Var), C->Kind == LCK_ByRef);
-  PushOnScopeChains(Var, CurScope, false);
 } else {
   TryCaptureKind Kind = C->Kind == LCK_ByRef ? TryCapture_ExplicitByRef
  : TryCapture_ExplicitByVal;

diff  --git a/clang/lib/Sema/SemaLookup.cpp b/clang/lib/Sema/SemaLookup.cpp
index 7a6a64529f52ec..d3d4bf27ae7283 100644
--- a/clang/lib/Sema/SemaLookup.cpp
+++ b/clang/lib/Sema/SemaLookup.cpp
@@ -570,7 +570,7 @@ void LookupResult::resolveKind() {
 
 // For non-type declarations, check for a prior lookup result naming this
 // canonical declaration.
-if (!D->isPlaceholderVar(getSema().getLangOpts()) && !ExistingI) {
+if (!ExistingI) {
   auto UniqueResult = Unique.insert(std::make_pair(D, I));
   if (!UniqueResult.second) {
 // We've seen this entity before.

diff  --git a/clang/test/SemaCXX/cxx2c-placeholder-vars.cpp 
b/clang/test/SemaCXX/cxx2c-placeholder-vars.cpp
index 5cf66b48784e91..29ca3b5ef3df72 100644
--- a/clang/test/SemaCXX/cxx2c-placeholder-vars.cpp
+++ b/clang/test/SemaCXX/cxx2c-placeholder-vars.cpp
@@ -50,14 +50,16 @@ void f() {
 
 void lambda() {
 (void)[_ = 0, _ = 1] { // expected-warning {{placeholder variables are 
incompatible with C++ standards before C++2c}} \
-   // expected-note 4{{placeholder declared here}}
+   // expected-note 2{{placeholder declared here}}
 (void)_++; // expected-error {{ambiguous reference to placeholder '_', 
which is defined multiple times}}
 };
 
 {
 int _ = 12;
-(void)[_ = 0]{}; // no warning (
diff erent scope)
+(void)[_ = 0]{ return _;}; // no warning (
diff erent scope)
 }
+
+auto GH107024 = [_ = 42]() { return _; }();
 }
 
 namespace global_var {



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [Clang] Fix handling of placeholder variables name in init captures (#107055) (PR #107214)

2024-09-12 Thread Tobias Hieta via llvm-branch-commits


https://github.com/tru closed https://github.com/llvm/llvm-project/pull/107214
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [Clang] Fix handling of placeholder variables name in init captures (#107055) (PR #107214)

2024-09-12 Thread via llvm-branch-commits


github-actions[bot] wrote:

@cor3ntin (or anyone else). If you would like to add a note about this fix in 
the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR. 

https://github.com/llvm/llvm-project/pull/107214
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] release/19.x: workflows/release-binaries: Fix automatic upload (#107315) (PR #108422)

2024-09-12 Thread Tobias Hieta via llvm-branch-commits


https://github.com/tru updated https://github.com/llvm/llvm-project/pull/108422

>From 32a8b56bbf0a3c7678d44ba690427915446a9a72 Mon Sep 17 00:00:00 2001
From: Tom Stellard 
Date: Thu, 12 Sep 2024 09:50:57 -0700
Subject: [PATCH] workflows/release-binaries: Fix automatic upload (#107315)

(cherry picked from commit ab96409180aaad5417030f06a386253722a99d71)
---
 .github/workflows/release-binaries.yml | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/.github/workflows/release-binaries.yml 
b/.github/workflows/release-binaries.yml
index 509016e5b89c45..fcd371d49e6c91 100644
--- a/.github/workflows/release-binaries.yml
+++ b/.github/workflows/release-binaries.yml
@@ -450,11 +450,22 @@ jobs:
 name: ${{ needs.prepare.outputs.release-binary-filename }}-attestation
 path: ${{ needs.prepare.outputs.release-binary-filename }}.jsonl
 
+- name: Checkout Release Scripts
+  uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1
+  with:
+sparse-checkout: |
+  llvm/utils/release/github-upload-release.py
+  llvm/utils/git/requirements.txt
+sparse-checkout-cone-mode: false
+
+- name: Install Python Requirements
+  run: |
+pip install --require-hashes -r ./llvm/utils/git/requirements.txt
+
 - name: Upload Release
   shell: bash
   run: |
-sudo apt install python3-github
-./llvm-project/llvm/utils/release/github-upload-release.py \
+./llvm/utils/release/github-upload-release.py \
 --token ${{ github.token }} \
 --release ${{ needs.prepare.outputs.release-version }} \
 upload \

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] release/19.x: workflows/release-binaries: Fix automatic upload (#107315) (PR #108422)

2024-09-12 Thread Tobias Hieta via llvm-branch-commits


https://github.com/tru closed https://github.com/llvm/llvm-project/pull/108422
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] release/19.x: workflows/release-binaries: Fix automatic upload (#107315) (PR #108422)

2024-09-12 Thread via llvm-branch-commits


github-actions[bot] wrote:

@tstellar (or anyone else). If you would like to add a note about this fix in 
the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR. 

https://github.com/llvm/llvm-project/pull/108422
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] release/19.x: [SLP]Fix PR104422: Wrong value truncation (PR #104747)

2024-09-12 Thread Tobias Hieta via llvm-branch-commits


https://github.com/tru updated https://github.com/llvm/llvm-project/pull/104747

>From e475814473c5990a1409e24d4ecd56ce01546fd0 Mon Sep 17 00:00:00 2001
From: Alexey Bataev 
Date: Thu, 15 Aug 2024 07:21:10 -0700
Subject: [PATCH 1/2] [SLP][NFC]Add a test with incorrect minbitwidth analysis
 for reduced operands

(cherry picked from commit 65ac12d3c9877ecf5b97552364e7eead887d94eb)
---
 .../X86/operand-is-reduced-val.ll | 46 +++
 1 file changed, 46 insertions(+)
 create mode 100644 
llvm/test/Transforms/SLPVectorizer/X86/operand-is-reduced-val.ll

diff --git a/llvm/test/Transforms/SLPVectorizer/X86/operand-is-reduced-val.ll 
b/llvm/test/Transforms/SLPVectorizer/X86/operand-is-reduced-val.ll
new file mode 100644
index 00..5fb93e27539d8e
--- /dev/null
+++ b/llvm/test/Transforms/SLPVectorizer/X86/operand-is-reduced-val.ll
@@ -0,0 +1,46 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 5
+; RUN: opt -S --passes=slp-vectorizer -mtriple=x86_64-unknown-linux < %s 
-slp-threshold=-10 | FileCheck %s
+
+define i64 @src(i32 %a) {
+; CHECK-LABEL: define i64 @src(
+; CHECK-SAME: i32 [[A:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:[[TMP17:%.*]] = sext i32 [[A]] to i64
+; CHECK-NEXT:[[TMP1:%.*]] = insertelement <4 x i32> poison, i32 [[A]], i32 0
+; CHECK-NEXT:[[TMP2:%.*]] = shufflevector <4 x i32> [[TMP1]], <4 x i32> 
poison, <4 x i32> zeroinitializer
+; CHECK-NEXT:[[TMP3:%.*]] = add <4 x i32> [[TMP2]], 
+; CHECK-NEXT:[[TMP4:%.*]] = sext <4 x i32> [[TMP3]] to <4 x i64>
+; CHECK-NEXT:[[TMP5:%.*]] = and <4 x i32> [[TMP3]], 
+; CHECK-NEXT:[[TMP6:%.*]] = zext <4 x i32> [[TMP5]] to <4 x i64>
+; CHECK-NEXT:[[TMP18:%.*]] = call i64 @llvm.vector.reduce.add.v4i64(<4 x 
i64> [[TMP6]])
+; CHECK-NEXT:[[TMP16:%.*]] = call i64 @llvm.vector.reduce.add.v4i64(<4 x 
i64> [[TMP4]])
+; CHECK-NEXT:[[TMP19:%.*]] = add i64 [[TMP18]], [[TMP16]]
+; CHECK-NEXT:[[OP_RDX1:%.*]] = add i64 [[TMP19]], 4294967297
+; CHECK-NEXT:[[TMP21:%.*]] = add i64 [[OP_RDX1]], [[TMP17]]
+; CHECK-NEXT:ret i64 [[TMP21]]
+;
+entry:
+  %0 = sext i32 %a to i64
+  %1 = add nsw i64 %0, 4294967297
+  %2 = sext i32 %a to i64
+  %3 = add nsw i64 %2, 4294967297
+  %4 = add i64 %3, %1
+  %5 = and i64 %3, 1
+  %6 = add i64 %4, %5
+  %7 = sext i32 %a to i64
+  %8 = add nsw i64 %7, 4294967297
+  %9 = add i64 %8, %6
+  %10 = and i64 %8, 1
+  %11 = add i64 %9, %10
+  %12 = sext i32 %a to i64
+  %13 = add nsw i64 %12, 4294967297
+  %14 = add i64 %13, %11
+  %15 = and i64 %13, 1
+  %16 = add i64 %14, %15
+  %17 = sext i32 %a to i64
+  %18 = add nsw i64 %17, 4294967297
+  %19 = add i64 %18, %16
+  %20 = and i64 %18, 1
+  %21 = add i64 %19, %20
+  ret i64 %21
+}

>From a6a1f2ba8cc54e674e0a9f9790c9f226b9cd6a2b Mon Sep 17 00:00:00 2001
From: Alexey Bataev 
Date: Thu, 15 Aug 2024 07:57:37 -0700
Subject: [PATCH 2/2] [SLP]Fix PR104422: Wrong value truncation

The minbitwidth restrictions can be skipped only for immediate reduced
values, for other nodes still need to check if external users allow
bitwidth reduction.

Fixes https://github.com/llvm/llvm-project/issues/104422

(cherry picked from commit 56140a8258a3498cfcd9f0f05c182457d43cbfd2)
---
 llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp |  3 ++-
 .../SLPVectorizer/X86/operand-is-reduced-val.ll | 17 ++---
 2 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp 
b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index 2f3d6b27378aee..ab2b96cdc42db8 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -15211,7 +15211,8 @@ bool BoUpSLP::collectValuesToDemote(
   if (any_of(E.Scalars, [&](Value *V) {
 return !all_of(V->users(), [=](User *U) {
   return getTreeEntry(U) ||
- (UserIgnoreList && UserIgnoreList->contains(U)) ||
+ (E.Idx == 0 && UserIgnoreList &&
+  UserIgnoreList->contains(U)) ||
  (!isa(U) && U->getType()->isSized() &&
   !U->getType()->isScalableTy() &&
   DL->getTypeSizeInBits(U->getType()) <= BitWidth);
diff --git a/llvm/test/Transforms/SLPVectorizer/X86/operand-is-reduced-val.ll 
b/llvm/test/Transforms/SLPVectorizer/X86/operand-is-reduced-val.ll
index 5fb93e27539d8e..5fcac3fbf3bafe 100644
--- a/llvm/test/Transforms/SLPVectorizer/X86/operand-is-reduced-val.ll
+++ b/llvm/test/Transforms/SLPVectorizer/X86/operand-is-reduced-val.ll
@@ -8,15 +8,18 @@ define i64 @src(i32 %a) {
 ; CHECK-NEXT:[[TMP17:%.*]] = sext i32 [[A]] to i64
 ; CHECK-NEXT:[[TMP1:%.*]] = insertelement <4 x i32> poison, i32 [[A]], i32 0
 ; CHECK-NEXT:[[TMP2:%.*]] = shufflevector <4 x i32> [[TMP1]], <4 x i32> 
poison, <4 x i32> zeroinitializer
-; CHECK-NEXT:[[TMP3:%.*]] = add <4 x i32> [[TMP2]], 
-; CHECK-NEXT:[[TMP4:%.*]] = sext <4 x i32> [[TMP3]] to <4 x i

[llvm-branch-commits] [llvm] release/19.x: [SLP]Fix PR104422: Wrong value truncation (PR #104747)

2024-09-12 Thread Tobias Hieta via llvm-branch-commits


https://github.com/tru updated https://github.com/llvm/llvm-project/pull/104747

>From 373180b440d04dc3cc0f6111b06684d18779d7c8 Mon Sep 17 00:00:00 2001
From: Alexey Bataev 
Date: Thu, 15 Aug 2024 07:21:10 -0700
Subject: [PATCH] [SLP]Fix PR104422: Wrong value truncation

The minbitwidth restrictions can be skipped only for immediate reduced
values, for other nodes still need to check if external users allow
bitwidth reduction.

Fixes https://github.com/llvm/llvm-project/issues/104422

(cherry picked from commit 56140a8258a3498cfcd9f0f05c182457d43cbfd2)
---
 .../Transforms/Vectorize/SLPVectorizer.cpp|  3 +-
 .../X86/operand-is-reduced-val.ll | 49 +++
 2 files changed, 51 insertions(+), 1 deletion(-)
 create mode 100644 
llvm/test/Transforms/SLPVectorizer/X86/operand-is-reduced-val.ll

diff --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp 
b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index 2f3d6b27378aee..ab2b96cdc42db8 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -15211,7 +15211,8 @@ bool BoUpSLP::collectValuesToDemote(
   if (any_of(E.Scalars, [&](Value *V) {
 return !all_of(V->users(), [=](User *U) {
   return getTreeEntry(U) ||
- (UserIgnoreList && UserIgnoreList->contains(U)) ||
+ (E.Idx == 0 && UserIgnoreList &&
+  UserIgnoreList->contains(U)) ||
  (!isa(U) && U->getType()->isSized() &&
   !U->getType()->isScalableTy() &&
   DL->getTypeSizeInBits(U->getType()) <= BitWidth);
diff --git a/llvm/test/Transforms/SLPVectorizer/X86/operand-is-reduced-val.ll 
b/llvm/test/Transforms/SLPVectorizer/X86/operand-is-reduced-val.ll
new file mode 100644
index 00..5fcac3fbf3bafe
--- /dev/null
+++ b/llvm/test/Transforms/SLPVectorizer/X86/operand-is-reduced-val.ll
@@ -0,0 +1,49 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 5
+; RUN: opt -S --passes=slp-vectorizer -mtriple=x86_64-unknown-linux < %s 
-slp-threshold=-10 | FileCheck %s
+
+define i64 @src(i32 %a) {
+; CHECK-LABEL: define i64 @src(
+; CHECK-SAME: i32 [[A:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:[[TMP17:%.*]] = sext i32 [[A]] to i64
+; CHECK-NEXT:[[TMP1:%.*]] = insertelement <4 x i32> poison, i32 [[A]], i32 0
+; CHECK-NEXT:[[TMP2:%.*]] = shufflevector <4 x i32> [[TMP1]], <4 x i32> 
poison, <4 x i32> zeroinitializer
+; CHECK-NEXT:[[TMP3:%.*]] = sext <4 x i32> [[TMP2]] to <4 x i64>
+; CHECK-NEXT:[[TMP4:%.*]] = add nsw <4 x i64> [[TMP3]], 
+; CHECK-NEXT:[[TMP6:%.*]] = and <4 x i64> [[TMP4]], 
+; CHECK-NEXT:[[TMP18:%.*]] = call i64 @llvm.vector.reduce.add.v4i64(<4 x 
i64> [[TMP6]])
+; CHECK-NEXT:[[TMP16:%.*]] = call i64 @llvm.vector.reduce.add.v4i64(<4 x 
i64> [[TMP4]])
+; CHECK-NEXT:[[TMP8:%.*]] = insertelement <2 x i64> poison, i64 [[TMP16]], 
i32 0
+; CHECK-NEXT:[[TMP9:%.*]] = insertelement <2 x i64> [[TMP8]], i64 
[[TMP18]], i32 1
+; CHECK-NEXT:[[TMP10:%.*]] = insertelement <2 x i64> , i64 [[TMP17]], i32 0
+; CHECK-NEXT:[[TMP11:%.*]] = add <2 x i64> [[TMP9]], [[TMP10]]
+; CHECK-NEXT:[[TMP12:%.*]] = extractelement <2 x i64> [[TMP11]], i32 0
+; CHECK-NEXT:[[TMP13:%.*]] = extractelement <2 x i64> [[TMP11]], i32 1
+; CHECK-NEXT:[[TMP21:%.*]] = add i64 [[TMP12]], [[TMP13]]
+; CHECK-NEXT:ret i64 [[TMP21]]
+;
+entry:
+  %0 = sext i32 %a to i64
+  %1 = add nsw i64 %0, 4294967297
+  %2 = sext i32 %a to i64
+  %3 = add nsw i64 %2, 4294967297
+  %4 = add i64 %3, %1
+  %5 = and i64 %3, 1
+  %6 = add i64 %4, %5
+  %7 = sext i32 %a to i64
+  %8 = add nsw i64 %7, 4294967297
+  %9 = add i64 %8, %6
+  %10 = and i64 %8, 1
+  %11 = add i64 %9, %10
+  %12 = sext i32 %a to i64
+  %13 = add nsw i64 %12, 4294967297
+  %14 = add i64 %13, %11
+  %15 = and i64 %13, 1
+  %16 = add i64 %14, %15
+  %17 = sext i32 %a to i64
+  %18 = add nsw i64 %17, 4294967297
+  %19 = add i64 %18, %16
+  %20 = and i64 %18, 1
+  %21 = add i64 %19, %20
+  ret i64 %21
+}

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] 373180b - [SLP]Fix PR104422: Wrong value truncation

2024-09-12 Thread Tobias Hieta via llvm-branch-commits


Author: Alexey Bataev
Date: 2024-09-13T07:58:38+02:00
New Revision: 373180b440d04dc3cc0f6111b06684d18779d7c8

URL: 
https://github.com/llvm/llvm-project/commit/373180b440d04dc3cc0f6111b06684d18779d7c8
DIFF: 
https://github.com/llvm/llvm-project/commit/373180b440d04dc3cc0f6111b06684d18779d7c8.diff

LOG: [SLP]Fix PR104422: Wrong value truncation

The minbitwidth restrictions can be skipped only for immediate reduced
values, for other nodes still need to check if external users allow
bitwidth reduction.

Fixes https://github.com/llvm/llvm-project/issues/104422

(cherry picked from commit 56140a8258a3498cfcd9f0f05c182457d43cbfd2)

Added: 
llvm/test/Transforms/SLPVectorizer/X86/operand-is-reduced-val.ll

Modified: 
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp

Removed: 




diff  --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp 
b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index 2f3d6b27378aee..ab2b96cdc42db8 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -15211,7 +15211,8 @@ bool BoUpSLP::collectValuesToDemote(
   if (any_of(E.Scalars, [&](Value *V) {
 return !all_of(V->users(), [=](User *U) {
   return getTreeEntry(U) ||
- (UserIgnoreList && UserIgnoreList->contains(U)) ||
+ (E.Idx == 0 && UserIgnoreList &&
+  UserIgnoreList->contains(U)) ||
  (!isa(U) && U->getType()->isSized() &&
   !U->getType()->isScalableTy() &&
   DL->getTypeSizeInBits(U->getType()) <= BitWidth);

diff  --git a/llvm/test/Transforms/SLPVectorizer/X86/operand-is-reduced-val.ll 
b/llvm/test/Transforms/SLPVectorizer/X86/operand-is-reduced-val.ll
new file mode 100644
index 00..5fcac3fbf3bafe
--- /dev/null
+++ b/llvm/test/Transforms/SLPVectorizer/X86/operand-is-reduced-val.ll
@@ -0,0 +1,49 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 5
+; RUN: opt -S --passes=slp-vectorizer -mtriple=x86_64-unknown-linux < %s 
-slp-threshold=-10 | FileCheck %s
+
+define i64 @src(i32 %a) {
+; CHECK-LABEL: define i64 @src(
+; CHECK-SAME: i32 [[A:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:[[TMP17:%.*]] = sext i32 [[A]] to i64
+; CHECK-NEXT:[[TMP1:%.*]] = insertelement <4 x i32> poison, i32 [[A]], i32 0
+; CHECK-NEXT:[[TMP2:%.*]] = shufflevector <4 x i32> [[TMP1]], <4 x i32> 
poison, <4 x i32> zeroinitializer
+; CHECK-NEXT:[[TMP3:%.*]] = sext <4 x i32> [[TMP2]] to <4 x i64>
+; CHECK-NEXT:[[TMP4:%.*]] = add nsw <4 x i64> [[TMP3]], 
+; CHECK-NEXT:[[TMP6:%.*]] = and <4 x i64> [[TMP4]], 
+; CHECK-NEXT:[[TMP18:%.*]] = call i64 @llvm.vector.reduce.add.v4i64(<4 x 
i64> [[TMP6]])
+; CHECK-NEXT:[[TMP16:%.*]] = call i64 @llvm.vector.reduce.add.v4i64(<4 x 
i64> [[TMP4]])
+; CHECK-NEXT:[[TMP8:%.*]] = insertelement <2 x i64> poison, i64 [[TMP16]], 
i32 0
+; CHECK-NEXT:[[TMP9:%.*]] = insertelement <2 x i64> [[TMP8]], i64 
[[TMP18]], i32 1
+; CHECK-NEXT:[[TMP10:%.*]] = insertelement <2 x i64> , i64 [[TMP17]], i32 0
+; CHECK-NEXT:[[TMP11:%.*]] = add <2 x i64> [[TMP9]], [[TMP10]]
+; CHECK-NEXT:[[TMP12:%.*]] = extractelement <2 x i64> [[TMP11]], i32 0
+; CHECK-NEXT:[[TMP13:%.*]] = extractelement <2 x i64> [[TMP11]], i32 1
+; CHECK-NEXT:[[TMP21:%.*]] = add i64 [[TMP12]], [[TMP13]]
+; CHECK-NEXT:ret i64 [[TMP21]]
+;
+entry:
+  %0 = sext i32 %a to i64
+  %1 = add nsw i64 %0, 4294967297
+  %2 = sext i32 %a to i64
+  %3 = add nsw i64 %2, 4294967297
+  %4 = add i64 %3, %1
+  %5 = and i64 %3, 1
+  %6 = add i64 %4, %5
+  %7 = sext i32 %a to i64
+  %8 = add nsw i64 %7, 4294967297
+  %9 = add i64 %8, %6
+  %10 = and i64 %8, 1
+  %11 = add i64 %9, %10
+  %12 = sext i32 %a to i64
+  %13 = add nsw i64 %12, 4294967297
+  %14 = add i64 %13, %11
+  %15 = and i64 %13, 1
+  %16 = add i64 %14, %15
+  %17 = sext i32 %a to i64
+  %18 = add nsw i64 %17, 4294967297
+  %19 = add i64 %18, %16
+  %20 = and i64 %18, 1
+  %21 = add i64 %19, %20
+  ret i64 %21
+}



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] release/19.x: [SLP]Fix PR104422: Wrong value truncation (PR #104747)

2024-09-12 Thread Tobias Hieta via llvm-branch-commits


https://github.com/tru closed https://github.com/llvm/llvm-project/pull/104747
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] release/19.x: [SLP]Fix PR104422: Wrong value truncation (PR #104747)

2024-09-12 Thread via llvm-branch-commits


github-actions[bot] wrote:

@nikic (or anyone else). If you would like to add a note about this fix in the 
release notes (completely optional). Please reply to this comment with a one or 
two sentence description of the fix.  When you are done, please add the 
release:note label to this PR. 

https://github.com/llvm/llvm-project/pull/104747
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] release/19.x: [MIPS] Optimize sortRelocs for o32 (PR #106008)

2024-09-12 Thread Tobias Hieta via llvm-branch-commits


https://github.com/tru closed https://github.com/llvm/llvm-project/pull/106008
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] release/19.x: [MIPS] Optimize sortRelocs for o32 (PR #106008)

2024-09-12 Thread Tobias Hieta via llvm-branch-commits


tru wrote:

This will have to wait for LLVM 20. I know it's not optimal for zig, but 
getting it in this late and it being abi breaking is tricky.

https://github.com/llvm/llvm-project/pull/106008
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] release/19.x: [Clang][Concepts] Fix the constraint equivalence checking involving parameter packs (#102131) (PR #106043)

2024-09-12 Thread Tobias Hieta via llvm-branch-commits


tru wrote:

Can this be reviewed @cor3ntin @mizvekov 

https://github.com/llvm/llvm-project/pull/106043
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] release/19.x: [AVR] Fix 16-bit LDDs with immediate overflows (#104923) (PR #106993)

2024-09-12 Thread Tobias Hieta via llvm-branch-commits


tru wrote:


Hi, since we are wrapping up LLVM 19.1.0 we are very strict with the fixes we 
pick at this point. Can you please respond to the following questions to help 
me understand if this has to be included in the final release or not.

Is this PR a fix for a regression or a critical issue?

What is the risk of accepting this into the release branch?

What is the risk of NOT accepting this into the release branch?



https://github.com/llvm/llvm-project/pull/106993
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] release/19.x: [LoongArch] Eliminate the redundant sign extension of division (#107971) (PR #107990)

2024-09-12 Thread Tobias Hieta via llvm-branch-commits


tru wrote:

Just re-run the cherry-pick comment on the updated SHA.

https://github.com/llvm/llvm-project/pull/107990
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

1 2 >

1 - 100 of 120 matches

Mail list logo