date:20250918

[llvm-branch-commits] [mlir] 7af3f6e - Revert "[mlir][SCF] Allow using a custom operation to generate loops with `ml…"

2025-09-18 Thread via llvm-branch-commits


Author: MaheshRavishankar
Date: 2025-09-18T09:29:29-07:00
New Revision: 7af3f6e0317e84900e6683ac0ea3dc60b805904e

URL: 
https://github.com/llvm/llvm-project/commit/7af3f6e0317e84900e6683ac0ea3dc60b805904e
DIFF: 
https://github.com/llvm/llvm-project/commit/7af3f6e0317e84900e6683ac0ea3dc60b805904e.diff

LOG: Revert "[mlir][SCF] Allow using a custom operation to generate loops with 
`ml…"

This reverts commit b8649098a7fcf598406d8d8b7d68891d1444e9c8.

Added: 


Modified: 
mlir/include/mlir/Dialect/SCF/Transforms/TileUsingInterface.h
mlir/lib/Dialect/SCF/Transforms/TileUsingInterface.cpp
mlir/test/lib/Interfaces/TilingInterface/TestTilingInterfaceTransformOps.cpp
mlir/test/lib/Interfaces/TilingInterface/TestTilingInterfaceTransformOps.td

Removed: 
mlir/test/Interfaces/TilingInterface/tile-using-custom-op.mlir



diff  --git a/mlir/include/mlir/Dialect/SCF/Transforms/TileUsingInterface.h 
b/mlir/include/mlir/Dialect/SCF/Transforms/TileUsingInterface.h
index 6b05ade37881c..3205da6e448fc 100644
--- a/mlir/include/mlir/Dialect/SCF/Transforms/TileUsingInterface.h
+++ b/mlir/include/mlir/Dialect/SCF/Transforms/TileUsingInterface.h
@@ -33,14 +33,6 @@ using SCFTileSizeComputationFunction =
 
 /// Options to use to control tiling.
 struct SCFTilingOptions {
-  /// Specify which loop construct to use for tile and fuse.
-  enum class LoopType { ForOp, ForallOp, CustomOp };
-  LoopType loopType = LoopType::ForOp;
-  SCFTilingOptions &setLoopType(LoopType type) {
-loopType = type;
-return *this;
-  }
-
   /// Computation function that returns the tile sizes to use for each loop.
   /// Returning a tile size of zero implies no tiling for that loop. If the
   /// size of the returned vector is smaller than the number of loops, the 
inner
@@ -58,17 +50,6 @@ struct SCFTilingOptions {
   /// proper interaction with folding.
   SCFTilingOptions &setTileSizes(ArrayRef tileSizes);
 
-  /// The interchange vector to reorder the tiled loops.
-  SmallVector interchangeVector = {};
-  SCFTilingOptions &setInterchange(ArrayRef interchange) {
-interchangeVector = llvm::to_vector(interchange);
-return *this;
-  }
-
-  //-//
-  // Options related to tiling using `scf.forall`.
-  //-//
-
   /// Computation function that returns the number of threads to use for
   /// each loop. Returning a num threads of zero implies no tiling for that
   /// loop. If the size of the returned vector is smaller than the number of
@@ -89,6 +70,21 @@ struct SCFTilingOptions {
   /// function that computes num threads at the point they are needed.
   SCFTilingOptions &setNumThreads(ArrayRef numThreads);
 
+  /// The interchange vector to reorder the tiled loops.
+  SmallVector interchangeVector = {};
+  SCFTilingOptions &setInterchange(ArrayRef interchange) {
+interchangeVector = llvm::to_vector(interchange);
+return *this;
+  }
+
+  /// Specify which loop construct to use for tile and fuse.
+  enum class LoopType { ForOp, ForallOp };
+  LoopType loopType = LoopType::ForOp;
+  SCFTilingOptions &setLoopType(LoopType type) {
+loopType = type;
+return *this;
+  }
+
   /// Specify mapping of loops to devices. This is only respected when the loop
   /// constructs support such a mapping (like `scf.forall`). Will be ignored
   /// when using loop constructs that dont support such a mapping (like
@@ -121,98 +117,6 @@ struct SCFTilingOptions {
 reductionDims.insert(dims.begin(), dims.end());
 return *this;
   }
-
-  //-//
-  // Options related to tiling using custom loop.
-  //-//
-
-  // For generating the inter-tile loops using a custom loop, two callback
-  // functions are needed
-  // 1. That generates the "loop header", i.e. the loop that iterates over the
-  //
diff erent tiles.
-  // 2. That generates the loop terminator
-  //
-  // For `scf.forall` case the call back to generate loop header would generate
-  //
-  // ```mlir
-  // scf.forall (...) = ... {
-  //   ..
-  // }
-  // ```
-  //
-  // and the call back to generate the loop terminator would generate the
-  // `scf.in_parallel` region
-  //
-  // ```mlir
-  // scf.forall (...) = ... {
-  //   scf.in_parallel {
-  //  tensor.parallel_insert_slice ...
-  //   }
-  // }
-  // ```
-  //
-
-  // Information that is to be returned by the callback to generate the loop
-  // header needed for the rest of the tiled codegeneration.
-  // - `loops`: The generated loops
-  // - `tileOffset`: The values that represent the offset of the iteration 
space
-  // tile
-  // - `tileSizes` : The values that represent the size of the iteration space
-  // tile.
-  // - `destinationTensor

[llvm-branch-commits] [llvm] [AMDGPU] Improve StructurizeCFG pass performance by using SSAUpdaterBulk. (PR #150937)

2025-09-18 Thread Valery Pykhtin via llvm-branch-commits


https://github.com/vpykhtin updated 
https://github.com/llvm/llvm-project/pull/150937

>From ae3589e2c93351349cd1bbb5586c2dfcb075ea68 Mon Sep 17 00:00:00 2001
From: Valery Pykhtin 
Date: Thu, 10 Apr 2025 11:58:13 +
Subject: [PATCH] amdgpu_use_ssaupdaterbulk_in_structurizecfg

---
 llvm/lib/Transforms/Scalar/StructurizeCFG.cpp | 25 +++
 1 file changed, 15 insertions(+), 10 deletions(-)

diff --git a/llvm/lib/Transforms/Scalar/StructurizeCFG.cpp 
b/llvm/lib/Transforms/Scalar/StructurizeCFG.cpp
index 2ee91a9b40026..0f3978f56045e 100644
--- a/llvm/lib/Transforms/Scalar/StructurizeCFG.cpp
+++ b/llvm/lib/Transforms/Scalar/StructurizeCFG.cpp
@@ -47,6 +47,7 @@
 #include "llvm/Transforms/Utils/BasicBlockUtils.h"
 #include "llvm/Transforms/Utils/Local.h"
 #include "llvm/Transforms/Utils/SSAUpdater.h"
+#include "llvm/Transforms/Utils/SSAUpdaterBulk.h"
 #include 
 #include 
 
@@ -321,7 +322,7 @@ class StructurizeCFG {
 
   void collectInfos();
 
-  void insertConditions(bool Loops);
+  void insertConditions(bool Loops, SSAUpdaterBulk &PhiInserter);
 
   void simplifyConditions();
 
@@ -671,10 +672,9 @@ void StructurizeCFG::collectInfos() {
 }
 
 /// Insert the missing branch conditions
-void StructurizeCFG::insertConditions(bool Loops) {
+void StructurizeCFG::insertConditions(bool Loops, SSAUpdaterBulk &PhiInserter) 
{
   BranchVector &Conds = Loops ? LoopConds : Conditions;
   Value *Default = Loops ? BoolTrue : BoolFalse;
-  SSAUpdater PhiInserter;
 
   for (BranchInst *Term : Conds) {
 assert(Term->isConditional());
@@ -683,8 +683,9 @@ void StructurizeCFG::insertConditions(bool Loops) {
 BasicBlock *SuccTrue = Term->getSuccessor(0);
 BasicBlock *SuccFalse = Term->getSuccessor(1);
 
-PhiInserter.Initialize(Boolean, "");
-PhiInserter.AddAvailableValue(Loops ? SuccFalse : Parent, Default);
+unsigned Variable = PhiInserter.AddVariable("", Boolean);
+PhiInserter.AddAvailableValue(Variable, Loops ? SuccFalse : Parent,
+  Default);
 
 BBPredicates &Preds = Loops ? LoopPreds[SuccFalse] : Predicates[SuccTrue];
 
@@ -697,7 +698,7 @@ void StructurizeCFG::insertConditions(bool Loops) {
 ParentInfo = PI;
 break;
   }
-  PhiInserter.AddAvailableValue(BB, PI.Pred);
+  PhiInserter.AddAvailableValue(Variable, BB, PI.Pred);
   Dominator.addAndRememberBlock(BB);
 }
 
@@ -706,9 +707,9 @@ void StructurizeCFG::insertConditions(bool Loops) {
   CondBranchWeights::setMetadata(*Term, ParentInfo.Weights);
 } else {
   if (!Dominator.resultIsRememberedBlock())
-PhiInserter.AddAvailableValue(Dominator.result(), Default);
+PhiInserter.AddAvailableValue(Variable, Dominator.result(), Default);
 
-  Term->setCondition(PhiInserter.GetValueInMiddleOfBlock(Parent));
+  PhiInserter.AddUse(Variable, &Term->getOperandUse(0));
 }
   }
 }
@@ -1414,8 +1415,12 @@ bool StructurizeCFG::run(Region *R, DominatorTree *DT,
   orderNodes();
   collectInfos();
   createFlow();
-  insertConditions(false);
-  insertConditions(true);
+
+  SSAUpdaterBulk PhiInserter;
+  insertConditions(false, PhiInserter);
+  insertConditions(true, PhiInserter);
+  PhiInserter.RewriteAndOptimizeAllUses(*DT);
+
   setPhiValues();
   simplifyHoistedPhis();
   simplifyConditions();

___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [Offload] `olGetMemInfo` (PR #157651)

2025-09-18 Thread Ross Brunton via llvm-branch-commits


https://github.com/RossBrunton converted_to_draft 
https://github.com/llvm/llvm-project/pull/157651
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [DirectX] Validating Root flags are denying shader stage (PR #153287)

2025-09-18 Thread via llvm-branch-commits


https://github.com/joaosaffran updated 
https://github.com/llvm/llvm-project/pull/153287

>From b1e34ff07fffe96fec438b87027bd2c450b6b36f Mon Sep 17 00:00:00 2001
From: Joao Saffran <{ID}+{username}@users.noreply.github.com>
Date: Tue, 12 Aug 2025 13:07:42 -0700
Subject: [PATCH 01/24] adding validaiton and tests

---
 .../DXILPostOptimizationValidation.cpp| 95 ++-
 .../rootsignature-validation-deny-shader.ll   | 16 
 ...re-validation-fail-deny-multiple-shader.ll | 17 
 ...ture-validation-fail-deny-single-shader.ll | 17 
 4 files changed, 122 insertions(+), 23 deletions(-)
 create mode 100644 
llvm/test/CodeGen/DirectX/rootsignature-validation-deny-shader.ll
 create mode 100644 
llvm/test/CodeGen/DirectX/rootsignature-validation-fail-deny-multiple-shader.ll
 create mode 100644 
llvm/test/CodeGen/DirectX/rootsignature-validation-fail-deny-single-shader.ll

diff --git a/llvm/lib/Target/DirectX/DXILPostOptimizationValidation.cpp 
b/llvm/lib/Target/DirectX/DXILPostOptimizationValidation.cpp
index 3721b5f539b8c..251f4a0daf43a 100644
--- a/llvm/lib/Target/DirectX/DXILPostOptimizationValidation.cpp
+++ b/llvm/lib/Target/DirectX/DXILPostOptimizationValidation.cpp
@@ -21,6 +21,7 @@
 #include "llvm/InitializePasses.h"
 #include "llvm/MC/DXContainerRootSignature.h"
 #include "llvm/Support/DXILABI.h"
+#include "llvm/TargetParser/Triple.h"
 #include 
 
 #define DEBUG_TYPE "dxil-post-optimization-validation"
@@ -169,15 +170,16 @@ reportDescriptorTableMixingTypes(Module &M, uint32_t 
Location,
   M.getContext().diagnose(DiagnosticInfoGeneric(Message));
 }
 
-static void reportOverlowingRange(Module &M, const 
dxbc::RTS0::v2::DescriptorRange &Range) {
+static void
+reportOverlowingRange(Module &M, const dxbc::RTS0::v2::DescriptorRange &Range) 
{
   SmallString<128> Message;
   raw_svector_ostream OS(Message);
-  OS << "Cannot append range with implicit lower " 
-  << "bound after an unbounded range "
-  << 
getResourceClassName(toResourceClass(static_cast(Range.RangeType)))
-  << "(register=" << Range.BaseShaderRegister << ", space=" << 
-  Range.RegisterSpace
-  << ") exceeds maximum allowed value.";
+  OS << "Cannot append range with implicit lower "
+ << "bound after an unbounded range "
+ << getResourceClassName(toResourceClass(
+static_cast(Range.RangeType)))
+ << "(register=" << Range.BaseShaderRegister
+ << ", space=" << Range.RegisterSpace << ") exceeds maximum allowed 
value.";
   M.getContext().diagnose(DiagnosticInfoGeneric(Message));
 }
 
@@ -262,12 +264,57 @@ getRootDescriptorsBindingInfo(const 
mcdxbc::RootSignatureDesc &RSD,
   return RDs;
 }
 
+static void reportIfDeniedShaderStageAccess(Module &M, dxbc::RootFlags Flags,
+dxbc::RootFlags Mask) {
+  if ((Flags & Mask) == Mask) {
+SmallString<128> Message;
+raw_svector_ostream OS(Message);
+OS << "Shader has root bindings but root signature uses a DENY flag to "
+  "disallow root binding access to the shader stage.";
+M.getContext().diagnose(DiagnosticInfoGeneric(Message));
+  }
+}
+
+static void validateRootFlags(Module &M, const mcdxbc::RootSignatureDesc &RSD,
+  const dxil::ModuleMetadataInfo &MMI) {
+  dxbc::RootFlags Flags = dxbc::RootFlags(RSD.Flags);
 
+  switch (MMI.ShaderProfile) {
+  case Triple::Pixel:
+reportIfDeniedShaderStageAccess(M, Flags,
+
dxbc::RootFlags::DenyPixelShaderRootAccess);
+break;
+  case Triple::Vertex:
+reportIfDeniedShaderStageAccess(
+M, Flags, dxbc::RootFlags::DenyVertexShaderRootAccess);
+break;
+  case Triple::Geometry:
+reportIfDeniedShaderStageAccess(
+M, Flags, dxbc::RootFlags::DenyGeometryShaderRootAccess);
+break;
+  case Triple::Hull:
+reportIfDeniedShaderStageAccess(M, Flags,
+dxbc::RootFlags::DenyHullShaderRootAccess);
+break;
+  case Triple::Domain:
+reportIfDeniedShaderStageAccess(
+M, Flags, dxbc::RootFlags::DenyDomainShaderRootAccess);
+break;
+  case Triple::Mesh:
+reportIfDeniedShaderStageAccess(M, Flags,
+dxbc::RootFlags::DenyMeshShaderRootAccess);
+break;
+  case Triple::Amplification:
+reportIfDeniedShaderStageAccess(
+M, Flags, dxbc::RootFlags::DenyAmplificationShaderRootAccess);
+break;
+  default:
+break;
+  }
+}
 
 static void validateDescriptorTables(Module &M,
- const mcdxbc::RootSignatureDesc &RSD,
- dxil::ModuleMetadataInfo &MMI,
- DXILResourceMap &DRM) {
+ const mcdxbc::RootSignatureDesc &RSD) {
   for (const mcdxbc::RootParameterInfo &ParamInfo : RSD.ParametersContainer) {
 if (static_cast(ParamInfo.Header.ParameterType) !=
 dxbc::RootParameterType::DescriptorTable)
@@ -2

[llvm-branch-commits] [flang] [mlir] [MLIR] Add new complex.powi op (PR #158722)

2025-09-18 Thread Mehdi Amini via llvm-branch-commits



@@ -47,74 +47,61 @@ static func::FuncOp getOrDeclare(fir::FirOpBuilder 
&builder, Location loc,
   return func;
 }
 
-static bool isZero(Value v) {
-  if (auto cst = v.getDefiningOp())
-if (auto attr = dyn_cast(cst.getValue()))
-  return attr.getValue().isZero();
-  return false;
-}
-
 void ConvertComplexPowPass::runOnOperation() {
   ModuleOp mod = getOperation();
   fir::FirOpBuilder builder(mod, fir::getKindMapping(mod));
 
-  mod.walk([&](complex::PowOp op) {
+  mod.walk([&](complex::PowiOp op) {
 builder.setInsertionPoint(op);
 Location loc = op.getLoc();
 auto complexTy = cast(op.getType());
 auto elemTy = complexTy.getElementType();
-
 Value base = op.getLhs();
-Value rhs = op.getRhs();
-
-Value intExp;
-if (auto create = rhs.getDefiningOp()) {
-  if (isZero(create.getImaginary())) {
-if (auto conv = create.getReal().getDefiningOp()) {
-  if (auto intTy = dyn_cast(conv.getValue().getType()))
-intExp = conv.getValue();
-}
-  }
-}
-
+Value intExp = op.getRhs();
 func::FuncOp callee;
-SmallVector args;
-if (intExp) {
-  unsigned realBits = cast(elemTy).getWidth();
-  unsigned intBits = cast(intExp.getType()).getWidth();
-  auto funcTy = builder.getFunctionType(
-  {complexTy, builder.getIntegerType(intBits)}, {complexTy});
-  if (realBits == 32 && intBits == 32)
-callee = getOrDeclare(builder, loc, RTNAME_STRING(cpowi), funcTy);
-  else if (realBits == 32 && intBits == 64)
-callee = getOrDeclare(builder, loc, RTNAME_STRING(cpowk), funcTy);
-  else if (realBits == 64 && intBits == 32)
-callee = getOrDeclare(builder, loc, RTNAME_STRING(zpowi), funcTy);
-  else if (realBits == 64 && intBits == 64)
-callee = getOrDeclare(builder, loc, RTNAME_STRING(zpowk), funcTy);
-  else if (realBits == 128 && intBits == 32)
-callee = getOrDeclare(builder, loc, RTNAME_STRING(cqpowi), funcTy);
-  else if (realBits == 128 && intBits == 64)
-callee = getOrDeclare(builder, loc, RTNAME_STRING(cqpowk), funcTy);
-  else
-return;
-  args = {base, intExp};
-} else {
-  unsigned realBits = cast(elemTy).getWidth();
-  auto funcTy =
-  builder.getFunctionType({complexTy, complexTy}, {complexTy});
-  if (realBits == 32)
-callee = getOrDeclare(builder, loc, "cpowf", funcTy);
-  else if (realBits == 64)
-callee = getOrDeclare(builder, loc, "cpow", funcTy);
-  else if (realBits == 128)
-callee = getOrDeclare(builder, loc, RTNAME_STRING(CPowF128), funcTy);
-  else
-return;
-  args = {base, rhs};
-}
+unsigned realBits = cast(elemTy).getWidth();
+unsigned intBits = cast(intExp.getType()).getWidth();
+auto funcTy = builder.getFunctionType(
+{complexTy, builder.getIntegerType(intBits)}, {complexTy});
+if (realBits == 32 && intBits == 32)
+  callee = getOrDeclare(builder, loc, RTNAME_STRING(cpowi), funcTy);
+else if (realBits == 32 && intBits == 64)
+  callee = getOrDeclare(builder, loc, RTNAME_STRING(cpowk), funcTy);
+else if (realBits == 64 && intBits == 32)
+  callee = getOrDeclare(builder, loc, RTNAME_STRING(zpowi), funcTy);
+else if (realBits == 64 && intBits == 64)
+  callee = getOrDeclare(builder, loc, RTNAME_STRING(zpowk), funcTy);
+else if (realBits == 128 && intBits == 32)
+  callee = getOrDeclare(builder, loc, RTNAME_STRING(cqpowi), funcTy);
+else if (realBits == 128 && intBits == 64)
+  callee = getOrDeclare(builder, loc, RTNAME_STRING(cqpowk), funcTy);
+else
+  return;
+auto call = fir::CallOp::create(builder, loc, callee, {base, intExp});
+if (auto fmf = op.getFastmathAttr())
+  call.setFastmathAttr(fmf);
+op.replaceAllUsesWith(call.getResult(0));
+op.erase();
+  });
 
-auto call = fir::CallOp::create(builder, loc, callee, args);
+  mod.walk([&](complex::PowOp op) {

joker-eph wrote:

We should not walk multiple times if we can do it in a single traversal, can 
you replace this with a walk on Operation* and dispatch inside the walk?

https://github.com/llvm/llvm-project/pull/158722
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [AllocToken, Clang] Implement TypeHashPointerSplit mode (PR #156840)

2025-09-18 Thread Marco Elver via llvm-branch-commits


https://github.com/melver updated 
https://github.com/llvm/llvm-project/pull/156840

>From 14c75441e84aa32e4f5876598b9a2c59d4ecbe65 Mon Sep 17 00:00:00 2001
From: Marco Elver 
Date: Mon, 8 Sep 2025 21:32:21 +0200
Subject: [PATCH 1/2] fixup! fix for incomplete types

Created using spr 1.3.8-beta.1
---
 clang/lib/CodeGen/CGExpr.cpp | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp
index 288b41bc42203..455de644daf00 100644
--- a/clang/lib/CodeGen/CGExpr.cpp
+++ b/clang/lib/CodeGen/CGExpr.cpp
@@ -1289,6 +1289,7 @@ void CodeGenFunction::EmitAllocTokenHint(llvm::CallBase 
*CB,
   // Check if QualType contains a pointer. Implements a simple DFS to
   // recursively check if a type contains a pointer type.
   llvm::SmallPtrSet VisitedRD;
+  bool IncompleteType = false;
   auto TypeContainsPtr = [&](auto &&self, QualType T) -> bool {
 QualType CanonicalType = T.getCanonicalType();
 if (CanonicalType->isPointerType())
@@ -1312,6 +1313,10 @@ void CodeGenFunction::EmitAllocTokenHint(llvm::CallBase 
*CB,
   return self(self, AT->getElementType());
 // The type is a struct, class, or union.
 if (const RecordDecl *RD = CanonicalType->getAsRecordDecl()) {
+  if (!RD->isCompleteDefinition()) {
+IncompleteType = true;
+return false;
+  }
   if (!VisitedRD.insert(RD).second)
 return false; // already visited
   // Check all fields.
@@ -1333,6 +1338,8 @@ void CodeGenFunction::EmitAllocTokenHint(llvm::CallBase 
*CB,
 return false;
   };
   const bool ContainsPtr = TypeContainsPtr(TypeContainsPtr, AllocType);
+  if (!ContainsPtr && IncompleteType)
+return nullptr;
   auto *ContainsPtrC = Builder.getInt1(ContainsPtr);
   auto *ContainsPtrMD = MDB.createConstant(ContainsPtrC);
 

>From 7f706618ddc40375d4085bc2ebe03f02ec78823a Mon Sep 17 00:00:00 2001
From: Marco Elver 
Date: Mon, 8 Sep 2025 21:58:01 +0200
Subject: [PATCH 2/2] fixup!

Created using spr 1.3.8-beta.1
---
 clang/lib/CodeGen/CGExpr.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp
index 455de644daf00..e7a0e7696e204 100644
--- a/clang/lib/CodeGen/CGExpr.cpp
+++ b/clang/lib/CodeGen/CGExpr.cpp
@@ -1339,7 +1339,7 @@ void CodeGenFunction::EmitAllocTokenHint(llvm::CallBase 
*CB,
   };
   const bool ContainsPtr = TypeContainsPtr(TypeContainsPtr, AllocType);
   if (!ContainsPtr && IncompleteType)
-return nullptr;
+return;
   auto *ContainsPtrC = Builder.getInt1(ContainsPtr);
   auto *ContainsPtrMD = MDB.createConstant(ContainsPtrC);
 

___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [Remarks] Restructure bitstream remarks to be fully standalone (PR #156715)

2025-09-18 Thread Tobias Stadler via llvm-branch-commits


https://github.com/tobias-stadler updated 
https://github.com/llvm/llvm-project/pull/156715

>From d33b31f01aeeb9005581b0a2a1f21c898463aa02 Mon Sep 17 00:00:00 2001
From: Tobias Stadler 
Date: Thu, 18 Sep 2025 12:34:55 +0100
Subject: [PATCH 1/3] Replace bitstream blobs by yaml

Created using spr 1.3.7-wip
---
 llvm/lib/Remarks/BitstreamRemarkParser.cpp|   5 +-
 .../dsymutil/ARM/remarks-linking-bundle.test  |  13 +-
 .../basic1.macho.remarks.arm64.opt.bitstream  | Bin 824 -> 0 bytes
 .../basic1.macho.remarks.arm64.opt.yaml   |  47 +
 ...c1.macho.remarks.empty.arm64.opt.bitstream |   0
 .../basic2.macho.remarks.arm64.opt.bitstream  | Bin 1696 -> 0 bytes
 .../basic2.macho.remarks.arm64.opt.yaml   | 194 ++
 ...c2.macho.remarks.empty.arm64.opt.bitstream |   0
 .../basic3.macho.remarks.arm64.opt.bitstream  | Bin 1500 -> 0 bytes
 .../basic3.macho.remarks.arm64.opt.yaml   | 181 
 ...c3.macho.remarks.empty.arm64.opt.bitstream |   0
 .../fat.macho.remarks.x86_64.opt.bitstream| Bin 820 -> 0 bytes
 .../remarks/fat.macho.remarks.x86_64.opt.yaml |  53 +
 .../fat.macho.remarks.x86_64h.opt.bitstream   | Bin 820 -> 0 bytes
 .../fat.macho.remarks.x86_64h.opt.yaml|  53 +
 .../X86/remarks-linking-fat-bundle.test   |   8 +-
 16 files changed, 543 insertions(+), 11 deletions(-)
 delete mode 100644 
llvm/test/tools/dsymutil/Inputs/private/tmp/remarks/basic1.macho.remarks.arm64.opt.bitstream
 create mode 100644 
llvm/test/tools/dsymutil/Inputs/private/tmp/remarks/basic1.macho.remarks.arm64.opt.yaml
 delete mode 100644 
llvm/test/tools/dsymutil/Inputs/private/tmp/remarks/basic1.macho.remarks.empty.arm64.opt.bitstream
 delete mode 100644 
llvm/test/tools/dsymutil/Inputs/private/tmp/remarks/basic2.macho.remarks.arm64.opt.bitstream
 create mode 100644 
llvm/test/tools/dsymutil/Inputs/private/tmp/remarks/basic2.macho.remarks.arm64.opt.yaml
 delete mode 100644 
llvm/test/tools/dsymutil/Inputs/private/tmp/remarks/basic2.macho.remarks.empty.arm64.opt.bitstream
 delete mode 100644 
llvm/test/tools/dsymutil/Inputs/private/tmp/remarks/basic3.macho.remarks.arm64.opt.bitstream
 create mode 100644 
llvm/test/tools/dsymutil/Inputs/private/tmp/remarks/basic3.macho.remarks.arm64.opt.yaml
 delete mode 100644 
llvm/test/tools/dsymutil/Inputs/private/tmp/remarks/basic3.macho.remarks.empty.arm64.opt.bitstream
 delete mode 100644 
llvm/test/tools/dsymutil/Inputs/private/tmp/remarks/fat.macho.remarks.x86_64.opt.bitstream
 create mode 100644 
llvm/test/tools/dsymutil/Inputs/private/tmp/remarks/fat.macho.remarks.x86_64.opt.yaml
 delete mode 100644 
llvm/test/tools/dsymutil/Inputs/private/tmp/remarks/fat.macho.remarks.x86_64h.opt.bitstream
 create mode 100644 
llvm/test/tools/dsymutil/Inputs/private/tmp/remarks/fat.macho.remarks.x86_64h.opt.yaml

diff --git a/llvm/lib/Remarks/BitstreamRemarkParser.cpp 
b/llvm/lib/Remarks/BitstreamRemarkParser.cpp
index 63b16bd2df0ec..2b27a0f661d88 100644
--- a/llvm/lib/Remarks/BitstreamRemarkParser.cpp
+++ b/llvm/lib/Remarks/BitstreamRemarkParser.cpp
@@ -411,9 +411,8 @@ Error BitstreamRemarkParser::processExternalFilePath() {
 return E;
 
   if (ContainerType != BitstreamRemarkContainerType::RemarksFile)
-return error(
-"Error while parsing external file's BLOCK_META: wrong container "
-"type.");
+return ParserHelper->MetaHelper.error(
+"Wrong container type in external file.");
 
   return Error::success();
 }
diff --git a/llvm/test/tools/dsymutil/ARM/remarks-linking-bundle.test 
b/llvm/test/tools/dsymutil/ARM/remarks-linking-bundle.test
index 09a60d7d044c6..e1b04455b0d9d 100644
--- a/llvm/test/tools/dsymutil/ARM/remarks-linking-bundle.test
+++ b/llvm/test/tools/dsymutil/ARM/remarks-linking-bundle.test
@@ -1,22 +1,25 @@
 RUN: rm -rf %t
-RUN: mkdir -p %t
+RUN: mkdir -p %t/private/tmp/remarks
 RUN: cat %p/../Inputs/remarks/basic.macho.remarks.arm64> 
%t/basic.macho.remarks.arm64
+RUN: llvm-remarkutil yaml2bitstream 
%p/../Inputs/private/tmp/remarks/basic1.macho.remarks.arm64.opt.yaml -o 
%t/private/tmp/remarks/basic1.macho.remarks.arm64.opt.bitstream
+RUN: llvm-remarkutil yaml2bitstream 
%p/../Inputs/private/tmp/remarks/basic2.macho.remarks.arm64.opt.yaml -o 
%t/private/tmp/remarks/basic2.macho.remarks.arm64.opt.bitstream
+RUN: llvm-remarkutil yaml2bitstream 
%p/../Inputs/private/tmp/remarks/basic3.macho.remarks.arm64.opt.yaml -o 
%t/private/tmp/remarks/basic3.macho.remarks.arm64.opt.bitstream
 
-RUN: dsymutil -oso-prepend-path=%p/../Inputs 
-remarks-prepend-path=%p/../Inputs %t/basic.macho.remarks.arm64
+RUN: dsymutil -oso-prepend-path=%p/../Inputs -remarks-prepend-path=%t 
%t/basic.macho.remarks.arm64
 
 Check that the remark file in the bundle exists and is sane:
 RUN: llvm-bcanalyzer -dump 
%t/basic.macho.remarks.arm64.dSYM/Contents/Resources/Remarks/basic.macho.remarks.arm64
 | FileCheck %s
 
-RUN: dsymutil --linker parallel -oso-prepend-path=%p/../Inputs 
-remarks-prepend-path=%p/../Inputs %t/basic.macho.r

[llvm-branch-commits] [flang] [flang][OpenMP] Use OmpDirectiveSpecification in THREADPRIVATE (PR #159632)

2025-09-18 Thread Krzysztof Parzyszek via llvm-branch-commits


https://github.com/kparzysz created 
https://github.com/llvm/llvm-project/pull/159632

Since ODS doesn't store a list of OmpObjects (i.e. not as OmpObjectList), some 
semantics-checking functions needed to be updated to operate on a single object 
at a time.

>From 7bb9fb5b3b9a2dfcd1d00f01c86fe26c5d14c30f Mon Sep 17 00:00:00 2001
From: Krzysztof Parzyszek 
Date: Thu, 18 Sep 2025 08:49:38 -0500
Subject: [PATCH] [flang][OpenMP] Use OmpDirectiveSpecification in
 THREADPRIVATE

Since ODS doesn't store a list of OmpObjects (i.e. not as OmpObjectList),
some semantics-checking functions needed to be updated to operate on a
single object at a time.
---
 flang/include/flang/Parser/openmp-utils.h|  4 +-
 flang/include/flang/Parser/parse-tree.h  |  3 +-
 flang/include/flang/Semantics/openmp-utils.h |  3 +-
 flang/lib/Parser/openmp-parsers.cpp  |  7 +-
 flang/lib/Parser/unparse.cpp |  7 +-
 flang/lib/Semantics/check-omp-structure.cpp  | 89 +++-
 flang/lib/Semantics/check-omp-structure.h|  3 +
 flang/lib/Semantics/openmp-utils.cpp | 22 +++--
 flang/lib/Semantics/resolve-directives.cpp   | 11 ++-
 9 files changed, 86 insertions(+), 63 deletions(-)

diff --git a/flang/include/flang/Parser/openmp-utils.h 
b/flang/include/flang/Parser/openmp-utils.h
index 032fb8996fe48..1372945427955 100644
--- a/flang/include/flang/Parser/openmp-utils.h
+++ b/flang/include/flang/Parser/openmp-utils.h
@@ -49,7 +49,6 @@ MAKE_CONSTR_ID(OpenMPDeclareSimdConstruct, 
D::OMPD_declare_simd);
 MAKE_CONSTR_ID(OpenMPDeclareTargetConstruct, D::OMPD_declare_target);
 MAKE_CONSTR_ID(OpenMPExecutableAllocate, D::OMPD_allocate);
 MAKE_CONSTR_ID(OpenMPRequiresConstruct, D::OMPD_requires);
-MAKE_CONSTR_ID(OpenMPThreadprivate, D::OMPD_threadprivate);
 
 #undef MAKE_CONSTR_ID
 
@@ -111,8 +110,7 @@ struct DirectiveNameScope {
   std::is_same_v ||
   std::is_same_v ||
   std::is_same_v ||
-  std::is_same_v ||
-  std::is_same_v) {
+  std::is_same_v) {
 return MakeName(std::get(x.t).source, ConstructId::id);
   } else {
 return GetFromTuple(
diff --git a/flang/include/flang/Parser/parse-tree.h 
b/flang/include/flang/Parser/parse-tree.h
index 09a45476420df..8cb6d2e744876 100644
--- a/flang/include/flang/Parser/parse-tree.h
+++ b/flang/include/flang/Parser/parse-tree.h
@@ -5001,9 +5001,8 @@ struct OpenMPRequiresConstruct {
 
 // 2.15.2 threadprivate -> THREADPRIVATE (variable-name-list)
 struct OpenMPThreadprivate {
-  TUPLE_CLASS_BOILERPLATE(OpenMPThreadprivate);
+  WRAPPER_CLASS_BOILERPLATE(OpenMPThreadprivate, OmpDirectiveSpecification);
   CharBlock source;
-  std::tuple t;
 };
 
 // 2.11.3 allocate -> ALLOCATE (variable-name-list) [clause]
diff --git a/flang/include/flang/Semantics/openmp-utils.h 
b/flang/include/flang/Semantics/openmp-utils.h
index 68318d6093a1e..65441728c5549 100644
--- a/flang/include/flang/Semantics/openmp-utils.h
+++ b/flang/include/flang/Semantics/openmp-utils.h
@@ -58,9 +58,10 @@ const parser::DataRef *GetDataRefFromObj(const 
parser::OmpObject &object);
 const parser::ArrayElement *GetArrayElementFromObj(
 const parser::OmpObject &object);
 const Symbol *GetObjectSymbol(const parser::OmpObject &object);
-const Symbol *GetArgumentSymbol(const parser::OmpArgument &argument);
 std::optional GetObjectSource(
 const parser::OmpObject &object);
+const Symbol *GetArgumentSymbol(const parser::OmpArgument &argument);
+const parser::OmpObject *GetArgumentObject(const parser::OmpArgument 
&argument);
 
 bool IsCommonBlock(const Symbol &sym);
 bool IsExtendedListItem(const Symbol &sym);
diff --git a/flang/lib/Parser/openmp-parsers.cpp 
b/flang/lib/Parser/openmp-parsers.cpp
index 66526ba00b5ed..60ce71cf983f6 100644
--- a/flang/lib/Parser/openmp-parsers.cpp
+++ b/flang/lib/Parser/openmp-parsers.cpp
@@ -1791,8 +1791,11 @@ TYPE_PARSER(sourced(construct(
 verbatim("REQUIRES"_tok), Parser{})))
 
 // 2.15.2 Threadprivate directive
-TYPE_PARSER(sourced(construct(
-verbatim("THREADPRIVATE"_tok), parenthesized(Parser{}
+TYPE_PARSER(sourced( //
+construct(
+predicated(OmpDirectiveNameParser{},
+IsDirective(llvm::omp::Directive::OMPD_threadprivate)) >=
+Parser{})))
 
 // 2.11.3 Declarative Allocate directive
 TYPE_PARSER(
diff --git a/flang/lib/Parser/unparse.cpp b/flang/lib/Parser/unparse.cpp
index 189a34ee1dc56..db46525ac57b1 100644
--- a/flang/lib/Parser/unparse.cpp
+++ b/flang/lib/Parser/unparse.cpp
@@ -2611,12 +2611,11 @@ class UnparseVisitor {
   }
   void Unparse(const OpenMPThreadprivate &x) {
 BeginOpenMP();
-Word("!$OMP THREADPRIVATE (");
-Walk(std::get(x.t));
-Put(")\n");
+Word("!$OMP ");
+Walk(x.v);
+Put("\n");
 EndOpenMP();
   }
-
   bool Pre(const OmpMessageClause &x) {
 Walk(x.v);
 return false;
diff --git a/flang/lib/Semantics/check-omp-structure.cpp 
b/flang/lib/Semantics/check-omp-structure.cpp
index 1ee5385fb38a1..507957df

[llvm-branch-commits] [flang] [flang][OpenMP] Use OmpDirectiveSpecification in THREADPRIVATE (PR #159632)

2025-09-18 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-flang-semantics

Author: Krzysztof Parzyszek (kparzysz)


Changes

Since ODS doesn't store a list of OmpObjects (i.e. not as OmpObjectList), some 
semantics-checking functions needed to be updated to operate on a single object 
at a time.

---
Full diff: https://github.com/llvm/llvm-project/pull/159632.diff


9 Files Affected:

- (modified) flang/include/flang/Parser/openmp-utils.h (+1-3) 
- (modified) flang/include/flang/Parser/parse-tree.h (+1-2) 
- (modified) flang/include/flang/Semantics/openmp-utils.h (+2-1) 
- (modified) flang/lib/Parser/openmp-parsers.cpp (+5-2) 
- (modified) flang/lib/Parser/unparse.cpp (+3-4) 
- (modified) flang/lib/Semantics/check-omp-structure.cpp (+48-41) 
- (modified) flang/lib/Semantics/check-omp-structure.h (+3) 
- (modified) flang/lib/Semantics/openmp-utils.cpp (+15-7) 
- (modified) flang/lib/Semantics/resolve-directives.cpp (+8-3) 


``diff
diff --git a/flang/include/flang/Parser/openmp-utils.h 
b/flang/include/flang/Parser/openmp-utils.h
index 032fb8996fe48..1372945427955 100644
--- a/flang/include/flang/Parser/openmp-utils.h
+++ b/flang/include/flang/Parser/openmp-utils.h
@@ -49,7 +49,6 @@ MAKE_CONSTR_ID(OpenMPDeclareSimdConstruct, 
D::OMPD_declare_simd);
 MAKE_CONSTR_ID(OpenMPDeclareTargetConstruct, D::OMPD_declare_target);
 MAKE_CONSTR_ID(OpenMPExecutableAllocate, D::OMPD_allocate);
 MAKE_CONSTR_ID(OpenMPRequiresConstruct, D::OMPD_requires);
-MAKE_CONSTR_ID(OpenMPThreadprivate, D::OMPD_threadprivate);
 
 #undef MAKE_CONSTR_ID
 
@@ -111,8 +110,7 @@ struct DirectiveNameScope {
   std::is_same_v ||
   std::is_same_v ||
   std::is_same_v ||
-  std::is_same_v ||
-  std::is_same_v) {
+  std::is_same_v) {
 return MakeName(std::get(x.t).source, ConstructId::id);
   } else {
 return GetFromTuple(
diff --git a/flang/include/flang/Parser/parse-tree.h 
b/flang/include/flang/Parser/parse-tree.h
index 09a45476420df..8cb6d2e744876 100644
--- a/flang/include/flang/Parser/parse-tree.h
+++ b/flang/include/flang/Parser/parse-tree.h
@@ -5001,9 +5001,8 @@ struct OpenMPRequiresConstruct {
 
 // 2.15.2 threadprivate -> THREADPRIVATE (variable-name-list)
 struct OpenMPThreadprivate {
-  TUPLE_CLASS_BOILERPLATE(OpenMPThreadprivate);
+  WRAPPER_CLASS_BOILERPLATE(OpenMPThreadprivate, OmpDirectiveSpecification);
   CharBlock source;
-  std::tuple t;
 };
 
 // 2.11.3 allocate -> ALLOCATE (variable-name-list) [clause]
diff --git a/flang/include/flang/Semantics/openmp-utils.h 
b/flang/include/flang/Semantics/openmp-utils.h
index 68318d6093a1e..65441728c5549 100644
--- a/flang/include/flang/Semantics/openmp-utils.h
+++ b/flang/include/flang/Semantics/openmp-utils.h
@@ -58,9 +58,10 @@ const parser::DataRef *GetDataRefFromObj(const 
parser::OmpObject &object);
 const parser::ArrayElement *GetArrayElementFromObj(
 const parser::OmpObject &object);
 const Symbol *GetObjectSymbol(const parser::OmpObject &object);
-const Symbol *GetArgumentSymbol(const parser::OmpArgument &argument);
 std::optional GetObjectSource(
 const parser::OmpObject &object);
+const Symbol *GetArgumentSymbol(const parser::OmpArgument &argument);
+const parser::OmpObject *GetArgumentObject(const parser::OmpArgument 
&argument);
 
 bool IsCommonBlock(const Symbol &sym);
 bool IsExtendedListItem(const Symbol &sym);
diff --git a/flang/lib/Parser/openmp-parsers.cpp 
b/flang/lib/Parser/openmp-parsers.cpp
index 66526ba00b5ed..60ce71cf983f6 100644
--- a/flang/lib/Parser/openmp-parsers.cpp
+++ b/flang/lib/Parser/openmp-parsers.cpp
@@ -1791,8 +1791,11 @@ TYPE_PARSER(sourced(construct(
 verbatim("REQUIRES"_tok), Parser{})))
 
 // 2.15.2 Threadprivate directive
-TYPE_PARSER(sourced(construct(
-verbatim("THREADPRIVATE"_tok), parenthesized(Parser{}
+TYPE_PARSER(sourced( //
+construct(
+predicated(OmpDirectiveNameParser{},
+IsDirective(llvm::omp::Directive::OMPD_threadprivate)) >=
+Parser{})))
 
 // 2.11.3 Declarative Allocate directive
 TYPE_PARSER(
diff --git a/flang/lib/Parser/unparse.cpp b/flang/lib/Parser/unparse.cpp
index 189a34ee1dc56..db46525ac57b1 100644
--- a/flang/lib/Parser/unparse.cpp
+++ b/flang/lib/Parser/unparse.cpp
@@ -2611,12 +2611,11 @@ class UnparseVisitor {
   }
   void Unparse(const OpenMPThreadprivate &x) {
 BeginOpenMP();
-Word("!$OMP THREADPRIVATE (");
-Walk(std::get(x.t));
-Put(")\n");
+Word("!$OMP ");
+Walk(x.v);
+Put("\n");
 EndOpenMP();
   }
-
   bool Pre(const OmpMessageClause &x) {
 Walk(x.v);
 return false;
diff --git a/flang/lib/Semantics/check-omp-structure.cpp 
b/flang/lib/Semantics/check-omp-structure.cpp
index 1ee5385fb38a1..507957dfecb3d 100644
--- a/flang/lib/Semantics/check-omp-structure.cpp
+++ b/flang/lib/Semantics/check-omp-structure.cpp
@@ -669,11 +669,6 @@ template  struct 
DirectiveSpellingVisitor {
 checker_(x.v.DirName().source, Directive::OMPD_groupprivate);
 return false;
   }
-  bool

[llvm-branch-commits] [llvm] CodeGen: Keep reference to TargetRegisterInfo in TargetInstrInfo (PR #158224)

2025-09-18 Thread Matt Arsenault via llvm-branch-commits



@@ -1070,8 +1070,8 @@ void InstrInfoEmitter::run(raw_ostream &OS) {
   OS << "namespace llvm {\n";
   OS << "struct " << ClassName << " : public TargetInstrInfo {\n"
  << "  explicit " << ClassName
- << "(const TargetSubtargetInfo &STI, unsigned CFSetupOpcode = ~0u, "
-"unsigned CFDestroyOpcode = ~0u, "
+ << "(const TargetSubtargetInfo &STI, const TargetRegisterInfo &TRI, "

arsenm wrote:

The other option I considered was having unique_ptr in the 
generic base class 

https://github.com/llvm/llvm-project/pull/158224
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU] gfx1251 VOP2 dpp support (PR #159641)

2025-09-18 Thread Stanislav Mekhanoshin via llvm-branch-commits


rampitec wrote:

> [!WARNING]
> This pull request is not mergeable via GitHub because a downstack PR is 
> open. Once all requirements are satisfied, merge this PR as a stack  href="https://app.graphite.dev/github/pr/llvm/llvm-project/159641?utm_source=stack-comment-downstack-mergeability-warning";
>  >on Graphite.
> https://graphite.dev/docs/merge-pull-requests";>Learn more

* **#159641** https://app.graphite.dev/github/pr/llvm/llvm-project/159641?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/159641?utm_source=stack-comment-view-in-graphite";
 target="_blank">(View in Graphite)
* **#159637** https://app.graphite.dev/github/pr/llvm/llvm-project/159637?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* `main`




This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn 
more about https://stacking.dev/?utm_source=stack-comment";>stacking.


https://github.com/llvm/llvm-project/pull/159641
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [profcheck][SimplifyCFG] Propagate !prof from `switch` to `select` (PR #159645)

2025-09-18 Thread Mircea Trofin via llvm-branch-commits


https://github.com/mtrofin created 
https://github.com/llvm/llvm-project/pull/159645

None

>From 92728fa5d41bd5f6ef63837bcb3ea8e85b7a8764 Mon Sep 17 00:00:00 2001
From: Mircea Trofin 
Date: Mon, 15 Sep 2025 17:49:18 +
Subject: [PATCH] [profcheck][SimplifyCFG] Propagate !prof from `switch` to
 `select`

---
 llvm/lib/Transforms/Utils/SimplifyCFG.cpp | 86 ---
 .../SimplifyCFG/switch-to-select-two-case.ll  | 72 +---
 2 files changed, 117 insertions(+), 41 deletions(-)

diff --git a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp 
b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
index a1f759dd1df83..276ca89d715f1 100644
--- a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
+++ b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
@@ -84,6 +84,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -6318,9 +6319,12 @@ static bool initializeUniqueCases(SwitchInst *SI, 
PHINode *&PHI,
 // Helper function that checks if it is possible to transform a switch with 
only
 // two cases (or two cases + default) that produces a result into a select.
 // TODO: Handle switches with more than 2 cases that map to the same result.
+// The branch weights correspond to the provided Condition (i.e. if Condition 
is
+// modified from the original SwitchInst, the caller must adjust the weights)
 static Value *foldSwitchToSelect(const SwitchCaseResultVectorTy &ResultVector,
  Constant *DefaultResult, Value *Condition,
- IRBuilder<> &Builder, const DataLayout &DL) {
+ IRBuilder<> &Builder, const DataLayout &DL,
+ ArrayRef BranchWeights) {
   // If we are selecting between only two cases transform into a simple
   // select or a two-way select if default is possible.
   // Example:
@@ -6329,6 +6333,10 @@ static Value *foldSwitchToSelect(const 
SwitchCaseResultVectorTy &ResultVector,
   //   case 20: return 2;   >  %2 = icmp eq i32 %a, 20
   //   default: return 4;  %3 = select i1 %2, i32 2, i32 %1
   // }
+
+  const bool HasBranchWeights =
+  !BranchWeights.empty() && !ProfcheckDisableMetadataFixes;
+
   if (ResultVector.size() == 2 && ResultVector[0].second.size() == 1 &&
   ResultVector[1].second.size() == 1) {
 ConstantInt *FirstCase = ResultVector[0].second[0];
@@ -6337,13 +6345,37 @@ static Value *foldSwitchToSelect(const 
SwitchCaseResultVectorTy &ResultVector,
 if (DefaultResult) {
   Value *ValueCompare =
   Builder.CreateICmpEQ(Condition, SecondCase, "switch.selectcmp");
-  SelectValue = Builder.CreateSelect(ValueCompare, ResultVector[1].first,
- DefaultResult, "switch.select");
+  SelectInst *SelectValueInst = cast(Builder.CreateSelect(
+  ValueCompare, ResultVector[1].first, DefaultResult, 
"switch.select"));
+  SelectValue = SelectValueInst;
+  if (HasBranchWeights) {
+// We start with 3 probabilities, where the numerator is the
+// corresponding BranchWeights[i], and the denominator is the sum over
+// BranchWeights. We want the probability and negative probability of
+// Condition == SecondCase.
+assert(BranchWeights.size() == 3);
+setBranchWeights(SelectValueInst, BranchWeights[2],
+ BranchWeights[0] + BranchWeights[1],
+ /*IsExpected=*/false);
+}
 }
 Value *ValueCompare =
 Builder.CreateICmpEQ(Condition, FirstCase, "switch.selectcmp");
-return Builder.CreateSelect(ValueCompare, ResultVector[0].first,
-SelectValue, "switch.select");
+SelectInst *Ret = cast(Builder.CreateSelect(
+ValueCompare, ResultVector[0].first, SelectValue, "switch.select"));
+if (HasBranchWeights) {
+  // We may have had a DefaultResult. Base the position of the first and
+  // second's branch weights accordingly. Also the proability that 
Condition
+  // != FirstCase needs to take that into account.
+  assert(BranchWeights.size() >= 2);
+  size_t FirstCasePos = (Condition != nullptr);
+  size_t SecondCasePos = FirstCasePos + 1;
+  uint32_t DefaultCase = (Condition != nullptr) ? BranchWeights[0] : 0;
+  setBranchWeights(Ret, BranchWeights[FirstCasePos],
+   DefaultCase + BranchWeights[SecondCasePos],
+   /*IsExpected=*/false);
+}
+return Ret;
   }
 
   // Handle the degenerate case where two cases have the same result value.
@@ -6379,8 +6411,16 @@ static Value *foldSwitchToSelect(const 
SwitchCaseResultVectorTy &ResultVector,
   Value *And = Builder.CreateAnd(Condition, AndMask);
   Value *Cmp = Builder.CreateICmpEQ(
   And, Constant::getIntegerValue(And->getType(), AndMask));
-  return Builder.CreateSelect(Cmp, ResultVector[0].first,
-  DefaultResult);
+

[llvm-branch-commits] [llvm] [profcheck][SimplifyCFG] Propagate !prof from `switch` to `select` (PR #159645)

2025-09-18 Thread Mircea Trofin via llvm-branch-commits


https://github.com/mtrofin edited 
https://github.com/llvm/llvm-project/pull/159645
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [profcheck][SimplifyCFG] Propagate !prof from `switch` to `select` (PR #159645)

2025-09-18 Thread Mircea Trofin via llvm-branch-commits



@@ -1,5 +1,5 @@
-; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
-; RUN: opt < %s -passes=simplifycfg 
-simplifycfg-require-and-preserve-domtree=1 -S | FileCheck %s
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --check-globals
+; RUN: opt < %s -passes=prof-inject,simplifycfg -profcheck-weights-for-test 
-simplifycfg-require-and-preserve-domtree=1 -S | FileCheck %s

mtrofin wrote:

Note: this test is perfect in that it covers all the cases in the change 
(verified with some appropriately - placed `dbgs()`). To avoid cumbersomely 
adding `!prof` everywhere, we're using the feature introduced in the previous 
patch.

https://github.com/llvm/llvm-project/pull/159645
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [profcheck][SimplifyCFG] Propagate !prof from `switch` to `select` (PR #159645)

2025-09-18 Thread via llvm-branch-commits


github-actions[bot] wrote:




:warning: C/C++ code formatter, clang-format found issues in your code. 
:warning:



You can test this locally with the following command:


``bash
git-clang-format --diff origin/main HEAD --extensions cpp -- 
llvm/lib/Transforms/Utils/SimplifyCFG.cpp
``

:warning:
The reproduction instructions above might return results for more than one PR
in a stack if you are using a stacked PR workflow. You can limit the results by
changing `origin/main` to the base branch/commit you want to compare against.
:warning:





View the diff from clang-format here.


``diff
diff --git a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp 
b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
index 276ca89d7..f775991b5 100644
--- a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
+++ b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
@@ -6357,7 +6357,7 @@ static Value *foldSwitchToSelect(const 
SwitchCaseResultVectorTy &ResultVector,
 setBranchWeights(SelectValueInst, BranchWeights[2],
  BranchWeights[0] + BranchWeights[1],
  /*IsExpected=*/false);
-}
+  }
 }
 Value *ValueCompare =
 Builder.CreateICmpEQ(Condition, FirstCase, "switch.selectcmp");
@@ -6411,8 +6411,8 @@ static Value *foldSwitchToSelect(const 
SwitchCaseResultVectorTy &ResultVector,
   Value *And = Builder.CreateAnd(Condition, AndMask);
   Value *Cmp = Builder.CreateICmpEQ(
   And, Constant::getIntegerValue(And->getType(), AndMask));
-  SelectInst *Ret = cast(Builder.CreateSelect(Cmp, 
ResultVector[0].first,
-  DefaultResult));
+  SelectInst *Ret = cast(
+  Builder.CreateSelect(Cmp, ResultVector[0].first, DefaultResult));
   if (HasBranchWeights) {
 // We know there's a Default case. We base the resulting branch
 // weights off its probability.

``




https://github.com/llvm/llvm-project/pull/159645
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [profcheck][SimplifyCFG] Propagate !prof from `switch` to `select` (PR #159645)

2025-09-18 Thread Mircea Trofin via llvm-branch-commits


https://github.com/mtrofin ready_for_review 
https://github.com/llvm/llvm-project/pull/159645
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [profcheck][SimplifyCFG] Propagate !prof from `switch` to `select` (PR #159645)

2025-09-18 Thread Mircea Trofin via llvm-branch-commits


https://github.com/mtrofin updated 
https://github.com/llvm/llvm-project/pull/159645

>From 6d3342f397d39e366a06eb6bcabddec0b3d5a963 Mon Sep 17 00:00:00 2001
From: Mircea Trofin 
Date: Mon, 15 Sep 2025 17:49:18 +
Subject: [PATCH] [profcheck][SimplifyCFG] Propagate !prof from `switch` to
 `select`

---
 llvm/lib/Transforms/Utils/SimplifyCFG.cpp | 86 ---
 .../SimplifyCFG/switch-to-select-two-case.ll  | 72 +---
 2 files changed, 117 insertions(+), 41 deletions(-)

diff --git a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp 
b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
index a1f759dd1df83..f775991b5ba41 100644
--- a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
+++ b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
@@ -84,6 +84,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -6318,9 +6319,12 @@ static bool initializeUniqueCases(SwitchInst *SI, 
PHINode *&PHI,
 // Helper function that checks if it is possible to transform a switch with 
only
 // two cases (or two cases + default) that produces a result into a select.
 // TODO: Handle switches with more than 2 cases that map to the same result.
+// The branch weights correspond to the provided Condition (i.e. if Condition 
is
+// modified from the original SwitchInst, the caller must adjust the weights)
 static Value *foldSwitchToSelect(const SwitchCaseResultVectorTy &ResultVector,
  Constant *DefaultResult, Value *Condition,
- IRBuilder<> &Builder, const DataLayout &DL) {
+ IRBuilder<> &Builder, const DataLayout &DL,
+ ArrayRef BranchWeights) {
   // If we are selecting between only two cases transform into a simple
   // select or a two-way select if default is possible.
   // Example:
@@ -6329,6 +6333,10 @@ static Value *foldSwitchToSelect(const 
SwitchCaseResultVectorTy &ResultVector,
   //   case 20: return 2;   >  %2 = icmp eq i32 %a, 20
   //   default: return 4;  %3 = select i1 %2, i32 2, i32 %1
   // }
+
+  const bool HasBranchWeights =
+  !BranchWeights.empty() && !ProfcheckDisableMetadataFixes;
+
   if (ResultVector.size() == 2 && ResultVector[0].second.size() == 1 &&
   ResultVector[1].second.size() == 1) {
 ConstantInt *FirstCase = ResultVector[0].second[0];
@@ -6337,13 +6345,37 @@ static Value *foldSwitchToSelect(const 
SwitchCaseResultVectorTy &ResultVector,
 if (DefaultResult) {
   Value *ValueCompare =
   Builder.CreateICmpEQ(Condition, SecondCase, "switch.selectcmp");
-  SelectValue = Builder.CreateSelect(ValueCompare, ResultVector[1].first,
- DefaultResult, "switch.select");
+  SelectInst *SelectValueInst = cast(Builder.CreateSelect(
+  ValueCompare, ResultVector[1].first, DefaultResult, 
"switch.select"));
+  SelectValue = SelectValueInst;
+  if (HasBranchWeights) {
+// We start with 3 probabilities, where the numerator is the
+// corresponding BranchWeights[i], and the denominator is the sum over
+// BranchWeights. We want the probability and negative probability of
+// Condition == SecondCase.
+assert(BranchWeights.size() == 3);
+setBranchWeights(SelectValueInst, BranchWeights[2],
+ BranchWeights[0] + BranchWeights[1],
+ /*IsExpected=*/false);
+  }
 }
 Value *ValueCompare =
 Builder.CreateICmpEQ(Condition, FirstCase, "switch.selectcmp");
-return Builder.CreateSelect(ValueCompare, ResultVector[0].first,
-SelectValue, "switch.select");
+SelectInst *Ret = cast(Builder.CreateSelect(
+ValueCompare, ResultVector[0].first, SelectValue, "switch.select"));
+if (HasBranchWeights) {
+  // We may have had a DefaultResult. Base the position of the first and
+  // second's branch weights accordingly. Also the proability that 
Condition
+  // != FirstCase needs to take that into account.
+  assert(BranchWeights.size() >= 2);
+  size_t FirstCasePos = (Condition != nullptr);
+  size_t SecondCasePos = FirstCasePos + 1;
+  uint32_t DefaultCase = (Condition != nullptr) ? BranchWeights[0] : 0;
+  setBranchWeights(Ret, BranchWeights[FirstCasePos],
+   DefaultCase + BranchWeights[SecondCasePos],
+   /*IsExpected=*/false);
+}
+return Ret;
   }
 
   // Handle the degenerate case where two cases have the same result value.
@@ -6379,8 +6411,16 @@ static Value *foldSwitchToSelect(const 
SwitchCaseResultVectorTy &ResultVector,
   Value *And = Builder.CreateAnd(Condition, AndMask);
   Value *Cmp = Builder.CreateICmpEQ(
   And, Constant::getIntegerValue(And->getType(), AndMask));
-  return Builder.CreateSelect(Cmp, ResultVector[0].first,
-  DefaultResult);
+  Select

[llvm-branch-commits] [llvm] [AMDGPU] gfx1251 VOP2 dpp support (PR #159641)

2025-09-18 Thread Stanislav Mekhanoshin via llvm-branch-commits


https://github.com/rampitec ready_for_review 
https://github.com/llvm/llvm-project/pull/159641
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [HLSL] NonUniformResourceIndex implementation (PR #159655)

2025-09-18 Thread Helena Kotas via llvm-branch-commits


https://github.com/hekota updated 
https://github.com/llvm/llvm-project/pull/159655

>From 108bf356e743d36b4eb5d0217720cf47ab85f33f Mon Sep 17 00:00:00 2001
From: Helena Kotas 
Date: Thu, 18 Sep 2025 14:31:38 -0700
Subject: [PATCH 1/2] [HLSL] NonUniformResourceIndex implementation

Adds HLSL function NonUniformResourceIndex to hlsl_intrinsics.h. The function 
calls
a builtin `__builtin_hlsl_resource_nonuniformindex` which gets translated to
LLVM intrinsic `llvm.{dx|spv}.resource_nonuniformindex.

Depends on #159608

Closes #157923
---
 clang/include/clang/Basic/Builtins.td |  6 +++
 clang/lib/CodeGen/CGHLSLBuiltins.cpp  |  7 
 clang/lib/CodeGen/CGHLSLRuntime.h |  2 +
 clang/lib/Headers/hlsl/hlsl_intrinsics.h  | 25 
 .../resources/NonUniformResourceIndex.hlsl| 38 +++
 5 files changed, 78 insertions(+)
 create mode 100644 
clang/test/CodeGenHLSL/resources/NonUniformResourceIndex.hlsl

diff --git a/clang/include/clang/Basic/Builtins.td 
b/clang/include/clang/Basic/Builtins.td
index 27639f06529cb..96676bd810631 100644
--- a/clang/include/clang/Basic/Builtins.td
+++ b/clang/include/clang/Basic/Builtins.td
@@ -4933,6 +4933,12 @@ def HLSLResourceHandleFromImplicitBinding : 
LangBuiltin<"HLSL_LANG"> {
   let Prototype = "void(...)";
 }
 
+def HLSLResourceNonUniformIndex : LangBuiltin<"HLSL_LANG"> {
+  let Spellings = ["__builtin_hlsl_resource_nonuniformindex"];
+  let Attributes = [NoThrow];
+  let Prototype = "uint32_t(uint32_t)";
+}
+
 def HLSLAll : LangBuiltin<"HLSL_LANG"> {
   let Spellings = ["__builtin_hlsl_all"];
   let Attributes = [NoThrow, Const];
diff --git a/clang/lib/CodeGen/CGHLSLBuiltins.cpp 
b/clang/lib/CodeGen/CGHLSLBuiltins.cpp
index 7b5b924b1fe82..9f87afa5a8a3d 100644
--- a/clang/lib/CodeGen/CGHLSLBuiltins.cpp
+++ b/clang/lib/CodeGen/CGHLSLBuiltins.cpp
@@ -352,6 +352,13 @@ Value *CodeGenFunction::EmitHLSLBuiltinExpr(unsigned 
BuiltinID,
 SmallVector Args{OrderID, SpaceOp, RangeOp, IndexOp, Name};
 return Builder.CreateIntrinsic(HandleTy, IntrinsicID, Args);
   }
+  case Builtin::BI__builtin_hlsl_resource_nonuniformindex: {
+Value *IndexOp = EmitScalarExpr(E->getArg(0));
+llvm::Type *RetTy = ConvertType(E->getType());
+return Builder.CreateIntrinsic(
+RetTy, CGM.getHLSLRuntime().getNonUniformResourceIndexIntrinsic(),
+ArrayRef{IndexOp});
+  }
   case Builtin::BI__builtin_hlsl_all: {
 Value *Op0 = EmitScalarExpr(E->getArg(0));
 return Builder.CreateIntrinsic(
diff --git a/clang/lib/CodeGen/CGHLSLRuntime.h 
b/clang/lib/CodeGen/CGHLSLRuntime.h
index 370f3d5c5d30d..f4b410664d60c 100644
--- a/clang/lib/CodeGen/CGHLSLRuntime.h
+++ b/clang/lib/CodeGen/CGHLSLRuntime.h
@@ -129,6 +129,8 @@ class CGHLSLRuntime {
resource_handlefrombinding)
   GENERATE_HLSL_INTRINSIC_FUNCTION(CreateHandleFromImplicitBinding,
resource_handlefromimplicitbinding)
+  GENERATE_HLSL_INTRINSIC_FUNCTION(NonUniformResourceIndex,
+   resource_nonuniformindex)
   GENERATE_HLSL_INTRINSIC_FUNCTION(BufferUpdateCounter, resource_updatecounter)
   GENERATE_HLSL_INTRINSIC_FUNCTION(GroupMemoryBarrierWithGroupSync,
group_memory_barrier_with_group_sync)
diff --git a/clang/lib/Headers/hlsl/hlsl_intrinsics.h 
b/clang/lib/Headers/hlsl/hlsl_intrinsics.h
index d9d87c827e6a4..0eab2ff56c519 100644
--- a/clang/lib/Headers/hlsl/hlsl_intrinsics.h
+++ b/clang/lib/Headers/hlsl/hlsl_intrinsics.h
@@ -422,6 +422,31 @@ constexpr int4 D3DCOLORtoUBYTE4(float4 V) {
   return __detail::d3d_color_to_ubyte4_impl(V);
 }
 
+//===--===//
+// NonUniformResourceIndex builtin
+//===--===//
+
+/// \fn uint NonUniformResourceIndex(uint I)
+/// \brief A compiler hint to indicate that a resource index varies across
+/// threads.
+// / within a wave (i.e., it is non-uniform).
+/// \param I [in] Resource array index
+///
+/// The return value is the \Index parameter.
+///
+/// When indexing into an array of shader resources (e.g., textures, buffers),
+/// some GPU hardware and drivers require the compiler to know whether the 
index
+/// is uniform (same for all threads) or non-uniform (varies per thread).
+///
+/// Using NonUniformResourceIndex explicitly marks an index as non-uniform, .
+/// disabling certain assumptions or optimizations that could lead to incorrect
+/// behavior when dynamically accessing resource arrays with non-uniform
+/// indices.
+
+constexpr uint32_t NonUniformResourceIndex(uint32_t Index) {
+  return __builtin_hlsl_resource_nonuniformindex(Index);
+}
+
 
//===--===//
 // reflect builtin
 
//===--===//
diff --git a/clang/test/CodeGenHLSL/

[llvm-branch-commits] [llvm] [IR2Vec] Refactor vocabulary to use section-based storage (PR #158376)

2025-09-18 Thread Mircea Trofin via llvm-branch-commits


https://github.com/mtrofin approved this pull request.


https://github.com/llvm/llvm-project/pull/158376
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] X86: Switch to RegClassByHwMode (PR #158274)

2025-09-18 Thread Matt Arsenault via llvm-branch-commits


https://github.com/arsenm updated 
https://github.com/llvm/llvm-project/pull/158274

>From 7d3e2fa03f76098b2f4f90a2c4407e18d59423c5 Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Tue, 9 Sep 2025 11:15:47 +0900
Subject: [PATCH] X86: Switch to RegClassByHwMode

Replace the target uses of PointerLikeRegClass with RegClassByHwMode
---
 .../X86/MCTargetDesc/X86MCTargetDesc.cpp  |  3 ++
 llvm/lib/Target/X86/X86.td|  2 ++
 llvm/lib/Target/X86/X86InstrInfo.td   |  8 ++---
 llvm/lib/Target/X86/X86InstrOperands.td   | 30 +++-
 llvm/lib/Target/X86/X86InstrPredicates.td | 14 
 llvm/lib/Target/X86/X86RegisterInfo.cpp   | 35 +--
 llvm/lib/Target/X86/X86Subtarget.h|  4 +--
 llvm/utils/TableGen/X86FoldTablesEmitter.cpp  |  4 +--
 8 files changed, 57 insertions(+), 43 deletions(-)

diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp 
b/llvm/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp
index bb1e716c33ed5..1d5ef8b0996dc 100644
--- a/llvm/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp
+++ b/llvm/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp
@@ -55,6 +55,9 @@ std::string X86_MC::ParseX86Triple(const Triple &TT) {
   else
 FS = "-64bit-mode,-32bit-mode,+16bit-mode";
 
+  if (TT.isX32())
+FS += ",+x32";
+
   return FS;
 }
 
diff --git a/llvm/lib/Target/X86/X86.td b/llvm/lib/Target/X86/X86.td
index 7c9e821c02fda..3af8b3e060a16 100644
--- a/llvm/lib/Target/X86/X86.td
+++ b/llvm/lib/Target/X86/X86.td
@@ -25,6 +25,8 @@ def Is32Bit : SubtargetFeature<"32bit-mode", "Is32Bit", 
"true",
"32-bit mode (80386)">;
 def Is16Bit : SubtargetFeature<"16bit-mode", "Is16Bit", "true",
"16-bit mode (i8086)">;
+def IsX32 : SubtargetFeature<"x32", "IsX32", "true",
+ "64-bit with ILP32 programming model (e.g. x32 
ABI)">;
 
 
//===--===//
 // X86 Subtarget ISA features
diff --git a/llvm/lib/Target/X86/X86InstrInfo.td 
b/llvm/lib/Target/X86/X86InstrInfo.td
index 7f6c5614847e3..0c4abc2c400f6 100644
--- a/llvm/lib/Target/X86/X86InstrInfo.td
+++ b/llvm/lib/Target/X86/X86InstrInfo.td
@@ -18,14 +18,14 @@ include "X86InstrFragments.td"
 include "X86InstrFragmentsSIMD.td"
 
 
//===--===//
-// X86 Operand Definitions.
+// X86 Predicate Definitions.
 //
-include "X86InstrOperands.td"
+include "X86InstrPredicates.td"
 
 
//===--===//
-// X86 Predicate Definitions.
+// X86 Operand Definitions.
 //
-include "X86InstrPredicates.td"
+include "X86InstrOperands.td"
 
 
//===--===//
 // X86 Instruction Format Definitions.
diff --git a/llvm/lib/Target/X86/X86InstrOperands.td 
b/llvm/lib/Target/X86/X86InstrOperands.td
index 80843f6bb80e6..5207ecad127a2 100644
--- a/llvm/lib/Target/X86/X86InstrOperands.td
+++ b/llvm/lib/Target/X86/X86InstrOperands.td
@@ -6,9 +6,15 @@
 //
 
//===--===//
 
+def x86_ptr_rc : RegClassByHwMode<
+  [X86_32, X86_64, X86_64_X32],
+  [GR32, GR64, LOW32_ADDR_ACCESS]>;
+
 // A version of ptr_rc which excludes SP, ESP, and RSP. This is used for
 // the index operand of an address, to conform to x86 encoding restrictions.
-def ptr_rc_nosp : PointerLikeRegClass<1>;
+def ptr_rc_nosp : RegClassByHwMode<
+  [X86_32, X86_64, X86_64_X32],
+  [GR32_NOSP, GR64_NOSP, GR32_NOSP]>;
 
 // *mem - Operand definitions for the funky X86 addressing mode operands.
 //
@@ -53,7 +59,7 @@ class X86MemOperand : Operand {
   let PrintMethod = printMethod;
-  let MIOperandInfo = (ops ptr_rc, i8imm, ptr_rc_nosp, i32imm, SEGMENT_REG);
+  let MIOperandInfo = (ops x86_ptr_rc, i8imm, ptr_rc_nosp, i32imm, 
SEGMENT_REG);
   let ParserMatchClass = parserMatchClass;
   let OperandType = "OPERAND_MEMORY";
   int Size = size;
@@ -63,7 +69,7 @@ class X86MemOperand
 : X86MemOperand {
-  let MIOperandInfo = (ops ptr_rc, i8imm, RC, i32imm, SEGMENT_REG);
+  let MIOperandInfo = (ops x86_ptr_rc, i8imm, RC, i32imm, SEGMENT_REG);
 }
 
 def anymem : X86MemOperand<"printMemReference">;
@@ -113,8 +119,14 @@ def sdmem : X86MemOperand<"printqwordmem", 
X86Mem64AsmOperand>;
 
 // A version of i8mem for use on x86-64 and x32 that uses a NOREX GPR instead
 // of a plain GPR, so that it doesn't potentially require a REX prefix.
-def ptr_rc_norex : PointerLikeRegClass<2>;
-def ptr_rc_norex_nosp : PointerLikeRegClass<3>;
+def ptr_rc_norex : RegClassByHwMode<
+  [X86_32, X86_64, X86_64_X32],
+  [GR32_NOREX, GR64_NOREX, GR32_NOREX]>;
+
+def ptr_rc_norex_nosp : RegClassByHwMode<
+  [X86_32, X86_64, X86_64_X32],
+  [GR32_NOREX_NOSP, GR64_NOREX_NOSP, GR32_NOREX_NOSP]>;
+
 
 def i8mem_NOREX : X86MemOperand<"printbytemem", X86Mem8AsmOperand, 8> {
   let MIOpe

[llvm-branch-commits] [llvm] SPARC: Use RegClassByHwMode instead of PointerLikeRegClass (PR #158271)

2025-09-18 Thread Matt Arsenault via llvm-branch-commits


https://github.com/arsenm updated 
https://github.com/llvm/llvm-project/pull/158271

>From e7ef891fb2c4e21bec4d23af954ad9204f3eb48f Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Mon, 8 Sep 2025 14:04:59 +0900
Subject: [PATCH] SPARC: Use RegClassByHwMode instead of PointerLikeRegClass

---
 .../Sparc/Disassembler/SparcDisassembler.cpp  |  8 ---
 llvm/lib/Target/Sparc/SparcInstrInfo.td   | 21 +--
 2 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/llvm/lib/Target/Sparc/Disassembler/SparcDisassembler.cpp 
b/llvm/lib/Target/Sparc/Disassembler/SparcDisassembler.cpp
index c3d60f3689e1f..e585e5af42d32 100644
--- a/llvm/lib/Target/Sparc/Disassembler/SparcDisassembler.cpp
+++ b/llvm/lib/Target/Sparc/Disassembler/SparcDisassembler.cpp
@@ -159,14 +159,6 @@ static DecodeStatus DecodeI64RegsRegisterClass(MCInst 
&Inst, unsigned RegNo,
   return DecodeIntRegsRegisterClass(Inst, RegNo, Address, Decoder);
 }
 
-// This is used for the type "ptr_rc", which is either IntRegs or I64Regs
-// depending on SparcRegisterInfo::getPointerRegClass.
-static DecodeStatus DecodePointerLikeRegClass0(MCInst &Inst, unsigned RegNo,
-   uint64_t Address,
-   const MCDisassembler *Decoder) {
-  return DecodeIntRegsRegisterClass(Inst, RegNo, Address, Decoder);
-}
-
 static DecodeStatus DecodeFPRegsRegisterClass(MCInst &Inst, unsigned RegNo,
   uint64_t Address,
   const MCDisassembler *Decoder) {
diff --git a/llvm/lib/Target/Sparc/SparcInstrInfo.td 
b/llvm/lib/Target/Sparc/SparcInstrInfo.td
index 53972d6c105a4..97e7fd7769edb 100644
--- a/llvm/lib/Target/Sparc/SparcInstrInfo.td
+++ b/llvm/lib/Target/Sparc/SparcInstrInfo.td
@@ -95,10 +95,27 @@ def HasFSMULD : Predicate<"!Subtarget->hasNoFSMULD()">;
 // will pick deprecated instructions.
 def UseDeprecatedInsts : Predicate<"Subtarget->useV8DeprecatedInsts()">;
 
+//===--===//
+// HwModes Pattern Stuff
+//===--===//
+
+defvar SPARC32 = DefaultMode;
+def SPARC64 : HwMode<[Is64Bit]>;
+
 
//===--===//
 // Instruction Pattern Stuff
 
//===--===//
 
+def sparc_ptr_rc : RegClassByHwMode<
+  [SPARC32, SPARC64],
+  [IntRegs, I64Regs]>;
+
+// Both cases can use the same decoder method, so avoid the dispatch
+// by hwmode by setting an explicit DecoderMethod
+def ptr_op : RegisterOperand {
+  let DecoderMethod = "DecodeIntRegsRegisterClass";
+}
+
 // FIXME these should have AsmOperandClass.
 def uimm3 : PatLeaf<(imm), [{ return isUInt<3>(N->getZExtValue()); }]>;
 
@@ -178,12 +195,12 @@ def simm13Op : Operand {
 
 def MEMrr : Operand {
   let PrintMethod = "printMemOperand";
-  let MIOperandInfo = (ops ptr_rc, ptr_rc);
+  let MIOperandInfo = (ops ptr_op, ptr_op);
   let ParserMatchClass = SparcMEMrrAsmOperand;
 }
 def MEMri : Operand {
   let PrintMethod = "printMemOperand";
-  let MIOperandInfo = (ops ptr_rc, simm13Op);
+  let MIOperandInfo = (ops ptr_op, simm13Op);
   let ParserMatchClass = SparcMEMriAsmOperand;
 }
 

___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] X86: Switch to RegClassByHwMode (PR #158274)

2025-09-18 Thread Matt Arsenault via llvm-branch-commits


https://github.com/arsenm updated 
https://github.com/llvm/llvm-project/pull/158274

>From 7d3e2fa03f76098b2f4f90a2c4407e18d59423c5 Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Tue, 9 Sep 2025 11:15:47 +0900
Subject: [PATCH] X86: Switch to RegClassByHwMode

Replace the target uses of PointerLikeRegClass with RegClassByHwMode
---
 .../X86/MCTargetDesc/X86MCTargetDesc.cpp  |  3 ++
 llvm/lib/Target/X86/X86.td|  2 ++
 llvm/lib/Target/X86/X86InstrInfo.td   |  8 ++---
 llvm/lib/Target/X86/X86InstrOperands.td   | 30 +++-
 llvm/lib/Target/X86/X86InstrPredicates.td | 14 
 llvm/lib/Target/X86/X86RegisterInfo.cpp   | 35 +--
 llvm/lib/Target/X86/X86Subtarget.h|  4 +--
 llvm/utils/TableGen/X86FoldTablesEmitter.cpp  |  4 +--
 8 files changed, 57 insertions(+), 43 deletions(-)

diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp 
b/llvm/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp
index bb1e716c33ed5..1d5ef8b0996dc 100644
--- a/llvm/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp
+++ b/llvm/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp
@@ -55,6 +55,9 @@ std::string X86_MC::ParseX86Triple(const Triple &TT) {
   else
 FS = "-64bit-mode,-32bit-mode,+16bit-mode";
 
+  if (TT.isX32())
+FS += ",+x32";
+
   return FS;
 }
 
diff --git a/llvm/lib/Target/X86/X86.td b/llvm/lib/Target/X86/X86.td
index 7c9e821c02fda..3af8b3e060a16 100644
--- a/llvm/lib/Target/X86/X86.td
+++ b/llvm/lib/Target/X86/X86.td
@@ -25,6 +25,8 @@ def Is32Bit : SubtargetFeature<"32bit-mode", "Is32Bit", 
"true",
"32-bit mode (80386)">;
 def Is16Bit : SubtargetFeature<"16bit-mode", "Is16Bit", "true",
"16-bit mode (i8086)">;
+def IsX32 : SubtargetFeature<"x32", "IsX32", "true",
+ "64-bit with ILP32 programming model (e.g. x32 
ABI)">;
 
 
//===--===//
 // X86 Subtarget ISA features
diff --git a/llvm/lib/Target/X86/X86InstrInfo.td 
b/llvm/lib/Target/X86/X86InstrInfo.td
index 7f6c5614847e3..0c4abc2c400f6 100644
--- a/llvm/lib/Target/X86/X86InstrInfo.td
+++ b/llvm/lib/Target/X86/X86InstrInfo.td
@@ -18,14 +18,14 @@ include "X86InstrFragments.td"
 include "X86InstrFragmentsSIMD.td"
 
 
//===--===//
-// X86 Operand Definitions.
+// X86 Predicate Definitions.
 //
-include "X86InstrOperands.td"
+include "X86InstrPredicates.td"
 
 
//===--===//
-// X86 Predicate Definitions.
+// X86 Operand Definitions.
 //
-include "X86InstrPredicates.td"
+include "X86InstrOperands.td"
 
 
//===--===//
 // X86 Instruction Format Definitions.
diff --git a/llvm/lib/Target/X86/X86InstrOperands.td 
b/llvm/lib/Target/X86/X86InstrOperands.td
index 80843f6bb80e6..5207ecad127a2 100644
--- a/llvm/lib/Target/X86/X86InstrOperands.td
+++ b/llvm/lib/Target/X86/X86InstrOperands.td
@@ -6,9 +6,15 @@
 //
 
//===--===//
 
+def x86_ptr_rc : RegClassByHwMode<
+  [X86_32, X86_64, X86_64_X32],
+  [GR32, GR64, LOW32_ADDR_ACCESS]>;
+
 // A version of ptr_rc which excludes SP, ESP, and RSP. This is used for
 // the index operand of an address, to conform to x86 encoding restrictions.
-def ptr_rc_nosp : PointerLikeRegClass<1>;
+def ptr_rc_nosp : RegClassByHwMode<
+  [X86_32, X86_64, X86_64_X32],
+  [GR32_NOSP, GR64_NOSP, GR32_NOSP]>;
 
 // *mem - Operand definitions for the funky X86 addressing mode operands.
 //
@@ -53,7 +59,7 @@ class X86MemOperand : Operand {
   let PrintMethod = printMethod;
-  let MIOperandInfo = (ops ptr_rc, i8imm, ptr_rc_nosp, i32imm, SEGMENT_REG);
+  let MIOperandInfo = (ops x86_ptr_rc, i8imm, ptr_rc_nosp, i32imm, 
SEGMENT_REG);
   let ParserMatchClass = parserMatchClass;
   let OperandType = "OPERAND_MEMORY";
   int Size = size;
@@ -63,7 +69,7 @@ class X86MemOperand
 : X86MemOperand {
-  let MIOperandInfo = (ops ptr_rc, i8imm, RC, i32imm, SEGMENT_REG);
+  let MIOperandInfo = (ops x86_ptr_rc, i8imm, RC, i32imm, SEGMENT_REG);
 }
 
 def anymem : X86MemOperand<"printMemReference">;
@@ -113,8 +119,14 @@ def sdmem : X86MemOperand<"printqwordmem", 
X86Mem64AsmOperand>;
 
 // A version of i8mem for use on x86-64 and x32 that uses a NOREX GPR instead
 // of a plain GPR, so that it doesn't potentially require a REX prefix.
-def ptr_rc_norex : PointerLikeRegClass<2>;
-def ptr_rc_norex_nosp : PointerLikeRegClass<3>;
+def ptr_rc_norex : RegClassByHwMode<
+  [X86_32, X86_64, X86_64_X32],
+  [GR32_NOREX, GR64_NOREX, GR32_NOREX]>;
+
+def ptr_rc_norex_nosp : RegClassByHwMode<
+  [X86_32, X86_64, X86_64_X32],
+  [GR32_NOREX_NOSP, GR64_NOREX_NOSP, GR32_NOREX_NOSP]>;
+
 
 def i8mem_NOREX : X86MemOperand<"printbytemem", X86Mem8AsmOperand, 8> {
   let MIOpe

[llvm-branch-commits] [llvm] Mips: Switch to RegClassByHwMode (PR #158273)

2025-09-18 Thread Matt Arsenault via llvm-branch-commits


https://github.com/arsenm updated 
https://github.com/llvm/llvm-project/pull/158273

>From 5b8f38bb56b46b9e63fe2031f9b43e4bbba333fb Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Sat, 6 Sep 2025 21:14:45 +0900
Subject: [PATCH 1/3] Mips: Switch to RegClassByHwMode

---
 .../Target/Mips/AsmParser/MipsAsmParser.cpp   |  9 +--
 .../Mips/Disassembler/MipsDisassembler.cpp| 24 +++
 llvm/lib/Target/Mips/MicroMipsInstrInfo.td| 12 +++---
 llvm/lib/Target/Mips/Mips.td  | 15 
 llvm/lib/Target/Mips/MipsInstrInfo.td | 20 +++-
 llvm/lib/Target/Mips/MipsRegisterInfo.cpp | 16 ++---
 llvm/lib/Target/Mips/MipsRegisterInfo.td  | 16 +
 7 files changed, 76 insertions(+), 36 deletions(-)

diff --git a/llvm/lib/Target/Mips/AsmParser/MipsAsmParser.cpp 
b/llvm/lib/Target/Mips/AsmParser/MipsAsmParser.cpp
index 8a5cb517c94c5..ba70c9e6cb9e8 100644
--- a/llvm/lib/Target/Mips/AsmParser/MipsAsmParser.cpp
+++ b/llvm/lib/Target/Mips/AsmParser/MipsAsmParser.cpp
@@ -3706,7 +3706,9 @@ void MipsAsmParser::expandMem16Inst(MCInst &Inst, SMLoc 
IDLoc, MCStreamer &Out,
   MCRegister TmpReg = DstReg;
 
   const MCInstrDesc &Desc = MII.get(OpCode);
-  int16_t DstRegClass = Desc.operands()[StartOp].RegClass;
+  int16_t DstRegClass =
+  MII.getOpRegClassID(Desc.operands()[StartOp],
+  STI->getHwMode(MCSubtargetInfo::HwMode_RegInfo));
   unsigned DstRegClassID =
   getContext().getRegisterInfo()->getRegClass(DstRegClass).getID();
   bool IsGPR = (DstRegClassID == Mips::GPR32RegClassID) ||
@@ -3834,7 +3836,10 @@ void MipsAsmParser::expandMem9Inst(MCInst &Inst, SMLoc 
IDLoc, MCStreamer &Out,
   MCRegister TmpReg = DstReg;
 
   const MCInstrDesc &Desc = MII.get(OpCode);
-  int16_t DstRegClass = Desc.operands()[StartOp].RegClass;
+  int16_t DstRegClass =
+  MII.getOpRegClassID(Desc.operands()[StartOp],
+  STI->getHwMode(MCSubtargetInfo::HwMode_RegInfo));
+
   unsigned DstRegClassID =
   getContext().getRegisterInfo()->getRegClass(DstRegClass).getID();
   bool IsGPR = (DstRegClassID == Mips::GPR32RegClassID) ||
diff --git a/llvm/lib/Target/Mips/Disassembler/MipsDisassembler.cpp 
b/llvm/lib/Target/Mips/Disassembler/MipsDisassembler.cpp
index c22b8f61b12dc..705695c74803f 100644
--- a/llvm/lib/Target/Mips/Disassembler/MipsDisassembler.cpp
+++ b/llvm/lib/Target/Mips/Disassembler/MipsDisassembler.cpp
@@ -916,6 +916,30 @@ DecodeGPRMM16MovePRegisterClass(MCInst &Inst, unsigned 
RegNo, uint64_t Address,
   return MCDisassembler::Success;
 }
 
+static DecodeStatus DecodeGP32RegisterClass(MCInst &Inst, unsigned RegNo,
+uint64_t Address,
+const MCDisassembler *Decoder) {
+  llvm_unreachable("this is unused");
+}
+
+static DecodeStatus DecodeGP64RegisterClass(MCInst &Inst, unsigned RegNo,
+uint64_t Address,
+const MCDisassembler *Decoder) {
+  llvm_unreachable("this is unused");
+}
+
+static DecodeStatus DecodeSP32RegisterClass(MCInst &Inst, unsigned RegNo,
+uint64_t Address,
+const MCDisassembler *Decoder) {
+  llvm_unreachable("this is unused");
+}
+
+static DecodeStatus DecodeSP64RegisterClass(MCInst &Inst, unsigned RegNo,
+uint64_t Address,
+const MCDisassembler *Decoder) {
+  llvm_unreachable("this is unused");
+}
+
 static DecodeStatus DecodeGPR32RegisterClass(MCInst &Inst, unsigned RegNo,
  uint64_t Address,
  const MCDisassembler *Decoder) {
diff --git a/llvm/lib/Target/Mips/MicroMipsInstrInfo.td 
b/llvm/lib/Target/Mips/MicroMipsInstrInfo.td
index b3fd8f422f429..b44bf1391b73e 100644
--- a/llvm/lib/Target/Mips/MicroMipsInstrInfo.td
+++ b/llvm/lib/Target/Mips/MicroMipsInstrInfo.td
@@ -57,12 +57,6 @@ def MicroMipsMemGPRMM16AsmOperand : AsmOperandClass {
   let PredicateMethod = "isMemWithGRPMM16Base";
 }
 
-// Define the classes of pointers used by microMIPS.
-// The numbers must match those in MipsRegisterInfo::MipsPtrClass.
-def ptr_gpr16mm_rc : PointerLikeRegClass<1>;
-def ptr_sp_rc : PointerLikeRegClass<2>;
-def ptr_gp_rc : PointerLikeRegClass<3>;
-
 class mem_mm_4_generic : Operand {
   let PrintMethod = "printMemOperand";
   let MIOperandInfo = (ops ptr_gpr16mm_rc, simm4);
@@ -114,7 +108,7 @@ def mem_mm_gp_simm7_lsl2 : Operand {
 
 def mem_mm_9 : Operand {
   let PrintMethod = "printMemOperand";
-  let MIOperandInfo = (ops ptr_rc, simm9);
+  let MIOperandInfo = (ops mips_ptr_rc, simm9);
   let EncoderMethod = "getMemEncodingMMImm9";
   let ParserMatchClass = MipsMemSimmAsmOperand<9>;
   let OperandType = "OPERAND_MEMORY";
@@ -130,7 +124,7 @@ def mem_m

[llvm-branch-commits] [llvm] PPC: Replace PointerLikeRegClass with RegClassByHwMode (PR #158777)

2025-09-18 Thread Sergei Barannikov via llvm-branch-commits


https://github.com/s-barannikov approved this pull request.

LGTM
Is there a reason to not implement the renaming suggestion? (Like it would 
require renaming methods in C++ files or so or make the naming inconsistent.)

https://github.com/llvm/llvm-project/pull/158777
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] PPC: Replace PointerLikeRegClass with RegClassByHwMode (PR #158777)

2025-09-18 Thread Sergei Barannikov via llvm-branch-commits


https://github.com/s-barannikov edited 
https://github.com/llvm/llvm-project/pull/158777
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] PPC: Replace PointerLikeRegClass with RegClassByHwMode (PR #158777)

2025-09-18 Thread Sergei Barannikov via llvm-branch-commits


s-barannikov wrote:

They seem all lowercase to me :man_shrugging: 
```
$ grep -E ": (Register)?Operand" llvm/lib/Target/PowerPC/*.td | cut -d ':' -f 
2,3

def s16imm64 : Operand {
def u16imm64 : Operand {
def s17imm64 : Operand {
def tocentry : Operand {
def tlsreg : Operand {
def tlsgd : Operand {}
def tlscall : Operand {
def tocentry32 : Operand {
def imm32SExt16  : Operand, ImmLeaf, ImmLeaf, ImmLeaf {
def g8rc : RegisterOperand {
def g8prc : RegisterOperand {
def gprc_nor0 : RegisterOperand {
def g8rc_nox0 : RegisterOperand {
def f8rc : RegisterOperand {
def f4rc : RegisterOperand {
def fpairrc : RegisterOperand {
def vrrc : RegisterOperand {
def vfrc : RegisterOperand {
def crbitrc : RegisterOperand {
def crrc : RegisterOperand {
def sperc : RegisterOperand {
def spe4rc : RegisterOperand {
def u1imm   : Operand {
def u2imm   : Operand {
def atimm   : Operand {
def u3imm   : Operand {
def u4imm   : Operand {
def s5imm   : Operand {
def u5imm   : Operand {
def u6imm   : Operand {
def u7imm   : Operand {
def u8imm   : Operand {
def u10imm  : Operand {
def u12imm  : Operand {
def s16imm  : Operand {
def u16imm  : Operand {
def s17imm  : Operand {
def s34imm : Operand {
def s34imm_pcrel : Operand {
def immZero : Operand {
def directbrtarget : Operand {
def absdirectbrtarget : Operand {
def condbrtarget : Operand {
def abscondbrtarget : Operand {
def calltarget : Operand {
def abscalltarget : Operand {
def crbitm: Operand {
def ptr_rc_nor0 : Operand, PointerLikeRegClass<1> {
def dispRI34 : Operand {
def dispRI34_pcrel : Operand {
def memri34 : Operand { // memri, imm is a 34-bit value.
def memri34_pcrel : Operand { // memri, imm is a 34-bit value.
def ptr_rc_idx : Operand, PointerLikeRegClass<0> {
def dispRI : Operand {
def dispRIX : Operand {
def dispRIHash : Operand {
def dispRIX16 : Operand {
def dispSPE8 : Operand {
def dispSPE4 : Operand {
def dispSPE2 : Operand {
def memri : Operand {
def memrr : Operand {
def memrix : Operand {   // memri where the imm is 4-aligned.
def memrihash : Operand {
def memrix16 : Operand { // memri, imm is 16-aligned, 12-bit, Inst{16
def spe8dis : Operand {   // SPE displacement where the imm is 8-aligned.
def spe4dis : Operand {   // SPE displacement where the imm is 4-aligned.
def spe2dis : Operand {   // SPE displacement where the imm is 2-aligned.
def memr : Operand {
def tlsreg32 : Operand {
def tlsgd32 : Operand {}
def tlscall32 : Operand {
def pred : Operand {
def vsrc : RegisterOperand {
def vsfrc : RegisterOperand {
def vssrc : RegisterOperand {
def spilltovsrrc : RegisterOperand {
def vsrprc : RegisterOperand {
def vsrpevenrc : RegisterOperand {
def acc : RegisterOperand {
def uacc : RegisterOperand {
def dmrrow : RegisterOperand {
def dmrrowp : RegisterOperand {
def wacc : RegisterOperand {
def wacc_hi : RegisterOperand {
def dmr : RegisterOperand {
def dmrp : RegisterOperand {
```


https://github.com/llvm/llvm-project/pull/158777
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] SPARC: Use RegClassByHwMode instead of PointerLikeRegClass (PR #158271)

2025-09-18 Thread Sergei Barannikov via llvm-branch-commits


https://github.com/s-barannikov approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/158271
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] PPC: Replace PointerLikeRegClass with RegClassByHwMode (PR #158777)

2025-09-18 Thread Matt Arsenault via llvm-branch-commits


arsenm wrote:

Most of the operands seem capitalized 

https://github.com/llvm/llvm-project/pull/158777
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] Mips: Switch to RegClassByHwMode (PR #158273)

2025-09-18 Thread Sergei Barannikov via llvm-branch-commits



@@ -211,6 +211,21 @@ def FeatureUseIndirectJumpsHazard : 
SubtargetFeature<"use-indirect-jump-hazard",
 def FeatureStrictAlign
 : SubtargetFeature<"strict-align", "StrictAlign", "true",
"Disable unaligned load store for r6">;
+//===--===//

s-barannikov wrote:

nit: newline before comment

https://github.com/llvm/llvm-project/pull/158273
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] Mips: Switch to RegClassByHwMode (PR #158273)

2025-09-18 Thread Sergei Barannikov via llvm-branch-commits



@@ -916,6 +916,30 @@ DecodeGPRMM16MovePRegisterClass(MCInst &Inst, unsigned 
RegNo, uint64_t Address,
   return MCDisassembler::Success;
 }
 
+static DecodeStatus DecodeGP32RegisterClass(MCInst &Inst, unsigned RegNo,

s-barannikov wrote:

Can you add a comment why these are present but not implemented?

https://github.com/llvm/llvm-project/pull/158273
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] PPC: Replace PointerLikeRegClass with RegClassByHwMode (PR #158777)

2025-09-18 Thread Matt Arsenault via llvm-branch-commits


https://github.com/arsenm updated 
https://github.com/llvm/llvm-project/pull/158777

>From 0e5dfd5493a599e6eb9e5a0a0b21cd542c964e8f Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Fri, 5 Sep 2025 18:03:59 +0900
Subject: [PATCH 1/3] PPC: Replace PointerLikeRegClass with RegClassByHwMode

---
 .../PowerPC/Disassembler/PPCDisassembler.cpp  |  3 --
 llvm/lib/Target/PowerPC/PPC.td|  6 
 llvm/lib/Target/PowerPC/PPCInstrInfo.cpp  | 28 ++-
 llvm/lib/Target/PowerPC/PPCRegisterInfo.td| 10 +--
 4 files changed, 23 insertions(+), 24 deletions(-)

diff --git a/llvm/lib/Target/PowerPC/Disassembler/PPCDisassembler.cpp 
b/llvm/lib/Target/PowerPC/Disassembler/PPCDisassembler.cpp
index 47586c417cfe3..70e619cc22b19 100644
--- a/llvm/lib/Target/PowerPC/Disassembler/PPCDisassembler.cpp
+++ b/llvm/lib/Target/PowerPC/Disassembler/PPCDisassembler.cpp
@@ -185,9 +185,6 @@ DecodeG8RC_NOX0RegisterClass(MCInst &Inst, uint64_t RegNo, 
uint64_t Address,
   return decodeRegisterClass(Inst, RegNo, XRegsNoX0);
 }
 
-#define DecodePointerLikeRegClass0 DecodeGPRCRegisterClass
-#define DecodePointerLikeRegClass1 DecodeGPRC_NOR0RegisterClass
-
 static DecodeStatus DecodeSPERCRegisterClass(MCInst &Inst, uint64_t RegNo,
  uint64_t Address,
  const MCDisassembler *Decoder) {
diff --git a/llvm/lib/Target/PowerPC/PPC.td b/llvm/lib/Target/PowerPC/PPC.td
index 386d0f65d1ed1..d491e88b66ad8 100644
--- a/llvm/lib/Target/PowerPC/PPC.td
+++ b/llvm/lib/Target/PowerPC/PPC.td
@@ -394,6 +394,12 @@ def NotAIX : Predicate<"!Subtarget->isAIXABI()">;
 def IsISAFuture : Predicate<"Subtarget->isISAFuture()">;
 def IsNotISAFuture : Predicate<"!Subtarget->isISAFuture()">;
 
+//===--===//
+// HwModes
+//===--===//
+
+defvar PPC32 = DefaultMode;
+def PPC64 : HwMode<[In64BitMode]>;
 
 // Since new processors generally contain a superset of features of those that
 // came before them, the idea is to make implementations of new processors
diff --git a/llvm/lib/Target/PowerPC/PPCInstrInfo.cpp 
b/llvm/lib/Target/PowerPC/PPCInstrInfo.cpp
index db066bc4b7bdd..55e38bcf4afc9 100644
--- a/llvm/lib/Target/PowerPC/PPCInstrInfo.cpp
+++ b/llvm/lib/Target/PowerPC/PPCInstrInfo.cpp
@@ -2142,33 +2142,23 @@ bool PPCInstrInfo::onlyFoldImmediate(MachineInstr 
&UseMI, MachineInstr &DefMI,
   assert(UseIdx < UseMI.getNumOperands() && "Cannot find Reg in UseMI");
   assert(UseIdx < UseMCID.getNumOperands() && "No operand description for 
Reg");
 
-  const MCOperandInfo *UseInfo = &UseMCID.operands()[UseIdx];
-
   // We can fold the zero if this register requires a GPRC_NOR0/G8RC_NOX0
   // register (which might also be specified as a pointer class kind).
-  if (UseInfo->isLookupPtrRegClass()) {
-if (UseInfo->RegClass /* Kind */ != 1)
-  return false;
-  } else {
-if (UseInfo->RegClass != PPC::GPRC_NOR0RegClassID &&
-UseInfo->RegClass != PPC::G8RC_NOX0RegClassID)
-  return false;
-  }
+
+  const MCOperandInfo &UseInfo = UseMCID.operands()[UseIdx];
+  int16_t RegClass = getOpRegClassID(UseInfo);
+  if (UseInfo.RegClass != PPC::GPRC_NOR0RegClassID &&
+  UseInfo.RegClass != PPC::G8RC_NOX0RegClassID)
+return false;
 
   // Make sure this is not tied to an output register (or otherwise
   // constrained). This is true for ST?UX registers, for example, which
   // are tied to their output registers.
-  if (UseInfo->Constraints != 0)
+  if (UseInfo.Constraints != 0)
 return false;
 
-  MCRegister ZeroReg;
-  if (UseInfo->isLookupPtrRegClass()) {
-bool isPPC64 = Subtarget.isPPC64();
-ZeroReg = isPPC64 ? PPC::ZERO8 : PPC::ZERO;
-  } else {
-ZeroReg = UseInfo->RegClass == PPC::G8RC_NOX0RegClassID ?
-  PPC::ZERO8 : PPC::ZERO;
-  }
+  MCRegister ZeroReg =
+  RegClass == PPC::G8RC_NOX0RegClassID ? PPC::ZERO8 : PPC::ZERO;
 
   LLVM_DEBUG(dbgs() << "Folded immediate zero for: ");
   LLVM_DEBUG(UseMI.dump());
diff --git a/llvm/lib/Target/PowerPC/PPCRegisterInfo.td 
b/llvm/lib/Target/PowerPC/PPCRegisterInfo.td
index 8b690b7b833b3..adda91786d19c 100644
--- a/llvm/lib/Target/PowerPC/PPCRegisterInfo.td
+++ b/llvm/lib/Target/PowerPC/PPCRegisterInfo.td
@@ -868,7 +868,11 @@ def crbitm: Operand {
 def PPCRegGxRCNoR0Operand : AsmOperandClass {
   let Name = "RegGxRCNoR0"; let PredicateMethod = "isRegNumber";
 }
-def ptr_rc_nor0 : Operand, PointerLikeRegClass<1> {
+
+def ptr_rc_nor0 : Operand,
+  RegClassByHwMode<
+[PPC32, PPC64],
+[GPRC_NOR0, G8RC_NOX0]> {
   let ParserMatchClass = PPCRegGxRCNoR0Operand;
 }
 
@@ -902,7 +906,9 @@ def memri34_pcrel : Operand { // memri, imm is a 
34-bit value.
 def PPCRegGxRCOperand : AsmOperandClass {
   let Name = "RegGxRC"; let PredicateMethod = "isRegNumber";
 }
-def ptr_rc_idx : Operand, PointerLikeRegClass<0> {
+def ptr_rc_idx : Operand,

[llvm-branch-commits] [llvm] SPARC: Use RegClassByHwMode instead of PointerLikeRegClass (PR #158271)

2025-09-18 Thread Matt Arsenault via llvm-branch-commits


https://github.com/arsenm updated 
https://github.com/llvm/llvm-project/pull/158271

>From e7ef891fb2c4e21bec4d23af954ad9204f3eb48f Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Mon, 8 Sep 2025 14:04:59 +0900
Subject: [PATCH] SPARC: Use RegClassByHwMode instead of PointerLikeRegClass

---
 .../Sparc/Disassembler/SparcDisassembler.cpp  |  8 ---
 llvm/lib/Target/Sparc/SparcInstrInfo.td   | 21 +--
 2 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/llvm/lib/Target/Sparc/Disassembler/SparcDisassembler.cpp 
b/llvm/lib/Target/Sparc/Disassembler/SparcDisassembler.cpp
index c3d60f3689e1f..e585e5af42d32 100644
--- a/llvm/lib/Target/Sparc/Disassembler/SparcDisassembler.cpp
+++ b/llvm/lib/Target/Sparc/Disassembler/SparcDisassembler.cpp
@@ -159,14 +159,6 @@ static DecodeStatus DecodeI64RegsRegisterClass(MCInst 
&Inst, unsigned RegNo,
   return DecodeIntRegsRegisterClass(Inst, RegNo, Address, Decoder);
 }
 
-// This is used for the type "ptr_rc", which is either IntRegs or I64Regs
-// depending on SparcRegisterInfo::getPointerRegClass.
-static DecodeStatus DecodePointerLikeRegClass0(MCInst &Inst, unsigned RegNo,
-   uint64_t Address,
-   const MCDisassembler *Decoder) {
-  return DecodeIntRegsRegisterClass(Inst, RegNo, Address, Decoder);
-}
-
 static DecodeStatus DecodeFPRegsRegisterClass(MCInst &Inst, unsigned RegNo,
   uint64_t Address,
   const MCDisassembler *Decoder) {
diff --git a/llvm/lib/Target/Sparc/SparcInstrInfo.td 
b/llvm/lib/Target/Sparc/SparcInstrInfo.td
index 53972d6c105a4..97e7fd7769edb 100644
--- a/llvm/lib/Target/Sparc/SparcInstrInfo.td
+++ b/llvm/lib/Target/Sparc/SparcInstrInfo.td
@@ -95,10 +95,27 @@ def HasFSMULD : Predicate<"!Subtarget->hasNoFSMULD()">;
 // will pick deprecated instructions.
 def UseDeprecatedInsts : Predicate<"Subtarget->useV8DeprecatedInsts()">;
 
+//===--===//
+// HwModes Pattern Stuff
+//===--===//
+
+defvar SPARC32 = DefaultMode;
+def SPARC64 : HwMode<[Is64Bit]>;
+
 
//===--===//
 // Instruction Pattern Stuff
 
//===--===//
 
+def sparc_ptr_rc : RegClassByHwMode<
+  [SPARC32, SPARC64],
+  [IntRegs, I64Regs]>;
+
+// Both cases can use the same decoder method, so avoid the dispatch
+// by hwmode by setting an explicit DecoderMethod
+def ptr_op : RegisterOperand {
+  let DecoderMethod = "DecodeIntRegsRegisterClass";
+}
+
 // FIXME these should have AsmOperandClass.
 def uimm3 : PatLeaf<(imm), [{ return isUInt<3>(N->getZExtValue()); }]>;
 
@@ -178,12 +195,12 @@ def simm13Op : Operand {
 
 def MEMrr : Operand {
   let PrintMethod = "printMemOperand";
-  let MIOperandInfo = (ops ptr_rc, ptr_rc);
+  let MIOperandInfo = (ops ptr_op, ptr_op);
   let ParserMatchClass = SparcMEMrrAsmOperand;
 }
 def MEMri : Operand {
   let PrintMethod = "printMemOperand";
-  let MIOperandInfo = (ops ptr_rc, simm13Op);
+  let MIOperandInfo = (ops ptr_op, simm13Op);
   let ParserMatchClass = SparcMEMriAsmOperand;
 }
 

___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] Mips: Switch to RegClassByHwMode (PR #158273)

2025-09-18 Thread Sergei Barannikov via llvm-branch-commits


https://github.com/s-barannikov approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/158273
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] Mips: Switch to RegClassByHwMode (PR #158273)

2025-09-18 Thread Sergei Barannikov via llvm-branch-commits



@@ -46,20 +46,8 @@ unsigned MipsRegisterInfo::getPICCallReg() { return 
Mips::T9; }
 
 const TargetRegisterClass *
 MipsRegisterInfo::getPointerRegClass(unsigned Kind) const {
-  MipsPtrClass PtrClassKind = static_cast(Kind);
-
-  switch (PtrClassKind) {
-  case MipsPtrClass::Default:

s-barannikov wrote:

I guess the enum can be removed.

https://github.com/llvm/llvm-project/pull/158273
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [profcheck][SimplifyCFG] Propagate !prof from `switch` to `select` (PR #159645)

2025-09-18 Thread Mircea Trofin via llvm-branch-commits


mtrofin wrote:

> [!WARNING]
> This pull request is not mergeable via GitHub because a downstack PR is 
> open. Once all requirements are satisfied, merge this PR as a stack  href="https://app.graphite.dev/github/pr/llvm/llvm-project/159645?utm_source=stack-comment-downstack-mergeability-warning";
>  >on Graphite.
> https://graphite.dev/docs/merge-pull-requests";>Learn more

* **#159645** https://app.graphite.dev/github/pr/llvm/llvm-project/159645?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/159645?utm_source=stack-comment-view-in-graphite";
 target="_blank">(View in Graphite)
* **#159644** https://app.graphite.dev/github/pr/llvm/llvm-project/159644?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* `main`




This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn 
more about https://stacking.dev/?utm_source=stack-comment";>stacking.


https://github.com/llvm/llvm-project/pull/159645
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [IR2Vec] Refactor vocabulary to use section-based storage (PR #158376)

2025-09-18 Thread S. VenkataKeerthy via llvm-branch-commits



@@ -301,12 +380,16 @@ class Vocabulary {
   constexpr static unsigned NumCanonicalEntries =
   MaxOpcodes + MaxCanonicalTypeIDs + MaxOperandKinds + MaxPredicateKinds;
 
-  // Base offsets for slot layout to simplify index computation
+  // Base offsets for flat index computation
   constexpr static unsigned OperandBaseOffset =
   MaxOpcodes + MaxCanonicalTypeIDs;
   constexpr static unsigned PredicateBaseOffset =
   OperandBaseOffset + MaxOperandKinds;
 
+  /// Functions for predicate index calculations

svkeerthy wrote:

These methods and constexpr are being used in the header. 

https://github.com/llvm/llvm-project/pull/158376
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [AllocToken, Clang] Implement TypeHashPointerSplit mode (PR #156840)

2025-09-18 Thread Florian Mayer via llvm-branch-commits



@@ -1272,6 +1272,57 @@ void CodeGenFunction::EmitBoundsCheckImpl(const Expr *E, 
llvm::Value *Bound,
   EmitCheck(std::make_pair(Check, CheckKind), CheckHandler, StaticData, Index);
 }
 
+static bool
+typeContainsPointer(QualType T,
+llvm::SmallPtrSet &VisitedRD,
+bool &IncompleteType) {
+  QualType CanonicalType = T.getCanonicalType();
+  if (CanonicalType->isPointerType())
+return true; // base case
+
+  // Look through typedef chain to check for special types.
+  for (QualType CurrentT = T; const auto *TT = CurrentT->getAs();
+   CurrentT = TT->getDecl()->getUnderlyingType()) {
+const IdentifierInfo *II = TT->getDecl()->getIdentifier();
+if (!II)
+  continue;
+// Special Case: Syntactically uintptr_t is not a pointer; semantically,
+// however, very likely used as such. Therefore, classify uintptr_t as a
+// pointer, too.
+if (II->isStr("uintptr_t"))
+  return true;
+  }
+
+  // The type is an array; check the element type.
+  if (const ArrayType *AT = CanonicalType->getAsArrayTypeUnsafe())

fmayer wrote:

Why do we need this over `dyn_cast`?

We are already using the canonical type here? This function is:

```
inline const ArrayType *Type::getAsArrayTypeUnsafe() const {
  // If this is directly an array type, return it.
  if (const auto *arr = dyn_cast(this))
return arr;

  // If the canonical form of this type isn't the right kind, reject it.
  if (!isa(CanonicalType))
return nullptr;

  // If this is a typedef for the type, strip the typedef off without
  // losing all typedef information.
  return cast(getUnqualifiedDesugaredType());
}
```

https://github.com/llvm/llvm-project/pull/156840
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [IR2Vec] Refactor vocabulary to use section-based storage (PR #158376)

2025-09-18 Thread Mircea Trofin via llvm-branch-commits



@@ -261,55 +262,106 @@ void FlowAwareEmbedder::computeEmbeddings(const 
BasicBlock &BB) const {
   BBVecMap[&BB] = BBVector;
 }
 
+// 
==--===//
+// VocabStorage
+//===--===//
+
+VocabStorage::VocabStorage(std::vector> &&SectionData)
+: Sections(std::move(SectionData)), TotalSize([&] {
+assert(!Sections.empty() && "Vocabulary has no sections");
+assert(!Sections[0].empty() && "First section of vocabulary is empty");
+// Compute total size across all sections
+size_t Size = 0;
+for (const auto &Section : Sections)
+  Size += Section.size();
+return Size;
+  }()),
+  Dimension([&] {
+// Get dimension from the first embedding in the first section - all
+// embeddings must have the same dimension
+assert(!Sections.empty() && "Vocabulary has no sections");
+assert(!Sections[0].empty() && "First section of vocabulary is empty");
+return static_cast(Sections[0][0].size());
+  }()) {}
+
+const Embedding &VocabStorage::const_iterator::operator*() const {
+  assert(SectionId < Storage->Sections.size() && "Invalid section ID");
+  assert(LocalIndex < Storage->Sections[SectionId].size() &&
+ "Local index out of range");
+  return Storage->Sections[SectionId][LocalIndex];
+}
+
+VocabStorage::const_iterator &VocabStorage::const_iterator::operator++() {
+  ++LocalIndex;
+  // Check if we need to move to the next section
+  while (SectionId < Storage->getNumSections() &&
+ LocalIndex >= Storage->Sections[SectionId].size()) {
+LocalIndex = 0;

mtrofin wrote:

Can you explain in a comment the fact that sections may be 0?

https://github.com/llvm/llvm-project/pull/158376
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU: Use RegClassByHwMode to manage operand VGPR operand constraints (PR #158272)

2025-09-18 Thread via llvm-branch-commits


github-actions[bot] wrote:




:warning: C/C++ code formatter, clang-format found issues in your code. 
:warning:



You can test this locally with the following command:


``bash
git-clang-format --diff origin/main HEAD --extensions h,cpp -- 
llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp 
llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp 
llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp 
llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.h 
llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp 
llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp 
llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp 
llvm/lib/Target/AMDGPU/SIFoldOperands.cpp 
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp llvm/lib/Target/AMDGPU/SIInstrInfo.h 
llvm/lib/Target/AMDGPU/SIPeepholeSDWA.cpp 
llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp 
llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h
``

:warning:
The reproduction instructions above might return results for more than one PR
in a stack if you are using a stacked PR workflow. You can limit the results by
changing `origin/main` to the base branch/commit you want to compare against.
:warning:





View the diff from clang-format here.


``diff
diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp 
b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
index 41d537fff..72b24c105 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
@@ -5973,7 +5973,6 @@ SIInstrInfo::getWholeWaveFunctionSetup(MachineFunction 
&MF) const {
   llvm_unreachable("Couldn't find SI_SETUP_WHOLE_WAVE_FUNC instruction");
 }
 
-
 // FIXME: This should not be an overridable function. All subtarget dependent
 // operand modifications should go through isLookupRegClassByHwMode in the
 // generic handling.

``




https://github.com/llvm/llvm-project/pull/158272
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] CodeGen: Keep reference to TargetRegisterInfo in TargetInstrInfo (PR #158224)

2025-09-18 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-backend-hexagon

Author: Matt Arsenault (arsenm)


Changes

Both conceptually belong to the same subtarget, so it should not
be necessary to pass in the context TargetRegisterInfo to any
TargetInstrInfo member. Add this reference so those superfluous
arguments can be removed.

Most targets placed their TargetRegisterInfo as a member
in TargetInstrInfo. A few had this owned by the TargetSubtargetInfo,
so unify all targets to look the same.

---

Patch is 45.06 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/158224.diff


50 Files Affected:

- (modified) llvm/include/llvm/CodeGen/TargetInstrInfo.h (+8-3) 
- (modified) llvm/lib/CodeGen/TargetInstrInfo.cpp (+27-41) 
- (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/AMDGPU/R600InstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.cpp (+2-1) 
- (modified) llvm/lib/Target/ARC/ARCInstrInfo.cpp (+2-1) 
- (modified) llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp (+3-2) 
- (modified) llvm/lib/Target/ARM/ARMBaseInstrInfo.h (+7-2) 
- (modified) llvm/lib/Target/ARM/ARMInstrInfo.cpp (+2-1) 
- (modified) llvm/lib/Target/ARM/ARMInstrInfo.h (+1-1) 
- (modified) llvm/lib/Target/ARM/Thumb1InstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/ARM/Thumb1InstrInfo.h (+1-1) 
- (modified) llvm/lib/Target/ARM/Thumb2InstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/ARM/Thumb2InstrInfo.h (+1-1) 
- (modified) llvm/lib/Target/AVR/AVRInstrInfo.cpp (+2-2) 
- (modified) llvm/lib/Target/BPF/BPFInstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/CSKY/CSKYInstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/DirectX/DirectXInstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/Hexagon/HexagonInstrInfo.cpp (+2-2) 
- (modified) llvm/lib/Target/Hexagon/HexagonInstrInfo.h (+5) 
- (modified) llvm/lib/Target/Hexagon/HexagonSubtarget.cpp (+1-2) 
- (modified) llvm/lib/Target/Hexagon/HexagonSubtarget.h (+1-2) 
- (modified) llvm/lib/Target/Lanai/LanaiInstrInfo.cpp (+2-1) 
- (modified) llvm/lib/Target/LoongArch/LoongArchInstrInfo.cpp (+2-2) 
- (modified) llvm/lib/Target/LoongArch/LoongArchInstrInfo.h (+4) 
- (modified) llvm/lib/Target/LoongArch/LoongArchSubtarget.cpp (+1-1) 
- (modified) llvm/lib/Target/LoongArch/LoongArchSubtarget.h (+1-2) 
- (modified) llvm/lib/Target/MSP430/MSP430InstrInfo.cpp (+2-1) 
- (modified) llvm/lib/Target/Mips/Mips16InstrInfo.cpp (+1-5) 
- (modified) llvm/lib/Target/Mips/Mips16InstrInfo.h (+1-1) 
- (modified) llvm/lib/Target/Mips/MipsInstrInfo.cpp (+3-2) 
- (modified) llvm/lib/Target/Mips/MipsInstrInfo.h (+6-2) 
- (modified) llvm/lib/Target/Mips/MipsSEInstrInfo.cpp (+1-5) 
- (modified) llvm/lib/Target/Mips/MipsSEInstrInfo.h (+1-1) 
- (modified) llvm/lib/Target/NVPTX/NVPTXInstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/PowerPC/PPCInstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/RISCV/RISCVInstrInfo.cpp (+3-2) 
- (modified) llvm/lib/Target/RISCV/RISCVInstrInfo.h (+3) 
- (modified) llvm/lib/Target/RISCV/RISCVSubtarget.cpp (+1-1) 
- (modified) llvm/lib/Target/RISCV/RISCVSubtarget.h (+1-2) 
- (modified) llvm/lib/Target/SPIRV/SPIRVInstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/Sparc/SparcInstrInfo.cpp (+2-2) 
- (modified) llvm/lib/Target/SystemZ/SystemZInstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/VE/VEInstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/X86/X86InstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/XCore/XCoreInstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/Xtensa/XtensaInstrInfo.cpp (+2-1) 
- (modified) llvm/unittests/CodeGen/MFCommon.inc (+3-1) 
- (modified) llvm/utils/TableGen/InstrInfoEmitter.cpp (+7-5) 


``diff
diff --git a/llvm/include/llvm/CodeGen/TargetInstrInfo.h 
b/llvm/include/llvm/CodeGen/TargetInstrInfo.h
index 6a624a7052cdd..802cca6022074 100644
--- a/llvm/include/llvm/CodeGen/TargetInstrInfo.h
+++ b/llvm/include/llvm/CodeGen/TargetInstrInfo.h
@@ -113,9 +113,12 @@ struct ExtAddrMode {
 ///
 class LLVM_ABI TargetInstrInfo : public MCInstrInfo {
 protected:
-  TargetInstrInfo(unsigned CFSetupOpcode = ~0u, unsigned CFDestroyOpcode = ~0u,
-  unsigned CatchRetOpcode = ~0u, unsigned ReturnOpcode = ~0u)
-  : CallFrameSetupOpcode(CFSetupOpcode),
+  const TargetRegisterInfo &TRI;
+
+  TargetInstrInfo(const TargetRegisterInfo &TRI, unsigned CFSetupOpcode = ~0u,
+  unsigned CFDestroyOpcode = ~0u, unsigned CatchRetOpcode = 
~0u,
+  unsigned ReturnOpcode = ~0u)
+  : TRI(TRI), CallFrameSetupOpcode(CFSetupOpcode),
 CallFrameDestroyOpcode(CFDestroyOpcode), 
CatchRetOpcode(CatchRetOpcode),
 ReturnOpcode(ReturnOpcode) {}
 
@@ -124,6 +127,8 @@ class LLVM_ABI TargetInstrInfo : public MCInstrInfo {
   TargetInstrInfo &operator=(const TargetInstrInfo &) = delete;
   virtual ~TargetInstrInfo();
 
+  const TargetRegisterInfo &getRegisterInfo() const { return TRI; }
+
   static bool is

[llvm-branch-commits] [AllocToken, Clang] Infer type hints from sizeof expressions and casts (PR #156841)

2025-09-18 Thread Marco Elver via llvm-branch-commits


https://github.com/melver updated 
https://github.com/llvm/llvm-project/pull/156841


___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] Mips: Switch to RegClassByHwMode (PR #158273)

2025-09-18 Thread Sergei Barannikov via llvm-branch-commits


https://github.com/s-barannikov edited 
https://github.com/llvm/llvm-project/pull/158273
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] Mips: Switch to RegClassByHwMode (PR #158273)

2025-09-18 Thread Matt Arsenault via llvm-branch-commits


https://github.com/arsenm updated 
https://github.com/llvm/llvm-project/pull/158273

>From 5b8f38bb56b46b9e63fe2031f9b43e4bbba333fb Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Sat, 6 Sep 2025 21:14:45 +0900
Subject: [PATCH 1/3] Mips: Switch to RegClassByHwMode

---
 .../Target/Mips/AsmParser/MipsAsmParser.cpp   |  9 +--
 .../Mips/Disassembler/MipsDisassembler.cpp| 24 +++
 llvm/lib/Target/Mips/MicroMipsInstrInfo.td| 12 +++---
 llvm/lib/Target/Mips/Mips.td  | 15 
 llvm/lib/Target/Mips/MipsInstrInfo.td | 20 +++-
 llvm/lib/Target/Mips/MipsRegisterInfo.cpp | 16 ++---
 llvm/lib/Target/Mips/MipsRegisterInfo.td  | 16 +
 7 files changed, 76 insertions(+), 36 deletions(-)

diff --git a/llvm/lib/Target/Mips/AsmParser/MipsAsmParser.cpp 
b/llvm/lib/Target/Mips/AsmParser/MipsAsmParser.cpp
index 8a5cb517c94c5..ba70c9e6cb9e8 100644
--- a/llvm/lib/Target/Mips/AsmParser/MipsAsmParser.cpp
+++ b/llvm/lib/Target/Mips/AsmParser/MipsAsmParser.cpp
@@ -3706,7 +3706,9 @@ void MipsAsmParser::expandMem16Inst(MCInst &Inst, SMLoc 
IDLoc, MCStreamer &Out,
   MCRegister TmpReg = DstReg;
 
   const MCInstrDesc &Desc = MII.get(OpCode);
-  int16_t DstRegClass = Desc.operands()[StartOp].RegClass;
+  int16_t DstRegClass =
+  MII.getOpRegClassID(Desc.operands()[StartOp],
+  STI->getHwMode(MCSubtargetInfo::HwMode_RegInfo));
   unsigned DstRegClassID =
   getContext().getRegisterInfo()->getRegClass(DstRegClass).getID();
   bool IsGPR = (DstRegClassID == Mips::GPR32RegClassID) ||
@@ -3834,7 +3836,10 @@ void MipsAsmParser::expandMem9Inst(MCInst &Inst, SMLoc 
IDLoc, MCStreamer &Out,
   MCRegister TmpReg = DstReg;
 
   const MCInstrDesc &Desc = MII.get(OpCode);
-  int16_t DstRegClass = Desc.operands()[StartOp].RegClass;
+  int16_t DstRegClass =
+  MII.getOpRegClassID(Desc.operands()[StartOp],
+  STI->getHwMode(MCSubtargetInfo::HwMode_RegInfo));
+
   unsigned DstRegClassID =
   getContext().getRegisterInfo()->getRegClass(DstRegClass).getID();
   bool IsGPR = (DstRegClassID == Mips::GPR32RegClassID) ||
diff --git a/llvm/lib/Target/Mips/Disassembler/MipsDisassembler.cpp 
b/llvm/lib/Target/Mips/Disassembler/MipsDisassembler.cpp
index c22b8f61b12dc..705695c74803f 100644
--- a/llvm/lib/Target/Mips/Disassembler/MipsDisassembler.cpp
+++ b/llvm/lib/Target/Mips/Disassembler/MipsDisassembler.cpp
@@ -916,6 +916,30 @@ DecodeGPRMM16MovePRegisterClass(MCInst &Inst, unsigned 
RegNo, uint64_t Address,
   return MCDisassembler::Success;
 }
 
+static DecodeStatus DecodeGP32RegisterClass(MCInst &Inst, unsigned RegNo,
+uint64_t Address,
+const MCDisassembler *Decoder) {
+  llvm_unreachable("this is unused");
+}
+
+static DecodeStatus DecodeGP64RegisterClass(MCInst &Inst, unsigned RegNo,
+uint64_t Address,
+const MCDisassembler *Decoder) {
+  llvm_unreachable("this is unused");
+}
+
+static DecodeStatus DecodeSP32RegisterClass(MCInst &Inst, unsigned RegNo,
+uint64_t Address,
+const MCDisassembler *Decoder) {
+  llvm_unreachable("this is unused");
+}
+
+static DecodeStatus DecodeSP64RegisterClass(MCInst &Inst, unsigned RegNo,
+uint64_t Address,
+const MCDisassembler *Decoder) {
+  llvm_unreachable("this is unused");
+}
+
 static DecodeStatus DecodeGPR32RegisterClass(MCInst &Inst, unsigned RegNo,
  uint64_t Address,
  const MCDisassembler *Decoder) {
diff --git a/llvm/lib/Target/Mips/MicroMipsInstrInfo.td 
b/llvm/lib/Target/Mips/MicroMipsInstrInfo.td
index b3fd8f422f429..b44bf1391b73e 100644
--- a/llvm/lib/Target/Mips/MicroMipsInstrInfo.td
+++ b/llvm/lib/Target/Mips/MicroMipsInstrInfo.td
@@ -57,12 +57,6 @@ def MicroMipsMemGPRMM16AsmOperand : AsmOperandClass {
   let PredicateMethod = "isMemWithGRPMM16Base";
 }
 
-// Define the classes of pointers used by microMIPS.
-// The numbers must match those in MipsRegisterInfo::MipsPtrClass.
-def ptr_gpr16mm_rc : PointerLikeRegClass<1>;
-def ptr_sp_rc : PointerLikeRegClass<2>;
-def ptr_gp_rc : PointerLikeRegClass<3>;
-
 class mem_mm_4_generic : Operand {
   let PrintMethod = "printMemOperand";
   let MIOperandInfo = (ops ptr_gpr16mm_rc, simm4);
@@ -114,7 +108,7 @@ def mem_mm_gp_simm7_lsl2 : Operand {
 
 def mem_mm_9 : Operand {
   let PrintMethod = "printMemOperand";
-  let MIOperandInfo = (ops ptr_rc, simm9);
+  let MIOperandInfo = (ops mips_ptr_rc, simm9);
   let EncoderMethod = "getMemEncodingMMImm9";
   let ParserMatchClass = MipsMemSimmAsmOperand<9>;
   let OperandType = "OPERAND_MEMORY";
@@ -130,7 +124,7 @@ def mem_m

[llvm-branch-commits] [flang][OpenMP] `do concurrent`: support `local` on device (PR #156589)

2025-09-18 Thread Kareem Ergawy via llvm-branch-commits


https://github.com/ergawy updated 
https://github.com/llvm/llvm-project/pull/156589


___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU: Stop using aligned VGPR classes for addRegisterClass (PR #158278)

2025-09-18 Thread Matt Arsenault via llvm-branch-commits


https://github.com/arsenm updated 
https://github.com/llvm/llvm-project/pull/158278

>From 96a4d9030b00b30f6aa7d9a70b191c1aaab1f2e8 Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Fri, 12 Sep 2025 20:45:56 +0900
Subject: [PATCH] AMDGPU: Stop using aligned VGPR classes for addRegisterClass

This is unnecessary. At use emission time, InstrEmitter will
use the common subclass of the value type's register class and
the use instruction register classes. This removes one of the
obstacles to treating special case instructions that do not have
the alignment requirement overly conservatively.
---
 llvm/lib/Target/AMDGPU/SIISelLowering.cpp | 32 +++
 llvm/test/CodeGen/AMDGPU/mfma-loop.ll | 14 +-
 2 files changed, 24 insertions(+), 22 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp 
b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
index 363717b017ef0..37beb192293f9 100644
--- a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
@@ -111,52 +111,52 @@ SITargetLowering::SITargetLowering(const TargetMachine 
&TM,
   addRegisterClass(MVT::Untyped, V64RegClass);
 
   addRegisterClass(MVT::v3i32, &AMDGPU::SGPR_96RegClass);
-  addRegisterClass(MVT::v3f32, TRI->getVGPRClassForBitWidth(96));
+  addRegisterClass(MVT::v3f32, &AMDGPU::VReg_96RegClass);
 
   addRegisterClass(MVT::v2i64, &AMDGPU::SGPR_128RegClass);
   addRegisterClass(MVT::v2f64, &AMDGPU::SGPR_128RegClass);
 
   addRegisterClass(MVT::v4i32, &AMDGPU::SGPR_128RegClass);
-  addRegisterClass(MVT::v4f32, TRI->getVGPRClassForBitWidth(128));
+  addRegisterClass(MVT::v4f32, &AMDGPU::VReg_128RegClass);
 
   addRegisterClass(MVT::v5i32, &AMDGPU::SGPR_160RegClass);
-  addRegisterClass(MVT::v5f32, TRI->getVGPRClassForBitWidth(160));
+  addRegisterClass(MVT::v5f32, &AMDGPU::VReg_160RegClass);
 
   addRegisterClass(MVT::v6i32, &AMDGPU::SGPR_192RegClass);
-  addRegisterClass(MVT::v6f32, TRI->getVGPRClassForBitWidth(192));
+  addRegisterClass(MVT::v6f32, &AMDGPU::VReg_192RegClass);
 
   addRegisterClass(MVT::v3i64, &AMDGPU::SGPR_192RegClass);
-  addRegisterClass(MVT::v3f64, TRI->getVGPRClassForBitWidth(192));
+  addRegisterClass(MVT::v3f64, &AMDGPU::VReg_192RegClass);
 
   addRegisterClass(MVT::v7i32, &AMDGPU::SGPR_224RegClass);
-  addRegisterClass(MVT::v7f32, TRI->getVGPRClassForBitWidth(224));
+  addRegisterClass(MVT::v7f32, &AMDGPU::VReg_224RegClass);
 
   addRegisterClass(MVT::v8i32, &AMDGPU::SGPR_256RegClass);
-  addRegisterClass(MVT::v8f32, TRI->getVGPRClassForBitWidth(256));
+  addRegisterClass(MVT::v8f32, &AMDGPU::VReg_256RegClass);
 
   addRegisterClass(MVT::v4i64, &AMDGPU::SGPR_256RegClass);
-  addRegisterClass(MVT::v4f64, TRI->getVGPRClassForBitWidth(256));
+  addRegisterClass(MVT::v4f64, &AMDGPU::VReg_256RegClass);
 
   addRegisterClass(MVT::v9i32, &AMDGPU::SGPR_288RegClass);
-  addRegisterClass(MVT::v9f32, TRI->getVGPRClassForBitWidth(288));
+  addRegisterClass(MVT::v9f32, &AMDGPU::VReg_288RegClass);
 
   addRegisterClass(MVT::v10i32, &AMDGPU::SGPR_320RegClass);
-  addRegisterClass(MVT::v10f32, TRI->getVGPRClassForBitWidth(320));
+  addRegisterClass(MVT::v10f32, &AMDGPU::VReg_320RegClass);
 
   addRegisterClass(MVT::v11i32, &AMDGPU::SGPR_352RegClass);
-  addRegisterClass(MVT::v11f32, TRI->getVGPRClassForBitWidth(352));
+  addRegisterClass(MVT::v11f32, &AMDGPU::VReg_352RegClass);
 
   addRegisterClass(MVT::v12i32, &AMDGPU::SGPR_384RegClass);
-  addRegisterClass(MVT::v12f32, TRI->getVGPRClassForBitWidth(384));
+  addRegisterClass(MVT::v12f32, &AMDGPU::VReg_384RegClass);
 
   addRegisterClass(MVT::v16i32, &AMDGPU::SGPR_512RegClass);
-  addRegisterClass(MVT::v16f32, TRI->getVGPRClassForBitWidth(512));
+  addRegisterClass(MVT::v16f32, &AMDGPU::VReg_512RegClass);
 
   addRegisterClass(MVT::v8i64, &AMDGPU::SGPR_512RegClass);
-  addRegisterClass(MVT::v8f64, TRI->getVGPRClassForBitWidth(512));
+  addRegisterClass(MVT::v8f64, &AMDGPU::VReg_512RegClass);
 
   addRegisterClass(MVT::v16i64, &AMDGPU::SGPR_1024RegClass);
-  addRegisterClass(MVT::v16f64, TRI->getVGPRClassForBitWidth(1024));
+  addRegisterClass(MVT::v16f64, &AMDGPU::VReg_1024RegClass);
 
   if (Subtarget->has16BitInsts()) {
 if (Subtarget->useRealTrue16Insts()) {
@@ -188,7 +188,7 @@ SITargetLowering::SITargetLowering(const TargetMachine &TM,
   }
 
   addRegisterClass(MVT::v32i32, &AMDGPU::VReg_1024RegClass);
-  addRegisterClass(MVT::v32f32, TRI->getVGPRClassForBitWidth(1024));
+  addRegisterClass(MVT::v32f32, &AMDGPU::VReg_1024RegClass);
 
   computeRegisterProperties(Subtarget->getRegisterInfo());
 
diff --git a/llvm/test/CodeGen/AMDGPU/mfma-loop.ll 
b/llvm/test/CodeGen/AMDGPU/mfma-loop.ll
index 0af655dfbbee9..4bb653848cbf0 100644
--- a/llvm/test/CodeGen/AMDGPU/mfma-loop.ll
+++ b/llvm/test/CodeGen/AMDGPU/mfma-loop.ll
@@ -2399,8 +2399,9 @@ define amdgpu_kernel void 
@test_mfma_nested_loop_zeroinit(ptr addrspace(1) %arg)
 ; GFX90A-NEXT:v_accvgpr_mov_b32 a29, a0
 ; GFX90A-NEXT:v_accvgpr_mov_b32 a30, a0
 ; GFX90A-NEXT

[llvm-branch-commits] [llvm] [AllocToken, Clang] Implement __builtin_alloc_token_infer() and llvm.alloc.token.id (PR #156842)

2025-09-18 Thread Marco Elver via llvm-branch-commits


https://github.com/melver edited 
https://github.com/llvm/llvm-project/pull/156842
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AllocToken, Clang] Implement __builtin_infer_alloc_token() and llvm.alloc.token.id (PR #156842)

2025-09-18 Thread Marco Elver via llvm-branch-commits



@@ -3352,10 +3352,15 @@ class CodeGenFunction : public CodeGenTypeCache {
   SanitizerAnnotateDebugInfo(ArrayRef 
Ordinals,
  SanitizerHandler Handler);
 
-  /// Emit additional metadata used by the AllocToken instrumentation.
+  /// Emit metadata used by the AllocToken instrumentation.
+  llvm::MDNode *EmitAllocTokenHint(QualType AllocType);

melver wrote:

Yes, LLVM permits sharing MD nodes - MDNode::get should intern nodes, although 
if you want to skip the whole calculation involved, that needs introducing a 
separate lookup table.

Changing it to buildAllocToken.

https://github.com/llvm/llvm-project/pull/156842
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AllocToken, Clang] Implement __builtin_infer_alloc_token() and llvm.alloc.token.id (PR #156842)

2025-09-18 Thread Marco Elver via llvm-branch-commits



@@ -5760,6 +5764,24 @@ bool Sema::BuiltinAllocaWithAlign(CallExpr *TheCall) {
   return false;
 }
 
+bool Sema::BuiltinAllocTokenInfer(CallExpr *TheCall) {

melver wrote:

I'm indifferent here. Switching to a static function.

https://github.com/llvm/llvm-project/pull/156842
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [flang] [mlir] [MLIR] Add new complex.powi op (PR #158722)

2025-09-18 Thread Mehdi Amini via llvm-branch-commits


joker-eph wrote:

> > That isn't in MLIR right now, so that's not generally usable.
> 
> I've added `complex.powi -> complex.pow` conversion to the 
> `ComplexToStandard` MLIR pass.

Thanks, LG!

https://github.com/llvm/llvm-project/pull/158722
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [AllocToken, Clang] Infer type hints from sizeof expressions and casts (PR #156841)

2025-09-18 Thread Marco Elver via llvm-branch-commits


https://github.com/melver edited 
https://github.com/llvm/llvm-project/pull/156841
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [compiler-rt] release/21.x: [compiler-rt][sanitizer] fix msghdr for musl (PR #159551)

2025-09-18 Thread Deák Lajos via llvm-branch-commits


deaklajos wrote:

@vitalybuka 

https://github.com/llvm/llvm-project/pull/159551
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] release/21.x: MC: Better handle backslash-escaped symbols (PR #159420)

2025-09-18 Thread Nikita Popov via llvm-branch-commits


nikic wrote:

The diff here is fairly large, but also very mechanical. This fixes a 
regression for the Rust defmt crate with LLVM 21.

https://github.com/llvm/llvm-project/pull/159420
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [lld] CodeGen: Emit .prefalign directives based on the prefalign attribute. (PR #155529)

2025-09-18 Thread Eli Friedman via llvm-branch-commits


https://github.com/efriedma-quic commented:

Can you split "implement basic codegen support for prefalign" (the bits which 
don't depend on the .prefalign directive) into a separate patch?  It's not 
clear what's causing the test changes here.

https://github.com/llvm/llvm-project/pull/155529
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] release/21.x: [clang][docs] Fix implicit-int-conversion-on-negation typos (PR #156815)

2025-09-18 Thread via llvm-branch-commits


github-actions[bot] wrote:

@correctmost (or anyone else). If you would like to add a note about this fix 
in the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR. 

https://github.com/llvm/llvm-project/pull/156815
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [Remarks] Restructure bitstream remarks to be fully standalone (PR #156715)

2025-09-18 Thread Tobias Stadler via llvm-branch-commits


https://github.com/tobias-stadler updated 
https://github.com/llvm/llvm-project/pull/156715

>From d33b31f01aeeb9005581b0a2a1f21c898463aa02 Mon Sep 17 00:00:00 2001
From: Tobias Stadler 
Date: Thu, 18 Sep 2025 12:34:55 +0100
Subject: [PATCH 1/2] Replace bitstream blobs by yaml

Created using spr 1.3.7-wip
---
 llvm/lib/Remarks/BitstreamRemarkParser.cpp|   5 +-
 .../dsymutil/ARM/remarks-linking-bundle.test  |  13 +-
 .../basic1.macho.remarks.arm64.opt.bitstream  | Bin 824 -> 0 bytes
 .../basic1.macho.remarks.arm64.opt.yaml   |  47 +
 ...c1.macho.remarks.empty.arm64.opt.bitstream |   0
 .../basic2.macho.remarks.arm64.opt.bitstream  | Bin 1696 -> 0 bytes
 .../basic2.macho.remarks.arm64.opt.yaml   | 194 ++
 ...c2.macho.remarks.empty.arm64.opt.bitstream |   0
 .../basic3.macho.remarks.arm64.opt.bitstream  | Bin 1500 -> 0 bytes
 .../basic3.macho.remarks.arm64.opt.yaml   | 181 
 ...c3.macho.remarks.empty.arm64.opt.bitstream |   0
 .../fat.macho.remarks.x86_64.opt.bitstream| Bin 820 -> 0 bytes
 .../remarks/fat.macho.remarks.x86_64.opt.yaml |  53 +
 .../fat.macho.remarks.x86_64h.opt.bitstream   | Bin 820 -> 0 bytes
 .../fat.macho.remarks.x86_64h.opt.yaml|  53 +
 .../X86/remarks-linking-fat-bundle.test   |   8 +-
 16 files changed, 543 insertions(+), 11 deletions(-)
 delete mode 100644 
llvm/test/tools/dsymutil/Inputs/private/tmp/remarks/basic1.macho.remarks.arm64.opt.bitstream
 create mode 100644 
llvm/test/tools/dsymutil/Inputs/private/tmp/remarks/basic1.macho.remarks.arm64.opt.yaml
 delete mode 100644 
llvm/test/tools/dsymutil/Inputs/private/tmp/remarks/basic1.macho.remarks.empty.arm64.opt.bitstream
 delete mode 100644 
llvm/test/tools/dsymutil/Inputs/private/tmp/remarks/basic2.macho.remarks.arm64.opt.bitstream
 create mode 100644 
llvm/test/tools/dsymutil/Inputs/private/tmp/remarks/basic2.macho.remarks.arm64.opt.yaml
 delete mode 100644 
llvm/test/tools/dsymutil/Inputs/private/tmp/remarks/basic2.macho.remarks.empty.arm64.opt.bitstream
 delete mode 100644 
llvm/test/tools/dsymutil/Inputs/private/tmp/remarks/basic3.macho.remarks.arm64.opt.bitstream
 create mode 100644 
llvm/test/tools/dsymutil/Inputs/private/tmp/remarks/basic3.macho.remarks.arm64.opt.yaml
 delete mode 100644 
llvm/test/tools/dsymutil/Inputs/private/tmp/remarks/basic3.macho.remarks.empty.arm64.opt.bitstream
 delete mode 100644 
llvm/test/tools/dsymutil/Inputs/private/tmp/remarks/fat.macho.remarks.x86_64.opt.bitstream
 create mode 100644 
llvm/test/tools/dsymutil/Inputs/private/tmp/remarks/fat.macho.remarks.x86_64.opt.yaml
 delete mode 100644 
llvm/test/tools/dsymutil/Inputs/private/tmp/remarks/fat.macho.remarks.x86_64h.opt.bitstream
 create mode 100644 
llvm/test/tools/dsymutil/Inputs/private/tmp/remarks/fat.macho.remarks.x86_64h.opt.yaml

diff --git a/llvm/lib/Remarks/BitstreamRemarkParser.cpp 
b/llvm/lib/Remarks/BitstreamRemarkParser.cpp
index 63b16bd2df0ec..2b27a0f661d88 100644
--- a/llvm/lib/Remarks/BitstreamRemarkParser.cpp
+++ b/llvm/lib/Remarks/BitstreamRemarkParser.cpp
@@ -411,9 +411,8 @@ Error BitstreamRemarkParser::processExternalFilePath() {
 return E;
 
   if (ContainerType != BitstreamRemarkContainerType::RemarksFile)
-return error(
-"Error while parsing external file's BLOCK_META: wrong container "
-"type.");
+return ParserHelper->MetaHelper.error(
+"Wrong container type in external file.");
 
   return Error::success();
 }
diff --git a/llvm/test/tools/dsymutil/ARM/remarks-linking-bundle.test 
b/llvm/test/tools/dsymutil/ARM/remarks-linking-bundle.test
index 09a60d7d044c6..e1b04455b0d9d 100644
--- a/llvm/test/tools/dsymutil/ARM/remarks-linking-bundle.test
+++ b/llvm/test/tools/dsymutil/ARM/remarks-linking-bundle.test
@@ -1,22 +1,25 @@
 RUN: rm -rf %t
-RUN: mkdir -p %t
+RUN: mkdir -p %t/private/tmp/remarks
 RUN: cat %p/../Inputs/remarks/basic.macho.remarks.arm64> 
%t/basic.macho.remarks.arm64
+RUN: llvm-remarkutil yaml2bitstream 
%p/../Inputs/private/tmp/remarks/basic1.macho.remarks.arm64.opt.yaml -o 
%t/private/tmp/remarks/basic1.macho.remarks.arm64.opt.bitstream
+RUN: llvm-remarkutil yaml2bitstream 
%p/../Inputs/private/tmp/remarks/basic2.macho.remarks.arm64.opt.yaml -o 
%t/private/tmp/remarks/basic2.macho.remarks.arm64.opt.bitstream
+RUN: llvm-remarkutil yaml2bitstream 
%p/../Inputs/private/tmp/remarks/basic3.macho.remarks.arm64.opt.yaml -o 
%t/private/tmp/remarks/basic3.macho.remarks.arm64.opt.bitstream
 
-RUN: dsymutil -oso-prepend-path=%p/../Inputs 
-remarks-prepend-path=%p/../Inputs %t/basic.macho.remarks.arm64
+RUN: dsymutil -oso-prepend-path=%p/../Inputs -remarks-prepend-path=%t 
%t/basic.macho.remarks.arm64
 
 Check that the remark file in the bundle exists and is sane:
 RUN: llvm-bcanalyzer -dump 
%t/basic.macho.remarks.arm64.dSYM/Contents/Resources/Remarks/basic.macho.remarks.arm64
 | FileCheck %s
 
-RUN: dsymutil --linker parallel -oso-prepend-path=%p/../Inputs 
-remarks-prepend-path=%p/../Inputs %t/basic.macho.r

[llvm-branch-commits] [llvm] [AArch64] Prepare for split ZPR and PPR area allocation (NFCI) (PR #142391)

2025-09-18 Thread Benjamin Maxwell via llvm-branch-commits


https://github.com/MacDue updated 
https://github.com/llvm/llvm-project/pull/142391

>From 0dfb0725e2a4f82af47821946bfbbfcd7ed08e10 Mon Sep 17 00:00:00 2001
From: Benjamin Maxwell 
Date: Thu, 8 May 2025 17:38:27 +
Subject: [PATCH] [AArch64] Prepare for split ZPR and PPR area allocation
 (NFCI)

This patch attempts to refactor AArch64FrameLowering to allow the size
of the ZPR and PPR areas to be calculated separately. This will be used
by a subsequent patch to support allocating ZPRs and PPRs to separate
areas. This patch should be an NFC and is split out to make later
functional changes easier to spot.
---
 .../Target/AArch64/AArch64FrameLowering.cpp   | 220 ++
 .../lib/Target/AArch64/AArch64FrameLowering.h |  20 +-
 .../AArch64/AArch64MachineFunctionInfo.cpp|  20 +-
 .../AArch64/AArch64MachineFunctionInfo.h  |  63 ++---
 .../AArch64/AArch64PrologueEpilogue.cpp   | 128 ++
 .../Target/AArch64/AArch64RegisterInfo.cpp|   4 +-
 .../DebugInfo/AArch64/asan-stack-vars.mir |   3 +-
 .../compiler-gen-bbs-livedebugvalues.mir  |   3 +-
 8 files changed, 288 insertions(+), 173 deletions(-)

diff --git a/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp 
b/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
index 20b0d697827c5..f5f7b6522ddec 100644
--- a/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
@@ -324,6 +324,36 @@ 
AArch64FrameLowering::getArgumentStackToRestore(MachineFunction &MF,
 static bool produceCompactUnwindFrame(const AArch64FrameLowering &,
   MachineFunction &MF);
 
+enum class AssignObjectOffsets { No, Yes };
+/// Process all the SVE stack objects and the SVE stack size and offsets for
+/// each object. If AssignOffsets is "Yes", the offsets get assigned (and SVE
+/// stack sizes set). Returns the size of the SVE stack.
+static SVEStackSizes determineSVEStackSizes(MachineFunction &MF,
+AssignObjectOffsets AssignOffsets,
+bool SplitSVEObjects = false);
+
+static unsigned getStackHazardSize(const MachineFunction &MF) {
+  return MF.getSubtarget().getStreamingHazardSize();
+}
+
+/// Returns true if PPRs are spilled as ZPRs.
+static bool arePPRsSpilledAsZPR(const MachineFunction &MF) {
+  return MF.getSubtarget().getRegisterInfo()->getSpillSize(
+ AArch64::PPRRegClass) == 16;
+}
+
+StackOffset
+AArch64FrameLowering::getZPRStackSize(const MachineFunction &MF) const {
+  const AArch64FunctionInfo *AFI = MF.getInfo();
+  return StackOffset::getScalable(AFI->getStackSizeZPR());
+}
+
+StackOffset
+AArch64FrameLowering::getPPRStackSize(const MachineFunction &MF) const {
+  const AArch64FunctionInfo *AFI = MF.getInfo();
+  return StackOffset::getScalable(AFI->getStackSizePPR());
+}
+
 // Conservatively, returns true if the function is likely to have SVE vectors
 // on the stack. This function is safe to be called before callee-saves or
 // object offsets have been determined.
@@ -482,13 +512,6 @@ AArch64FrameLowering::getFixedObjectSize(const 
MachineFunction &MF,
   }
 }
 
-/// Returns the size of the entire SVE stackframe (calleesaves + spills).
-StackOffset
-AArch64FrameLowering::getSVEStackSize(const MachineFunction &MF) const {
-  const AArch64FunctionInfo *AFI = MF.getInfo();
-  return StackOffset::getScalable((int64_t)AFI->getStackSizeSVE());
-}
-
 bool AArch64FrameLowering::canUseRedZone(const MachineFunction &MF) const {
   if (!EnableRedZone)
 return false;
@@ -514,7 +537,7 @@ bool AArch64FrameLowering::canUseRedZone(const 
MachineFunction &MF) const {
  !Subtarget.hasSVE();
 
   return !(MFI.hasCalls() || hasFP(MF) || NumBytes > RedZoneSize ||
-   getSVEStackSize(MF) || LowerQRegCopyThroughMem);
+   AFI->hasSVEStackSize() || LowerQRegCopyThroughMem);
 }
 
 /// hasFPImpl - Return true if the specified function should have a dedicated
@@ -557,7 +580,7 @@ bool AArch64FrameLowering::hasFPImpl(const MachineFunction 
&MF) const {
   // CFA in either of these cases.
   if (AFI.needsDwarfUnwindInfo(MF) &&
   ((requiresSaveVG(MF) || AFI.getSMEFnAttrs().hasStreamingBody()) &&
-   (!AFI.hasCalculatedStackSizeSVE() || AFI.getStackSizeSVE() > 0)))
+   (!AFI.hasCalculatedStackSizeSVE() || AFI.hasSVEStackSize(
 return true;
   // With large callframes around we may need to use FP to access the 
scavenging
   // emergency spillslot.
@@ -1126,10 +1149,6 @@ static bool isTargetWindows(const MachineFunction &MF) {
   return MF.getSubtarget().isTargetWindows();
 }
 
-static unsigned getStackHazardSize(const MachineFunction &MF) {
-  return MF.getSubtarget().getStreamingHazardSize();
-}
-
 void AArch64FrameLowering::emitPacRetPlusLeafHardening(
 MachineFunction &MF) const {
   const AArch64Subtarget &Subtarget = MF.getSubtarget();
@@ -1212,7 +1231,9 @@ AArch64FrameLowering::getFrameIndexReferenceFromSP(const

[llvm-branch-commits] [flang] [mlir] [MLIR] Add new complex.powi op (PR #158722)

2025-09-18 Thread Slava Zakharin via llvm-branch-commits


https://github.com/vzakhari commented:

LGTM with some final comments.

https://github.com/llvm/llvm-project/pull/158722
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [flang] [mlir] [MLIR] Add new complex.powi op (PR #158722)

2025-09-18 Thread Slava Zakharin via llvm-branch-commits



@@ -1272,7 +1272,18 @@ mlir::Value genMathOp(fir::FirOpBuilder &builder, 
mlir::Location loc,
 LLVM_DEBUG(llvm::dbgs() << "Generating '" << mathLibFuncName
 << "' operation with type ";
mathLibFuncType.dump(); llvm::dbgs() << "\n");
-result = T::create(builder, loc, args);
+if constexpr (std::is_same_v) {
+  auto resultType = mathLibFuncType.getResult(0);
+  result = T::create(builder, loc, resultType, args);
+} else if constexpr (std::is_same_v) {
+  auto resultType = mathLibFuncType.getResult(0);
+  auto fmfAttr = mlir::arith::FastMathFlagsAttr::get(
+  builder.getContext(), builder.getFastMathFlags());
+  result = builder.create(loc, resultType, args[0],
+ args[1], fmfAttr);
+} else {

vzakhari wrote:

Do we really need all this code?  I believe just a simple `T::create(buider, 
loc, args)` should work, because of the type constraints in the operations 
definitions.

https://github.com/llvm/llvm-project/pull/158722
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [flang] [mlir] [MLIR] Add new complex.powi op (PR #158722)

2025-09-18 Thread Slava Zakharin via llvm-branch-commits



@@ -175,12 +176,20 @@ PowIStrengthReduction::matchAndRewrite(
 
   Value one;
   Type opType = getElementTypeOrSelf(op.getType());
-  if constexpr (std::is_same_v)
+  if constexpr (std::is_same_v) {
 one = arith::ConstantOp::create(rewriter, loc,
 rewriter.getFloatAttr(opType, 1.0));
-  else
+  } else if constexpr (std::is_same_v) {
+auto complexTy = cast(opType);
+Type elementType = complexTy.getElementType();
+auto realPart = rewriter.getFloatAttr(elementType, 1.0);
+auto imagPart = rewriter.getFloatAttr(elementType, 0.0);
+one = rewriter.create(

vzakhari wrote:

I believe all the `create` methods of the rewriter will become deprecated soon, 
so `complex::ConstantOp::create` is a better alternative.  There are other 
cases below.

https://github.com/llvm/llvm-project/pull/158722
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [flang] [mlir] [MLIR] Add new complex.powi op (PR #158722)

2025-09-18 Thread Slava Zakharin via llvm-branch-commits


https://github.com/vzakhari edited 
https://github.com/llvm/llvm-project/pull/158722
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU: Move spill pseudo special case out of adjustAllocatableRegClass (PR #158246)

2025-09-18 Thread Matt Arsenault via llvm-branch-commits


https://github.com/arsenm created 
https://github.com/llvm/llvm-project/pull/158246

This is special for the same reason av_mov_b64_imm_pseudo is special.

>From e5032294b4979c4b7f2367cee30c24d42901714b Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Fri, 5 Sep 2025 17:27:37 +0900
Subject: [PATCH] AMDGPU: Move spill pseudo special case out of
 adjustAllocatableRegClass

This is special for the same reason av_mov_b64_imm_pseudo is special.
---
 llvm/lib/Target/AMDGPU/SIInstrInfo.cpp | 8 +++-
 llvm/lib/Target/AMDGPU/SIInstrInfo.h   | 6 --
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp 
b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
index 5c3340703ba3b..b1a61886802f4 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
@@ -5976,8 +5976,7 @@ SIInstrInfo::getWholeWaveFunctionSetup(MachineFunction 
&MF) const {
 static const TargetRegisterClass *
 adjustAllocatableRegClass(const GCNSubtarget &ST, const SIRegisterInfo &RI,
   const MCInstrDesc &TID, unsigned RCID) {
-  if (!ST.hasGFX90AInsts() && (((TID.mayLoad() || TID.mayStore()) &&
-!(TID.TSFlags & SIInstrFlags::Spill {
+  if (!ST.hasGFX90AInsts() && (((TID.mayLoad() || TID.mayStore() {
 switch (RCID) {
 case AMDGPU::AV_32RegClassID:
   RCID = AMDGPU::VGPR_32RegClassID;
@@ -6012,10 +6011,9 @@ const TargetRegisterClass 
*SIInstrInfo::getRegClass(const MCInstrDesc &TID,
   if (OpNum >= TID.getNumOperands())
 return nullptr;
   auto RegClass = TID.operands()[OpNum].RegClass;
-  if (TID.getOpcode() == AMDGPU::AV_MOV_B64_IMM_PSEUDO) {
-// Special pseudos have no alignment requirement
+  // Special pseudos have no alignment requirement
+  if (TID.getOpcode() == AMDGPU::AV_MOV_B64_IMM_PSEUDO || isSpill(TID))
 return RI.getRegClass(RegClass);
-  }
 
   return adjustAllocatableRegClass(ST, RI, TID, RegClass);
 }
diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.h 
b/llvm/lib/Target/AMDGPU/SIInstrInfo.h
index f7dde2b90b68e..e0373e7768435 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.h
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.h
@@ -797,10 +797,12 @@ class SIInstrInfo final : public AMDGPUGenInstrInfo {
 return get(Opcode).TSFlags & SIInstrFlags::Spill;
   }
 
-  static bool isSpill(const MachineInstr &MI) {
-return MI.getDesc().TSFlags & SIInstrFlags::Spill;
+  static bool isSpill(const MCInstrDesc &Desc) {
+return Desc.TSFlags & SIInstrFlags::Spill;
   }
 
+  static bool isSpill(const MachineInstr &MI) { return isSpill(MI.getDesc()); }
+
   static bool isWWMRegSpillOpcode(uint16_t Opcode) {
 return Opcode == AMDGPU::SI_SPILL_WWM_V32_SAVE ||
Opcode == AMDGPU::SI_SPILL_WWM_AV32_SAVE ||

___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] CodeGen: Keep reference to TargetRegisterInfo in TargetInstrInfo (PR #158224)

2025-09-18 Thread Matt Arsenault via llvm-branch-commits


https://github.com/arsenm created 
https://github.com/llvm/llvm-project/pull/158224

Both conceptually belong to the same subtarget, so it should not
be necessary to pass in the context TargetRegisterInfo to any
TargetInstrInfo member. Add this reference so those superfluous
arguments can be removed.

Most targets placed their TargetRegisterInfo as a member
in TargetInstrInfo. A few had this owned by the TargetSubtargetInfo,
so unify all targets to look the same.

>From 532af14dba99fbaf1ccfbd4ac63e22fce9aa371b Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Fri, 12 Sep 2025 14:11:48 +0900
Subject: [PATCH] CodeGen: Keep reference to TargetRegisterInfo in
 TargetInstrInfo

Both conceptually belong to the same subtarget, so it should not
be necessary to pass in the context TargetRegisterInfo to any
TargetInstrInfo member. Add this reference so those superfluous
arguments can be removed.

Most targets placed their TargetRegisterInfo as a member
in TargetInstrInfo. A few had this owned by the TargetSubtargetInfo,
so unify all targets to look the same.
---
 llvm/include/llvm/CodeGen/TargetInstrInfo.h   | 11 ++-
 llvm/lib/CodeGen/TargetInstrInfo.cpp  | 68 ---
 llvm/lib/Target/AArch64/AArch64InstrInfo.cpp  |  2 +-
 llvm/lib/Target/AMDGPU/R600InstrInfo.cpp  |  2 +-
 llvm/lib/Target/AMDGPU/SIInstrInfo.cpp|  3 +-
 llvm/lib/Target/ARC/ARCInstrInfo.cpp  |  3 +-
 llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp  |  5 +-
 llvm/lib/Target/ARM/ARMBaseInstrInfo.h|  9 ++-
 llvm/lib/Target/ARM/ARMInstrInfo.cpp  |  3 +-
 llvm/lib/Target/ARM/ARMInstrInfo.h|  2 +-
 llvm/lib/Target/ARM/Thumb1InstrInfo.cpp   |  2 +-
 llvm/lib/Target/ARM/Thumb1InstrInfo.h |  2 +-
 llvm/lib/Target/ARM/Thumb2InstrInfo.cpp   |  2 +-
 llvm/lib/Target/ARM/Thumb2InstrInfo.h |  2 +-
 llvm/lib/Target/AVR/AVRInstrInfo.cpp  |  4 +-
 llvm/lib/Target/BPF/BPFInstrInfo.cpp  |  2 +-
 llvm/lib/Target/CSKY/CSKYInstrInfo.cpp|  2 +-
 llvm/lib/Target/DirectX/DirectXInstrInfo.cpp  |  2 +-
 llvm/lib/Target/Hexagon/HexagonInstrInfo.cpp  |  4 +-
 llvm/lib/Target/Hexagon/HexagonInstrInfo.h|  5 ++
 llvm/lib/Target/Hexagon/HexagonSubtarget.cpp  |  3 +-
 llvm/lib/Target/Hexagon/HexagonSubtarget.h|  3 +-
 llvm/lib/Target/Lanai/LanaiInstrInfo.cpp  |  3 +-
 .../Target/LoongArch/LoongArchInstrInfo.cpp   |  4 +-
 .../lib/Target/LoongArch/LoongArchInstrInfo.h |  4 ++
 .../Target/LoongArch/LoongArchSubtarget.cpp   |  2 +-
 .../lib/Target/LoongArch/LoongArchSubtarget.h |  3 +-
 llvm/lib/Target/MSP430/MSP430InstrInfo.cpp|  3 +-
 llvm/lib/Target/Mips/Mips16InstrInfo.cpp  |  6 +-
 llvm/lib/Target/Mips/Mips16InstrInfo.h|  2 +-
 llvm/lib/Target/Mips/MipsInstrInfo.cpp|  5 +-
 llvm/lib/Target/Mips/MipsInstrInfo.h  |  8 ++-
 llvm/lib/Target/Mips/MipsSEInstrInfo.cpp  |  6 +-
 llvm/lib/Target/Mips/MipsSEInstrInfo.h|  2 +-
 llvm/lib/Target/NVPTX/NVPTXInstrInfo.cpp  |  2 +-
 llvm/lib/Target/PowerPC/PPCInstrInfo.cpp  |  2 +-
 llvm/lib/Target/RISCV/RISCVInstrInfo.cpp  |  5 +-
 llvm/lib/Target/RISCV/RISCVInstrInfo.h|  3 +
 llvm/lib/Target/RISCV/RISCVSubtarget.cpp  |  2 +-
 llvm/lib/Target/RISCV/RISCVSubtarget.h|  3 +-
 llvm/lib/Target/SPIRV/SPIRVInstrInfo.cpp  |  2 +-
 llvm/lib/Target/Sparc/SparcInstrInfo.cpp  |  4 +-
 llvm/lib/Target/SystemZ/SystemZInstrInfo.cpp  |  2 +-
 llvm/lib/Target/VE/VEInstrInfo.cpp|  2 +-
 .../WebAssembly/WebAssemblyInstrInfo.cpp  |  2 +-
 llvm/lib/Target/X86/X86InstrInfo.cpp  |  2 +-
 llvm/lib/Target/XCore/XCoreInstrInfo.cpp  |  2 +-
 llvm/lib/Target/Xtensa/XtensaInstrInfo.cpp|  3 +-
 llvm/unittests/CodeGen/MFCommon.inc   |  4 +-
 llvm/utils/TableGen/InstrInfoEmitter.cpp  | 12 ++--
 50 files changed, 127 insertions(+), 114 deletions(-)

diff --git a/llvm/include/llvm/CodeGen/TargetInstrInfo.h 
b/llvm/include/llvm/CodeGen/TargetInstrInfo.h
index 6a624a7052cdd..802cca6022074 100644
--- a/llvm/include/llvm/CodeGen/TargetInstrInfo.h
+++ b/llvm/include/llvm/CodeGen/TargetInstrInfo.h
@@ -113,9 +113,12 @@ struct ExtAddrMode {
 ///
 class LLVM_ABI TargetInstrInfo : public MCInstrInfo {
 protected:
-  TargetInstrInfo(unsigned CFSetupOpcode = ~0u, unsigned CFDestroyOpcode = ~0u,
-  unsigned CatchRetOpcode = ~0u, unsigned ReturnOpcode = ~0u)
-  : CallFrameSetupOpcode(CFSetupOpcode),
+  const TargetRegisterInfo &TRI;
+
+  TargetInstrInfo(const TargetRegisterInfo &TRI, unsigned CFSetupOpcode = ~0u,
+  unsigned CFDestroyOpcode = ~0u, unsigned CatchRetOpcode = 
~0u,
+  unsigned ReturnOpcode = ~0u)
+  : TRI(TRI), CallFrameSetupOpcode(CFSetupOpcode),
 CallFrameDestroyOpcode(CFDestroyOpcode), 
CatchRetOpcode(CatchRetOpcode),
 ReturnOpcode(ReturnOpcode) {}
 
@@ -124,6 +127,8 @@ class LLVM_ABI TargetInstrInfo : public MCInstrInfo {
   TargetInstrInfo &operator=(co

[llvm-branch-commits] [compiler-rt] Backport AArch64 sanitizer fixes to 21.x. (PR #157848)

2025-09-18 Thread Michał Górny via llvm-branch-commits


https://github.com/mgorny milestoned 
https://github.com/llvm/llvm-project/pull/157848
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [flang] [mlir] [MLIR] Add new complex.powi op (PR #158722)

2025-09-18 Thread Akash Banerjee via llvm-branch-commits


https://github.com/TIFitis updated 
https://github.com/llvm/llvm-project/pull/158722

>From 6976910364aa2fe18603aefcb27b10bd0120513d Mon Sep 17 00:00:00 2001
From: Akash Banerjee 
Date: Mon, 15 Sep 2025 20:35:29 +0100
Subject: [PATCH 1/7] Add complex.powi op.

---
 flang/lib/Optimizer/Builder/IntrinsicCall.cpp | 20 ++--
 .../Transforms/ConvertComplexPow.cpp  | 94 +--
 flang/test/Lower/HLFIR/binary-ops.f90 |  2 +-
 .../test/Lower/Intrinsics/pow_complex16i.f90  |  2 +-
 .../test/Lower/Intrinsics/pow_complex16k.f90  |  2 +-
 flang/test/Lower/amdgcn-complex.f90   |  9 ++
 flang/test/Lower/power-operator.f90   |  9 +-
 .../mlir/Dialect/Complex/IR/ComplexOps.td | 26 +
 .../ComplexToROCDLLibraryCalls.cpp| 41 +++-
 .../Transforms/AlgebraicSimplification.cpp| 24 +++--
 .../Dialect/Math/Transforms/CMakeLists.txt|  1 +
 .../complex-to-rocdl-library-calls.mlir   | 14 +++
 mlir/test/Dialect/Complex/powi-simplify.mlir  | 20 
 13 files changed, 188 insertions(+), 76 deletions(-)
 create mode 100644 mlir/test/Dialect/Complex/powi-simplify.mlir

diff --git a/flang/lib/Optimizer/Builder/IntrinsicCall.cpp 
b/flang/lib/Optimizer/Builder/IntrinsicCall.cpp
index 466458c05dba7..74a4e8f85c8ff 100644
--- a/flang/lib/Optimizer/Builder/IntrinsicCall.cpp
+++ b/flang/lib/Optimizer/Builder/IntrinsicCall.cpp
@@ -1331,14 +1331,20 @@ mlir::Value genComplexPow(fir::FirOpBuilder &builder, 
mlir::Location loc,
 return genLibCall(builder, loc, mathOp, mathLibFuncType, args);
   auto complexTy = mlir::cast(mathLibFuncType.getInput(0));
   mlir::Value exp = args[1];
-  if (!mlir::isa(exp.getType())) {
-auto realTy = complexTy.getElementType();
-mlir::Value realExp = builder.createConvert(loc, realTy, exp);
-mlir::Value zero = builder.createRealConstant(loc, realTy, 0);
-exp =
-builder.create(loc, complexTy, realExp, zero);
+  mlir::Value result;
+  if (mlir::isa(exp.getType()) ||
+  mlir::isa(exp.getType())) {
+result = builder.create(loc, args[0], exp);
+  } else {
+if (!mlir::isa(exp.getType())) {
+  auto realTy = complexTy.getElementType();
+  mlir::Value realExp = builder.createConvert(loc, realTy, exp);
+  mlir::Value zero = builder.createRealConstant(loc, realTy, 0);
+  exp = builder.create(loc, complexTy, realExp,
+zero);
+}
+result = builder.create(loc, args[0], exp);
   }
-  mlir::Value result = builder.create(loc, args[0], exp);
   result = builder.createConvert(loc, mathLibFuncType.getResult(0), result);
   return result;
 }
diff --git a/flang/lib/Optimizer/Transforms/ConvertComplexPow.cpp 
b/flang/lib/Optimizer/Transforms/ConvertComplexPow.cpp
index 78f9d9e4f639a..d76451459def9 100644
--- a/flang/lib/Optimizer/Transforms/ConvertComplexPow.cpp
+++ b/flang/lib/Optimizer/Transforms/ConvertComplexPow.cpp
@@ -58,63 +58,57 @@ void ConvertComplexPowPass::runOnOperation() {
   ModuleOp mod = getOperation();
   fir::FirOpBuilder builder(mod, fir::getKindMapping(mod));
 
-  mod.walk([&](complex::PowOp op) {
+  mod.walk([&](complex::PowiOp op) {
 builder.setInsertionPoint(op);
 Location loc = op.getLoc();
 auto complexTy = cast(op.getType());
 auto elemTy = complexTy.getElementType();
-
 Value base = op.getLhs();
-Value rhs = op.getRhs();
-
-Value intExp;
-if (auto create = rhs.getDefiningOp()) {
-  if (isZero(create.getImaginary())) {
-if (auto conv = create.getReal().getDefiningOp()) {
-  if (auto intTy = dyn_cast(conv.getValue().getType()))
-intExp = conv.getValue();
-}
-  }
-}
-
+Value intExp = op.getRhs();
 func::FuncOp callee;
-SmallVector args;
-if (intExp) {
-  unsigned realBits = cast(elemTy).getWidth();
-  unsigned intBits = cast(intExp.getType()).getWidth();
-  auto funcTy = builder.getFunctionType(
-  {complexTy, builder.getIntegerType(intBits)}, {complexTy});
-  if (realBits == 32 && intBits == 32)
-callee = getOrDeclare(builder, loc, RTNAME_STRING(cpowi), funcTy);
-  else if (realBits == 32 && intBits == 64)
-callee = getOrDeclare(builder, loc, RTNAME_STRING(cpowk), funcTy);
-  else if (realBits == 64 && intBits == 32)
-callee = getOrDeclare(builder, loc, RTNAME_STRING(zpowi), funcTy);
-  else if (realBits == 64 && intBits == 64)
-callee = getOrDeclare(builder, loc, RTNAME_STRING(zpowk), funcTy);
-  else if (realBits == 128 && intBits == 32)
-callee = getOrDeclare(builder, loc, RTNAME_STRING(cqpowi), funcTy);
-  else if (realBits == 128 && intBits == 64)
-callee = getOrDeclare(builder, loc, RTNAME_STRING(cqpowk), funcTy);
-  else
-return;
-  args = {base, intExp};
-} else {
-  unsigned realBits = cast(elemTy).getWidth();
-  auto funcTy =
-  builder.getFunctionType({complexTy, complexTy}, {complexTy});
-

[llvm-branch-commits] [flang] [mlir] [MLIR] Add new complex.powi op (PR #158722)

2025-09-18 Thread Akash Banerjee via llvm-branch-commits



@@ -1272,7 +1272,18 @@ mlir::Value genMathOp(fir::FirOpBuilder &builder, 
mlir::Location loc,
 LLVM_DEBUG(llvm::dbgs() << "Generating '" << mathLibFuncName
 << "' operation with type ";
mathLibFuncType.dump(); llvm::dbgs() << "\n");
-result = T::create(builder, loc, args);
+if constexpr (std::is_same_v) {
+  auto resultType = mathLibFuncType.getResult(0);
+  result = T::create(builder, loc, resultType, args);
+} else if constexpr (std::is_same_v) {
+  auto resultType = mathLibFuncType.getResult(0);
+  auto fmfAttr = mlir::arith::FastMathFlagsAttr::get(
+  builder.getContext(), builder.getFastMathFlags());
+  result = builder.create(loc, resultType, args[0],
+ args[1], fmfAttr);
+} else {

TIFitis wrote:

You're right, I've simplified it. Thanks for catching.

https://github.com/llvm/llvm-project/pull/158722
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [flang] [mlir] [MLIR] Add new complex.powi op (PR #158722)

2025-09-18 Thread Akash Banerjee via llvm-branch-commits



@@ -175,12 +176,20 @@ PowIStrengthReduction::matchAndRewrite(
 
   Value one;
   Type opType = getElementTypeOrSelf(op.getType());
-  if constexpr (std::is_same_v)
+  if constexpr (std::is_same_v) {
 one = arith::ConstantOp::create(rewriter, loc,
 rewriter.getFloatAttr(opType, 1.0));
-  else
+  } else if constexpr (std::is_same_v) {
+auto complexTy = cast(opType);
+Type elementType = complexTy.getElementType();
+auto realPart = rewriter.getFloatAttr(elementType, 1.0);
+auto imagPart = rewriter.getFloatAttr(elementType, 0.0);
+one = rewriter.create(

TIFitis wrote:

Done.

https://github.com/llvm/llvm-project/pull/158722
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [LoopUnroll] Fix block frequencies when no runtime (PR #157754)

2025-09-18 Thread Joel E. Denny via llvm-branch-commits


https://github.com/jdenny-ornl edited 
https://github.com/llvm/llvm-project/pull/157754
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [compiler-rt] release/21.x: [compiler-rt][sanitizer] fix msghdr for musl (PR #159551)

2025-09-18 Thread via llvm-branch-commits


github-actions[bot] wrote:

⚠️ We detected that you are using a GitHub private e-mail address to contribute 
to the repo. Please turn off [Keep my email addresses 
private](https://github.com/settings/emails) setting in your account. See 
[LLVM Developer 
Policy](https://llvm.org/docs/DeveloperPolicy.html#email-addresses) and [LLVM 
Discourse](https://discourse.llvm.org/t/hidden-emails-on-github-should-we-do-something-about-it)
 for more information.

https://github.com/llvm/llvm-project/pull/159551
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AllocToken, Clang] Implement __builtin_alloc_token_infer() and llvm.alloc.token.id (PR #156842)

2025-09-18 Thread Marco Elver via llvm-branch-commits



@@ -1274,6 +1274,12 @@ def AllocaWithAlignUninitialized : Builtin {
   let Prototype = "void*(size_t, _Constant size_t)";
 }
 
+def AllocTokenInfer : Builtin {
+  let Spellings = ["__builtin_alloc_token_infer"];

melver wrote:

Renaming to __builtin_infer_alloc_token

https://github.com/llvm/llvm-project/pull/156842
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [AllocToken, Clang] Implement TypeHashPointerSplit mode (PR #156840)

2025-09-18 Thread Marco Elver via llvm-branch-commits


https://github.com/melver updated 
https://github.com/llvm/llvm-project/pull/156840

>From 14c75441e84aa32e4f5876598b9a2c59d4ecbe65 Mon Sep 17 00:00:00 2001
From: Marco Elver 
Date: Mon, 8 Sep 2025 21:32:21 +0200
Subject: [PATCH 1/2] fixup! fix for incomplete types

Created using spr 1.3.8-beta.1
---
 clang/lib/CodeGen/CGExpr.cpp | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp
index 288b41bc42203..455de644daf00 100644
--- a/clang/lib/CodeGen/CGExpr.cpp
+++ b/clang/lib/CodeGen/CGExpr.cpp
@@ -1289,6 +1289,7 @@ void CodeGenFunction::EmitAllocTokenHint(llvm::CallBase 
*CB,
   // Check if QualType contains a pointer. Implements a simple DFS to
   // recursively check if a type contains a pointer type.
   llvm::SmallPtrSet VisitedRD;
+  bool IncompleteType = false;
   auto TypeContainsPtr = [&](auto &&self, QualType T) -> bool {
 QualType CanonicalType = T.getCanonicalType();
 if (CanonicalType->isPointerType())
@@ -1312,6 +1313,10 @@ void CodeGenFunction::EmitAllocTokenHint(llvm::CallBase 
*CB,
   return self(self, AT->getElementType());
 // The type is a struct, class, or union.
 if (const RecordDecl *RD = CanonicalType->getAsRecordDecl()) {
+  if (!RD->isCompleteDefinition()) {
+IncompleteType = true;
+return false;
+  }
   if (!VisitedRD.insert(RD).second)
 return false; // already visited
   // Check all fields.
@@ -1333,6 +1338,8 @@ void CodeGenFunction::EmitAllocTokenHint(llvm::CallBase 
*CB,
 return false;
   };
   const bool ContainsPtr = TypeContainsPtr(TypeContainsPtr, AllocType);
+  if (!ContainsPtr && IncompleteType)
+return nullptr;
   auto *ContainsPtrC = Builder.getInt1(ContainsPtr);
   auto *ContainsPtrMD = MDB.createConstant(ContainsPtrC);
 

>From 7f706618ddc40375d4085bc2ebe03f02ec78823a Mon Sep 17 00:00:00 2001
From: Marco Elver 
Date: Mon, 8 Sep 2025 21:58:01 +0200
Subject: [PATCH 2/2] fixup!

Created using spr 1.3.8-beta.1
---
 clang/lib/CodeGen/CGExpr.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp
index 455de644daf00..e7a0e7696e204 100644
--- a/clang/lib/CodeGen/CGExpr.cpp
+++ b/clang/lib/CodeGen/CGExpr.cpp
@@ -1339,7 +1339,7 @@ void CodeGenFunction::EmitAllocTokenHint(llvm::CallBase 
*CB,
   };
   const bool ContainsPtr = TypeContainsPtr(TypeContainsPtr, AllocType);
   if (!ContainsPtr && IncompleteType)
-return nullptr;
+return;
   auto *ContainsPtrC = Builder.getInt1(ContainsPtr);
   auto *ContainsPtrMD = MDB.createConstant(ContainsPtrC);
 

___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AllocToken, Clang] Implement __builtin_infer_alloc_token() and llvm.alloc.token.id (PR #156842)

2025-09-18 Thread Marco Elver via llvm-branch-commits


https://github.com/melver updated 
https://github.com/llvm/llvm-project/pull/156842

>From 48227c8f7712b2dc807b252d18353c91905b1fb5 Mon Sep 17 00:00:00 2001
From: Marco Elver 
Date: Mon, 8 Sep 2025 17:19:04 +0200
Subject: [PATCH] fixup!

Created using spr 1.3.8-beta.1
---
 llvm/lib/Transforms/Instrumentation/AllocToken.cpp | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/llvm/lib/Transforms/Instrumentation/AllocToken.cpp 
b/llvm/lib/Transforms/Instrumentation/AllocToken.cpp
index d5ac3035df71b..3a28705d87523 100644
--- a/llvm/lib/Transforms/Instrumentation/AllocToken.cpp
+++ b/llvm/lib/Transforms/Instrumentation/AllocToken.cpp
@@ -151,7 +151,8 @@ STATISTIC(NumAllocations, "Allocations found");
 /// Expected format is: !{, }
 MDNode *getAllocTokenHintMetadata(const CallBase &CB) {
   MDNode *Ret = nullptr;
-  if (auto *II = dyn_cast(&CB)) {
+  if (auto *II = dyn_cast(&CB);
+  II && II->getIntrinsicID() == Intrinsic::alloc_token_id) {
 auto *MDV = cast(II->getArgOperand(0));
 Ret = cast(MDV->getMetadata());
 // If the intrinsic has an empty MDNode, type inference failed.
@@ -358,7 +359,7 @@ bool AllocToken::instrumentFunction(Function &F) {
   // Collect all allocation calls to avoid iterator invalidation.
   for (Instruction &I : instructions(F)) {
 // Collect all alloc_token_* intrinsics.
-if (IntrinsicInst *II = dyn_cast(&I);
+if (auto *II = dyn_cast(&I);
 II && II->getIntrinsicID() == Intrinsic::alloc_token_id) {
   IntrinsicInsts.emplace_back(II);
   continue;

___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AllocToken, Clang] Implement __builtin_infer_alloc_token() and llvm.alloc.token.id (PR #156842)

2025-09-18 Thread Marco Elver via llvm-branch-commits


https://github.com/melver updated 
https://github.com/llvm/llvm-project/pull/156842

>From 48227c8f7712b2dc807b252d18353c91905b1fb5 Mon Sep 17 00:00:00 2001
From: Marco Elver 
Date: Mon, 8 Sep 2025 17:19:04 +0200
Subject: [PATCH] fixup!

Created using spr 1.3.8-beta.1
---
 llvm/lib/Transforms/Instrumentation/AllocToken.cpp | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/llvm/lib/Transforms/Instrumentation/AllocToken.cpp 
b/llvm/lib/Transforms/Instrumentation/AllocToken.cpp
index d5ac3035df71b..3a28705d87523 100644
--- a/llvm/lib/Transforms/Instrumentation/AllocToken.cpp
+++ b/llvm/lib/Transforms/Instrumentation/AllocToken.cpp
@@ -151,7 +151,8 @@ STATISTIC(NumAllocations, "Allocations found");
 /// Expected format is: !{, }
 MDNode *getAllocTokenHintMetadata(const CallBase &CB) {
   MDNode *Ret = nullptr;
-  if (auto *II = dyn_cast(&CB)) {
+  if (auto *II = dyn_cast(&CB);
+  II && II->getIntrinsicID() == Intrinsic::alloc_token_id) {
 auto *MDV = cast(II->getArgOperand(0));
 Ret = cast(MDV->getMetadata());
 // If the intrinsic has an empty MDNode, type inference failed.
@@ -358,7 +359,7 @@ bool AllocToken::instrumentFunction(Function &F) {
   // Collect all allocation calls to avoid iterator invalidation.
   for (Instruction &I : instructions(F)) {
 // Collect all alloc_token_* intrinsics.
-if (IntrinsicInst *II = dyn_cast(&I);
+if (auto *II = dyn_cast(&I);
 II && II->getIntrinsicID() == Intrinsic::alloc_token_id) {
   IntrinsicInsts.emplace_back(II);
   continue;

___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [AllocToken, Clang] Infer type hints from sizeof expressions and casts (PR #156841)

2025-09-18 Thread Marco Elver via llvm-branch-commits


https://github.com/melver updated 
https://github.com/llvm/llvm-project/pull/156841


___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [AllocToken, Clang] Infer type hints from sizeof expressions and casts (PR #156841)

2025-09-18 Thread Marco Elver via llvm-branch-commits


https://github.com/melver updated 
https://github.com/llvm/llvm-project/pull/156841


___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [Clang] Introduce -fsanitize=alloc-token (PR #156839)

2025-09-18 Thread Marco Elver via llvm-branch-commits


https://github.com/melver updated 
https://github.com/llvm/llvm-project/pull/156839

>From b3653330c2c39ebaa094670f11afb0f9d36b9de2 Mon Sep 17 00:00:00 2001
From: Marco Elver 
Date: Thu, 4 Sep 2025 12:07:26 +0200
Subject: [PATCH] fixup! Insert AllocToken into index.rst

Created using spr 1.3.8-beta.1
---
 clang/docs/index.rst | 1 +
 1 file changed, 1 insertion(+)

diff --git a/clang/docs/index.rst b/clang/docs/index.rst
index be654af57f890..aa2b3a73dc11b 100644
--- a/clang/docs/index.rst
+++ b/clang/docs/index.rst
@@ -40,6 +40,7 @@ Using Clang as a Compiler
SanitizerCoverage
SanitizerStats
SanitizerSpecialCaseList
+   AllocToken
BoundsSafety
BoundsSafetyAdoptionGuide
BoundsSafetyImplPlans

___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [Clang] Introduce -fsanitize=alloc-token (PR #156839)

2025-09-18 Thread Marco Elver via llvm-branch-commits


https://github.com/melver updated 
https://github.com/llvm/llvm-project/pull/156839

>From b3653330c2c39ebaa094670f11afb0f9d36b9de2 Mon Sep 17 00:00:00 2001
From: Marco Elver 
Date: Thu, 4 Sep 2025 12:07:26 +0200
Subject: [PATCH] fixup! Insert AllocToken into index.rst

Created using spr 1.3.8-beta.1
---
 clang/docs/index.rst | 1 +
 1 file changed, 1 insertion(+)

diff --git a/clang/docs/index.rst b/clang/docs/index.rst
index be654af57f890..aa2b3a73dc11b 100644
--- a/clang/docs/index.rst
+++ b/clang/docs/index.rst
@@ -40,6 +40,7 @@ Using Clang as a Compiler
SanitizerCoverage
SanitizerStats
SanitizerSpecialCaseList
+   AllocToken
BoundsSafety
BoundsSafetyAdoptionGuide
BoundsSafetyImplPlans

___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [LoongArch] Generate [x]vldi instructions with special constant splats (PR #159258)

2025-09-18 Thread Zhaoxin Yang via llvm-branch-commits


https://github.com/ylzsx updated 
https://github.com/llvm/llvm-project/pull/159258

>From e1a23dd6e31734b05af239bb827a280d403564ee Mon Sep 17 00:00:00 2001
From: yangzhaoxin 
Date: Wed, 17 Sep 2025 10:20:46 +0800
Subject: [PATCH 1/3] [LoongArch] Generate [x]vldi instructions with special
 constant splats

---
 .../LoongArch/LoongArchISelDAGToDAG.cpp   | 52 +++
 .../LoongArch/LoongArchISelLowering.cpp   | 87 ++-
 .../Target/LoongArch/LoongArchISelLowering.h  |  5 ++
 .../CodeGen/LoongArch/lasx/build-vector.ll| 80 +
 .../lasx/fdiv-reciprocal-estimate.ll  | 87 +++
 .../lasx/fsqrt-reciprocal-estimate.ll | 39 +++--
 llvm/test/CodeGen/LoongArch/lasx/fsqrt.ll |  3 +-
 .../LoongArch/lasx/ir-instruction/fdiv.ll |  3 +-
 llvm/test/CodeGen/LoongArch/lasx/vselect.ll   | 31 +++
 .../CodeGen/LoongArch/lsx/build-vector.ll | 77 +---
 .../LoongArch/lsx/fdiv-reciprocal-estimate.ll | 87 +++
 .../lsx/fsqrt-reciprocal-estimate.ll  | 70 +--
 llvm/test/CodeGen/LoongArch/lsx/fsqrt.ll  |  3 +-
 .../LoongArch/lsx/ir-instruction/fdiv.ll  |  3 +-
 llvm/test/CodeGen/LoongArch/lsx/vselect.ll| 31 +++
 15 files changed, 289 insertions(+), 369 deletions(-)

diff --git a/llvm/lib/Target/LoongArch/LoongArchISelDAGToDAG.cpp 
b/llvm/lib/Target/LoongArch/LoongArchISelDAGToDAG.cpp
index 07e722b9a6591..fda313e693760 100644
--- a/llvm/lib/Target/LoongArch/LoongArchISelDAGToDAG.cpp
+++ b/llvm/lib/Target/LoongArch/LoongArchISelDAGToDAG.cpp
@@ -113,10 +113,11 @@ void LoongArchDAGToDAGISel::Select(SDNode *Node) {
 APInt SplatValue, SplatUndef;
 unsigned SplatBitSize;
 bool HasAnyUndefs;
-unsigned Op;
+unsigned Op = 0;
 EVT ResTy = BVN->getValueType(0);
 bool Is128Vec = BVN->getValueType(0).is128BitVector();
 bool Is256Vec = BVN->getValueType(0).is256BitVector();
+SDNode *Res;
 
 if (!Subtarget->hasExtLSX() || (!Is128Vec && !Is256Vec))
   break;
@@ -124,26 +125,25 @@ void LoongArchDAGToDAGISel::Select(SDNode *Node) {
   HasAnyUndefs, 8))
   break;
 
-switch (SplatBitSize) {
-default:
-  break;
-case 8:
-  Op = Is256Vec ? LoongArch::PseudoXVREPLI_B : LoongArch::PseudoVREPLI_B;
-  break;
-case 16:
-  Op = Is256Vec ? LoongArch::PseudoXVREPLI_H : LoongArch::PseudoVREPLI_H;
-  break;
-case 32:
-  Op = Is256Vec ? LoongArch::PseudoXVREPLI_W : LoongArch::PseudoVREPLI_W;
-  break;
-case 64:
-  Op = Is256Vec ? LoongArch::PseudoXVREPLI_D : LoongArch::PseudoVREPLI_D;
-  break;
-}
-
-SDNode *Res;
 // If we have a signed 10 bit integer, we can splat it directly.
 if (SplatValue.isSignedIntN(10)) {
+  switch (SplatBitSize) {
+  default:
+break;
+  case 8:
+Op = Is256Vec ? LoongArch::PseudoXVREPLI_B : LoongArch::PseudoVREPLI_B;
+break;
+  case 16:
+Op = Is256Vec ? LoongArch::PseudoXVREPLI_H : LoongArch::PseudoVREPLI_H;
+break;
+  case 32:
+Op = Is256Vec ? LoongArch::PseudoXVREPLI_W : LoongArch::PseudoVREPLI_W;
+break;
+  case 64:
+Op = Is256Vec ? LoongArch::PseudoXVREPLI_D : LoongArch::PseudoVREPLI_D;
+break;
+  }
+
   EVT EleType = ResTy.getVectorElementType();
   APInt Val = SplatValue.sextOrTrunc(EleType.getSizeInBits());
   SDValue Imm = CurDAG->getTargetConstant(Val, DL, EleType);
@@ -151,6 +151,20 @@ void LoongArchDAGToDAGISel::Select(SDNode *Node) {
   ReplaceNode(Node, Res);
   return;
 }
+
+// Select appropriate [x]vldi instructions for some special constant 
splats,
+// where the immediate value `imm[12] == 1` for used [x]vldi instructions.
+std::pair ConvertVLDI =
+LoongArchTargetLowering::isImmVLDILegalForMode1(SplatValue,
+SplatBitSize);
+if (ConvertVLDI.first) {
+  Op = Is256Vec ? LoongArch::XVLDI : LoongArch::VLDI;
+  SDValue Imm = CurDAG->getSignedTargetConstant(
+  SignExtend32<13>(ConvertVLDI.second), DL, MVT::i32);
+  Res = CurDAG->getMachineNode(Op, DL, ResTy, Imm);
+  ReplaceNode(Node, Res);
+  return;
+}
 break;
   }
   }
diff --git a/llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp 
b/llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
index e8668860c2b38..460e2d7c87af7 100644
--- a/llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
+++ b/llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
@@ -2679,9 +2679,10 @@ SDValue 
LoongArchTargetLowering::lowerBUILD_VECTOR(SDValue Op,
 
 if (SplatBitSize == 64 && !Subtarget.is64Bit()) {
   // We can only handle 64-bit elements that are within
-  // the signed 10-bit range on 32-bit targets.
+  // the signed 10-bit range or match vldi patterns on 32-bit targets.
   // See the BUILD_VECTOR case in LoongArchDAGToDAGISel::Select().
-

[llvm-branch-commits] [AllocToken, Clang] Infer type hints from sizeof expressions and casts (PR #156841)

2025-09-18 Thread Marco Elver via llvm-branch-commits


https://github.com/melver edited 
https://github.com/llvm/llvm-project/pull/156841
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [AllocToken, Clang] Infer type hints from sizeof expressions and casts (PR #156841)

2025-09-18 Thread Marco Elver via llvm-branch-commits



@@ -1349,6 +1350,98 @@ void CodeGenFunction::EmitAllocTokenHint(llvm::CallBase 
*CB,
   CB->setMetadata(llvm::LLVMContext::MD_alloc_token_hint, MDN);
 }
 
+/// Infer type from a simple sizeof expression.
+static QualType inferTypeFromSizeofExpr(const Expr *E) {
+  const Expr *Arg = E->IgnoreParenImpCasts();
+  if (const auto *UET = dyn_cast(Arg)) {
+if (UET->getKind() == UETT_SizeOf) {
+  if (UET->isArgumentType()) {
+return UET->getArgumentTypeInfo()->getType();
+  } else {
+return UET->getArgumentExpr()->getType();
+  }
+}
+  }
+  return QualType();
+}
+
+/// Infer type from an arithmetic expression involving a sizeof.
+static QualType inferTypeFromArithSizeofExpr(const Expr *E) {
+  const Expr *Arg = E->IgnoreParenImpCasts();
+  // The argument is a lone sizeof expression.
+  QualType QT = inferTypeFromSizeofExpr(Arg);
+  if (!QT.isNull())
+return QT;
+  if (const auto *BO = dyn_cast(Arg)) {
+// Argument is an arithmetic expression. Cover common arithmetic patterns
+// involving sizeof.
+switch (BO->getOpcode()) {
+case BO_Add:
+case BO_Div:
+case BO_Mul:
+case BO_Shl:
+case BO_Shr:
+case BO_Sub:
+  QT = inferTypeFromArithSizeofExpr(BO->getLHS());

melver wrote:

The Linux kernel has structs with flexible array members, and it's not uncommon 
to see this:
```
struct A {
  int len;
  struct Foo *foo;
  int array[];
};

... = kmalloc(sizeof(A) + sizeof(int) * N, ...);
```

I'm willing to accept some degree of unsoundness in complex cases to get 
completeness here, but am assuming that in the majority of cases the first type 
is the one we want to pick.

https://github.com/llvm/llvm-project/pull/156841
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [flang] [mlir] [MLIR] Add new complex.powi op (PR #158722)

2025-09-18 Thread Akash Banerjee via llvm-branch-commits


https://github.com/TIFitis updated 
https://github.com/llvm/llvm-project/pull/158722

>From 6976910364aa2fe18603aefcb27b10bd0120513d Mon Sep 17 00:00:00 2001
From: Akash Banerjee 
Date: Mon, 15 Sep 2025 20:35:29 +0100
Subject: [PATCH 1/6] Add complex.powi op.

---
 flang/lib/Optimizer/Builder/IntrinsicCall.cpp | 20 ++--
 .../Transforms/ConvertComplexPow.cpp  | 94 +--
 flang/test/Lower/HLFIR/binary-ops.f90 |  2 +-
 .../test/Lower/Intrinsics/pow_complex16i.f90  |  2 +-
 .../test/Lower/Intrinsics/pow_complex16k.f90  |  2 +-
 flang/test/Lower/amdgcn-complex.f90   |  9 ++
 flang/test/Lower/power-operator.f90   |  9 +-
 .../mlir/Dialect/Complex/IR/ComplexOps.td | 26 +
 .../ComplexToROCDLLibraryCalls.cpp| 41 +++-
 .../Transforms/AlgebraicSimplification.cpp| 24 +++--
 .../Dialect/Math/Transforms/CMakeLists.txt|  1 +
 .../complex-to-rocdl-library-calls.mlir   | 14 +++
 mlir/test/Dialect/Complex/powi-simplify.mlir  | 20 
 13 files changed, 188 insertions(+), 76 deletions(-)
 create mode 100644 mlir/test/Dialect/Complex/powi-simplify.mlir

diff --git a/flang/lib/Optimizer/Builder/IntrinsicCall.cpp 
b/flang/lib/Optimizer/Builder/IntrinsicCall.cpp
index 466458c05dba7..74a4e8f85c8ff 100644
--- a/flang/lib/Optimizer/Builder/IntrinsicCall.cpp
+++ b/flang/lib/Optimizer/Builder/IntrinsicCall.cpp
@@ -1331,14 +1331,20 @@ mlir::Value genComplexPow(fir::FirOpBuilder &builder, 
mlir::Location loc,
 return genLibCall(builder, loc, mathOp, mathLibFuncType, args);
   auto complexTy = mlir::cast(mathLibFuncType.getInput(0));
   mlir::Value exp = args[1];
-  if (!mlir::isa(exp.getType())) {
-auto realTy = complexTy.getElementType();
-mlir::Value realExp = builder.createConvert(loc, realTy, exp);
-mlir::Value zero = builder.createRealConstant(loc, realTy, 0);
-exp =
-builder.create(loc, complexTy, realExp, zero);
+  mlir::Value result;
+  if (mlir::isa(exp.getType()) ||
+  mlir::isa(exp.getType())) {
+result = builder.create(loc, args[0], exp);
+  } else {
+if (!mlir::isa(exp.getType())) {
+  auto realTy = complexTy.getElementType();
+  mlir::Value realExp = builder.createConvert(loc, realTy, exp);
+  mlir::Value zero = builder.createRealConstant(loc, realTy, 0);
+  exp = builder.create(loc, complexTy, realExp,
+zero);
+}
+result = builder.create(loc, args[0], exp);
   }
-  mlir::Value result = builder.create(loc, args[0], exp);
   result = builder.createConvert(loc, mathLibFuncType.getResult(0), result);
   return result;
 }
diff --git a/flang/lib/Optimizer/Transforms/ConvertComplexPow.cpp 
b/flang/lib/Optimizer/Transforms/ConvertComplexPow.cpp
index 78f9d9e4f639a..d76451459def9 100644
--- a/flang/lib/Optimizer/Transforms/ConvertComplexPow.cpp
+++ b/flang/lib/Optimizer/Transforms/ConvertComplexPow.cpp
@@ -58,63 +58,57 @@ void ConvertComplexPowPass::runOnOperation() {
   ModuleOp mod = getOperation();
   fir::FirOpBuilder builder(mod, fir::getKindMapping(mod));
 
-  mod.walk([&](complex::PowOp op) {
+  mod.walk([&](complex::PowiOp op) {
 builder.setInsertionPoint(op);
 Location loc = op.getLoc();
 auto complexTy = cast(op.getType());
 auto elemTy = complexTy.getElementType();
-
 Value base = op.getLhs();
-Value rhs = op.getRhs();
-
-Value intExp;
-if (auto create = rhs.getDefiningOp()) {
-  if (isZero(create.getImaginary())) {
-if (auto conv = create.getReal().getDefiningOp()) {
-  if (auto intTy = dyn_cast(conv.getValue().getType()))
-intExp = conv.getValue();
-}
-  }
-}
-
+Value intExp = op.getRhs();
 func::FuncOp callee;
-SmallVector args;
-if (intExp) {
-  unsigned realBits = cast(elemTy).getWidth();
-  unsigned intBits = cast(intExp.getType()).getWidth();
-  auto funcTy = builder.getFunctionType(
-  {complexTy, builder.getIntegerType(intBits)}, {complexTy});
-  if (realBits == 32 && intBits == 32)
-callee = getOrDeclare(builder, loc, RTNAME_STRING(cpowi), funcTy);
-  else if (realBits == 32 && intBits == 64)
-callee = getOrDeclare(builder, loc, RTNAME_STRING(cpowk), funcTy);
-  else if (realBits == 64 && intBits == 32)
-callee = getOrDeclare(builder, loc, RTNAME_STRING(zpowi), funcTy);
-  else if (realBits == 64 && intBits == 64)
-callee = getOrDeclare(builder, loc, RTNAME_STRING(zpowk), funcTy);
-  else if (realBits == 128 && intBits == 32)
-callee = getOrDeclare(builder, loc, RTNAME_STRING(cqpowi), funcTy);
-  else if (realBits == 128 && intBits == 64)
-callee = getOrDeclare(builder, loc, RTNAME_STRING(cqpowk), funcTy);
-  else
-return;
-  args = {base, intExp};
-} else {
-  unsigned realBits = cast(elemTy).getWidth();
-  auto funcTy =
-  builder.getFunctionType({complexTy, complexTy}, {complexTy});
-

[llvm-branch-commits] [llvm] [LoopUnroll] Fix block frequencies for epilogue (PR #159163)

2025-09-18 Thread Joel E. Denny via llvm-branch-commits


https://github.com/jdenny-ornl edited 
https://github.com/llvm/llvm-project/pull/159163
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [LoongArch] Generate [x]vldi instructions with special constant splats (PR #159258)

2025-09-18 Thread Zhaoxin Yang via llvm-branch-commits


https://github.com/ylzsx updated 
https://github.com/llvm/llvm-project/pull/159258

>From e1a23dd6e31734b05af239bb827a280d403564ee Mon Sep 17 00:00:00 2001
From: yangzhaoxin 
Date: Wed, 17 Sep 2025 10:20:46 +0800
Subject: [PATCH 1/3] [LoongArch] Generate [x]vldi instructions with special
 constant splats

---
 .../LoongArch/LoongArchISelDAGToDAG.cpp   | 52 +++
 .../LoongArch/LoongArchISelLowering.cpp   | 87 ++-
 .../Target/LoongArch/LoongArchISelLowering.h  |  5 ++
 .../CodeGen/LoongArch/lasx/build-vector.ll| 80 +
 .../lasx/fdiv-reciprocal-estimate.ll  | 87 +++
 .../lasx/fsqrt-reciprocal-estimate.ll | 39 +++--
 llvm/test/CodeGen/LoongArch/lasx/fsqrt.ll |  3 +-
 .../LoongArch/lasx/ir-instruction/fdiv.ll |  3 +-
 llvm/test/CodeGen/LoongArch/lasx/vselect.ll   | 31 +++
 .../CodeGen/LoongArch/lsx/build-vector.ll | 77 +---
 .../LoongArch/lsx/fdiv-reciprocal-estimate.ll | 87 +++
 .../lsx/fsqrt-reciprocal-estimate.ll  | 70 +--
 llvm/test/CodeGen/LoongArch/lsx/fsqrt.ll  |  3 +-
 .../LoongArch/lsx/ir-instruction/fdiv.ll  |  3 +-
 llvm/test/CodeGen/LoongArch/lsx/vselect.ll| 31 +++
 15 files changed, 289 insertions(+), 369 deletions(-)

diff --git a/llvm/lib/Target/LoongArch/LoongArchISelDAGToDAG.cpp 
b/llvm/lib/Target/LoongArch/LoongArchISelDAGToDAG.cpp
index 07e722b9a6591..fda313e693760 100644
--- a/llvm/lib/Target/LoongArch/LoongArchISelDAGToDAG.cpp
+++ b/llvm/lib/Target/LoongArch/LoongArchISelDAGToDAG.cpp
@@ -113,10 +113,11 @@ void LoongArchDAGToDAGISel::Select(SDNode *Node) {
 APInt SplatValue, SplatUndef;
 unsigned SplatBitSize;
 bool HasAnyUndefs;
-unsigned Op;
+unsigned Op = 0;
 EVT ResTy = BVN->getValueType(0);
 bool Is128Vec = BVN->getValueType(0).is128BitVector();
 bool Is256Vec = BVN->getValueType(0).is256BitVector();
+SDNode *Res;
 
 if (!Subtarget->hasExtLSX() || (!Is128Vec && !Is256Vec))
   break;
@@ -124,26 +125,25 @@ void LoongArchDAGToDAGISel::Select(SDNode *Node) {
   HasAnyUndefs, 8))
   break;
 
-switch (SplatBitSize) {
-default:
-  break;
-case 8:
-  Op = Is256Vec ? LoongArch::PseudoXVREPLI_B : LoongArch::PseudoVREPLI_B;
-  break;
-case 16:
-  Op = Is256Vec ? LoongArch::PseudoXVREPLI_H : LoongArch::PseudoVREPLI_H;
-  break;
-case 32:
-  Op = Is256Vec ? LoongArch::PseudoXVREPLI_W : LoongArch::PseudoVREPLI_W;
-  break;
-case 64:
-  Op = Is256Vec ? LoongArch::PseudoXVREPLI_D : LoongArch::PseudoVREPLI_D;
-  break;
-}
-
-SDNode *Res;
 // If we have a signed 10 bit integer, we can splat it directly.
 if (SplatValue.isSignedIntN(10)) {
+  switch (SplatBitSize) {
+  default:
+break;
+  case 8:
+Op = Is256Vec ? LoongArch::PseudoXVREPLI_B : LoongArch::PseudoVREPLI_B;
+break;
+  case 16:
+Op = Is256Vec ? LoongArch::PseudoXVREPLI_H : LoongArch::PseudoVREPLI_H;
+break;
+  case 32:
+Op = Is256Vec ? LoongArch::PseudoXVREPLI_W : LoongArch::PseudoVREPLI_W;
+break;
+  case 64:
+Op = Is256Vec ? LoongArch::PseudoXVREPLI_D : LoongArch::PseudoVREPLI_D;
+break;
+  }
+
   EVT EleType = ResTy.getVectorElementType();
   APInt Val = SplatValue.sextOrTrunc(EleType.getSizeInBits());
   SDValue Imm = CurDAG->getTargetConstant(Val, DL, EleType);
@@ -151,6 +151,20 @@ void LoongArchDAGToDAGISel::Select(SDNode *Node) {
   ReplaceNode(Node, Res);
   return;
 }
+
+// Select appropriate [x]vldi instructions for some special constant 
splats,
+// where the immediate value `imm[12] == 1` for used [x]vldi instructions.
+std::pair ConvertVLDI =
+LoongArchTargetLowering::isImmVLDILegalForMode1(SplatValue,
+SplatBitSize);
+if (ConvertVLDI.first) {
+  Op = Is256Vec ? LoongArch::XVLDI : LoongArch::VLDI;
+  SDValue Imm = CurDAG->getSignedTargetConstant(
+  SignExtend32<13>(ConvertVLDI.second), DL, MVT::i32);
+  Res = CurDAG->getMachineNode(Op, DL, ResTy, Imm);
+  ReplaceNode(Node, Res);
+  return;
+}
 break;
   }
   }
diff --git a/llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp 
b/llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
index e8668860c2b38..460e2d7c87af7 100644
--- a/llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
+++ b/llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
@@ -2679,9 +2679,10 @@ SDValue 
LoongArchTargetLowering::lowerBUILD_VECTOR(SDValue Op,
 
 if (SplatBitSize == 64 && !Subtarget.is64Bit()) {
   // We can only handle 64-bit elements that are within
-  // the signed 10-bit range on 32-bit targets.
+  // the signed 10-bit range or match vldi patterns on 32-bit targets.
   // See the BUILD_VECTOR case in LoongArchDAGToDAGISel::Select().
-

[llvm-branch-commits] [llvm] [AllocToken, Clang] Implement __builtin_infer_alloc_token() and llvm.alloc.token.id (PR #156842)

2025-09-18 Thread Marco Elver via llvm-branch-commits


https://github.com/melver edited 
https://github.com/llvm/llvm-project/pull/156842
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [Offload] Add GenericPluginTy::get_mem_info (PR #157484)

2025-09-18 Thread Ross Brunton via llvm-branch-commits


https://github.com/RossBrunton converted_to_draft 
https://github.com/llvm/llvm-project/pull/157484
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Enable ISD::PTRADD for 64-bit AS by default (PR #146076)

2025-09-18 Thread Fabian Ritter via llvm-branch-commits


https://github.com/ritter-x2a updated 
https://github.com/llvm/llvm-project/pull/146076

>From 3b0c210862015dc304004641990fea429f8e31c7 Mon Sep 17 00:00:00 2001
From: Fabian Ritter 
Date: Fri, 27 Jun 2025 05:38:52 -0400
Subject: [PATCH 1/3] [AMDGPU][SDAG] Enable ISD::PTRADD for 64-bit AS by
 default

Also removes the command line option to control this feature.

There seem to be mainly two kinds of test changes:
- Some operands of addition instructions are swapped; that is to be expected
  since PTRADD is not commutative.
- Improvements in code generation, probably because the legacy lowering enabled
  some transformations that were sometimes harmful.

For SWDEV-516125.
---
 llvm/lib/Target/AMDGPU/SIISelLowering.cpp |  10 +-
 .../identical-subrange-spill-infloop.ll   | 352 +++---
 .../AMDGPU/infer-addrspace-flat-atomic.ll |  14 +-
 llvm/test/CodeGen/AMDGPU/lds-frame-extern.ll  |   8 +-
 .../AMDGPU/lower-module-lds-via-hybrid.ll |   4 +-
 .../AMDGPU/lower-module-lds-via-table.ll  |  16 +-
 .../match-perm-extract-vector-elt-bug.ll  |  22 +-
 llvm/test/CodeGen/AMDGPU/memmove-var-size.ll  |  16 +-
 .../AMDGPU/preload-implicit-kernargs.ll   |   6 +-
 .../AMDGPU/promote-constOffset-to-imm.ll  |   8 +-
 llvm/test/CodeGen/AMDGPU/ptradd-sdag-mubuf.ll |   7 +-
 .../AMDGPU/ptradd-sdag-optimizations.ll   |  94 ++---
 .../AMDGPU/ptradd-sdag-undef-poison.ll|   6 +-
 llvm/test/CodeGen/AMDGPU/ptradd-sdag.ll   |  27 +-
 llvm/test/CodeGen/AMDGPU/store-weird-sizes.ll |  29 +-
 15 files changed, 310 insertions(+), 309 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp 
b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
index 78d608556f056..ac3d322ad65c3 100644
--- a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
@@ -64,14 +64,6 @@ static cl::opt UseDivergentRegisterIndexing(
 cl::desc("Use indirect register addressing for divergent indexes"),
 cl::init(false));
 
-// TODO: This option should be removed once we switch to always using PTRADD in
-// the SelectionDAG.
-static cl::opt UseSelectionDAGPTRADD(
-"amdgpu-use-sdag-ptradd", cl::Hidden,
-cl::desc("Generate ISD::PTRADD nodes for 64-bit pointer arithmetic in the "
- "SelectionDAG ISel"),
-cl::init(false));
-
 static bool denormalModeIsFlushAllF32(const MachineFunction &MF) {
   const SIMachineFunctionInfo *Info = MF.getInfo();
   return Info->getMode().FP32Denormals == DenormalMode::getPreserveSign();
@@ -11473,7 +11465,7 @@ static bool isNoUnsignedWrap(SDValue Addr) {
 
 bool SITargetLowering::shouldPreservePtrArith(const Function &F,
   EVT PtrVT) const {
-  return UseSelectionDAGPTRADD && PtrVT == MVT::i64;
+  return PtrVT == MVT::i64;
 }
 
 bool SITargetLowering::canTransformPtrArithOutOfBounds(const Function &F,
diff --git a/llvm/test/CodeGen/AMDGPU/identical-subrange-spill-infloop.ll 
b/llvm/test/CodeGen/AMDGPU/identical-subrange-spill-infloop.ll
index 2c03113e8af47..805cdd37d6e70 100644
--- a/llvm/test/CodeGen/AMDGPU/identical-subrange-spill-infloop.ll
+++ b/llvm/test/CodeGen/AMDGPU/identical-subrange-spill-infloop.ll
@@ -6,96 +6,150 @@ define void @main(i1 %arg) #0 {
 ; CHECK:   ; %bb.0: ; %bb
 ; CHECK-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
 ; CHECK-NEXT:s_xor_saveexec_b64 s[4:5], -1
-; CHECK-NEXT:buffer_store_dword v5, off, s[0:3], s32 ; 4-byte Folded Spill
-; CHECK-NEXT:buffer_store_dword v6, off, s[0:3], s32 offset:4 ; 4-byte 
Folded Spill
+; CHECK-NEXT:buffer_store_dword v6, off, s[0:3], s32 ; 4-byte Folded Spill
+; CHECK-NEXT:buffer_store_dword v7, off, s[0:3], s32 offset:4 ; 4-byte 
Folded Spill
 ; CHECK-NEXT:s_mov_b64 exec, s[4:5]
-; CHECK-NEXT:v_writelane_b32 v5, s30, 0
-; CHECK-NEXT:v_writelane_b32 v5, s31, 1
-; CHECK-NEXT:v_writelane_b32 v5, s36, 2
-; CHECK-NEXT:v_writelane_b32 v5, s37, 3
-; CHECK-NEXT:v_writelane_b32 v5, s38, 4
-; CHECK-NEXT:v_writelane_b32 v5, s39, 5
-; CHECK-NEXT:v_writelane_b32 v5, s48, 6
-; CHECK-NEXT:v_writelane_b32 v5, s49, 7
-; CHECK-NEXT:v_writelane_b32 v5, s50, 8
-; CHECK-NEXT:v_writelane_b32 v5, s51, 9
-; CHECK-NEXT:v_writelane_b32 v5, s52, 10
-; CHECK-NEXT:v_writelane_b32 v5, s53, 11
-; CHECK-NEXT:v_writelane_b32 v5, s54, 12
-; CHECK-NEXT:v_writelane_b32 v5, s55, 13
-; CHECK-NEXT:s_getpc_b64 s[24:25]
-; CHECK-NEXT:v_writelane_b32 v5, s64, 14
-; CHECK-NEXT:s_movk_i32 s4, 0xf0
-; CHECK-NEXT:s_mov_b32 s5, s24
-; CHECK-NEXT:v_writelane_b32 v5, s65, 15
-; CHECK-NEXT:s_load_dwordx16 s[8:23], s[4:5], 0x0
-; CHECK-NEXT:s_mov_b64 s[4:5], 0
-; CHECK-NEXT:v_writelane_b32 v5, s66, 16
-; CHECK-NEXT:s_load_dwordx4 s[4:7], s[4:5], 0x0
-; CHECK-NEXT:v_writelane_b32 v5, s67, 17
-; CHECK-NEXT:s_waitcnt lgkmcnt(0)
-; CHECK-NEXT:s_movk_i32 s6, 0x130
-; CHECK-NEXT:s_mov_b32 s7, s24
-; CHECK-NEXT:v_writelane_b32 v5

[llvm-branch-commits] [llvm] [SDAG][AMDGPU] Allow opting in to OOB-generating PTRADD transforms (PR #146074)

2025-09-18 Thread Fabian Ritter via llvm-branch-commits


https://github.com/ritter-x2a updated 
https://github.com/llvm/llvm-project/pull/146074

>From b484d75cff9bd4703dd2c90d041d4df0aefd0e3c Mon Sep 17 00:00:00 2001
From: Fabian Ritter 
Date: Thu, 26 Jun 2025 06:10:35 -0400
Subject: [PATCH 1/2] [SDAG][AMDGPU] Allow opting in to OOB-generating PTRADD
 transforms

This PR adds a TargetLowering hook, canTransformPtrArithOutOfBounds,
that targets can use to allow transformations to introduce out-of-bounds
pointer arithmetic. It also moves two such transformations from the
AMDGPU-specific DAG combines to the generic DAGCombiner.

This is motivated by target features like AArch64's checked pointer
arithmetic, CPA, which does not tolerate the introduction of
out-of-bounds pointer arithmetic.
---
 llvm/include/llvm/CodeGen/TargetLowering.h|   7 +
 llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | 125 +++---
 llvm/lib/Target/AMDGPU/SIISelLowering.cpp |  59 ++---
 llvm/lib/Target/AMDGPU/SIISelLowering.h   |   3 +
 4 files changed, 94 insertions(+), 100 deletions(-)

diff --git a/llvm/include/llvm/CodeGen/TargetLowering.h 
b/llvm/include/llvm/CodeGen/TargetLowering.h
index 46be271320fdd..4c2d991308d30 100644
--- a/llvm/include/llvm/CodeGen/TargetLowering.h
+++ b/llvm/include/llvm/CodeGen/TargetLowering.h
@@ -3518,6 +3518,13 @@ class LLVM_ABI TargetLoweringBase {
 return false;
   }
 
+  /// True if the target allows transformations of in-bounds pointer
+  /// arithmetic that cause out-of-bounds intermediate results.
+  virtual bool canTransformPtrArithOutOfBounds(const Function &F,
+   EVT PtrVT) const {
+return false;
+  }
+
   /// Does this target support complex deinterleaving
   virtual bool isComplexDeinterleavingSupported() const { return false; }
 
diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp 
b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index 77bc47f28fc80..67db08c3f9bac 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -2696,59 +2696,82 @@ SDValue DAGCombiner::visitPTRADD(SDNode *N) {
   if (PtrVT == IntVT && isNullConstant(N0))
 return N1;
 
-  if (N0.getOpcode() != ISD::PTRADD ||
-  reassociationCanBreakAddressingModePattern(ISD::PTRADD, DL, N, N0, N1))
-return SDValue();
-
-  SDValue X = N0.getOperand(0);
-  SDValue Y = N0.getOperand(1);
-  SDValue Z = N1;
-  bool N0OneUse = N0.hasOneUse();
-  bool YIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Y);
-  bool ZIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Z);
-
-  // (ptradd (ptradd x, y), z) -> (ptradd x, (add y, z)) if:
-  //   * y is a constant and (ptradd x, y) has one use; or
-  //   * y and z are both constants.
-  if ((YIsConstant && N0OneUse) || (YIsConstant && ZIsConstant)) {
-// If both additions in the original were NUW, the new ones are as well.
-SDNodeFlags Flags =
-(N->getFlags() & N0->getFlags()) & SDNodeFlags::NoUnsignedWrap;
-SDValue Add = DAG.getNode(ISD::ADD, DL, IntVT, {Y, Z}, Flags);
-AddToWorklist(Add.getNode());
-return DAG.getMemBasePlusOffset(X, Add, DL, Flags);
+  if (N0.getOpcode() == ISD::PTRADD &&
+  !reassociationCanBreakAddressingModePattern(ISD::PTRADD, DL, N, N0, N1)) 
{
+SDValue X = N0.getOperand(0);
+SDValue Y = N0.getOperand(1);
+SDValue Z = N1;
+bool N0OneUse = N0.hasOneUse();
+bool YIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Y);
+bool ZIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Z);
+
+// (ptradd (ptradd x, y), z) -> (ptradd x, (add y, z)) if:
+//   * y is a constant and (ptradd x, y) has one use; or
+//   * y and z are both constants.
+if ((YIsConstant && N0OneUse) || (YIsConstant && ZIsConstant)) {
+  // If both additions in the original were NUW, the new ones are as well.
+  SDNodeFlags Flags =
+  (N->getFlags() & N0->getFlags()) & SDNodeFlags::NoUnsignedWrap;
+  SDValue Add = DAG.getNode(ISD::ADD, DL, IntVT, {Y, Z}, Flags);
+  AddToWorklist(Add.getNode());
+  return DAG.getMemBasePlusOffset(X, Add, DL, Flags);
+}
+  }
+
+  // The following combines can turn in-bounds pointer arithmetic out of 
bounds.
+  // That is problematic for settings like AArch64's CPA, which checks that
+  // intermediate results of pointer arithmetic remain in bounds. The target
+  // therefore needs to opt-in to enable them.
+  if (!TLI.canTransformPtrArithOutOfBounds(
+  DAG.getMachineFunction().getFunction(), PtrVT))
+return SDValue();
+
+  if (N0.getOpcode() == ISD::PTRADD && N1.getOpcode() == ISD::Constant) {
+// Fold (ptradd (ptradd GA, v), c) -> (ptradd (ptradd GA, c) v) with
+// global address GA and constant c, such that c can be folded into GA.
+SDValue GAValue = N0.getOperand(0);
+if (const GlobalAddressSDNode *GA =
+dyn_cast(GAValue)) {
+  const TargetLowering &TLI = DAG.getTargetLoweringInfo();
+  if (!LegalOperations && TLI.

[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Handle ISD::PTRADD in various special cases (PR #145330)

2025-09-18 Thread Fabian Ritter via llvm-branch-commits


https://github.com/ritter-x2a updated 
https://github.com/llvm/llvm-project/pull/145330

>From da5b337fef36cdee209845b51bba323e84272334 Mon Sep 17 00:00:00 2001
From: Fabian Ritter 
Date: Tue, 17 Jun 2025 04:03:53 -0400
Subject: [PATCH 1/2] [AMDGPU][SDAG] Handle ISD::PTRADD in various special
 cases

There are more places in SIISelLowering.cpp and AMDGPUISelDAGToDAG.cpp
that check for ISD::ADD in a pointer context, but as far as I can tell
those are only relevant for 32-bit pointer arithmetic (like frame
indices/scratch addresses and LDS), for which we don't enable PTRADD
generation yet.

For SWDEV-516125.
---
 .../lib/CodeGen/SelectionDAG/SelectionDAG.cpp |   2 +-
 .../CodeGen/SelectionDAG/TargetLowering.cpp   |  21 +-
 llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp |   6 +-
 llvm/lib/Target/AMDGPU/SIISelLowering.cpp |   7 +-
 llvm/test/CodeGen/AMDGPU/ptradd-sdag-mubuf.ll |  67 ++
 .../AMDGPU/ptradd-sdag-optimizations.ll   | 196 ++
 6 files changed, 105 insertions(+), 194 deletions(-)

diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp 
b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
index 93ddba93b8034..42d3b36f222d7 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
@@ -8600,7 +8600,7 @@ static bool isMemSrcFromConstant(SDValue Src, 
ConstantDataArraySlice &Slice) {
   GlobalAddressSDNode *G = nullptr;
   if (Src.getOpcode() == ISD::GlobalAddress)
 G = cast(Src);
-  else if (Src.getOpcode() == ISD::ADD &&
+  else if (Src->isAnyAdd() &&
Src.getOperand(0).getOpcode() == ISD::GlobalAddress &&
Src.getOperand(1).getOpcode() == ISD::Constant) {
 G = cast(Src.getOperand(0));
diff --git a/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp 
b/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
index 177aa0d11ff90..7465c9b310cb9 100644
--- a/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
@@ -638,8 +638,14 @@ bool TargetLowering::ShrinkDemandedOp(SDValue Op, unsigned 
BitWidth,
   // operands on the new node are also disjoint.
   SDNodeFlags Flags(Op->getFlags().hasDisjoint() ? SDNodeFlags::Disjoint
  : SDNodeFlags::None);
+  unsigned Opcode = Op.getOpcode();
+  if (Opcode == ISD::PTRADD) {
+// It isn't a ptradd anymore if it doesn't operate on the entire
+// pointer.
+Opcode = ISD::ADD;
+  }
   SDValue X = DAG.getNode(
-  Op.getOpcode(), dl, SmallVT,
+  Opcode, dl, SmallVT,
   DAG.getNode(ISD::TRUNCATE, dl, SmallVT, Op.getOperand(0)),
   DAG.getNode(ISD::TRUNCATE, dl, SmallVT, Op.getOperand(1)), Flags);
   assert(DemandedSize <= SmallVTBits && "Narrowed below demanded bits?");
@@ -2860,6 +2866,11 @@ bool TargetLowering::SimplifyDemandedBits(
   return TLO.CombineTo(Op, And1);
 }
 [[fallthrough]];
+  case ISD::PTRADD:
+if (Op.getOperand(0).getValueType() != Op.getOperand(1).getValueType())
+  break;
+// PTRADD behaves like ADD if pointers are represented as integers.
+[[fallthrough]];
   case ISD::ADD:
   case ISD::SUB: {
 // Add, Sub, and Mul don't demand any bits in positions beyond that
@@ -2969,10 +2980,10 @@ bool TargetLowering::SimplifyDemandedBits(
 
 if (Op.getOpcode() == ISD::MUL) {
   Known = KnownBits::mul(KnownOp0, KnownOp1);
-} else { // Op.getOpcode() is either ISD::ADD or ISD::SUB.
+} else { // Op.getOpcode() is either ISD::ADD, ISD::PTRADD, or ISD::SUB.
   Known = KnownBits::computeForAddSub(
-  Op.getOpcode() == ISD::ADD, Flags.hasNoSignedWrap(),
-  Flags.hasNoUnsignedWrap(), KnownOp0, KnownOp1);
+  Op->isAnyAdd(), Flags.hasNoSignedWrap(), Flags.hasNoUnsignedWrap(),
+  KnownOp0, KnownOp1);
 }
 break;
   }
@@ -5679,7 +5690,7 @@ bool TargetLowering::isGAPlusOffset(SDNode *WN, const 
GlobalValue *&GA,
 return true;
   }
 
-  if (N->getOpcode() == ISD::ADD) {
+  if (N->isAnyAdd()) {
 SDValue N1 = N->getOperand(0);
 SDValue N2 = N->getOperand(1);
 if (isGAPlusOffset(N1.getNode(), GA, Offset)) {
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
index c2fca79979e1b..312de262490f4 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
@@ -1531,7 +1531,7 @@ bool AMDGPUDAGToDAGISel::SelectMUBUF(SDValue Addr, 
SDValue &Ptr, SDValue &VAddr,
   C1 = nullptr;
   }
 
-  if (N0.getOpcode() == ISD::ADD) {
+  if (N0->isAnyAdd()) {
 // (add N2, N3) -> addr64, or
 // (add (add N2, N3), C1) -> addr64
 SDValue N2 = N0.getOperand(0);
@@ -1993,7 +1993,7 @@ bool AMDGPUDAGToDAGISel::SelectGlobalSAddr(SDNode *N, 
SDValue Addr,
   }
 
   // Match the variable offset.
-  if (Addr.getOpcode() == ISD::ADD) {
+  if (Addr->isAnyAdd()) {
 LHS = Addr.getOperand(0);
 
 if (!LHS

[llvm-branch-commits] [llvm] [SDAG][AMDGPU] Allow opting in to OOB-generating PTRADD transforms (PR #146074)

2025-09-18 Thread Fabian Ritter via llvm-branch-commits


https://github.com/ritter-x2a updated 
https://github.com/llvm/llvm-project/pull/146074

>From b484d75cff9bd4703dd2c90d041d4df0aefd0e3c Mon Sep 17 00:00:00 2001
From: Fabian Ritter 
Date: Thu, 26 Jun 2025 06:10:35 -0400
Subject: [PATCH 1/2] [SDAG][AMDGPU] Allow opting in to OOB-generating PTRADD
 transforms

This PR adds a TargetLowering hook, canTransformPtrArithOutOfBounds,
that targets can use to allow transformations to introduce out-of-bounds
pointer arithmetic. It also moves two such transformations from the
AMDGPU-specific DAG combines to the generic DAGCombiner.

This is motivated by target features like AArch64's checked pointer
arithmetic, CPA, which does not tolerate the introduction of
out-of-bounds pointer arithmetic.
---
 llvm/include/llvm/CodeGen/TargetLowering.h|   7 +
 llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | 125 +++---
 llvm/lib/Target/AMDGPU/SIISelLowering.cpp |  59 ++---
 llvm/lib/Target/AMDGPU/SIISelLowering.h   |   3 +
 4 files changed, 94 insertions(+), 100 deletions(-)

diff --git a/llvm/include/llvm/CodeGen/TargetLowering.h 
b/llvm/include/llvm/CodeGen/TargetLowering.h
index 46be271320fdd..4c2d991308d30 100644
--- a/llvm/include/llvm/CodeGen/TargetLowering.h
+++ b/llvm/include/llvm/CodeGen/TargetLowering.h
@@ -3518,6 +3518,13 @@ class LLVM_ABI TargetLoweringBase {
 return false;
   }
 
+  /// True if the target allows transformations of in-bounds pointer
+  /// arithmetic that cause out-of-bounds intermediate results.
+  virtual bool canTransformPtrArithOutOfBounds(const Function &F,
+   EVT PtrVT) const {
+return false;
+  }
+
   /// Does this target support complex deinterleaving
   virtual bool isComplexDeinterleavingSupported() const { return false; }
 
diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp 
b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index 77bc47f28fc80..67db08c3f9bac 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -2696,59 +2696,82 @@ SDValue DAGCombiner::visitPTRADD(SDNode *N) {
   if (PtrVT == IntVT && isNullConstant(N0))
 return N1;
 
-  if (N0.getOpcode() != ISD::PTRADD ||
-  reassociationCanBreakAddressingModePattern(ISD::PTRADD, DL, N, N0, N1))
-return SDValue();
-
-  SDValue X = N0.getOperand(0);
-  SDValue Y = N0.getOperand(1);
-  SDValue Z = N1;
-  bool N0OneUse = N0.hasOneUse();
-  bool YIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Y);
-  bool ZIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Z);
-
-  // (ptradd (ptradd x, y), z) -> (ptradd x, (add y, z)) if:
-  //   * y is a constant and (ptradd x, y) has one use; or
-  //   * y and z are both constants.
-  if ((YIsConstant && N0OneUse) || (YIsConstant && ZIsConstant)) {
-// If both additions in the original were NUW, the new ones are as well.
-SDNodeFlags Flags =
-(N->getFlags() & N0->getFlags()) & SDNodeFlags::NoUnsignedWrap;
-SDValue Add = DAG.getNode(ISD::ADD, DL, IntVT, {Y, Z}, Flags);
-AddToWorklist(Add.getNode());
-return DAG.getMemBasePlusOffset(X, Add, DL, Flags);
+  if (N0.getOpcode() == ISD::PTRADD &&
+  !reassociationCanBreakAddressingModePattern(ISD::PTRADD, DL, N, N0, N1)) 
{
+SDValue X = N0.getOperand(0);
+SDValue Y = N0.getOperand(1);
+SDValue Z = N1;
+bool N0OneUse = N0.hasOneUse();
+bool YIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Y);
+bool ZIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Z);
+
+// (ptradd (ptradd x, y), z) -> (ptradd x, (add y, z)) if:
+//   * y is a constant and (ptradd x, y) has one use; or
+//   * y and z are both constants.
+if ((YIsConstant && N0OneUse) || (YIsConstant && ZIsConstant)) {
+  // If both additions in the original were NUW, the new ones are as well.
+  SDNodeFlags Flags =
+  (N->getFlags() & N0->getFlags()) & SDNodeFlags::NoUnsignedWrap;
+  SDValue Add = DAG.getNode(ISD::ADD, DL, IntVT, {Y, Z}, Flags);
+  AddToWorklist(Add.getNode());
+  return DAG.getMemBasePlusOffset(X, Add, DL, Flags);
+}
+  }
+
+  // The following combines can turn in-bounds pointer arithmetic out of 
bounds.
+  // That is problematic for settings like AArch64's CPA, which checks that
+  // intermediate results of pointer arithmetic remain in bounds. The target
+  // therefore needs to opt-in to enable them.
+  if (!TLI.canTransformPtrArithOutOfBounds(
+  DAG.getMachineFunction().getFunction(), PtrVT))
+return SDValue();
+
+  if (N0.getOpcode() == ISD::PTRADD && N1.getOpcode() == ISD::Constant) {
+// Fold (ptradd (ptradd GA, v), c) -> (ptradd (ptradd GA, c) v) with
+// global address GA and constant c, such that c can be folded into GA.
+SDValue GAValue = N0.getOperand(0);
+if (const GlobalAddressSDNode *GA =
+dyn_cast(GAValue)) {
+  const TargetLowering &TLI = DAG.getTargetLoweringInfo();
+  if (!LegalOperations && TLI.

[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Handle ISD::PTRADD in various special cases (PR #145330)

2025-09-18 Thread Fabian Ritter via llvm-branch-commits


https://github.com/ritter-x2a updated 
https://github.com/llvm/llvm-project/pull/145330

>From da5b337fef36cdee209845b51bba323e84272334 Mon Sep 17 00:00:00 2001
From: Fabian Ritter 
Date: Tue, 17 Jun 2025 04:03:53 -0400
Subject: [PATCH 1/2] [AMDGPU][SDAG] Handle ISD::PTRADD in various special
 cases

There are more places in SIISelLowering.cpp and AMDGPUISelDAGToDAG.cpp
that check for ISD::ADD in a pointer context, but as far as I can tell
those are only relevant for 32-bit pointer arithmetic (like frame
indices/scratch addresses and LDS), for which we don't enable PTRADD
generation yet.

For SWDEV-516125.
---
 .../lib/CodeGen/SelectionDAG/SelectionDAG.cpp |   2 +-
 .../CodeGen/SelectionDAG/TargetLowering.cpp   |  21 +-
 llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp |   6 +-
 llvm/lib/Target/AMDGPU/SIISelLowering.cpp |   7 +-
 llvm/test/CodeGen/AMDGPU/ptradd-sdag-mubuf.ll |  67 ++
 .../AMDGPU/ptradd-sdag-optimizations.ll   | 196 ++
 6 files changed, 105 insertions(+), 194 deletions(-)

diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp 
b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
index 93ddba93b8034..42d3b36f222d7 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
@@ -8600,7 +8600,7 @@ static bool isMemSrcFromConstant(SDValue Src, 
ConstantDataArraySlice &Slice) {
   GlobalAddressSDNode *G = nullptr;
   if (Src.getOpcode() == ISD::GlobalAddress)
 G = cast(Src);
-  else if (Src.getOpcode() == ISD::ADD &&
+  else if (Src->isAnyAdd() &&
Src.getOperand(0).getOpcode() == ISD::GlobalAddress &&
Src.getOperand(1).getOpcode() == ISD::Constant) {
 G = cast(Src.getOperand(0));
diff --git a/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp 
b/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
index 177aa0d11ff90..7465c9b310cb9 100644
--- a/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
@@ -638,8 +638,14 @@ bool TargetLowering::ShrinkDemandedOp(SDValue Op, unsigned 
BitWidth,
   // operands on the new node are also disjoint.
   SDNodeFlags Flags(Op->getFlags().hasDisjoint() ? SDNodeFlags::Disjoint
  : SDNodeFlags::None);
+  unsigned Opcode = Op.getOpcode();
+  if (Opcode == ISD::PTRADD) {
+// It isn't a ptradd anymore if it doesn't operate on the entire
+// pointer.
+Opcode = ISD::ADD;
+  }
   SDValue X = DAG.getNode(
-  Op.getOpcode(), dl, SmallVT,
+  Opcode, dl, SmallVT,
   DAG.getNode(ISD::TRUNCATE, dl, SmallVT, Op.getOperand(0)),
   DAG.getNode(ISD::TRUNCATE, dl, SmallVT, Op.getOperand(1)), Flags);
   assert(DemandedSize <= SmallVTBits && "Narrowed below demanded bits?");
@@ -2860,6 +2866,11 @@ bool TargetLowering::SimplifyDemandedBits(
   return TLO.CombineTo(Op, And1);
 }
 [[fallthrough]];
+  case ISD::PTRADD:
+if (Op.getOperand(0).getValueType() != Op.getOperand(1).getValueType())
+  break;
+// PTRADD behaves like ADD if pointers are represented as integers.
+[[fallthrough]];
   case ISD::ADD:
   case ISD::SUB: {
 // Add, Sub, and Mul don't demand any bits in positions beyond that
@@ -2969,10 +2980,10 @@ bool TargetLowering::SimplifyDemandedBits(
 
 if (Op.getOpcode() == ISD::MUL) {
   Known = KnownBits::mul(KnownOp0, KnownOp1);
-} else { // Op.getOpcode() is either ISD::ADD or ISD::SUB.
+} else { // Op.getOpcode() is either ISD::ADD, ISD::PTRADD, or ISD::SUB.
   Known = KnownBits::computeForAddSub(
-  Op.getOpcode() == ISD::ADD, Flags.hasNoSignedWrap(),
-  Flags.hasNoUnsignedWrap(), KnownOp0, KnownOp1);
+  Op->isAnyAdd(), Flags.hasNoSignedWrap(), Flags.hasNoUnsignedWrap(),
+  KnownOp0, KnownOp1);
 }
 break;
   }
@@ -5679,7 +5690,7 @@ bool TargetLowering::isGAPlusOffset(SDNode *WN, const 
GlobalValue *&GA,
 return true;
   }
 
-  if (N->getOpcode() == ISD::ADD) {
+  if (N->isAnyAdd()) {
 SDValue N1 = N->getOperand(0);
 SDValue N2 = N->getOperand(1);
 if (isGAPlusOffset(N1.getNode(), GA, Offset)) {
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
index c2fca79979e1b..312de262490f4 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
@@ -1531,7 +1531,7 @@ bool AMDGPUDAGToDAGISel::SelectMUBUF(SDValue Addr, 
SDValue &Ptr, SDValue &VAddr,
   C1 = nullptr;
   }
 
-  if (N0.getOpcode() == ISD::ADD) {
+  if (N0->isAnyAdd()) {
 // (add N2, N3) -> addr64, or
 // (add (add N2, N3), C1) -> addr64
 SDValue N2 = N0.getOperand(0);
@@ -1993,7 +1993,7 @@ bool AMDGPUDAGToDAGISel::SelectGlobalSAddr(SDNode *N, 
SDValue Addr,
   }
 
   // Match the variable offset.
-  if (Addr.getOpcode() == ISD::ADD) {
+  if (Addr->isAnyAdd()) {
 LHS = Addr.getOperand(0);
 
 if (!LHS

[llvm-branch-commits] [llvm] [AMDGPU][SDAG] DAGCombine PTRADD -> disjoint OR (PR #146075)

2025-09-18 Thread Fabian Ritter via llvm-branch-commits


https://github.com/ritter-x2a updated 
https://github.com/llvm/llvm-project/pull/146075

>From 7c417c4c1413a3807d476b7fc490256084a0ac62 Mon Sep 17 00:00:00 2001
From: Fabian Ritter 
Date: Fri, 27 Jun 2025 04:23:50 -0400
Subject: [PATCH 1/5] [AMDGPU][SDAG] DAGCombine PTRADD -> disjoint OR

If we can't fold a PTRADD's offset into its users, lowering them to
disjoint ORs is preferable: Often, a 32-bit OR instruction suffices
where we'd otherwise use a pair of 32-bit additions with carry.

This needs to be a DAGCombine (and not a selection rule) because its
main purpose is to enable subsequent DAGCombines for bitwise operations.
We don't want to just turn PTRADDs into disjoint ORs whenever that's
sound because this transform loses the information that the operation
implements pointer arithmetic, which we will soon need to fold offsets
into FLAT instructions. Currently, disjoint ORs can still be used for
offset folding, so that part of the logic can't be tested.

The PR contains a hacky workaround for a situation where an AssertAlign
operand of a PTRADD is not DAGCombined before the PTRADD, causing the
PTRADD to be turned into a disjoint OR although reassociating it with
the operand of the AssertAlign would be better. This wouldn't be a
problem if the DAGCombiner ensured that a node is only processed after
all its operands have been processed.

For SWDEV-516125.
---
 llvm/lib/Target/AMDGPU/SIISelLowering.cpp | 35 
 .../AMDGPU/ptradd-sdag-optimizations.ll   | 56 ++-
 2 files changed, 90 insertions(+), 1 deletion(-)

diff --git a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp 
b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
index 78d608556f056..ffaaef65569ae 100644
--- a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
@@ -16145,6 +16145,41 @@ SDValue SITargetLowering::performPtrAddCombine(SDNode 
*N,
   return Folded;
   }
 
+  // Transform (ptradd a, b) -> (or disjoint a, b) if it is equivalent and if
+  // that transformation can't block an offset folding at any use of the 
ptradd.
+  // This should be done late, after legalization, so that it doesn't block
+  // other ptradd combines that could enable more offset folding.
+  bool HasIntermediateAssertAlign =
+  N0->getOpcode() == ISD::AssertAlign && N0->getOperand(0)->isAnyAdd();
+  // This is a hack to work around an ordering problem for DAGs like this:
+  //   (ptradd (AssertAlign (ptradd p, c1), k), c2)
+  // If the outer ptradd is handled first by the DAGCombiner, it can be
+  // transformed into a disjoint or. Then, when the generic AssertAlign combine
+  // pushes the AssertAlign through the inner ptradd, it's too late for the
+  // ptradd reassociation to trigger.
+  if (!DCI.isBeforeLegalizeOps() && !HasIntermediateAssertAlign &&
+  DAG.haveNoCommonBitsSet(N0, N1)) {
+bool TransformCanBreakAddrMode = any_of(N->users(), [&](SDNode *User) {
+  if (auto *LoadStore = dyn_cast(User);
+  LoadStore && LoadStore->getBasePtr().getNode() == N) {
+unsigned AS = LoadStore->getAddressSpace();
+// Currently, we only really need ptradds to fold offsets into flat
+// memory instructions.
+if (AS != AMDGPUAS::FLAT_ADDRESS)
+  return false;
+TargetLoweringBase::AddrMode AM;
+AM.HasBaseReg = true;
+EVT VT = LoadStore->getMemoryVT();
+Type *AccessTy = VT.getTypeForEVT(*DAG.getContext());
+return isLegalAddressingMode(DAG.getDataLayout(), AM, AccessTy, AS);
+  }
+  return false;
+});
+
+if (!TransformCanBreakAddrMode)
+  return DAG.getNode(ISD::OR, DL, VT, N0, N1, SDNodeFlags::Disjoint);
+  }
+
   if (N1.getOpcode() != ISD::ADD || !N1.hasOneUse())
 return SDValue();
 
diff --git a/llvm/test/CodeGen/AMDGPU/ptradd-sdag-optimizations.ll 
b/llvm/test/CodeGen/AMDGPU/ptradd-sdag-optimizations.ll
index 199c1f61d2522..7d7fe141e5440 100644
--- a/llvm/test/CodeGen/AMDGPU/ptradd-sdag-optimizations.ll
+++ b/llvm/test/CodeGen/AMDGPU/ptradd-sdag-optimizations.ll
@@ -100,7 +100,7 @@ define void @baseptr_null(i64 %offset, i8 %v) {
 
 ; Taken from implicit-kernarg-backend-usage.ll, tests the PTRADD handling in 
the
 ; assertalign DAG combine.
-define amdgpu_kernel void @llvm_amdgcn_queue_ptr(ptr addrspace(1) %ptr)  #0 {
+define amdgpu_kernel void @llvm_amdgcn_queue_ptr(ptr addrspace(1) %ptr) {
 ; GFX942-LABEL: llvm_amdgcn_queue_ptr:
 ; GFX942:   ; %bb.0:
 ; GFX942-NEXT:v_mov_b32_e32 v0, 0
@@ -415,6 +415,60 @@ entry:
   ret void
 }
 
+; Check that ptradds can be lowered to disjoint ORs.
+define ptr @gep_disjoint_or(ptr %base) {
+; GFX942-LABEL: gep_disjoint_or:
+; GFX942:   ; %bb.0:
+; GFX942-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX942-NEXT:v_and_or_b32 v0, v0, -16, 4
+; GFX942-NEXT:s_setpc_b64 s[30:31]
+  %p = call ptr @llvm.ptrmask(ptr %base, i64 s0xf0)
+  %gep = getelementptr nuw inbounds i8, ptr %p, i64 4
+  ret ptr %gep
+}
+
+; Check that AssertAlign no

[llvm-branch-commits] [llvm] [Offload] Add olGetMemInfo with platform-less API (PR #159581)

2025-09-18 Thread Ross Brunton via llvm-branch-commits


https://github.com/RossBrunton created 
https://github.com/llvm/llvm-project/pull/159581

None

>From 149a8e88c447d10e9181ba0940c5d05ace6f0d5a Mon Sep 17 00:00:00 2001
From: Ross Brunton 
Date: Thu, 18 Sep 2025 15:23:45 +0100
Subject: [PATCH] [Offload] Add olGetMemInfo with platform-less API

---
 offload/liboffload/API/Memory.td  |  50 +++
 offload/liboffload/src/OffloadImpl.cpp|  54 
 offload/unittests/OffloadAPI/CMakeLists.txt   |   4 +-
 .../OffloadAPI/memory/olGetMemInfo.cpp| 130 ++
 .../OffloadAPI/memory/olGetMemInfoSize.cpp|  63 +
 5 files changed, 300 insertions(+), 1 deletion(-)
 create mode 100644 offload/unittests/OffloadAPI/memory/olGetMemInfo.cpp
 create mode 100644 offload/unittests/OffloadAPI/memory/olGetMemInfoSize.cpp

diff --git a/offload/liboffload/API/Memory.td b/offload/liboffload/API/Memory.td
index debda165d2b23..3e47b586edd23 100644
--- a/offload/liboffload/API/Memory.td
+++ b/offload/liboffload/API/Memory.td
@@ -45,6 +45,56 @@ def olMemFree : Function {
   let returns = [];
 }
 
+def ol_mem_info_t : Enum {
+  let desc = "Supported memory info.";
+  let is_typed = 1;
+  let etors = [
+TaggedEtor<"DEVICE", "ol_device_handle_t", "The handle of the device 
associated with the allocation.">,
+TaggedEtor<"BASE", "void *", "Base address of this allocation.">,
+TaggedEtor<"SIZE", "size_t", "Size of this allocation in bytes.">,
+TaggedEtor<"TYPE", "ol_alloc_type_t", "Type of this allocation.">,
+  ];
+}
+
+def olGetMemInfo : Function {
+  let desc = "Queries the given property of a memory allocation allocated with 
olMemAlloc.";
+  let details = [
+"`olGetMemInfoSize` can be used to query the storage size required for the 
given query.",
+"The provided pointer can point to any location inside the allocation.",
+  ];
+  let params = [
+Param<"const void *", "Ptr", "pointer to the allocated memory", PARAM_IN>,
+Param<"ol_mem_info_t", "PropName", "type of the info to retrieve", 
PARAM_IN>,
+Param<"size_t", "PropSize", "the number of bytes pointed to by 
PropValue.", PARAM_IN>,
+TypeTaggedParam<"void*", "PropValue", "array of bytes holding the info. "
+  "If Size is not equal to or greater to the real number of bytes needed 
to return the info "
+  "then the OL_ERRC_INVALID_SIZE error is returned and pPlatformInfo is 
not used.", PARAM_OUT,
+  TypeInfo<"PropName" , "PropSize">>
+  ];
+  let returns = [
+Return<"OL_ERRC_INVALID_SIZE", [
+  "`PropSize == 0`",
+  "If `PropSize` is less than the real number of bytes needed to return 
the info."
+]>,
+Return<"OL_ERRC_NOT_FOUND", ["memory was not allocated by this platform"]>
+  ];
+}
+
+def olGetMemInfoSize : Function {
+  let desc = "Returns the storage size of the given queue query.";
+  let details = [
+"The provided pointer can point to any location inside the allocation.",
+  ];
+  let params = [
+Param<"const void *", "Ptr", "pointer to the allocated memory", PARAM_IN>,
+Param<"ol_mem_info_t", "PropName", "type of the info to query", PARAM_IN>,
+Param<"size_t*", "PropSizeRet", "pointer to the number of bytes required 
to store the query", PARAM_OUT>
+  ];
+  let returns = [
+Return<"OL_ERRC_NOT_FOUND", ["memory was not allocated by this platform"]>
+  ];
+}
+
 def olMemcpy : Function {
 let desc = "Enqueue a memcpy operation.";
 let details = [
diff --git a/offload/liboffload/src/OffloadImpl.cpp 
b/offload/liboffload/src/OffloadImpl.cpp
index 4a253c61a657b..2a0e238125dd7 100644
--- a/offload/liboffload/src/OffloadImpl.cpp
+++ b/offload/liboffload/src/OffloadImpl.cpp
@@ -700,6 +700,60 @@ Error olMemFree_impl(void *Address) {
   return Error::success();
 }
 
+Error olGetMemInfoImplDetail(const void *Ptr, ol_mem_info_t PropName,
+ size_t PropSize, void *PropValue,
+ size_t *PropSizeRet) {
+  InfoWriter Info(PropSize, PropValue, PropSizeRet);
+  std::lock_guard Lock(OffloadContext::get().AllocInfoMapMutex);
+
+  auto &AllocBases = OffloadContext::get().AllocBases;
+  auto &AllocInfoMap = OffloadContext::get().AllocInfoMap;
+  const AllocInfo *Alloc = nullptr;
+  if (AllocInfoMap.contains(Ptr)) {
+// Fast case, we have been given the base pointer directly
+Alloc = &AllocInfoMap.at(Ptr);
+  } else {
+// Slower case, we need to look up the base pointer first
+// Find the first memory allocation whose end is after the target pointer,
+// and then check to see if it is in range
+auto Loc = std::lower_bound(AllocBases.begin(), AllocBases.end(), Ptr,
+[&](const void *Iter, const void *Val) {
+  return AllocInfoMap.at(Iter).End <= Val;
+});
+if (Loc == AllocBases.end() || Ptr < AllocInfoMap.at(*Loc).Start)
+  return Plugin::error(ErrorCode::NOT_FOUND,
+   "allocated memory information

[llvm-branch-commits] [llvm] [Offload] Add GenericPluginTy::get_mem_info (PR #157484)

2025-09-18 Thread Ross Brunton via llvm-branch-commits


https://github.com/RossBrunton closed 
https://github.com/llvm/llvm-project/pull/157484
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [Offload] `olGetMemInfo` (PR #157651)

2025-09-18 Thread Ross Brunton via llvm-branch-commits


https://github.com/RossBrunton closed 
https://github.com/llvm/llvm-project/pull/157651
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU][SDAG] DAGCombine PTRADD -> disjoint OR (PR #146075)

2025-09-18 Thread Fabian Ritter via llvm-branch-commits


https://github.com/ritter-x2a updated 
https://github.com/llvm/llvm-project/pull/146075

>From 7c417c4c1413a3807d476b7fc490256084a0ac62 Mon Sep 17 00:00:00 2001
From: Fabian Ritter 
Date: Fri, 27 Jun 2025 04:23:50 -0400
Subject: [PATCH 1/5] [AMDGPU][SDAG] DAGCombine PTRADD -> disjoint OR

If we can't fold a PTRADD's offset into its users, lowering them to
disjoint ORs is preferable: Often, a 32-bit OR instruction suffices
where we'd otherwise use a pair of 32-bit additions with carry.

This needs to be a DAGCombine (and not a selection rule) because its
main purpose is to enable subsequent DAGCombines for bitwise operations.
We don't want to just turn PTRADDs into disjoint ORs whenever that's
sound because this transform loses the information that the operation
implements pointer arithmetic, which we will soon need to fold offsets
into FLAT instructions. Currently, disjoint ORs can still be used for
offset folding, so that part of the logic can't be tested.

The PR contains a hacky workaround for a situation where an AssertAlign
operand of a PTRADD is not DAGCombined before the PTRADD, causing the
PTRADD to be turned into a disjoint OR although reassociating it with
the operand of the AssertAlign would be better. This wouldn't be a
problem if the DAGCombiner ensured that a node is only processed after
all its operands have been processed.

For SWDEV-516125.
---
 llvm/lib/Target/AMDGPU/SIISelLowering.cpp | 35 
 .../AMDGPU/ptradd-sdag-optimizations.ll   | 56 ++-
 2 files changed, 90 insertions(+), 1 deletion(-)

diff --git a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp 
b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
index 78d608556f056..ffaaef65569ae 100644
--- a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
@@ -16145,6 +16145,41 @@ SDValue SITargetLowering::performPtrAddCombine(SDNode 
*N,
   return Folded;
   }
 
+  // Transform (ptradd a, b) -> (or disjoint a, b) if it is equivalent and if
+  // that transformation can't block an offset folding at any use of the 
ptradd.
+  // This should be done late, after legalization, so that it doesn't block
+  // other ptradd combines that could enable more offset folding.
+  bool HasIntermediateAssertAlign =
+  N0->getOpcode() == ISD::AssertAlign && N0->getOperand(0)->isAnyAdd();
+  // This is a hack to work around an ordering problem for DAGs like this:
+  //   (ptradd (AssertAlign (ptradd p, c1), k), c2)
+  // If the outer ptradd is handled first by the DAGCombiner, it can be
+  // transformed into a disjoint or. Then, when the generic AssertAlign combine
+  // pushes the AssertAlign through the inner ptradd, it's too late for the
+  // ptradd reassociation to trigger.
+  if (!DCI.isBeforeLegalizeOps() && !HasIntermediateAssertAlign &&
+  DAG.haveNoCommonBitsSet(N0, N1)) {
+bool TransformCanBreakAddrMode = any_of(N->users(), [&](SDNode *User) {
+  if (auto *LoadStore = dyn_cast(User);
+  LoadStore && LoadStore->getBasePtr().getNode() == N) {
+unsigned AS = LoadStore->getAddressSpace();
+// Currently, we only really need ptradds to fold offsets into flat
+// memory instructions.
+if (AS != AMDGPUAS::FLAT_ADDRESS)
+  return false;
+TargetLoweringBase::AddrMode AM;
+AM.HasBaseReg = true;
+EVT VT = LoadStore->getMemoryVT();
+Type *AccessTy = VT.getTypeForEVT(*DAG.getContext());
+return isLegalAddressingMode(DAG.getDataLayout(), AM, AccessTy, AS);
+  }
+  return false;
+});
+
+if (!TransformCanBreakAddrMode)
+  return DAG.getNode(ISD::OR, DL, VT, N0, N1, SDNodeFlags::Disjoint);
+  }
+
   if (N1.getOpcode() != ISD::ADD || !N1.hasOneUse())
 return SDValue();
 
diff --git a/llvm/test/CodeGen/AMDGPU/ptradd-sdag-optimizations.ll 
b/llvm/test/CodeGen/AMDGPU/ptradd-sdag-optimizations.ll
index 199c1f61d2522..7d7fe141e5440 100644
--- a/llvm/test/CodeGen/AMDGPU/ptradd-sdag-optimizations.ll
+++ b/llvm/test/CodeGen/AMDGPU/ptradd-sdag-optimizations.ll
@@ -100,7 +100,7 @@ define void @baseptr_null(i64 %offset, i8 %v) {
 
 ; Taken from implicit-kernarg-backend-usage.ll, tests the PTRADD handling in 
the
 ; assertalign DAG combine.
-define amdgpu_kernel void @llvm_amdgcn_queue_ptr(ptr addrspace(1) %ptr)  #0 {
+define amdgpu_kernel void @llvm_amdgcn_queue_ptr(ptr addrspace(1) %ptr) {
 ; GFX942-LABEL: llvm_amdgcn_queue_ptr:
 ; GFX942:   ; %bb.0:
 ; GFX942-NEXT:v_mov_b32_e32 v0, 0
@@ -415,6 +415,60 @@ entry:
   ret void
 }
 
+; Check that ptradds can be lowered to disjoint ORs.
+define ptr @gep_disjoint_or(ptr %base) {
+; GFX942-LABEL: gep_disjoint_or:
+; GFX942:   ; %bb.0:
+; GFX942-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX942-NEXT:v_and_or_b32 v0, v0, -16, 4
+; GFX942-NEXT:s_setpc_b64 s[30:31]
+  %p = call ptr @llvm.ptrmask(ptr %base, i64 s0xf0)
+  %gep = getelementptr nuw inbounds i8, ptr %p, i64 4
+  ret ptr %gep
+}
+
+; Check that AssertAlign no

[llvm-branch-commits] [llvm] [AllocToken, Clang] Implement __builtin_infer_alloc_token() and llvm.alloc.token.id (PR #156842)

2025-09-18 Thread Marco Elver via llvm-branch-commits


https://github.com/melver edited 
https://github.com/llvm/llvm-project/pull/156842
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [Remarks] Restructure bitstream remarks to be fully standalone (PR #156715)

2025-09-18 Thread Tobias Stadler via llvm-branch-commits


https://github.com/tobias-stadler updated 
https://github.com/llvm/llvm-project/pull/156715

>From d33b31f01aeeb9005581b0a2a1f21c898463aa02 Mon Sep 17 00:00:00 2001
From: Tobias Stadler 
Date: Thu, 18 Sep 2025 12:34:55 +0100
Subject: [PATCH] Replace bitstream blobs by yaml

Created using spr 1.3.7-wip
---
 llvm/lib/Remarks/BitstreamRemarkParser.cpp|   5 +-
 .../dsymutil/ARM/remarks-linking-bundle.test  |  13 +-
 .../basic1.macho.remarks.arm64.opt.bitstream  | Bin 824 -> 0 bytes
 .../basic1.macho.remarks.arm64.opt.yaml   |  47 +
 ...c1.macho.remarks.empty.arm64.opt.bitstream |   0
 .../basic2.macho.remarks.arm64.opt.bitstream  | Bin 1696 -> 0 bytes
 .../basic2.macho.remarks.arm64.opt.yaml   | 194 ++
 ...c2.macho.remarks.empty.arm64.opt.bitstream |   0
 .../basic3.macho.remarks.arm64.opt.bitstream  | Bin 1500 -> 0 bytes
 .../basic3.macho.remarks.arm64.opt.yaml   | 181 
 ...c3.macho.remarks.empty.arm64.opt.bitstream |   0
 .../fat.macho.remarks.x86_64.opt.bitstream| Bin 820 -> 0 bytes
 .../remarks/fat.macho.remarks.x86_64.opt.yaml |  53 +
 .../fat.macho.remarks.x86_64h.opt.bitstream   | Bin 820 -> 0 bytes
 .../fat.macho.remarks.x86_64h.opt.yaml|  53 +
 .../X86/remarks-linking-fat-bundle.test   |   8 +-
 16 files changed, 543 insertions(+), 11 deletions(-)
 delete mode 100644 
llvm/test/tools/dsymutil/Inputs/private/tmp/remarks/basic1.macho.remarks.arm64.opt.bitstream
 create mode 100644 
llvm/test/tools/dsymutil/Inputs/private/tmp/remarks/basic1.macho.remarks.arm64.opt.yaml
 delete mode 100644 
llvm/test/tools/dsymutil/Inputs/private/tmp/remarks/basic1.macho.remarks.empty.arm64.opt.bitstream
 delete mode 100644 
llvm/test/tools/dsymutil/Inputs/private/tmp/remarks/basic2.macho.remarks.arm64.opt.bitstream
 create mode 100644 
llvm/test/tools/dsymutil/Inputs/private/tmp/remarks/basic2.macho.remarks.arm64.opt.yaml
 delete mode 100644 
llvm/test/tools/dsymutil/Inputs/private/tmp/remarks/basic2.macho.remarks.empty.arm64.opt.bitstream
 delete mode 100644 
llvm/test/tools/dsymutil/Inputs/private/tmp/remarks/basic3.macho.remarks.arm64.opt.bitstream
 create mode 100644 
llvm/test/tools/dsymutil/Inputs/private/tmp/remarks/basic3.macho.remarks.arm64.opt.yaml
 delete mode 100644 
llvm/test/tools/dsymutil/Inputs/private/tmp/remarks/basic3.macho.remarks.empty.arm64.opt.bitstream
 delete mode 100644 
llvm/test/tools/dsymutil/Inputs/private/tmp/remarks/fat.macho.remarks.x86_64.opt.bitstream
 create mode 100644 
llvm/test/tools/dsymutil/Inputs/private/tmp/remarks/fat.macho.remarks.x86_64.opt.yaml
 delete mode 100644 
llvm/test/tools/dsymutil/Inputs/private/tmp/remarks/fat.macho.remarks.x86_64h.opt.bitstream
 create mode 100644 
llvm/test/tools/dsymutil/Inputs/private/tmp/remarks/fat.macho.remarks.x86_64h.opt.yaml

diff --git a/llvm/lib/Remarks/BitstreamRemarkParser.cpp 
b/llvm/lib/Remarks/BitstreamRemarkParser.cpp
index 63b16bd2df0ec..2b27a0f661d88 100644
--- a/llvm/lib/Remarks/BitstreamRemarkParser.cpp
+++ b/llvm/lib/Remarks/BitstreamRemarkParser.cpp
@@ -411,9 +411,8 @@ Error BitstreamRemarkParser::processExternalFilePath() {
 return E;
 
   if (ContainerType != BitstreamRemarkContainerType::RemarksFile)
-return error(
-"Error while parsing external file's BLOCK_META: wrong container "
-"type.");
+return ParserHelper->MetaHelper.error(
+"Wrong container type in external file.");
 
   return Error::success();
 }
diff --git a/llvm/test/tools/dsymutil/ARM/remarks-linking-bundle.test 
b/llvm/test/tools/dsymutil/ARM/remarks-linking-bundle.test
index 09a60d7d044c6..e1b04455b0d9d 100644
--- a/llvm/test/tools/dsymutil/ARM/remarks-linking-bundle.test
+++ b/llvm/test/tools/dsymutil/ARM/remarks-linking-bundle.test
@@ -1,22 +1,25 @@
 RUN: rm -rf %t
-RUN: mkdir -p %t
+RUN: mkdir -p %t/private/tmp/remarks
 RUN: cat %p/../Inputs/remarks/basic.macho.remarks.arm64> 
%t/basic.macho.remarks.arm64
+RUN: llvm-remarkutil yaml2bitstream 
%p/../Inputs/private/tmp/remarks/basic1.macho.remarks.arm64.opt.yaml -o 
%t/private/tmp/remarks/basic1.macho.remarks.arm64.opt.bitstream
+RUN: llvm-remarkutil yaml2bitstream 
%p/../Inputs/private/tmp/remarks/basic2.macho.remarks.arm64.opt.yaml -o 
%t/private/tmp/remarks/basic2.macho.remarks.arm64.opt.bitstream
+RUN: llvm-remarkutil yaml2bitstream 
%p/../Inputs/private/tmp/remarks/basic3.macho.remarks.arm64.opt.yaml -o 
%t/private/tmp/remarks/basic3.macho.remarks.arm64.opt.bitstream
 
-RUN: dsymutil -oso-prepend-path=%p/../Inputs 
-remarks-prepend-path=%p/../Inputs %t/basic.macho.remarks.arm64
+RUN: dsymutil -oso-prepend-path=%p/../Inputs -remarks-prepend-path=%t 
%t/basic.macho.remarks.arm64
 
 Check that the remark file in the bundle exists and is sane:
 RUN: llvm-bcanalyzer -dump 
%t/basic.macho.remarks.arm64.dSYM/Contents/Resources/Remarks/basic.macho.remarks.arm64
 | FileCheck %s
 
-RUN: dsymutil --linker parallel -oso-prepend-path=%p/../Inputs 
-remarks-prepend-path=%p/../Inputs %t/basic.macho.remar

[llvm-branch-commits] [llvm] [AllocToken, Clang] Implement __builtin_infer_alloc_token() and llvm.alloc.token.id (PR #156842)

2025-09-18 Thread Marco Elver via llvm-branch-commits


https://github.com/melver edited 
https://github.com/llvm/llvm-project/pull/156842
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [compiler-rt] release/21.x: [compiler-rt][sanitizer] fix msghdr for musl (PR #159551)

2025-09-18 Thread Deák Lajos via llvm-branch-commits


https://github.com/deaklajos created 
https://github.com/llvm/llvm-project/pull/159551

Backports: 3fc723ec2cf1965aa4eec8883957fbbe1b2e7027 (#136195)

Ran into the issue on Alpine when building with TSAN that `__sanitizer_msghdr` 
and the `msghdr` provided by musl did not match. This caused lots of tsan 
reports and an eventual termination of the application by the oom during a 
`sendmsg`.

From 60b10f56319e62415c61e69c67f9c713ed81172e Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?De=C3=A1k=20Lajos?=
 <[email protected]>
Date: Tue, 22 Jul 2025 20:31:28 +0200
Subject: [PATCH] [compiler-rt][sanitizer] fix msghdr for musl (#136195)

Ran into the issue on Alpine when building with TSAN that
`__sanitizer_msghdr` and the `msghdr` provided by musl did not match.
This caused lots of tsan reports and an eventual termination of the
application by the oom during a `sendmsg`.
---
 .../sanitizer_platform_limits_posix.h | 24 +++
 1 file changed, 24 insertions(+)

diff --git a/compiler-rt/lib/sanitizer_common/sanitizer_platform_limits_posix.h 
b/compiler-rt/lib/sanitizer_common/sanitizer_platform_limits_posix.h
index f118d53f0df80..24966523f3a02 100644
--- a/compiler-rt/lib/sanitizer_common/sanitizer_platform_limits_posix.h
+++ b/compiler-rt/lib/sanitizer_common/sanitizer_platform_limits_posix.h
@@ -478,6 +478,30 @@ struct __sanitizer_cmsghdr {
   int cmsg_level;
   int cmsg_type;
 };
+#  elif SANITIZER_MUSL
+struct __sanitizer_msghdr {
+  void *msg_name;
+  unsigned msg_namelen;
+  struct __sanitizer_iovec *msg_iov;
+  int msg_iovlen;
+#if SANITIZER_WORDSIZE == 64
+  int __pad1;
+#endif
+  void *msg_control;
+  unsigned msg_controllen;
+#if SANITIZER_WORDSIZE == 64
+  int __pad2;
+#endif
+  int msg_flags;
+};
+struct __sanitizer_cmsghdr {
+  unsigned cmsg_len;
+#if SANITIZER_WORDSIZE == 64
+  int __pad1;
+#endif
+  int cmsg_level;
+  int cmsg_type;
+};
 #  else
 // In POSIX, int msg_iovlen; socklen_t msg_controllen; socklen_t cmsg_len; but
 // many implementations don't conform to the standard.

___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [compiler-rt] release/21.x: [compiler-rt][sanitizer] fix msghdr for musl (PR #159551)

2025-09-18 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-compiler-rt-sanitizer

Author: Deák Lajos (deaklajos)


Changes

Backports: 3fc723ec2cf1965aa4eec8883957fbbe1b2e7027 (#136195)

Ran into the issue on Alpine when building with TSAN that `__sanitizer_msghdr` 
and the `msghdr` provided by musl did not match. This caused lots of tsan 
reports and an eventual termination of the application by the oom during a 
`sendmsg`.

---
Full diff: https://github.com/llvm/llvm-project/pull/159551.diff


1 Files Affected:

- (modified) compiler-rt/lib/sanitizer_common/sanitizer_platform_limits_posix.h 
(+24) 


``diff
diff --git a/compiler-rt/lib/sanitizer_common/sanitizer_platform_limits_posix.h 
b/compiler-rt/lib/sanitizer_common/sanitizer_platform_limits_posix.h
index f118d53f0df80..24966523f3a02 100644
--- a/compiler-rt/lib/sanitizer_common/sanitizer_platform_limits_posix.h
+++ b/compiler-rt/lib/sanitizer_common/sanitizer_platform_limits_posix.h
@@ -478,6 +478,30 @@ struct __sanitizer_cmsghdr {
   int cmsg_level;
   int cmsg_type;
 };
+#  elif SANITIZER_MUSL
+struct __sanitizer_msghdr {
+  void *msg_name;
+  unsigned msg_namelen;
+  struct __sanitizer_iovec *msg_iov;
+  int msg_iovlen;
+#if SANITIZER_WORDSIZE == 64
+  int __pad1;
+#endif
+  void *msg_control;
+  unsigned msg_controllen;
+#if SANITIZER_WORDSIZE == 64
+  int __pad2;
+#endif
+  int msg_flags;
+};
+struct __sanitizer_cmsghdr {
+  unsigned cmsg_len;
+#if SANITIZER_WORDSIZE == 64
+  int __pad1;
+#endif
+  int cmsg_level;
+  int cmsg_type;
+};
 #  else
 // In POSIX, int msg_iovlen; socklen_t msg_controllen; socklen_t cmsg_len; but
 // many implementations don't conform to the standard.

``




https://github.com/llvm/llvm-project/pull/159551
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

1 2 >

1 - 100 of 122 matches

Mail list logo