date:20250410

[llvm-branch-commits] [llvm] [InstCombine] Handle "add like" in ADD+GEP->GEP+GEP rewrites (PR #135156)

2025-04-10 Thread Nikita Popov via llvm-branch-commits


nikic wrote:

See also https://github.com/llvm/llvm-project/pull/76981.

https://github.com/llvm/llvm-project/pull/135156
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [InstCombine] Improve inbounds preservation for ADD+GEP -> GEP+GEP (PR #135155)

2025-04-10 Thread via llvm-branch-commits


github-actions[bot] wrote:




:warning: C/C++ code formatter, clang-format found issues in your code. 
:warning:



You can test this locally with the following command:


``bash
git-clang-format --diff HEAD~1 HEAD --extensions cpp -- 
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
``





View the diff from clang-format here.


``diff
diff --git a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp 
b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
index b5e085be9..09b6f4880 100644
--- a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
@@ -3091,8 +3091,7 @@ Instruction 
*InstCombinerImpl::visitGetElementPtrInst(GetElementPtrInst &GEP) {
 DL.getIndexSizeInBits(PtrOp->getType()->getPointerAddressSpace());
 APInt BasePtrOffset(IdxWidth, 0);
 Value *UnderlyingPtrOp =
-PtrOp->stripAndAccumulateInBoundsConstantOffsets(DL,
- BasePtrOffset);
+PtrOp->stripAndAccumulateInBoundsConstantOffsets(DL, BasePtrOffset);
 bool CanBeNull, CanBeFreed;
 uint64_t DerefBytes = UnderlyingPtrOp->getPointerDereferenceableBytes(
 DL, CanBeNull, CanBeFreed);
@@ -3120,7 +3119,8 @@ Instruction 
*InstCombinerImpl::visitGetElementPtrInst(GetElementPtrInst &GEP) {
   // These rewrites is trying to preserve inbounds/nuw attributes. So we want 
to
   // do this after having tried to derive "nuw" above.
   if (GEP.getNumIndices() == 1) {
-auto GetPreservedNoWrapFlags = [&](bool AddIsNUW, Value *Idx1, Value 
*Idx2) {
+auto GetPreservedNoWrapFlags = [&](bool AddIsNUW, Value *Idx1,
+   Value *Idx2) {
   // Preserve "inbounds nuw" if the original gep is "inbounds nuw", and the
   // add is "nuw". Preserve "nuw" if the original gep is "nuw", and the add
   // is "nuw".
@@ -3160,8 +3160,8 @@ Instruction 
*InstCombinerImpl::visitGetElementPtrInst(GetElementPtrInst &GEP) {
   // as:
   // %newptr = getelementptr i32, ptr %ptr, i32 %idx1
   // %newgep = getelementptr i32, ptr %newptr, i32 idx2
-  bool NUW = match(GEP.getOperand(1), m_NNegZExt(m_NUWAddLike(m_Value(),
-  m_Value(;
+  bool NUW = match(GEP.getOperand(1),
+   m_NNegZExt(m_NUWAddLike(m_Value(), m_Value(;
   GEPNoWrapFlags NWFlags = GetPreservedNoWrapFlags(NUW, Idx1, C);
   auto *NewPtr = Builder.CreateGEP(
   GEP.getSourceElementType(), GEP.getPointerOperand(),

``




https://github.com/llvm/llvm-project/pull/135155
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] release/20.x: [clang-format] Keep the space between `not` and a unary operator (#135035) (PR #135118)

2025-04-10 Thread Björn Schäpers via llvm-branch-commits


https://github.com/HazardyKnusperkeks approved this pull request.


https://github.com/llvm/llvm-project/pull/135118
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [GOFF] Add writing of section symbols (PR #133799)

2025-04-10 Thread Kai Nacke via llvm-branch-commits



@@ -2759,6 +2762,29 @@ MCSection 
*TargetLoweringObjectFileXCOFF::getSectionForLSDA(
 
//===--===//
 TargetLoweringObjectFileGOFF::TargetLoweringObjectFileGOFF() = default;
 
+void TargetLoweringObjectFileGOFF::getModuleMetadata(Module &M) {
+  // Construct the default names for the root SD and the ADA PR symbol.
+  StringRef FileName = sys::path::stem(M.getSourceFileName());
+  if (FileName.size() > 1 && FileName.starts_with('<') &&
+  FileName.ends_with('>'))
+FileName = FileName.substr(1, FileName.size() - 2);
+  DefaultRootSDName = Twine(FileName).concat("#C").str();

redstar wrote:

Using a name and setting the binding scope to "section scope" is similar to 
using " " as name and leaving the binding scope unspecified.
The XLC and Open XL compilers provide a command line option to change this name 
(XLC: `-qcsect`, `-qnocsect`; Open XL: `-mcsect`, `-mnocsect`). The Open XL 
compiler defaults to the variant coded here but that can be changed to having a 
name with binding scope unspecified (`-mcsect=a`) or set to space (`-mnocsect`).
Using `-mcsect=a` results in exactly the problem you describe.
The compiler option will be added later to clang, along with the required code 
here.

The front end (aka clang) provides this value in the `source_filename` 
property. All strings in LLVM/clang are in UTF-8 so there is no other choice. 
The same problem arises for symbols derived from function names etc.

https://github.com/llvm/llvm-project/pull/133799
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [InstCombine] Improve inbounds preservation for ADD+GEP -> GEP+GEP (PR #135155)

2025-04-10 Thread Björn Pettersson via llvm-branch-commits


https://github.com/bjope updated 
https://github.com/llvm/llvm-project/pull/135155

From 0abeb7b7eea0e15e15c98a8e4f8501fde81d4811 Mon Sep 17 00:00:00 2001
From: Bjorn Pettersson 
Date: Tue, 11 Mar 2025 16:27:43 +0100
Subject: [PATCH 1/3] [InstCombine] Improve inbounds preservation for ADD+GEP
 -> GEP+GEP

Given that we have a "add nuw" and a "getelementptr inbounds nuw" like
this:
   %idx = add nuw i64 %idx1, %idx2
   %gep = getelementptr inbounds nuw i32, ptr %ptr, i64 %idx

Then we can preserve the "inbounds nuw" flag when transforming that
into two getelementptr instructions:
   %gep1 = getelementptr inbounds nuw i32, ptr %ptr, i64 %idx1
   %gep = getelementptr inbounds nuw i32, ptr %ptr, i64 %idx2

Similarly for just having "nuw" instead of "inbounds nuw" on the
getelementptr.

Proof: https://alive2.llvm.org/ce/z/4uhfDq
---
 .../InstCombine/InstructionCombining.cpp  | 43 +++
 llvm/test/Transforms/InstCombine/array.ll | 10 ++---
 2 files changed, 30 insertions(+), 23 deletions(-)

diff --git a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp 
b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
index 856e02c9f1ddb..19a818f4baa30 100644
--- a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
@@ -3087,12 +3087,22 @@ Instruction 
*InstCombinerImpl::visitGetElementPtrInst(GetElementPtrInst &GEP) {
 return nullptr;
 
   if (GEP.getNumIndices() == 1) {
-// We can only preserve inbounds if the original gep is inbounds, the add
-// is nsw, and the add operands are non-negative.
-auto CanPreserveInBounds = [&](bool AddIsNSW, Value *Idx1, Value *Idx2) {
+auto CanPreserveNoWrapFlags = [&](bool AddIsNSW, bool AddIsNUW, Value 
*Idx1,
+  Value *Idx2) {
+  // Preserve "inbounds nuw" if the original gep is "inbounds nuw",
+  // and the add is "nuw".
+  if (GEP.isInBounds() && GEP.hasNoUnsignedWrap() && AddIsNUW)
+return GEPNoWrapFlags::inBounds() | GEPNoWrapFlags::noUnsignedWrap();
+  // Preserve "inbounds" if the original gep is "inbounds", the add
+  // is "nsw", and the add operands are non-negative.
   SimplifyQuery Q = SQ.getWithInstruction(&GEP);
-  return GEP.isInBounds() && AddIsNSW && isKnownNonNegative(Idx1, Q) &&
- isKnownNonNegative(Idx2, Q);
+  if (GEP.isInBounds() && AddIsNSW && isKnownNonNegative(Idx1, Q) &&
+  isKnownNonNegative(Idx2, Q))
+return GEPNoWrapFlags::inBounds();
+  // Preserve "nuw" if the original gep is "nuw", and the add is "nuw".
+  if (GEP.hasNoUnsignedWrap() && AddIsNUW)
+return GEPNoWrapFlags::noUnsignedWrap();
+  return GEPNoWrapFlags::none();
 };
 
 // Try to replace ADD + GEP with GEP + GEP.
@@ -3104,15 +3114,15 @@ Instruction 
*InstCombinerImpl::visitGetElementPtrInst(GetElementPtrInst &GEP) {
   // as:
   //   %newptr = getelementptr i32, ptr %ptr, i64 %idx1
   //   %newgep = getelementptr i32, ptr %newptr, i64 %idx2
-  bool IsInBounds = CanPreserveInBounds(
-  
cast(GEP.getOperand(1))->hasNoSignedWrap(),
-  Idx1, Idx2);
+  bool NSW = match(GEP.getOperand(1), m_NSWAddLike(m_Value(), m_Value()));
+  bool NUW = match(GEP.getOperand(1), m_NUWAddLike(m_Value(), m_Value()));
+  GEPNoWrapFlags NWFlags = CanPreserveNoWrapFlags(NSW, NUW, Idx1, Idx2);
   auto *NewPtr =
   Builder.CreateGEP(GEP.getSourceElementType(), 
GEP.getPointerOperand(),
-Idx1, "", IsInBounds);
-  return replaceInstUsesWith(
-  GEP, Builder.CreateGEP(GEP.getSourceElementType(), NewPtr, Idx2, "",
- IsInBounds));
+Idx1, "", NWFlags);
+  return replaceInstUsesWith(GEP,
+ Builder.CreateGEP(GEP.getSourceElementType(),
+   NewPtr, Idx2, "", NWFlags));
 }
 ConstantInt *C;
 if (match(GEP.getOperand(1), m_OneUse(m_SExtLike(m_OneUse(m_NSWAdd(
@@ -3123,17 +3133,16 @@ Instruction 
*InstCombinerImpl::visitGetElementPtrInst(GetElementPtrInst &GEP) {
   // as:
   // %newptr = getelementptr i32, ptr %ptr, i32 %idx1
   // %newgep = getelementptr i32, ptr %newptr, i32 idx2
-  bool IsInBounds = CanPreserveInBounds(
-  /*IsNSW=*/true, Idx1, C);
+  GEPNoWrapFlags NWFlags = CanPreserveNoWrapFlags(
+  /*IsNSW=*/true, /*IsNUW=*/false, Idx1, C);
   auto *NewPtr = Builder.CreateGEP(
   GEP.getSourceElementType(), GEP.getPointerOperand(),
-  Builder.CreateSExt(Idx1, GEP.getOperand(1)->getType()), "",
-  IsInBounds);
+  Builder.CreateSExt(Idx1, GEP.getOperand(1)->getType()), "", NWFlags);
   return replaceInstUsesWith(
   GEP,
   Builder.CreateGEP(GEP.getSourceElementType(), NewPtr,
 Builder.CreateSExt(C, 
GEP.getOperand(1)

[llvm-branch-commits] [llvm] [KeyInstr] Add Atom Group waterline to LLVMContext (PR #133478)

2025-04-10 Thread Jeremy Morse via llvm-branch-commits


https://github.com/jmorse commented:

Are there any expected interactions between atom-group-numbers and loading 
bitcode? i.e., if we serialise the literal atom-group-number to the output and 
then read it back in again, then it might conflict with atom-group-numbers seen 
in other functions in other bitcode files. It doesn't appear that they get 
re-numbered in the textual IR parsing patch for example.

Possibly part of the design here is to simply not care, if it's only about 
internal consistency within a Function (does that hold after inlining too). 
Apologies if this is all explained in a later patch.

The answers to that should ultimately be documented somewhere; I imagine that's 
in the patch stack or coming later.

https://github.com/llvm/llvm-project/pull/133478
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [KeyInstr] Add Atom Group waterline to LLVMContext (PR #133478)

2025-04-10 Thread Jeremy Morse via llvm-branch-commits


https://github.com/jmorse edited 
https://github.com/llvm/llvm-project/pull/133478
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [KeyInstr] Add Atom Group waterline to LLVMContext (PR #133478)

2025-04-10 Thread Jeremy Morse via llvm-branch-commits



@@ -1366,6 +1367,43 @@ TEST_F(DILocationTest, discriminatorSpecialCases) {
   EXPECT_EQ(std::nullopt, L4->cloneByMultiplyingDuplicationFactor(0x1000));
 }
 
+TEST_F(DILocationTest, KeyInstructions) {
+  Context.pImpl->NextAtomGroup = 1;
+
+  EXPECT_EQ(Context.pImpl->NextAtomGroup, 1u);
+  DILocation *A1 = DILocation::get(Context, 1, 0, getSubprogram(), nullptr, 
false, 1, 2);
+  // The group is only applied to the DILocation if the build has opted into
+  // the additional DILocation fields needed for the feature.

jmorse wrote:

Style nit: I feel "the build has opted into" is a bit too abstract, and is like 
the code referring to itself in the third person. "if we have been built 
with..." feels a lot cleaner IMHO, YMMV.

https://github.com/llvm/llvm-project/pull/133478
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] release/20.x: [X86][SSE] Don't emit SSE2 load instructions in SSE1-only mode (#134547) (PR #135191)

2025-04-10 Thread via llvm-branch-commits


llvmbot wrote:

@RKSimon What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/135191
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] release/20.x: [X86][SSE] Don't emit SSE2 load instructions in SSE1-only mode (#134547) (PR #135191)

2025-04-10 Thread via llvm-branch-commits


https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/135191

Backport 08e080ee98832c2aec6f379b04f486bea18730cc

Requested by: @RKSimon

>From bcb6ae86466e917786310b3133a2df6a776923fa Mon Sep 17 00:00:00 2001
From: Stefan Schmidt 
Date: Wed, 9 Apr 2025 11:19:26 +0200
Subject: [PATCH] [X86][SSE] Don't emit SSE2 load instructions in SSE1-only
 mode (#134547)

This fixes a regression I traced back to
https://github.com/llvm/llvm-project/commit/8b43c1be23119c1024bed0a8ce392bc73727e2e2
/ https://github.com/llvm/llvm-project/pull/79000

The regression caused an SSE2 instruction, `movsd`, to be emitted as a
replacement for an SSE instruction, `movaps` despite the target
potentially not supporting this instruction, such as when building with
clang using `-march=pentium3`.

Fixes #134607

(cherry picked from commit 08e080ee98832c2aec6f379b04f486bea18730cc)
---
 .../Target/X86/X86FixupVectorConstants.cpp| 11 ++
 llvm/test/CodeGen/X86/pr134607.ll | 20 +++
 2 files changed, 27 insertions(+), 4 deletions(-)
 create mode 100644 llvm/test/CodeGen/X86/pr134607.ll

diff --git a/llvm/lib/Target/X86/X86FixupVectorConstants.cpp 
b/llvm/lib/Target/X86/X86FixupVectorConstants.cpp
index 453898e132ca4..9dc392d6e9626 100644
--- a/llvm/lib/Target/X86/X86FixupVectorConstants.cpp
+++ b/llvm/lib/Target/X86/X86FixupVectorConstants.cpp
@@ -333,6 +333,7 @@ bool 
X86FixupVectorConstantsPass::processInstruction(MachineFunction &MF,
  MachineInstr &MI) {
   unsigned Opc = MI.getOpcode();
   MachineConstantPool *CP = MI.getParent()->getParent()->getConstantPool();
+  bool HasSSE2 = ST->hasSSE2();
   bool HasSSE41 = ST->hasSSE41();
   bool HasAVX2 = ST->hasAVX2();
   bool HasDQI = ST->hasDQI();
@@ -394,11 +395,13 @@ bool 
X86FixupVectorConstantsPass::processInstruction(MachineFunction &MF,
   case X86::MOVAPDrm:
   case X86::MOVAPSrm:
   case X86::MOVUPDrm:
-  case X86::MOVUPSrm:
+  case X86::MOVUPSrm: {
 // TODO: SSE3 MOVDDUP Handling
-return FixupConstant({{X86::MOVSSrm, 1, 32, rebuildZeroUpperCst},
-  {X86::MOVSDrm, 1, 64, rebuildZeroUpperCst}},
- 128, 1);
+FixupEntry Fixups[] = {
+{X86::MOVSSrm, 1, 32, rebuildZeroUpperCst},
+{HasSSE2 ? X86::MOVSDrm : 0, 1, 64, rebuildZeroUpperCst}};
+return FixupConstant(Fixups, 128, 1);
+  }
   case X86::VMOVAPDrm:
   case X86::VMOVAPSrm:
   case X86::VMOVUPDrm:
diff --git a/llvm/test/CodeGen/X86/pr134607.ll 
b/llvm/test/CodeGen/X86/pr134607.ll
new file mode 100644
index 0..5e824c22e5a22
--- /dev/null
+++ b/llvm/test/CodeGen/X86/pr134607.ll
@@ -0,0 +1,20 @@
+; RUN: llc < %s -mtriple=i386-unknown-unknown -mattr=+sse -O3 | FileCheck %s 
--check-prefixes=X86
+; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=-sse2,+sse -O3 | 
FileCheck %s --check-prefixes=X64-SSE1
+; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+sse2,+sse -O3 | 
FileCheck %s --check-prefixes=X64-SSE2
+
+define void @store_v2f32_constant(ptr %v) {
+; X86-LABEL: store_v2f32_constant:
+; X86:   # %bb.0:
+; X86-NEXT:movl 4(%esp), %eax
+; X86-NEXT:movaps {{\.?LCPI[0-9]+_[0-9]+}}, %xmm0
+
+; X64-SSE1-LABEL: store_v2f32_constant:
+; X64-SSE1:   # %bb.0:
+; X64-SSE1-NEXT:movaps {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0
+
+; X64-SSE2-LABEL: store_v2f32_constant:
+; X64-SSE2:   # %bb.0:
+; X64-SSE2-NEXT:movsd {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0
+  store <2 x float> , ptr %v, align 4
+  ret void
+}

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [KeyInstr] Add Atom Group waterline to LLVMContext (PR #133478)

2025-04-10 Thread Jeremy Morse via llvm-branch-commits



@@ -335,6 +335,14 @@ class LLVMContext {
   StringRef getDefaultTargetFeatures();
   void setDefaultTargetFeatures(StringRef Features);
 
+  /// Key Instructions: update the highest number atom group emitted for any
+  /// function.
+  void updateAtomGroupWaterline(uint64_t G);
+
+  /// Key Instructions: get the next free atom group number and increment
+  /// the global tracker.
+  uint64_t incNextAtomGroup();
+

jmorse wrote:

IMO in isolation it's not clear that this is to do with debugging information 
and source locations; could we shoe-horn `DILocation` into the comments to make 
it clear what it affects?

(Thinking purely about someone stumbling on this and not immediately knowing 
whether it's relevant to what they're studying)

https://github.com/llvm/llvm-project/pull/133478
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] release/20.x: [X86][SSE] Don't emit SSE2 load instructions in SSE1-only mode (#134547) (PR #135191)

2025-04-10 Thread via llvm-branch-commits


github-actions[bot] wrote:

⚠️ We detected that you are using a GitHub private e-mail address to contribute 
to the repo. Please turn off [Keep my email addresses 
private](https://github.com/settings/emails) setting in your account. See 
[LLVM 
Discourse](https://discourse.llvm.org/t/hidden-emails-on-github-should-we-do-something-about-it)
 for more information.

https://github.com/llvm/llvm-project/pull/135191
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [InstCombine] Improve inbounds preservation for ADD+GEP -> GEP+GEP (PR #135155)

2025-04-10 Thread Björn Pettersson via llvm-branch-commits


https://github.com/bjope updated 
https://github.com/llvm/llvm-project/pull/135155

From 0abeb7b7eea0e15e15c98a8e4f8501fde81d4811 Mon Sep 17 00:00:00 2001
From: Bjorn Pettersson 
Date: Tue, 11 Mar 2025 16:27:43 +0100
Subject: [PATCH 1/2] [InstCombine] Improve inbounds preservation for ADD+GEP
 -> GEP+GEP

Given that we have a "add nuw" and a "getelementptr inbounds nuw" like
this:
   %idx = add nuw i64 %idx1, %idx2
   %gep = getelementptr inbounds nuw i32, ptr %ptr, i64 %idx

Then we can preserve the "inbounds nuw" flag when transforming that
into two getelementptr instructions:
   %gep1 = getelementptr inbounds nuw i32, ptr %ptr, i64 %idx1
   %gep = getelementptr inbounds nuw i32, ptr %ptr, i64 %idx2

Similarly for just having "nuw" instead of "inbounds nuw" on the
getelementptr.

Proof: https://alive2.llvm.org/ce/z/4uhfDq
---
 .../InstCombine/InstructionCombining.cpp  | 43 +++
 llvm/test/Transforms/InstCombine/array.ll | 10 ++---
 2 files changed, 30 insertions(+), 23 deletions(-)

diff --git a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp 
b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
index 856e02c9f1ddb..19a818f4baa30 100644
--- a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
@@ -3087,12 +3087,22 @@ Instruction 
*InstCombinerImpl::visitGetElementPtrInst(GetElementPtrInst &GEP) {
 return nullptr;
 
   if (GEP.getNumIndices() == 1) {
-// We can only preserve inbounds if the original gep is inbounds, the add
-// is nsw, and the add operands are non-negative.
-auto CanPreserveInBounds = [&](bool AddIsNSW, Value *Idx1, Value *Idx2) {
+auto CanPreserveNoWrapFlags = [&](bool AddIsNSW, bool AddIsNUW, Value 
*Idx1,
+  Value *Idx2) {
+  // Preserve "inbounds nuw" if the original gep is "inbounds nuw",
+  // and the add is "nuw".
+  if (GEP.isInBounds() && GEP.hasNoUnsignedWrap() && AddIsNUW)
+return GEPNoWrapFlags::inBounds() | GEPNoWrapFlags::noUnsignedWrap();
+  // Preserve "inbounds" if the original gep is "inbounds", the add
+  // is "nsw", and the add operands are non-negative.
   SimplifyQuery Q = SQ.getWithInstruction(&GEP);
-  return GEP.isInBounds() && AddIsNSW && isKnownNonNegative(Idx1, Q) &&
- isKnownNonNegative(Idx2, Q);
+  if (GEP.isInBounds() && AddIsNSW && isKnownNonNegative(Idx1, Q) &&
+  isKnownNonNegative(Idx2, Q))
+return GEPNoWrapFlags::inBounds();
+  // Preserve "nuw" if the original gep is "nuw", and the add is "nuw".
+  if (GEP.hasNoUnsignedWrap() && AddIsNUW)
+return GEPNoWrapFlags::noUnsignedWrap();
+  return GEPNoWrapFlags::none();
 };
 
 // Try to replace ADD + GEP with GEP + GEP.
@@ -3104,15 +3114,15 @@ Instruction 
*InstCombinerImpl::visitGetElementPtrInst(GetElementPtrInst &GEP) {
   // as:
   //   %newptr = getelementptr i32, ptr %ptr, i64 %idx1
   //   %newgep = getelementptr i32, ptr %newptr, i64 %idx2
-  bool IsInBounds = CanPreserveInBounds(
-  
cast(GEP.getOperand(1))->hasNoSignedWrap(),
-  Idx1, Idx2);
+  bool NSW = match(GEP.getOperand(1), m_NSWAddLike(m_Value(), m_Value()));
+  bool NUW = match(GEP.getOperand(1), m_NUWAddLike(m_Value(), m_Value()));
+  GEPNoWrapFlags NWFlags = CanPreserveNoWrapFlags(NSW, NUW, Idx1, Idx2);
   auto *NewPtr =
   Builder.CreateGEP(GEP.getSourceElementType(), 
GEP.getPointerOperand(),
-Idx1, "", IsInBounds);
-  return replaceInstUsesWith(
-  GEP, Builder.CreateGEP(GEP.getSourceElementType(), NewPtr, Idx2, "",
- IsInBounds));
+Idx1, "", NWFlags);
+  return replaceInstUsesWith(GEP,
+ Builder.CreateGEP(GEP.getSourceElementType(),
+   NewPtr, Idx2, "", NWFlags));
 }
 ConstantInt *C;
 if (match(GEP.getOperand(1), m_OneUse(m_SExtLike(m_OneUse(m_NSWAdd(
@@ -3123,17 +3133,16 @@ Instruction 
*InstCombinerImpl::visitGetElementPtrInst(GetElementPtrInst &GEP) {
   // as:
   // %newptr = getelementptr i32, ptr %ptr, i32 %idx1
   // %newgep = getelementptr i32, ptr %newptr, i32 idx2
-  bool IsInBounds = CanPreserveInBounds(
-  /*IsNSW=*/true, Idx1, C);
+  GEPNoWrapFlags NWFlags = CanPreserveNoWrapFlags(
+  /*IsNSW=*/true, /*IsNUW=*/false, Idx1, C);
   auto *NewPtr = Builder.CreateGEP(
   GEP.getSourceElementType(), GEP.getPointerOperand(),
-  Builder.CreateSExt(Idx1, GEP.getOperand(1)->getType()), "",
-  IsInBounds);
+  Builder.CreateSExt(Idx1, GEP.getOperand(1)->getType()), "", NWFlags);
   return replaceInstUsesWith(
   GEP,
   Builder.CreateGEP(GEP.getSourceElementType(), NewPtr,
 Builder.CreateSExt(C, 
GEP.getOperand(1)

[llvm-branch-commits] [llvm] Add initial support for SPE brstack format (PR #129231)

2025-04-10 Thread Ádám Kallai via llvm-branch-commits


https://github.com/kaadam updated 
https://github.com/llvm/llvm-project/pull/129231

From 93c958c3f016092c340e897aeabbb470e58b9dbb Mon Sep 17 00:00:00 2001
From: Adam Kallai 
Date: Wed, 19 Feb 2025 17:00:47 +0100
Subject: [PATCH 1/2] Add initial support for SPE brstack

Perf will be able to report SPE branch events as similar as it does
with LBR brstack.
Therefore we can utilize the existing LBR parsing process for SPE as well.

Example of the SPE brstack input format:

perf script -i perf.data -F pid,brstack --itrace=bl
---
PIDFROM TO   PREDICTED
---
16984  0x72e342e5f4/0x72e36192d0/M/-/-/11/RET/-
16984  0x72e7b8b3b4/0x72e7b8b3b8/PN/-/-/11/COND/-
16984  0x72e7b92b48/0x72e7b92b4c/PN/-/-/8/COND/-
16984  0x72eacc6b7c/0x760cc94b00/P/-/-/9/RET/-
16984  0x72e3f210fc/0x72e3f21068/P/-/-/4//-
16984  0x72e39b8c5c/0x72e3627b24/P/-/-/4//-
16984  0x72e7b89d20/0x72e7b92bbc/P/-/-/4/RET/-

SPE brstack mispredicted flag might be two characters long: 'PN' or 'MN'.
Where 'N' means the branch was marked as NOT-TAKEN. This event is only related 
to
conditional instruction (conditional branch or compare-and-branch),
it tells that failed its condition code check.

Perf with 'brstack' support for SPE is available here:
```
https://github.com/Leo-Yan/linux/tree/perf_arm_spe_branch_flags_v2
```

Example of useage with SPE perf data:
```bash
perf2bolt -p perf.data -o perf.fdata --spe BINARY
```

Capture standard SPE branch events with perf:
```bash
perf record -e 'arm_spe_0/branch_filter=1/u' -- BINARY
```

An unittest is also added to check parsing process of 'SPE brstack format'.
---
 bolt/lib/Profile/DataAggregator.cpp   | 60 ++--
 .../test/perf2bolt/AArch64/perf2bolt-spe.test |  2 +-
 bolt/unittests/Profile/PerfSpeEvents.cpp  | 71 +++
 3 files changed, 109 insertions(+), 24 deletions(-)

diff --git a/bolt/lib/Profile/DataAggregator.cpp 
b/bolt/lib/Profile/DataAggregator.cpp
index cce9fdbef99bd..4af3a493b8be6 100644
--- a/bolt/lib/Profile/DataAggregator.cpp
+++ b/bolt/lib/Profile/DataAggregator.cpp
@@ -49,12 +49,10 @@ static cl::opt
  cl::desc("aggregate basic samples (without LBR info)"),
  cl::cat(AggregatorCategory));
 
-cl::opt ArmSPE(
-"spe",
-cl::desc(
-"Enable Arm SPE mode. Used in conjuction with no-lbr mode, ie `--spe "
-"--nl`"),
-cl::cat(AggregatorCategory));
+cl::opt ArmSPE("spe",
+ cl::desc("Enable Arm SPE mode. Can combine with `--nl` "
+  "to use in no-lbr mode"),
+ cl::cat(AggregatorCategory));
 
 static cl::opt
 ITraceAggregation("itrace",
@@ -180,13 +178,16 @@ void DataAggregator::start() {
 
   if (opts::ArmSPE) {
 if (!opts::BasicAggregation) {
-  errs() << "PERF2BOLT-ERROR: Arm SPE mode is combined only with "
-"BasicAggregation.\n";
-  exit(1);
+  // pidfrom_ip  to_ippredicted?
+  // 12345  0x123/0x456/P/-/-/8/RET/-
+  launchPerfProcess("SPE branch events", MainEventsPPI,
+"script -F pid,brstack --itrace=bl",
+/*Wait = */ false);
+} else {
+  launchPerfProcess("SPE brstack events", MainEventsPPI,
+"script -F pid,event,ip,addr --itrace=i1i",
+/*Wait = */ false);
 }
-launchPerfProcess("branch events with SPE", MainEventsPPI,
-  "script -F pid,event,ip,addr --itrace=i1i",
-  /*Wait = */ false);
   } else if (opts::BasicAggregation) {
 launchPerfProcess("events without LBR", MainEventsPPI,
   "script -F pid,event,ip",
@@ -527,8 +528,7 @@ Error DataAggregator::preprocessProfile(BinaryContext &BC) {
 }
 exit(0);
   }
-
-  if (((!opts::BasicAggregation && !opts::ArmSPE) && parseBranchEvents()) ||
+  if ((!opts::BasicAggregation && parseBranchEvents()) ||
   (opts::BasicAggregation && opts::ArmSPE && parseSpeAsBasicEvents()) ||
   (opts::BasicAggregation && parseBasicEvents()))
 errs() << "PERF2BOLT: failed to parse samples\n";
@@ -1034,7 +1034,11 @@ ErrorOr DataAggregator::parseLBREntry() {
   if (std::error_code EC = MispredStrRes.getError())
 return EC;
   StringRef MispredStr = MispredStrRes.get();
-  if (MispredStr.size() != 1 ||
+  // SPE brstack mispredicted flags might be two characters long: 'PN' or 'MN'.
+  bool ProperStrSize = (MispredStr.size() == 2 && opts::ArmSPE)
+   ? (MispredStr[1] == 'N')
+   : (MispredStr.size() == 1);
+  if (!ProperStrSize ||
   (MispredStr[0] != 'P' && MispredStr[0] != 'M' && MispredStr[0] != '-')) {
 reportError("expected single char for mispred bit");
 Diag << "Found: " << MispredStr << "\n";
@@ -1565,9 +1569,11 @@ uint64_t DataAggregator::parseLBRSample(const 
PerfBranchSample &Sample,
 }
 
 std::error_code DataAggregator::parseBranchEvents() {
-  outs() << "PERF2BOLT

[llvm-branch-commits] [llvm] Add initial support for SPE brstack format (PR #129231)

2025-04-10 Thread Ádám Kallai via llvm-branch-commits



@@ -113,6 +153,37 @@ TEST_F(PerfSpeEventsTestHelper, SpeBranches) {
   EXPECT_TRUE(checkEvents(1234, 10, {"branches-spe:"}));
 }
 
+TEST_F(PerfSpeEventsTestHelper, SpeBranchesWithBrstack) {
+  // Check perf input with SPE branch events as brstack format.
+  // Example collection command:
+  // ```
+  // perf record -e 'arm_spe_0/branch_filter=1/u' -- BINARY
+  // ```
+  // How Bolt extracts the branch events:
+  // ```
+  // perf script -F pid,brstack --itrace=bl
+  // ```
+
+  opts::ArmSPE = true;
+  opts::ReadPerfEvents = "  1234  0xa001/0xa002/PN/-/-/10/COND/-\n"
+ "  1234  0xb001/0xb002/P/-/-/4/RET/-\n"
+ "  1234  0xc001/0xc002/P/-/-/13/-/-\n"
+ "  1234  0xd001/0xd002/M/-/-/7/RET/-\n"
+ "  1234  0xe001/0xe002/P/-/-/14/RET/-\n"
+ "  1234  0xf001/0xf002/MN/-/-/8/COND/-\n";
+
+  LBREntry Entry1 = {0xa001, 0xa002, false};
+  LBREntry Entry2 = {0xb001, 0xb002, false};
+  LBREntry Entry3 = {0xc001, 0xc002, false};
+  LBREntry Entry4 = {0xd001, 0xd002, true};
+  LBREntry Entry5 = {0xe001, 0xe002, false};
+  LBREntry Entry6 = {0xf001, 0xf002, true};
+  std::vector> ExpectedSamples = {
+  {{Entry1}}, {{Entry2}}, {{Entry3}}, {{Entry4}}, {{Entry5}}, {{Entry6}},
+  };

kaadam wrote:

Simplified, thanks for the hint. 

https://github.com/llvm/llvm-project/pull/129231
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] 6c36439 - Revert "Remember LLVM_ENABLE_LIBCXX setting in installed configuration (#134990)"

2025-04-10 Thread via llvm-branch-commits


Author: Michael Kruse
Date: 2025-04-10T11:54:53+02:00
New Revision: 6c3643905210c717831a18d4af1ae921a1ad9f74

URL: 
https://github.com/llvm/llvm-project/commit/6c3643905210c717831a18d4af1ae921a1ad9f74
DIFF: 
https://github.com/llvm/llvm-project/commit/6c3643905210c717831a18d4af1ae921a1ad9f74.diff

LOG: Revert "Remember LLVM_ENABLE_LIBCXX setting in installed configuration 
(#134990)"

This reverts commit 785e7f06ddb1ba36aa679d23436726dcf61f8afb.

Added: 


Modified: 
llvm/cmake/modules/LLVMConfig.cmake.in

Removed: 




diff  --git a/llvm/cmake/modules/LLVMConfig.cmake.in 
b/llvm/cmake/modules/LLVMConfig.cmake.in
index 1c34073f6b910..5ccc66b8039bf 100644
--- a/llvm/cmake/modules/LLVMConfig.cmake.in
+++ b/llvm/cmake/modules/LLVMConfig.cmake.in
@@ -55,8 +55,6 @@ endif()
 
 set(LLVM_ENABLE_RTTI @LLVM_ENABLE_RTTI@)
 
-set(LLVM_ENABLE_LIBCXX @LLVM_ENABLE_LIBCXX@)
-
 set(LLVM_ENABLE_LIBEDIT @HAVE_LIBEDIT@)
 if(LLVM_ENABLE_LIBEDIT)
   find_package(LibEdit)



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] Add initial support for SPE brstack format (PR #129231)

2025-04-10 Thread Ádám Kallai via llvm-branch-commits



@@ -11,4 +11,4 @@ CHECK-SPE-NO-LBR: PERF2BOLT: Starting data aggregation job
 RUN: perf record -e cycles -q -o %t.perf.data -- %t.exe
 RUN: not perf2bolt -p %t.perf.data -o %t.perf.boltdata --spe %t.exe 2>&1 | 
FileCheck %s --check-prefix=CHECK-SPE-LBR
 
-CHECK-SPE-LBR: PERF2BOLT-ERROR: Arm SPE mode is combined only with 
BasicAggregation.
+CHECK-SPE-LBR: PERF2BOLT: spawning perf job to read SPE branch events

kaadam wrote:

Thanks for clarifying this. Updated this test.

https://github.com/llvm/llvm-project/pull/129231
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] Add initial support for SPE brstack format (PR #129231)

2025-04-10 Thread Ádám Kallai via llvm-branch-commits


https://github.com/kaadam updated 
https://github.com/llvm/llvm-project/pull/129231

From 93c958c3f016092c340e897aeabbb470e58b9dbb Mon Sep 17 00:00:00 2001
From: Adam Kallai 
Date: Wed, 19 Feb 2025 17:00:47 +0100
Subject: [PATCH 1/3] Add initial support for SPE brstack

Perf will be able to report SPE branch events as similar as it does
with LBR brstack.
Therefore we can utilize the existing LBR parsing process for SPE as well.

Example of the SPE brstack input format:

perf script -i perf.data -F pid,brstack --itrace=bl
---
PIDFROM TO   PREDICTED
---
16984  0x72e342e5f4/0x72e36192d0/M/-/-/11/RET/-
16984  0x72e7b8b3b4/0x72e7b8b3b8/PN/-/-/11/COND/-
16984  0x72e7b92b48/0x72e7b92b4c/PN/-/-/8/COND/-
16984  0x72eacc6b7c/0x760cc94b00/P/-/-/9/RET/-
16984  0x72e3f210fc/0x72e3f21068/P/-/-/4//-
16984  0x72e39b8c5c/0x72e3627b24/P/-/-/4//-
16984  0x72e7b89d20/0x72e7b92bbc/P/-/-/4/RET/-

SPE brstack mispredicted flag might be two characters long: 'PN' or 'MN'.
Where 'N' means the branch was marked as NOT-TAKEN. This event is only related 
to
conditional instruction (conditional branch or compare-and-branch),
it tells that failed its condition code check.

Perf with 'brstack' support for SPE is available here:
```
https://github.com/Leo-Yan/linux/tree/perf_arm_spe_branch_flags_v2
```

Example of useage with SPE perf data:
```bash
perf2bolt -p perf.data -o perf.fdata --spe BINARY
```

Capture standard SPE branch events with perf:
```bash
perf record -e 'arm_spe_0/branch_filter=1/u' -- BINARY
```

An unittest is also added to check parsing process of 'SPE brstack format'.
---
 bolt/lib/Profile/DataAggregator.cpp   | 60 ++--
 .../test/perf2bolt/AArch64/perf2bolt-spe.test |  2 +-
 bolt/unittests/Profile/PerfSpeEvents.cpp  | 71 +++
 3 files changed, 109 insertions(+), 24 deletions(-)

diff --git a/bolt/lib/Profile/DataAggregator.cpp 
b/bolt/lib/Profile/DataAggregator.cpp
index cce9fdbef99bd..4af3a493b8be6 100644
--- a/bolt/lib/Profile/DataAggregator.cpp
+++ b/bolt/lib/Profile/DataAggregator.cpp
@@ -49,12 +49,10 @@ static cl::opt
  cl::desc("aggregate basic samples (without LBR info)"),
  cl::cat(AggregatorCategory));
 
-cl::opt ArmSPE(
-"spe",
-cl::desc(
-"Enable Arm SPE mode. Used in conjuction with no-lbr mode, ie `--spe "
-"--nl`"),
-cl::cat(AggregatorCategory));
+cl::opt ArmSPE("spe",
+ cl::desc("Enable Arm SPE mode. Can combine with `--nl` "
+  "to use in no-lbr mode"),
+ cl::cat(AggregatorCategory));
 
 static cl::opt
 ITraceAggregation("itrace",
@@ -180,13 +178,16 @@ void DataAggregator::start() {
 
   if (opts::ArmSPE) {
 if (!opts::BasicAggregation) {
-  errs() << "PERF2BOLT-ERROR: Arm SPE mode is combined only with "
-"BasicAggregation.\n";
-  exit(1);
+  // pidfrom_ip  to_ippredicted?
+  // 12345  0x123/0x456/P/-/-/8/RET/-
+  launchPerfProcess("SPE branch events", MainEventsPPI,
+"script -F pid,brstack --itrace=bl",
+/*Wait = */ false);
+} else {
+  launchPerfProcess("SPE brstack events", MainEventsPPI,
+"script -F pid,event,ip,addr --itrace=i1i",
+/*Wait = */ false);
 }
-launchPerfProcess("branch events with SPE", MainEventsPPI,
-  "script -F pid,event,ip,addr --itrace=i1i",
-  /*Wait = */ false);
   } else if (opts::BasicAggregation) {
 launchPerfProcess("events without LBR", MainEventsPPI,
   "script -F pid,event,ip",
@@ -527,8 +528,7 @@ Error DataAggregator::preprocessProfile(BinaryContext &BC) {
 }
 exit(0);
   }
-
-  if (((!opts::BasicAggregation && !opts::ArmSPE) && parseBranchEvents()) ||
+  if ((!opts::BasicAggregation && parseBranchEvents()) ||
   (opts::BasicAggregation && opts::ArmSPE && parseSpeAsBasicEvents()) ||
   (opts::BasicAggregation && parseBasicEvents()))
 errs() << "PERF2BOLT: failed to parse samples\n";
@@ -1034,7 +1034,11 @@ ErrorOr DataAggregator::parseLBREntry() {
   if (std::error_code EC = MispredStrRes.getError())
 return EC;
   StringRef MispredStr = MispredStrRes.get();
-  if (MispredStr.size() != 1 ||
+  // SPE brstack mispredicted flags might be two characters long: 'PN' or 'MN'.
+  bool ProperStrSize = (MispredStr.size() == 2 && opts::ArmSPE)
+   ? (MispredStr[1] == 'N')
+   : (MispredStr.size() == 1);
+  if (!ProperStrSize ||
   (MispredStr[0] != 'P' && MispredStr[0] != 'M' && MispredStr[0] != '-')) {
 reportError("expected single char for mispred bit");
 Diag << "Found: " << MispredStr << "\n";
@@ -1565,9 +1569,11 @@ uint64_t DataAggregator::parseLBRSample(const 
PerfBranchSample &Sample,
 }
 
 std::error_code DataAggregator::parseBranchEvents() {
-  outs() << "PERF2BOLT

[llvm-branch-commits] [llvm] [AMDGPU] Improve StructurizeCFG pass performance by using SSAUpdaterBulk. (PR #135181)

2025-04-10 Thread Valery Pykhtin via llvm-branch-commits


https://github.com/vpykhtin ready_for_review 
https://github.com/llvm/llvm-project/pull/135181
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [SDAG] Introduce inbounds flag for pointer arithmetic (PR #131862)

2025-04-10 Thread Fabian Ritter via llvm-branch-commits


https://github.com/ritter-x2a updated 
https://github.com/llvm/llvm-project/pull/131862

>From 4b88628633b065f3d8cc24d4f3bd4e3274fcc75a Mon Sep 17 00:00:00 2001
From: Fabian Ritter 
Date: Mon, 17 Mar 2025 06:51:16 -0400
Subject: [PATCH] [SDAG] Introduce inbounds flag for pointer arithmetic

This patch introduces an inbounds SDNodeFlag, to show that a pointer
addition SDNode implements an inbounds getelementptr operation (i.e.,
the pointer operand is in bounds wrt. the allocated object it is based
on, and the arithmetic does not change that). The flag is set in the DAG
construction when lowering inbounds GEPs.

Inbounds information is useful in the ISel when selecting memory
instructions that perform address computations whose intermediate steps
must be in the same memory region as the final result. A follow-up patch
will start using it for AMDGPU's flat memory instructions, where the
immediate offset must not affect the memory aperture of the address.

A similar patch for gMIR and GlobalISel will follow.

For SWDEV-516125.
---
 llvm/include/llvm/CodeGen/SelectionDAGNodes.h| 9 +++--
 llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp| 3 +++
 llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp | 3 +++
 .../CodeGen/X86/merge-store-partially-alias-loads.ll | 2 +-
 4 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/llvm/include/llvm/CodeGen/SelectionDAGNodes.h 
b/llvm/include/llvm/CodeGen/SelectionDAGNodes.h
index 2283f99202e2f..13ac65f5d731c 100644
--- a/llvm/include/llvm/CodeGen/SelectionDAGNodes.h
+++ b/llvm/include/llvm/CodeGen/SelectionDAGNodes.h
@@ -415,12 +415,15 @@ struct SDNodeFlags {
 Unpredictable = 1 << 13,
 // Compare instructions which may carry the samesign flag.
 SameSign = 1 << 14,
+// Pointer arithmetic instructions that remain in bounds, e.g., 
implementing
+// an inbounds GEP.
+InBounds = 1 << 15,
 
 // NOTE: Please update LargestValue in LLVM_DECLARE_ENUM_AS_BITMASK below
 // the class definition when adding new flags.
 
 PoisonGeneratingFlags = NoUnsignedWrap | NoSignedWrap | Exact | Disjoint |
-NonNeg | NoNaNs | NoInfs | SameSign,
+NonNeg | NoNaNs | NoInfs | SameSign | InBounds,
 FastMathFlags = NoNaNs | NoInfs | NoSignedZeros | AllowReciprocal |
 AllowContract | ApproximateFuncs | AllowReassociation,
   };
@@ -455,6 +458,7 @@ struct SDNodeFlags {
   void setAllowReassociation(bool b) { setFlag(b); }
   void setNoFPExcept(bool b) { setFlag(b); }
   void setUnpredictable(bool b) { setFlag(b); }
+  void setInBounds(bool b) { setFlag(b); }
 
   // These are accessors for each flag.
   bool hasNoUnsignedWrap() const { return Flags & NoUnsignedWrap; }
@@ -472,6 +476,7 @@ struct SDNodeFlags {
   bool hasAllowReassociation() const { return Flags & AllowReassociation; }
   bool hasNoFPExcept() const { return Flags & NoFPExcept; }
   bool hasUnpredictable() const { return Flags & Unpredictable; }
+  bool hasInBounds() const { return Flags & InBounds; }
 
   bool operator==(const SDNodeFlags &Other) const {
 return Flags == Other.Flags;
@@ -481,7 +486,7 @@ struct SDNodeFlags {
 };
 
 LLVM_DECLARE_ENUM_AS_BITMASK(decltype(SDNodeFlags::None),
- SDNodeFlags::SameSign);
+ SDNodeFlags::InBounds);
 
 inline SDNodeFlags operator|(SDNodeFlags LHS, SDNodeFlags RHS) {
   LHS |= RHS;
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp 
b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
index 89793c30f3710..32973be608937 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
@@ -4283,6 +4283,7 @@ void SelectionDAGBuilder::visitGetElementPtr(const User 
&I) {
 if (NW.hasNoUnsignedWrap() ||
 (int64_t(Offset) >= 0 && NW.hasNoUnsignedSignedWrap()))
   Flags |= SDNodeFlags::NoUnsignedWrap;
+Flags.setInBounds(NW.isInBounds());
 
 N = DAG.getNode(ISD::ADD, dl, N.getValueType(), N,
 DAG.getConstant(Offset, dl, N.getValueType()), Flags);
@@ -4326,6 +4327,7 @@ void SelectionDAGBuilder::visitGetElementPtr(const User 
&I) {
 if (NW.hasNoUnsignedWrap() ||
 (Offs.isNonNegative() && NW.hasNoUnsignedSignedWrap()))
   Flags.setNoUnsignedWrap(true);
+Flags.setInBounds(NW.isInBounds());
 
 OffsVal = DAG.getSExtOrTrunc(OffsVal, dl, N.getValueType());
 
@@ -4388,6 +4390,7 @@ void SelectionDAGBuilder::visitGetElementPtr(const User 
&I) {
   // pointer index type (add nuw).
   SDNodeFlags AddFlags;
   AddFlags.setNoUnsignedWrap(NW.hasNoUnsignedWrap());
+  AddFlags.setInBounds(NW.isInBounds());
 
   N = DAG.getNode(ISD::ADD, dl, N.getValueType(), N, IdxN, AddFlags);
 }
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp 
b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
ind

[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Only fold flat offsets if they are inbounds (PR #132353)

2025-04-10 Thread Fabian Ritter via llvm-branch-commits


https://github.com/ritter-x2a updated 
https://github.com/llvm/llvm-project/pull/132353

>From b3a2dc9d2642a79cc3251db2623464075f206e12 Mon Sep 17 00:00:00 2001
From: Fabian Ritter 
Date: Fri, 21 Mar 2025 03:33:02 -0400
Subject: [PATCH] [AMDGPU][SDAG] Only fold flat offsets if they are inbounds

For flat memory instructions where the address is supplied as a base address
register with an immediate offset, the memory aperture test ignores the
immediate offset. Currently, ISel does not respect that, which leads to
miscompilations where valid input programs crash when the address computation
relies on the immediate offset to get the base address in the proper memory
aperture. Global or scratch instructions are not affected.

This patch only selects flat instructions with immediate offsets from address
computations with the inbounds flag: If the address computation does not leave
the bounds of the allocated object, it cannot leave the bounds of the memory
aperture and is therefore safe to handle with an immediate offset.

It also adds the inbounds flag to DAG nodes resulting from transformations:
- Address computations resulting from getObjectPtrOffset. As far as I can tell,
  this function is only used to compute addresses within accessed memory ranges,
  e.g., for loads and stores that are split during legalization.
- Reassociated inbounds adds. If both involved operations are inbounds, then so
  are operations after the transformation.
- Address computations in the SelectionDAG lowering of the memcpy/move/set
  intrinsics. Base and result of the address arithmetic there are accessed, so
  the operation must be inbounds.

It might make sense to separate these changes into their own PR, but I don't
see a way to test them without adding a use of the inbounds SDAG flag.

Affected tests:
- CodeGen/AMDGPU/fold-gep-offset.ll: Offsets are no longer wrongly folded,
  added new positive tests where we still do fold them.
- Transforms/InferAddressSpaces/AMDGPU/flat_atomic.ll: Offset folding doesn't
  seem integral to this test, so the test is not changed to make offset folding
  still happen.
- CodeGen/AMDGPU/loop-prefetch-data.ll: loop-reduce prefers to base addresses
  on the potentially OOB addresses used for prefetching for memory accesses,
  that might be a separate issue to look into.
- Added memset tests to CodeGen/AMDGPU/memintrinsic-unroll.ll to make sure that
  offsets in the memset DAG lowering are still folded properly.

A similar patch for GlobalISel will follow.

Fixes SWDEV-516125.
---
 llvm/include/llvm/CodeGen/SelectionDAG.h  |  12 +-
 llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp |   9 +-
 .../lib/CodeGen/SelectionDAG/SelectionDAG.cpp |  12 +-
 llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp | 140 ---
 llvm/test/CodeGen/AMDGPU/fold-gep-offset.ll   | 374 +-
 .../test/CodeGen/AMDGPU/loop-prefetch-data.ll |  17 +-
 .../CodeGen/AMDGPU/memintrinsic-unroll.ll | 241 +++
 .../InferAddressSpaces/AMDGPU/flat_atomic.ll  |   6 +-
 8 files changed, 717 insertions(+), 94 deletions(-)

diff --git a/llvm/include/llvm/CodeGen/SelectionDAG.h 
b/llvm/include/llvm/CodeGen/SelectionDAG.h
index 15a2370e5d8b8..aa3668d3e9aae 100644
--- a/llvm/include/llvm/CodeGen/SelectionDAG.h
+++ b/llvm/include/llvm/CodeGen/SelectionDAG.h
@@ -1069,7 +1069,8 @@ class SelectionDAG {
  SDValue EVL);
 
   /// Returns sum of the base pointer and offset.
-  /// Unlike getObjectPtrOffset this does not set NoUnsignedWrap by default.
+  /// Unlike getObjectPtrOffset this does not set NoUnsignedWrap and InBounds 
by
+  /// default.
   SDValue getMemBasePlusOffset(SDValue Base, TypeSize Offset, const SDLoc &DL,
const SDNodeFlags Flags = SDNodeFlags());
   SDValue getMemBasePlusOffset(SDValue Base, SDValue Offset, const SDLoc &DL,
@@ -1077,15 +1078,18 @@ class SelectionDAG {
 
   /// Create an add instruction with appropriate flags when used for
   /// addressing some offset of an object. i.e. if a load is split into 
multiple
-  /// components, create an add nuw from the base pointer to the offset.
+  /// components, create an add nuw inbounds from the base pointer to the
+  /// offset.
   SDValue getObjectPtrOffset(const SDLoc &SL, SDValue Ptr, TypeSize Offset) {
-return getMemBasePlusOffset(Ptr, Offset, SL, SDNodeFlags::NoUnsignedWrap);
+return getMemBasePlusOffset(
+Ptr, Offset, SL, SDNodeFlags::NoUnsignedWrap | SDNodeFlags::InBounds);
   }
 
   SDValue getObjectPtrOffset(const SDLoc &SL, SDValue Ptr, SDValue Offset) {
 // The object itself can't wrap around the address space, so it shouldn't 
be
 // possible for the adds of the offsets to the split parts to overflow.
-return getMemBasePlusOffset(Ptr, Offset, SL, SDNodeFlags::NoUnsignedWrap);
+return getMemBasePlusOffset(
+Ptr, Offset, SL, SDNodeFlags::NoUnsignedWrap | SDNodeFlags::InBounds);
   }
 
   /// Return a new CALLSEQ_START node, that starts new call fram

[llvm-branch-commits] [llvm] [SDAG] Introduce inbounds flag for pointer arithmetic (PR #131862)

2025-04-10 Thread Fabian Ritter via llvm-branch-commits


https://github.com/ritter-x2a updated 
https://github.com/llvm/llvm-project/pull/131862

>From 4b88628633b065f3d8cc24d4f3bd4e3274fcc75a Mon Sep 17 00:00:00 2001
From: Fabian Ritter 
Date: Mon, 17 Mar 2025 06:51:16 -0400
Subject: [PATCH] [SDAG] Introduce inbounds flag for pointer arithmetic

This patch introduces an inbounds SDNodeFlag, to show that a pointer
addition SDNode implements an inbounds getelementptr operation (i.e.,
the pointer operand is in bounds wrt. the allocated object it is based
on, and the arithmetic does not change that). The flag is set in the DAG
construction when lowering inbounds GEPs.

Inbounds information is useful in the ISel when selecting memory
instructions that perform address computations whose intermediate steps
must be in the same memory region as the final result. A follow-up patch
will start using it for AMDGPU's flat memory instructions, where the
immediate offset must not affect the memory aperture of the address.

A similar patch for gMIR and GlobalISel will follow.

For SWDEV-516125.
---
 llvm/include/llvm/CodeGen/SelectionDAGNodes.h| 9 +++--
 llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp| 3 +++
 llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp | 3 +++
 .../CodeGen/X86/merge-store-partially-alias-loads.ll | 2 +-
 4 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/llvm/include/llvm/CodeGen/SelectionDAGNodes.h 
b/llvm/include/llvm/CodeGen/SelectionDAGNodes.h
index 2283f99202e2f..13ac65f5d731c 100644
--- a/llvm/include/llvm/CodeGen/SelectionDAGNodes.h
+++ b/llvm/include/llvm/CodeGen/SelectionDAGNodes.h
@@ -415,12 +415,15 @@ struct SDNodeFlags {
 Unpredictable = 1 << 13,
 // Compare instructions which may carry the samesign flag.
 SameSign = 1 << 14,
+// Pointer arithmetic instructions that remain in bounds, e.g., 
implementing
+// an inbounds GEP.
+InBounds = 1 << 15,
 
 // NOTE: Please update LargestValue in LLVM_DECLARE_ENUM_AS_BITMASK below
 // the class definition when adding new flags.
 
 PoisonGeneratingFlags = NoUnsignedWrap | NoSignedWrap | Exact | Disjoint |
-NonNeg | NoNaNs | NoInfs | SameSign,
+NonNeg | NoNaNs | NoInfs | SameSign | InBounds,
 FastMathFlags = NoNaNs | NoInfs | NoSignedZeros | AllowReciprocal |
 AllowContract | ApproximateFuncs | AllowReassociation,
   };
@@ -455,6 +458,7 @@ struct SDNodeFlags {
   void setAllowReassociation(bool b) { setFlag(b); }
   void setNoFPExcept(bool b) { setFlag(b); }
   void setUnpredictable(bool b) { setFlag(b); }
+  void setInBounds(bool b) { setFlag(b); }
 
   // These are accessors for each flag.
   bool hasNoUnsignedWrap() const { return Flags & NoUnsignedWrap; }
@@ -472,6 +476,7 @@ struct SDNodeFlags {
   bool hasAllowReassociation() const { return Flags & AllowReassociation; }
   bool hasNoFPExcept() const { return Flags & NoFPExcept; }
   bool hasUnpredictable() const { return Flags & Unpredictable; }
+  bool hasInBounds() const { return Flags & InBounds; }
 
   bool operator==(const SDNodeFlags &Other) const {
 return Flags == Other.Flags;
@@ -481,7 +486,7 @@ struct SDNodeFlags {
 };
 
 LLVM_DECLARE_ENUM_AS_BITMASK(decltype(SDNodeFlags::None),
- SDNodeFlags::SameSign);
+ SDNodeFlags::InBounds);
 
 inline SDNodeFlags operator|(SDNodeFlags LHS, SDNodeFlags RHS) {
   LHS |= RHS;
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp 
b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
index 89793c30f3710..32973be608937 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
@@ -4283,6 +4283,7 @@ void SelectionDAGBuilder::visitGetElementPtr(const User 
&I) {
 if (NW.hasNoUnsignedWrap() ||
 (int64_t(Offset) >= 0 && NW.hasNoUnsignedSignedWrap()))
   Flags |= SDNodeFlags::NoUnsignedWrap;
+Flags.setInBounds(NW.isInBounds());
 
 N = DAG.getNode(ISD::ADD, dl, N.getValueType(), N,
 DAG.getConstant(Offset, dl, N.getValueType()), Flags);
@@ -4326,6 +4327,7 @@ void SelectionDAGBuilder::visitGetElementPtr(const User 
&I) {
 if (NW.hasNoUnsignedWrap() ||
 (Offs.isNonNegative() && NW.hasNoUnsignedSignedWrap()))
   Flags.setNoUnsignedWrap(true);
+Flags.setInBounds(NW.isInBounds());
 
 OffsVal = DAG.getSExtOrTrunc(OffsVal, dl, N.getValueType());
 
@@ -4388,6 +4390,7 @@ void SelectionDAGBuilder::visitGetElementPtr(const User 
&I) {
   // pointer index type (add nuw).
   SDNodeFlags AddFlags;
   AddFlags.setNoUnsignedWrap(NW.hasNoUnsignedWrap());
+  AddFlags.setInBounds(NW.isInBounds());
 
   N = DAG.getNode(ISD::ADD, dl, N.getValueType(), N, IdxN, AddFlags);
 }
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp 
b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
ind

[llvm-branch-commits] [llvm] [AMDGPU] Improve StructurizeCFG pass performance by using SSAUpdaterBulk. (PR #135181)

2025-04-10 Thread Valery Pykhtin via llvm-branch-commits


https://github.com/vpykhtin edited 
https://github.com/llvm/llvm-project/pull/135181
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] ELF: Remove lock from MTE global relocation handling code. (PR #135123)

2025-04-10 Thread Fangrui Song via llvm-branch-commits


https://github.com/MaskRay approved this pull request.


https://github.com/llvm/llvm-project/pull/135123
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Only fold flat offsets if they are inbounds (PR #132353)

2025-04-10 Thread Fabian Ritter via llvm-branch-commits


https://github.com/ritter-x2a updated 
https://github.com/llvm/llvm-project/pull/132353

>From b3a2dc9d2642a79cc3251db2623464075f206e12 Mon Sep 17 00:00:00 2001
From: Fabian Ritter 
Date: Fri, 21 Mar 2025 03:33:02 -0400
Subject: [PATCH] [AMDGPU][SDAG] Only fold flat offsets if they are inbounds

For flat memory instructions where the address is supplied as a base address
register with an immediate offset, the memory aperture test ignores the
immediate offset. Currently, ISel does not respect that, which leads to
miscompilations where valid input programs crash when the address computation
relies on the immediate offset to get the base address in the proper memory
aperture. Global or scratch instructions are not affected.

This patch only selects flat instructions with immediate offsets from address
computations with the inbounds flag: If the address computation does not leave
the bounds of the allocated object, it cannot leave the bounds of the memory
aperture and is therefore safe to handle with an immediate offset.

It also adds the inbounds flag to DAG nodes resulting from transformations:
- Address computations resulting from getObjectPtrOffset. As far as I can tell,
  this function is only used to compute addresses within accessed memory ranges,
  e.g., for loads and stores that are split during legalization.
- Reassociated inbounds adds. If both involved operations are inbounds, then so
  are operations after the transformation.
- Address computations in the SelectionDAG lowering of the memcpy/move/set
  intrinsics. Base and result of the address arithmetic there are accessed, so
  the operation must be inbounds.

It might make sense to separate these changes into their own PR, but I don't
see a way to test them without adding a use of the inbounds SDAG flag.

Affected tests:
- CodeGen/AMDGPU/fold-gep-offset.ll: Offsets are no longer wrongly folded,
  added new positive tests where we still do fold them.
- Transforms/InferAddressSpaces/AMDGPU/flat_atomic.ll: Offset folding doesn't
  seem integral to this test, so the test is not changed to make offset folding
  still happen.
- CodeGen/AMDGPU/loop-prefetch-data.ll: loop-reduce prefers to base addresses
  on the potentially OOB addresses used for prefetching for memory accesses,
  that might be a separate issue to look into.
- Added memset tests to CodeGen/AMDGPU/memintrinsic-unroll.ll to make sure that
  offsets in the memset DAG lowering are still folded properly.

A similar patch for GlobalISel will follow.

Fixes SWDEV-516125.
---
 llvm/include/llvm/CodeGen/SelectionDAG.h  |  12 +-
 llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp |   9 +-
 .../lib/CodeGen/SelectionDAG/SelectionDAG.cpp |  12 +-
 llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp | 140 ---
 llvm/test/CodeGen/AMDGPU/fold-gep-offset.ll   | 374 +-
 .../test/CodeGen/AMDGPU/loop-prefetch-data.ll |  17 +-
 .../CodeGen/AMDGPU/memintrinsic-unroll.ll | 241 +++
 .../InferAddressSpaces/AMDGPU/flat_atomic.ll  |   6 +-
 8 files changed, 717 insertions(+), 94 deletions(-)

diff --git a/llvm/include/llvm/CodeGen/SelectionDAG.h 
b/llvm/include/llvm/CodeGen/SelectionDAG.h
index 15a2370e5d8b8..aa3668d3e9aae 100644
--- a/llvm/include/llvm/CodeGen/SelectionDAG.h
+++ b/llvm/include/llvm/CodeGen/SelectionDAG.h
@@ -1069,7 +1069,8 @@ class SelectionDAG {
  SDValue EVL);
 
   /// Returns sum of the base pointer and offset.
-  /// Unlike getObjectPtrOffset this does not set NoUnsignedWrap by default.
+  /// Unlike getObjectPtrOffset this does not set NoUnsignedWrap and InBounds 
by
+  /// default.
   SDValue getMemBasePlusOffset(SDValue Base, TypeSize Offset, const SDLoc &DL,
const SDNodeFlags Flags = SDNodeFlags());
   SDValue getMemBasePlusOffset(SDValue Base, SDValue Offset, const SDLoc &DL,
@@ -1077,15 +1078,18 @@ class SelectionDAG {
 
   /// Create an add instruction with appropriate flags when used for
   /// addressing some offset of an object. i.e. if a load is split into 
multiple
-  /// components, create an add nuw from the base pointer to the offset.
+  /// components, create an add nuw inbounds from the base pointer to the
+  /// offset.
   SDValue getObjectPtrOffset(const SDLoc &SL, SDValue Ptr, TypeSize Offset) {
-return getMemBasePlusOffset(Ptr, Offset, SL, SDNodeFlags::NoUnsignedWrap);
+return getMemBasePlusOffset(
+Ptr, Offset, SL, SDNodeFlags::NoUnsignedWrap | SDNodeFlags::InBounds);
   }
 
   SDValue getObjectPtrOffset(const SDLoc &SL, SDValue Ptr, SDValue Offset) {
 // The object itself can't wrap around the address space, so it shouldn't 
be
 // possible for the adds of the offsets to the split parts to overflow.
-return getMemBasePlusOffset(Ptr, Offset, SL, SDNodeFlags::NoUnsignedWrap);
+return getMemBasePlusOffset(
+Ptr, Offset, SL, SDNodeFlags::NoUnsignedWrap | SDNodeFlags::InBounds);
   }
 
   /// Return a new CALLSEQ_START node, that starts new call fram

[llvm-branch-commits] [clang] [clang] implement printing of canonical template arguments of expression kind (PR #135133)

2025-04-10 Thread Erich Keane via llvm-branch-commits



@@ -1357,6 +1357,8 @@ void 
TextNodeDumper::VisitTemplateExpansionTemplateArgument(
 void TextNodeDumper::VisitExpressionTemplateArgument(
 const TemplateArgument &TA) {
   OS << " expr";
+  if (TA.isCanonicalExpr())
+OS << " canon";

erichkeane wrote:

Same hope here on the full word.

https://github.com/llvm/llvm-project/pull/135133
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [clang] implement printing of canonical template arguments of expression kind (PR #135133)

2025-04-10 Thread Erich Keane via llvm-branch-commits



@@ -1305,9 +1305,13 @@ void StmtPrinter::VisitDeclRefExpr(DeclRefExpr *Node) {
 Qualifier->print(OS, Policy);
   if (Node->hasTemplateKeyword())
 OS << "template ";
+
+  bool ForceAnonymous =
+  Policy.PrintAsCanonical && VD->getKind() == Decl::NonTypeTemplateParm;

erichkeane wrote:

Can you explain what is going on here?  This is a little subtle.

https://github.com/llvm/llvm-project/pull/135133
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [clang] implement printing of canonical template arguments of expression kind (PR #135133)

2025-04-10 Thread Erich Keane via llvm-branch-commits



@@ -1724,6 +1724,8 @@ void 
JSONNodeDumper::VisitTemplateExpansionTemplateArgument(
 void JSONNodeDumper::VisitExpressionTemplateArgument(
 const TemplateArgument &TA) {
   JOS.attribute("isExpr", true);
+  if (TA.isCanonicalExpr())
+JOS.attribute("isCanon", true);

erichkeane wrote:

Any reason to not just do `isCanonical` instead?  `Canon` is already a word and 
sounds nonsensical.

https://github.com/llvm/llvm-project/pull/135133
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [KeyInstr] Add Atom Group waterline to LLVMContext (PR #133478)

2025-04-10 Thread Orlando Cazalet-Hyams via llvm-branch-commits

OCHyams wrote:

> Possibly part of the design here is to simply not care, if it's only about 
> internal consistency within a Function (does that hold after inlining too). 
> Apologies if this is all explained in a later patch.

It is indeed the goal not to care; an instruction is only considered to be from 
the same source atom as another instruction (implied: in the same function) if 
they've got the same `atomGroup` and `inlinedAt` fields. (we only ever examine 
the groups within the context of a function, i.e., we don't ever try to ask 
"are these instructions in different functions from the same source atom").

> The answers to that should ultimately be documented somewhere; I imagine 
> that's in the patch stack or coming later.

Your imagination gives me too much credit. It's documented in some comments 
scattered through the stack, but there's not yet a documentation patch.

https://github.com/llvm/llvm-project/pull/133478
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] Add initial support for SPE brstack format (PR #129231)

2025-04-10 Thread via llvm-branch-commits


github-actions[bot] wrote:




:warning: C/C++ code formatter, clang-format found issues in your code. 
:warning:



You can test this locally with the following command:


``bash
git-clang-format --diff 47a986de762147c4f27a20ff9b1d75f9f5a50bdc 
aec7a556fed56c72184963d21d6893e586d6a7e2 --extensions cpp -- 
bolt/lib/Profile/DataAggregator.cpp bolt/unittests/Profile/PerfSpeEvents.cpp
``





View the diff from clang-format here.


``diff
diff --git a/bolt/lib/Profile/DataAggregator.cpp 
b/bolt/lib/Profile/DataAggregator.cpp
index 4273eda865..bcb3b2c8ef 100644
--- a/bolt/lib/Profile/DataAggregator.cpp
+++ b/bolt/lib/Profile/DataAggregator.cpp
@@ -1035,19 +1035,20 @@ ErrorOr DataAggregator::parseLBREntry() {
 return EC;
   StringRef MispredStr = MispredStrRes.get();
   // SPE brstack mispredicted flags might be two characters long: 'PN' or 'MN'.
-  bool ValidStrSize = opts::ArmSPE ?
-MispredStr.size() >= 1 && MispredStr.size() <= 2 : MispredStr.size() == 1;
+  bool ValidStrSize = opts::ArmSPE
+  ? MispredStr.size() >= 1 && MispredStr.size() <= 2
+  : MispredStr.size() == 1;
   bool SpeTakenBitErr =
- (opts::ArmSPE && MispredStr.size() == 2 && MispredStr[1] != 'N');
+  (opts::ArmSPE && MispredStr.size() == 2 && MispredStr[1] != 'N');
   bool PredictionBitErr =
- !ValidStrSize ||
- (MispredStr[0] != 'P' && MispredStr[0] != 'M' && MispredStr[0] != 
'-');
+  !ValidStrSize ||
+  (MispredStr[0] != 'P' && MispredStr[0] != 'M' && MispredStr[0] != '-');
   if (SpeTakenBitErr)
 reportError("expected 'N' as SPE prediction bit for a not-taken branch");
   if (PredictionBitErr)
 reportError("expected 'P', 'M' or '-' char as a prediction bit");
 
- if (SpeTakenBitErr || PredictionBitErr) {
+  if (SpeTakenBitErr || PredictionBitErr) {
 Diag << "Found: " << MispredStr << "\n";
 return make_error_code(llvm::errc::io_error);
   }

``




https://github.com/llvm/llvm-project/pull/129231
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [clang] implement printing of canonical template arguments of expression kind (PR #135133)

2025-04-10 Thread Matheus Izvekov via llvm-branch-commits



@@ -1305,9 +1305,13 @@ void StmtPrinter::VisitDeclRefExpr(DeclRefExpr *Node) {
 Qualifier->print(OS, Policy);
   if (Node->hasTemplateKeyword())
 OS << "template ";
+
+  bool ForceAnonymous =
+  Policy.PrintAsCanonical && VD->getKind() == Decl::NonTypeTemplateParm;

mizvekov wrote:

Yeah, canonicalization of expressions should erase the identity of any NTTPs 
referenced therein, which should make them print as 'value-parameter-X-X', as 
if the NTTP was anonymous, and similarly to how it happens with regards to 
types and type template parameters.

https://github.com/llvm/llvm-project/pull/135133
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: analyze functions without CFG information (PR #133461)

2025-04-10 Thread Kristof Beyls via llvm-branch-commits


https://github.com/kbeyls edited 
https://github.com/llvm/llvm-project/pull/133461
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [clang-tools-extra] [clang] implement printing of canonical template arguments of expression kind (PR #135133)

2025-04-10 Thread Matheus Izvekov via llvm-branch-commits


https://github.com/mizvekov updated 
https://github.com/llvm/llvm-project/pull/135133

>From e8ab5ff779bc00ff6a239f0acea8182c69cb7bcc Mon Sep 17 00:00:00 2001
From: Matheus Izvekov 
Date: Thu, 10 Apr 2025 02:52:36 -0300
Subject: [PATCH] [clang] implement printing of canonical template arguments of
 expression kind

This patch extends the canonicalization printing policy to cover expressions
and template names, and wires that up to the template argument printer,
covering expressions.

This is helpful for debugging, or if these template arguments somehow end up
in diagnostics, as without this patch they can print as completely unrelated
expressions, which can be quite confusing.

This is because expressions are not uniqued, unlike types, and
when a template specialization containing an expression is the first to be
canonicalized, the expression ends up appearing in the canonical type of
subsequent equivalent specializations.

Fixes https://github.com/llvm/llvm-project/issues/92292
---
 .../StaticAccessedThroughInstanceCheck.cpp|2 +-
 .../clang-tidy/utils/Matchers.cpp |2 +-
 clang/include/clang/AST/PrettyPrinter.h   |6 +-
 clang/lib/AST/DeclPrinter.cpp |4 +-
 clang/lib/AST/JSONNodeDumper.cpp  |2 +
 clang/lib/AST/StmtPrinter.cpp |6 +-
 clang/lib/AST/TemplateBase.cpp|7 +-
 clang/lib/AST/TemplateName.cpp|   10 +-
 clang/lib/AST/TextNodeDumper.cpp  |2 +
 clang/lib/AST/TypePrinter.cpp |9 +-
 clang/lib/CodeGen/CGDebugInfo.cpp |2 +-
 clang/lib/Sema/SemaTemplate.cpp   |2 +-
 clang/test/AST/ast-dump-templates.cpp | 1022 +
 clang/unittests/AST/TypePrinterTest.cpp   |2 +-
 14 files changed, 1058 insertions(+), 20 deletions(-)

diff --git 
a/clang-tools-extra/clang-tidy/readability/StaticAccessedThroughInstanceCheck.cpp
 
b/clang-tools-extra/clang-tidy/readability/StaticAccessedThroughInstanceCheck.cpp
index 08adc7134cfea..fffb136e5a332 100644
--- 
a/clang-tools-extra/clang-tidy/readability/StaticAccessedThroughInstanceCheck.cpp
+++ 
b/clang-tools-extra/clang-tidy/readability/StaticAccessedThroughInstanceCheck.cpp
@@ -69,7 +69,7 @@ void StaticAccessedThroughInstanceCheck::check(
   PrintingPolicyWithSuppressedTag.SuppressTagKeyword = true;
   PrintingPolicyWithSuppressedTag.SuppressUnwrittenScope = true;
 
-  PrintingPolicyWithSuppressedTag.PrintCanonicalTypes =
+  PrintingPolicyWithSuppressedTag.PrintAsCanonical =
   !BaseExpr->getType()->isTypedefNameType();
 
   std::string BaseTypeName =
diff --git a/clang-tools-extra/clang-tidy/utils/Matchers.cpp 
b/clang-tools-extra/clang-tidy/utils/Matchers.cpp
index 7e89cae1c3316..0721667fd0c41 100644
--- a/clang-tools-extra/clang-tidy/utils/Matchers.cpp
+++ b/clang-tools-extra/clang-tidy/utils/Matchers.cpp
@@ -32,7 +32,7 @@ bool MatchesAnyListedTypeNameMatcher::matches(
 
   PrintingPolicy PrintingPolicyWithSuppressedTag(
   Finder->getASTContext().getLangOpts());
-  PrintingPolicyWithSuppressedTag.PrintCanonicalTypes = true;
+  PrintingPolicyWithSuppressedTag.PrintAsCanonical = true;
   PrintingPolicyWithSuppressedTag.SuppressElaboration = true;
   PrintingPolicyWithSuppressedTag.SuppressScope = false;
   PrintingPolicyWithSuppressedTag.SuppressTagKeyword = true;
diff --git a/clang/include/clang/AST/PrettyPrinter.h 
b/clang/include/clang/AST/PrettyPrinter.h
index 91818776b770c..5a98ae1987b16 100644
--- a/clang/include/clang/AST/PrettyPrinter.h
+++ b/clang/include/clang/AST/PrettyPrinter.h
@@ -76,7 +76,7 @@ struct PrintingPolicy {
 MSWChar(LO.MicrosoftExt && !LO.WChar), IncludeNewlines(true),
 MSVCFormatting(false), ConstantsAsWritten(false),
 SuppressImplicitBase(false), FullyQualifiedName(false),
-PrintCanonicalTypes(false), PrintInjectedClassNameWithArguments(true),
+PrintAsCanonical(false), PrintInjectedClassNameWithArguments(true),
 UsePreferredNames(true), AlwaysIncludeTypeForTemplateArgument(false),
 CleanUglifiedParameters(false), EntireContentsOfLargeArray(true),
 UseEnumerators(true), UseHLSLTypes(LO.HLSL) {}
@@ -310,9 +310,9 @@ struct PrintingPolicy {
   LLVM_PREFERRED_TYPE(bool)
   unsigned FullyQualifiedName : 1;
 
-  /// Whether to print types as written or canonically.
+  /// Whether to print entities as written or canonically.
   LLVM_PREFERRED_TYPE(bool)
-  unsigned PrintCanonicalTypes : 1;
+  unsigned PrintAsCanonical : 1;
 
   /// Whether to print an InjectedClassNameType with template arguments or as
   /// written. When a template argument is unnamed, printing it results in
diff --git a/clang/lib/AST/DeclPrinter.cpp b/clang/lib/AST/DeclPrinter.cpp
index 28098b242d494..22da5bf251ecd 100644
--- a/clang/lib/AST/DeclPrinter.cpp
+++ b/clang/lib/AST/DeclPrinter.cpp
@@ -735,7 +735,7 @@ void DeclPrinter::VisitFunctionDecl(FunctionDecl *D) {
 llvm::raw_string_ostream POut

[llvm-branch-commits] [llvm] ssaupdaterbulk_add_phi_optimization (PR #135180)

2025-04-10 Thread Valery Pykhtin via llvm-branch-commits


https://github.com/vpykhtin created 
https://github.com/llvm/llvm-project/pull/135180

None

>From 367db01dcf1d8f6305e86e624306f4aefc0b1f95 Mon Sep 17 00:00:00 2001
From: Valery Pykhtin 
Date: Thu, 10 Apr 2025 11:56:57 +
Subject: [PATCH] ssaupdaterbulk_add_phi_optimization

---
 .../llvm/Transforms/Utils/SSAUpdaterBulk.h|  5 +-
 llvm/lib/Transforms/Utils/SSAUpdaterBulk.cpp  | 38 ++-
 .../Transforms/Utils/SSAUpdaterBulkTest.cpp   | 67 +++
 3 files changed, 108 insertions(+), 2 deletions(-)

diff --git a/llvm/include/llvm/Transforms/Utils/SSAUpdaterBulk.h 
b/llvm/include/llvm/Transforms/Utils/SSAUpdaterBulk.h
index b2cf29608f58b..2fb241b0d8e26 100644
--- a/llvm/include/llvm/Transforms/Utils/SSAUpdaterBulk.h
+++ b/llvm/include/llvm/Transforms/Utils/SSAUpdaterBulk.h
@@ -13,7 +13,6 @@
 #ifndef LLVM_TRANSFORMS_UTILS_SSAUPDATERBULK_H
 #define LLVM_TRANSFORMS_UTILS_SSAUPDATERBULK_H
 
-#include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/StringRef.h"
 #include "llvm/IR/PredIteratorCache.h"
 
@@ -77,6 +76,10 @@ class SSAUpdaterBulk {
   /// vector.
   void RewriteAllUses(DominatorTree *DT,
   SmallVectorImpl *InsertedPHIs = nullptr);
+
+  /// Rewrite all uses and simplify the inserted PHI nodes.
+  /// Use this method to preserve behavior when replacing SSAUpdater.
+  void RewriteAndOptimizeAllUses(DominatorTree *DT);
 };
 
 } // end namespace llvm
diff --git a/llvm/lib/Transforms/Utils/SSAUpdaterBulk.cpp 
b/llvm/lib/Transforms/Utils/SSAUpdaterBulk.cpp
index d7bf791a23edf..437fd0c1dca91 100644
--- a/llvm/lib/Transforms/Utils/SSAUpdaterBulk.cpp
+++ b/llvm/lib/Transforms/Utils/SSAUpdaterBulk.cpp
@@ -11,13 +11,14 @@
 
//===--===//
 
 #include "llvm/Transforms/Utils/SSAUpdaterBulk.h"
+#include "llvm/Analysis/InstructionSimplify.h"
 #include "llvm/Analysis/IteratedDominanceFrontier.h"
 #include "llvm/IR/BasicBlock.h"
 #include "llvm/IR/Dominators.h"
 #include "llvm/IR/IRBuilder.h"
-#include "llvm/IR/Instructions.h"
 #include "llvm/IR/Use.h"
 #include "llvm/IR/Value.h"
+#include "llvm/Transforms/Utils/Local.h"
 
 using namespace llvm;
 
@@ -222,3 +223,38 @@ void SSAUpdaterBulk::RewriteAllUses(DominatorTree *DT,
 }
   }
 }
+
+// Perform a single pass of simplification over the worklist of PHIs.
+static void SimplifyPass(MutableArrayRef Worklist) {
+  if (Worklist.empty())
+return;
+
+  const DataLayout &DL = Worklist.front()->getParent()->getDataLayout();
+  for (PHINode *&PHI : Worklist) {
+if (Value *Simplified = simplifyInstruction(PHI, DL)) {
+  PHI->replaceAllUsesWith(Simplified);
+  PHI->eraseFromParent();
+  PHI = nullptr; // Mark as removed.
+}
+  }
+}
+
+static void DeduplicatePass(ArrayRef Worklist) {
+  SmallDenseMap BBs;
+  for (PHINode *PHI : Worklist) {
+if (PHI)
+  ++BBs[PHI->getParent()];
+  }
+
+  for (auto [BB, NumNewPHIs] : BBs) {
+auto FirstExistedPN = std::next(BB->phis().begin(), NumNewPHIs);
+EliminateNewDuplicatePHINodes(BB, FirstExistedPN);
+  }
+}
+
+void SSAUpdaterBulk::RewriteAndOptimizeAllUses(DominatorTree *DT) {
+  SmallVector PHIs;
+  RewriteAllUses(DT, &PHIs);
+  SimplifyPass(PHIs);
+  DeduplicatePass(PHIs);
+}
\ No newline at end of file
diff --git a/llvm/unittests/Transforms/Utils/SSAUpdaterBulkTest.cpp 
b/llvm/unittests/Transforms/Utils/SSAUpdaterBulkTest.cpp
index 841f44cf6bfed..6f2e63dcd9f90 100644
--- a/llvm/unittests/Transforms/Utils/SSAUpdaterBulkTest.cpp
+++ b/llvm/unittests/Transforms/Utils/SSAUpdaterBulkTest.cpp
@@ -308,3 +308,70 @@ TEST(SSAUpdaterBulk, TwoBBLoop) {
   EXPECT_EQ(Phi->getIncomingValueForBlock(Entry), ConstantInt::get(I32Ty, 0));
   EXPECT_EQ(Phi->getIncomingValueForBlock(Loop), I);
 }
+
+TEST(SSAUpdaterBulk, SimplifyPHIs) {
+  const char *IR = R"(
+  define void @main(i32 %val, i1 %cond) {
+  entry:
+  br i1 %cond, label %left, label %right
+  left:
+  %add = add i32 %val, 1
+  br label %exit
+  right:
+  %sub = sub i32 %val, 1
+  br label %exit
+  exit:
+  %phi = phi i32 [ %sub, %right ], [ %add, %left ]
+  %cmp = icmp slt i32 0, 42
+  ret void
+  }
+  )";
+
+  llvm::LLVMContext Context;
+  llvm::SMDiagnostic Err;
+  std::unique_ptr M = llvm::parseAssemblyString(IR, Err, 
Context);
+  ASSERT_NE(M, nullptr) << "Failed to parse IR: " << Err.getMessage();
+
+  Function *F = M->getFunction("main");
+  auto *Entry = &F->getEntryBlock();
+  auto *Left = Entry->getTerminator()->getSuccessor(0);
+  auto *Right = Entry->getTerminator()->getSuccessor(1);
+  auto *Exit = Left->getSingleSuccessor();
+  auto *Val = &*F->arg_begin();
+  auto *Phi = &Exit->front();
+  auto *Cmp = &*std::next(Exit->begin());
+  auto *Add = &Left->front();
+  auto *Sub = &Right->front();
+
+  SSAUpdaterBulk Updater;
+  Type *I32Ty = Type::getInt32Ty(Context);
+
+  // Use %val directly instead of creating a phi.
+  unsigned ValVar = Updater.AddVariabl

[llvm-branch-commits] [llvm] ssaupdaterbulk_add_phi_optimization (PR #135180)

2025-04-10 Thread Valery Pykhtin via llvm-branch-commits


vpykhtin wrote:

> [!WARNING]
> This pull request is not mergeable via GitHub because a downstack PR is 
> open. Once all requirements are satisfied, merge this PR as a stack  href="https://app.graphite.dev/github/pr/llvm/llvm-project/135180?utm_source=stack-comment-downstack-mergeability-warning";
>  >on Graphite.
> https://graphite.dev/docs/merge-pull-requests";>Learn more

* **#135181** https://app.graphite.dev/github/pr/llvm/llvm-project/135181?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* **#135180** https://app.graphite.dev/github/pr/llvm/llvm-project/135180?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/135180?utm_source=stack-comment-view-in-graphite";
 target="_blank">(View in Graphite)
* **#135179** https://app.graphite.dev/github/pr/llvm/llvm-project/135179?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* `main`




This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn 
more about https://stacking.dev/?utm_source=stack-comment";>stacking.


https://github.com/llvm/llvm-project/pull/135180
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] Add initial support for SPE brstack format (PR #129231)

2025-04-10 Thread Ádám Kallai via llvm-branch-commits



@@ -88,6 +89,45 @@ struct PerfSpeEventsTestHelper : public testing::Test {
 
 return SampleSize == DA.BasicSamples.size();
   }
+
+  /// Compare LBREntries
+  bool checkLBREntry(const LBREntry &Lhs, const LBREntry &Rhs) {
+return Lhs.From == Rhs.From && Lhs.To == Rhs.To &&
+   Lhs.Mispred == Rhs.Mispred;
+  }
+
+  /// Parse and check SPE brstack as LBR
+  void parseAndCheckBrstackEvents(
+  uint64_t PID,
+  const std::vector> &ExpectedSamples) {
+int NumSamples = 0;
+
+DataAggregator DA("");
+DA.ParsingBuf = opts::ReadPerfEvents;
+DA.BC = BC.get();
+DataAggregator::MMapInfo MMap;
+DA.BinaryMMapInfo.insert(std::make_pair(PID, MMap));
+
+// Process buffer.
+while (DA.hasData()) {

kaadam wrote:

I kept the original approach, since I haven't find good way to create such a 
simple ELF mock binary that test BranchLBRs functionality properly.  Maybe 
better to add a new test, to check BranchLBRs in different manner.

https://github.com/llvm/llvm-project/pull/129231
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] amdgpu_use_ssaupdaterbulk_in_structurizecfg (PR #135181)

2025-04-10 Thread Valery Pykhtin via llvm-branch-commits


vpykhtin wrote:

> [!WARNING]
> This pull request is not mergeable via GitHub because a downstack PR is 
> open. Once all requirements are satisfied, merge this PR as a stack  href="https://app.graphite.dev/github/pr/llvm/llvm-project/135181?utm_source=stack-comment-downstack-mergeability-warning";
>  >on Graphite.
> https://graphite.dev/docs/merge-pull-requests";>Learn more

* **#135181** https://app.graphite.dev/github/pr/llvm/llvm-project/135181?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/135181?utm_source=stack-comment-view-in-graphite";
 target="_blank">(View in Graphite)
* **#135180** https://app.graphite.dev/github/pr/llvm/llvm-project/135180?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* **#135179** https://app.graphite.dev/github/pr/llvm/llvm-project/135179?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* `main`




This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn 
more about https://stacking.dev/?utm_source=stack-comment";>stacking.


https://github.com/llvm/llvm-project/pull/135181
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] ssaupdaterbulk_add_phi_optimization (PR #135180)

2025-04-10 Thread Valery Pykhtin via llvm-branch-commits


https://github.com/vpykhtin edited 
https://github.com/llvm/llvm-project/pull/135180
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] ssaupdaterbulk_add_phi_optimization (PR #135180)

2025-04-10 Thread Valery Pykhtin via llvm-branch-commits


https://github.com/vpykhtin edited 
https://github.com/llvm/llvm-project/pull/135180
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] ssaupdaterbulk_add_phi_optimization (PR #135180)

2025-04-10 Thread Valery Pykhtin via llvm-branch-commits


https://github.com/vpykhtin edited 
https://github.com/llvm/llvm-project/pull/135180
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [SSAUpdaterBulk] Add PHI simplification pass. (PR #135180)

2025-04-10 Thread Valery Pykhtin via llvm-branch-commits


https://github.com/vpykhtin edited 
https://github.com/llvm/llvm-project/pull/135180
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU] Improve StructurizeCFG pass performance by using SSAUpdaterBulk. (PR #135181)

2025-04-10 Thread Valery Pykhtin via llvm-branch-commits


https://github.com/vpykhtin edited 
https://github.com/llvm/llvm-project/pull/135181
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [SSAUpdaterBulk] Add PHI simplification pass. (PR #135180)

2025-04-10 Thread Valery Pykhtin via llvm-branch-commits


https://github.com/vpykhtin ready_for_review 
https://github.com/llvm/llvm-project/pull/135180
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [SSAUpdaterBulk] Add PHI simplification pass. (PR #135180)

2025-04-10 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-llvm-transforms

Author: Valery Pykhtin (vpykhtin)


Changes

This is a replacement PR for https://github.com/llvm/llvm-project/pull/132004, 
stacked version.


---
Full diff: https://github.com/llvm/llvm-project/pull/135180.diff


3 Files Affected:

- (modified) llvm/include/llvm/Transforms/Utils/SSAUpdaterBulk.h (+4-1) 
- (modified) llvm/lib/Transforms/Utils/SSAUpdaterBulk.cpp (+37-1) 
- (modified) llvm/unittests/Transforms/Utils/SSAUpdaterBulkTest.cpp (+67) 


``diff
diff --git a/llvm/include/llvm/Transforms/Utils/SSAUpdaterBulk.h 
b/llvm/include/llvm/Transforms/Utils/SSAUpdaterBulk.h
index b2cf29608f58b..2fb241b0d8e26 100644
--- a/llvm/include/llvm/Transforms/Utils/SSAUpdaterBulk.h
+++ b/llvm/include/llvm/Transforms/Utils/SSAUpdaterBulk.h
@@ -13,7 +13,6 @@
 #ifndef LLVM_TRANSFORMS_UTILS_SSAUPDATERBULK_H
 #define LLVM_TRANSFORMS_UTILS_SSAUPDATERBULK_H
 
-#include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/StringRef.h"
 #include "llvm/IR/PredIteratorCache.h"
 
@@ -77,6 +76,10 @@ class SSAUpdaterBulk {
   /// vector.
   void RewriteAllUses(DominatorTree *DT,
   SmallVectorImpl *InsertedPHIs = nullptr);
+
+  /// Rewrite all uses and simplify the inserted PHI nodes.
+  /// Use this method to preserve behavior when replacing SSAUpdater.
+  void RewriteAndOptimizeAllUses(DominatorTree *DT);
 };
 
 } // end namespace llvm
diff --git a/llvm/lib/Transforms/Utils/SSAUpdaterBulk.cpp 
b/llvm/lib/Transforms/Utils/SSAUpdaterBulk.cpp
index d7bf791a23edf..437fd0c1dca91 100644
--- a/llvm/lib/Transforms/Utils/SSAUpdaterBulk.cpp
+++ b/llvm/lib/Transforms/Utils/SSAUpdaterBulk.cpp
@@ -11,13 +11,14 @@
 
//===--===//
 
 #include "llvm/Transforms/Utils/SSAUpdaterBulk.h"
+#include "llvm/Analysis/InstructionSimplify.h"
 #include "llvm/Analysis/IteratedDominanceFrontier.h"
 #include "llvm/IR/BasicBlock.h"
 #include "llvm/IR/Dominators.h"
 #include "llvm/IR/IRBuilder.h"
-#include "llvm/IR/Instructions.h"
 #include "llvm/IR/Use.h"
 #include "llvm/IR/Value.h"
+#include "llvm/Transforms/Utils/Local.h"
 
 using namespace llvm;
 
@@ -222,3 +223,38 @@ void SSAUpdaterBulk::RewriteAllUses(DominatorTree *DT,
 }
   }
 }
+
+// Perform a single pass of simplification over the worklist of PHIs.
+static void SimplifyPass(MutableArrayRef Worklist) {
+  if (Worklist.empty())
+return;
+
+  const DataLayout &DL = Worklist.front()->getParent()->getDataLayout();
+  for (PHINode *&PHI : Worklist) {
+if (Value *Simplified = simplifyInstruction(PHI, DL)) {
+  PHI->replaceAllUsesWith(Simplified);
+  PHI->eraseFromParent();
+  PHI = nullptr; // Mark as removed.
+}
+  }
+}
+
+static void DeduplicatePass(ArrayRef Worklist) {
+  SmallDenseMap BBs;
+  for (PHINode *PHI : Worklist) {
+if (PHI)
+  ++BBs[PHI->getParent()];
+  }
+
+  for (auto [BB, NumNewPHIs] : BBs) {
+auto FirstExistedPN = std::next(BB->phis().begin(), NumNewPHIs);
+EliminateNewDuplicatePHINodes(BB, FirstExistedPN);
+  }
+}
+
+void SSAUpdaterBulk::RewriteAndOptimizeAllUses(DominatorTree *DT) {
+  SmallVector PHIs;
+  RewriteAllUses(DT, &PHIs);
+  SimplifyPass(PHIs);
+  DeduplicatePass(PHIs);
+}
\ No newline at end of file
diff --git a/llvm/unittests/Transforms/Utils/SSAUpdaterBulkTest.cpp 
b/llvm/unittests/Transforms/Utils/SSAUpdaterBulkTest.cpp
index 841f44cf6bfed..6f2e63dcd9f90 100644
--- a/llvm/unittests/Transforms/Utils/SSAUpdaterBulkTest.cpp
+++ b/llvm/unittests/Transforms/Utils/SSAUpdaterBulkTest.cpp
@@ -308,3 +308,70 @@ TEST(SSAUpdaterBulk, TwoBBLoop) {
   EXPECT_EQ(Phi->getIncomingValueForBlock(Entry), ConstantInt::get(I32Ty, 0));
   EXPECT_EQ(Phi->getIncomingValueForBlock(Loop), I);
 }
+
+TEST(SSAUpdaterBulk, SimplifyPHIs) {
+  const char *IR = R"(
+  define void @main(i32 %val, i1 %cond) {
+  entry:
+  br i1 %cond, label %left, label %right
+  left:
+  %add = add i32 %val, 1
+  br label %exit
+  right:
+  %sub = sub i32 %val, 1
+  br label %exit
+  exit:
+  %phi = phi i32 [ %sub, %right ], [ %add, %left ]
+  %cmp = icmp slt i32 0, 42
+  ret void
+  }
+  )";
+
+  llvm::LLVMContext Context;
+  llvm::SMDiagnostic Err;
+  std::unique_ptr M = llvm::parseAssemblyString(IR, Err, 
Context);
+  ASSERT_NE(M, nullptr) << "Failed to parse IR: " << Err.getMessage();
+
+  Function *F = M->getFunction("main");
+  auto *Entry = &F->getEntryBlock();
+  auto *Left = Entry->getTerminator()->getSuccessor(0);
+  auto *Right = Entry->getTerminator()->getSuccessor(1);
+  auto *Exit = Left->getSingleSuccessor();
+  auto *Val = &*F->arg_begin();
+  auto *Phi = &Exit->front();
+  auto *Cmp = &*std::next(Exit->begin());
+  auto *Add = &Left->front();
+  auto *Sub = &Right->front();
+
+  SSAUpdaterBulk Updater;
+  Type *I32Ty = Type::getInt32Ty(Context);
+
+  // Use %val directly instead of creating a phi.
+  unsigned ValVar = Updater.AddVariable("V

[llvm-branch-commits] [llvm] [AMDGPU] Improve StructurizeCFG pass performance by using SSAUpdaterBulk. (PR #135181)

2025-04-10 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-llvm-transforms

Author: Valery Pykhtin (vpykhtin)


Changes

This is a replacement PR for https://github.com/llvm/llvm-project/pull/130611, 
stacked version.

---
Full diff: https://github.com/llvm/llvm-project/pull/135181.diff


1 Files Affected:

- (modified) llvm/lib/Transforms/Scalar/StructurizeCFG.cpp (+15-10) 


``diff
diff --git a/llvm/lib/Transforms/Scalar/StructurizeCFG.cpp 
b/llvm/lib/Transforms/Scalar/StructurizeCFG.cpp
index 00c4fcc76e791..95c68ecd2255b 100644
--- a/llvm/lib/Transforms/Scalar/StructurizeCFG.cpp
+++ b/llvm/lib/Transforms/Scalar/StructurizeCFG.cpp
@@ -47,6 +47,7 @@
 #include "llvm/Transforms/Utils/BasicBlockUtils.h"
 #include "llvm/Transforms/Utils/Local.h"
 #include "llvm/Transforms/Utils/SSAUpdater.h"
+#include "llvm/Transforms/Utils/SSAUpdaterBulk.h"
 #include 
 #include 
 
@@ -317,7 +318,7 @@ class StructurizeCFG {
 
   void collectInfos();
 
-  void insertConditions(bool Loops);
+  void insertConditions(bool Loops, SSAUpdaterBulk &PhiInserter);
 
   void simplifyConditions();
 
@@ -600,10 +601,9 @@ void StructurizeCFG::collectInfos() {
 }
 
 /// Insert the missing branch conditions
-void StructurizeCFG::insertConditions(bool Loops) {
+void StructurizeCFG::insertConditions(bool Loops, SSAUpdaterBulk &PhiInserter) 
{
   BranchVector &Conds = Loops ? LoopConds : Conditions;
   Value *Default = Loops ? BoolTrue : BoolFalse;
-  SSAUpdater PhiInserter;
 
   for (BranchInst *Term : Conds) {
 assert(Term->isConditional());
@@ -619,22 +619,23 @@ void StructurizeCFG::insertConditions(bool Loops) {
   Term->setCondition(PI.Pred);
   CondBranchWeights::setMetadata(*Term, PI.Weights);
 } else {
-  PhiInserter.Initialize(Boolean, "");
-  PhiInserter.AddAvailableValue(Loops ? SuccFalse : Parent, Default);
+  unsigned Variable = PhiInserter.AddVariable("", Boolean);
+  PhiInserter.AddAvailableValue(Variable, Loops ? SuccFalse : Parent,
+Default);
 
   NearestCommonDominator Dominator(DT);
   Dominator.addBlock(Parent);
 
   for (auto [BB, PI] : Preds) {
 assert(BB != Parent);
-PhiInserter.AddAvailableValue(BB, PI.Pred);
+PhiInserter.AddAvailableValue(Variable, BB, PI.Pred);
 Dominator.addAndRememberBlock(BB);
   }
 
   if (!Dominator.resultIsRememberedBlock())
-PhiInserter.AddAvailableValue(Dominator.result(), Default);
+PhiInserter.AddAvailableValue(Variable, Dominator.result(), Default);
 
-  Term->setCondition(PhiInserter.GetValueInMiddleOfBlock(Parent));
+  PhiInserter.AddUse(Variable, &Term->getOperandUse(0));
 }
   }
 }
@@ -1318,8 +1319,12 @@ bool StructurizeCFG::run(Region *R, DominatorTree *DT) {
   orderNodes();
   collectInfos();
   createFlow();
-  insertConditions(false);
-  insertConditions(true);
+
+  SSAUpdaterBulk PhiInserter;
+  insertConditions(false, PhiInserter);
+  insertConditions(true, PhiInserter);
+  PhiInserter.RewriteAndOptimizeAllUses(DT);
+
   setPhiValues();
   simplifyConditions();
   simplifyAffectedPhis();

``




https://github.com/llvm/llvm-project/pull/135181
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] Add initial support for SPE brstack format (PR #129231)

2025-04-10 Thread Ádám Kallai via llvm-branch-commits



@@ -180,13 +178,16 @@ void DataAggregator::start() {
 
   if (opts::ArmSPE) {
 if (!opts::BasicAggregation) {
-  errs() << "PERF2BOLT-ERROR: Arm SPE mode is combined only with "
-"BasicAggregation.\n";
-  exit(1);
+  // pidfrom_ip  to_ippredicted?
+  // 12345  0x123/0x456/P/-/-/8/RET/-

kaadam wrote:

Updated

https://github.com/llvm/llvm-project/pull/129231
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [NFC][KeyInstr] Add Atom Group (re)mapping (PR #133479)

2025-04-10 Thread Jeremy Morse via llvm-branch-commits



@@ -87,6 +87,8 @@ class ValueMap {
   using ValueMapCVH = ValueMapCallbackVH;
   using MapT = DenseMap>;
   using MDMapT = DenseMap;
+  /// Map {(InlinedAt, old atom number) -> new atom number}.
+  using DMAtomT = DenseMap, uint64_t>;

jmorse wrote:

Consider using SmallDenseMap simply to reduce the initial allocations in the 
non-debug-info codepath?

https://github.com/llvm/llvm-project/pull/133479
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [NFC][KeyInstr] Add Atom Group (re)mapping (PR #133479)

2025-04-10 Thread Jeremy Morse via llvm-branch-commits



@@ -105,6 +105,13 @@ enum RemapFlags {
   /// Any global values not in value map are mapped to null instead of mapping
   /// to self.  Illegal if RF_IgnoreMissingLocals is also set.
   RF_NullMapMissingGlobalValues = 8,
+
+  /// Do not remap atom instances. Only safe if to do this if the cloned
+  /// instructions being remapped are inserted into a new function, or an
+  /// existing function where the inlined-at fields are updated. If in doubt,
+  /// don't use this flag. It's used for compiler performance reasons rather
+  /// than correctness.

jmorse wrote:

```suggestion
  /// don't use this flag. It's used when remapping is known to be un-necessary
  /// to save some compile-time.
```

https://github.com/llvm/llvm-project/pull/133479
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [NFC][KeyInstr] Add Atom Group (re)mapping (PR #133479)

2025-04-10 Thread Jeremy Morse via llvm-branch-commits



@@ -117,9 +118,21 @@ struct ClonedCodeInfo {
 /// If you would like to collect additional information about the cloned
 /// function, you can specify a ClonedCodeInfo object with the optional fifth
 /// parameter.
+///
+/// Set \p MapAtoms to false to skip mapping source atoms for later remapping.

jmorse wrote:

IMO "source-location atoms" to make it even clearer that this is a debugging 
feature.

Also IMO it's better to discuss when this flag is necessary instead of when 
it's not necessary, as it'll enlighten the reader what it's for. AFAIUI, 
something like "Must be true when you duplicate a code path and a source line 
is intended to appear twice in the generated instructions. Can be set to false 
if you are transplanting code from one place to another".

https://github.com/llvm/llvm-project/pull/133479
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [NFC][KeyInstr] Add Atom Group (re)mapping (PR #133479)

2025-04-10 Thread Jeremy Morse via llvm-branch-commits



@@ -105,6 +105,13 @@ enum RemapFlags {
   /// Any global values not in value map are mapped to null instead of mapping
   /// to self.  Illegal if RF_IgnoreMissingLocals is also set.
   RF_NullMapMissingGlobalValues = 8,
+
+  /// Do not remap atom instances. Only safe if to do this if the cloned
+  /// instructions being remapped are inserted into a new function, or an
+  /// existing function where the inlined-at fields are updated. If in doubt,
+  /// don't use this flag. It's used for compiler performance reasons rather
+  /// than correctness.

jmorse wrote:

IMO suggesting that it's not related to correctness is misleading, because the 
presence/absence of the flag can lead to correctness issues. Better to just 
avoid saying that and say something else as suggested.

https://github.com/llvm/llvm-project/pull/133479
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [NFC][KeyInstr] Add Atom Group (re)mapping (PR #133479)

2025-04-10 Thread Jeremy Morse via llvm-branch-commits


https://github.com/jmorse edited 
https://github.com/llvm/llvm-project/pull/133479
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [NFC][KeyInstr] Add Atom Group (re)mapping (PR #133479)

2025-04-10 Thread Jeremy Morse via llvm-branch-commits



@@ -105,6 +105,13 @@ enum RemapFlags {
   /// Any global values not in value map are mapped to null instead of mapping
   /// to self.  Illegal if RF_IgnoreMissingLocals is also set.
   RF_NullMapMissingGlobalValues = 8,
+
+  /// Do not remap atom instances. Only safe if to do this if the cloned

jmorse wrote:

IMO needs "source location atom" instead of just "atom" to ensure the random 
reader knows it's about debug-info.

https://github.com/llvm/llvm-project/pull/133479
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [NFC][KeyInstr] Add Atom Group (re)mapping (PR #133479)

2025-04-10 Thread Jeremy Morse via llvm-branch-commits


https://github.com/jmorse approved this pull request.

LGTM, code and tests are good. As ever all my comments are about comments and 
maintainability!

https://github.com/llvm/llvm-project/pull/133479
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [NFC][KeyInstr] Add Atom Group (re)mapping (PR #133479)

2025-04-10 Thread Jeremy Morse via llvm-branch-commits



@@ -284,6 +291,9 @@ inline void RemapInstruction(Instruction *I, 
ValueToValueMapTy &VM,
   .remapInstruction(*I);
 }
 
+/// Remap source atom. Called by RemapInstruction.

jmorse wrote:

IMO too terse; needs some purpose and context.

https://github.com/llvm/llvm-project/pull/133479
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] PHITransAddr: Avoid looking at constant use lists (PR #134689)

2025-04-10 Thread Nikita Popov via llvm-branch-commits


https://github.com/nikic approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/134689
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [lld] ELF: Only rewrite non-preemptible IFUNCs to IPLT functions if a non-IRELATIVE relocation is needed. (PR #133531)

2025-04-10 Thread Peter Smith via llvm-branch-commits



@@ -1761,6 +1761,9 @@ void RelocationBaseSection::computeRels() {
 llvm::sort(nonRelative, irelative, [&](auto &a, auto &b) {
   return std::tie(a.r_sym, a.r_offset) < std::tie(b.r_sym, b.r_offset);
 });
+llvm::sort(irelative, relocs.end(), [&](auto &a, auto &b) {

smithp35 wrote:

Could be worth updating the comment on line 1753, which doesn't mention 
irelative after non-irelative. If the r_offset is not just for readability it 
will be worth updating that too. 

https://github.com/llvm/llvm-project/pull/133531
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [lld] ELF: Only rewrite non-preemptible IFUNCs to IPLT functions if a non-IRELATIVE relocation is needed. (PR #133531)

2025-04-10 Thread Peter Smith via llvm-branch-commits



@@ -1964,6 +1979,26 @@ void elf::postScanRelocations(Ctx &ctx) {
   for (ELFFileBase *file : ctx.objectFiles)
 for (Symbol *sym : file->getLocalSymbols())
   fn(*sym);
+
+  // Now that we have checked all ifunc symbols for demotion to regular 
function
+  // symbols, move IRELATIVE relocations to the right place:
+  // - Relocations for non-demoted ifuncs are added to .rela.dyn
+  // - Relocations for demoted ifuncs are turned into RELATIVE relocations
+  //   or static relocations in PDEs

smithp35 wrote:

Could you expand the acronym? I think this means Position Dependent Executable 
(PDE). It isn't used anywhere else in the codebase, and while derivable made me 
stop and think of alternatives.

https://github.com/llvm/llvm-project/pull/133531
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [lld] ELF: Only rewrite non-preemptible IFUNCs to IPLT functions if a non-IRELATIVE relocation is needed. (PR #133531)

2025-04-10 Thread Peter Smith via llvm-branch-commits



@@ -42,6 +42,8 @@ void printTraceSymbol(const Symbol &sym, StringRef name);
 enum {
   NEEDS_GOT = 1 << 0,
   NEEDS_PLT = 1 << 1,
+  // True if this is an ifunc with a direct relocation that cannot be

smithp35 wrote:

Although not new, could be worth expanding on what a direct relocation is in 
the comment. Could be just `direct (non GOT or PLT generating) relocation ...`  

https://github.com/llvm/llvm-project/pull/133531
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [clang-tools-extra] [clang] implement printing of canonical template arguments of expression kind (PR #135133)

2025-04-10 Thread Matheus Izvekov via llvm-branch-commits



@@ -1357,6 +1357,8 @@ void 
TextNodeDumper::VisitTemplateExpansionTemplateArgument(
 void TextNodeDumper::VisitExpressionTemplateArgument(
 const TemplateArgument &TA) {
   OS << " expr";
+  if (TA.isCanonicalExpr())
+OS << " canon";

mizvekov wrote:

Done

https://github.com/llvm/llvm-project/pull/135133
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] amdgpu_use_ssaupdaterbulk_in_structurizecfg (PR #135181)

2025-04-10 Thread Valery Pykhtin via llvm-branch-commits


https://github.com/vpykhtin created 
https://github.com/llvm/llvm-project/pull/135181

None

>From c22138d41a4e1d81f3017478bfe9496fc80164f8 Mon Sep 17 00:00:00 2001
From: Valery Pykhtin 
Date: Thu, 10 Apr 2025 11:58:13 +
Subject: [PATCH] amdgpu_use_ssaupdaterbulk_in_structurizecfg

---
 llvm/lib/Transforms/Scalar/StructurizeCFG.cpp | 25 +++
 1 file changed, 15 insertions(+), 10 deletions(-)

diff --git a/llvm/lib/Transforms/Scalar/StructurizeCFG.cpp 
b/llvm/lib/Transforms/Scalar/StructurizeCFG.cpp
index 00c4fcc76e791..95c68ecd2255b 100644
--- a/llvm/lib/Transforms/Scalar/StructurizeCFG.cpp
+++ b/llvm/lib/Transforms/Scalar/StructurizeCFG.cpp
@@ -47,6 +47,7 @@
 #include "llvm/Transforms/Utils/BasicBlockUtils.h"
 #include "llvm/Transforms/Utils/Local.h"
 #include "llvm/Transforms/Utils/SSAUpdater.h"
+#include "llvm/Transforms/Utils/SSAUpdaterBulk.h"
 #include 
 #include 
 
@@ -317,7 +318,7 @@ class StructurizeCFG {
 
   void collectInfos();
 
-  void insertConditions(bool Loops);
+  void insertConditions(bool Loops, SSAUpdaterBulk &PhiInserter);
 
   void simplifyConditions();
 
@@ -600,10 +601,9 @@ void StructurizeCFG::collectInfos() {
 }
 
 /// Insert the missing branch conditions
-void StructurizeCFG::insertConditions(bool Loops) {
+void StructurizeCFG::insertConditions(bool Loops, SSAUpdaterBulk &PhiInserter) 
{
   BranchVector &Conds = Loops ? LoopConds : Conditions;
   Value *Default = Loops ? BoolTrue : BoolFalse;
-  SSAUpdater PhiInserter;
 
   for (BranchInst *Term : Conds) {
 assert(Term->isConditional());
@@ -619,22 +619,23 @@ void StructurizeCFG::insertConditions(bool Loops) {
   Term->setCondition(PI.Pred);
   CondBranchWeights::setMetadata(*Term, PI.Weights);
 } else {
-  PhiInserter.Initialize(Boolean, "");
-  PhiInserter.AddAvailableValue(Loops ? SuccFalse : Parent, Default);
+  unsigned Variable = PhiInserter.AddVariable("", Boolean);
+  PhiInserter.AddAvailableValue(Variable, Loops ? SuccFalse : Parent,
+Default);
 
   NearestCommonDominator Dominator(DT);
   Dominator.addBlock(Parent);
 
   for (auto [BB, PI] : Preds) {
 assert(BB != Parent);
-PhiInserter.AddAvailableValue(BB, PI.Pred);
+PhiInserter.AddAvailableValue(Variable, BB, PI.Pred);
 Dominator.addAndRememberBlock(BB);
   }
 
   if (!Dominator.resultIsRememberedBlock())
-PhiInserter.AddAvailableValue(Dominator.result(), Default);
+PhiInserter.AddAvailableValue(Variable, Dominator.result(), Default);
 
-  Term->setCondition(PhiInserter.GetValueInMiddleOfBlock(Parent));
+  PhiInserter.AddUse(Variable, &Term->getOperandUse(0));
 }
   }
 }
@@ -1318,8 +1319,12 @@ bool StructurizeCFG::run(Region *R, DominatorTree *DT) {
   orderNodes();
   collectInfos();
   createFlow();
-  insertConditions(false);
-  insertConditions(true);
+
+  SSAUpdaterBulk PhiInserter;
+  insertConditions(false, PhiInserter);
+  insertConditions(true, PhiInserter);
+  PhiInserter.RewriteAndOptimizeAllUses(DT);
+
   setPhiValues();
   simplifyConditions();
   simplifyAffectedPhis();

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [lld] ELF: Only rewrite non-preemptible IFUNCs to IPLT functions if a non-IRELATIVE relocation is needed. (PR #133531)

2025-04-10 Thread Peter Smith via llvm-branch-commits


https://github.com/smithp35 edited 
https://github.com/llvm/llvm-project/pull/133531
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] ELF: Remove lock from MTE global relocation handling code. (PR #135123)

2025-04-10 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-lld-elf

Author: Peter Collingbourne (pcc)


Changes

This lock is unnecessary because we can add the relocations to
shards and let them be sorted later.


---
Full diff: https://github.com/llvm/llvm-project/pull/135123.diff


1 Files Affected:

- (modified) lld/ELF/Relocations.cpp (+2-3) 


``diff
diff --git a/lld/ELF/Relocations.cpp b/lld/ELF/Relocations.cpp
index 81de664fd1c23..277acb26987bc 100644
--- a/lld/ELF/Relocations.cpp
+++ b/lld/ELF/Relocations.cpp
@@ -847,9 +847,8 @@ static void addRelativeReloc(Ctx &ctx, InputSectionBase 
&isec,
   Partition &part = isec.getPartition(ctx);
 
   if (sym.isTagged()) {
-std::lock_guard lock(ctx.relocMutex);
-part.relaDyn->addRelativeReloc(ctx.target->relativeRel, isec, offsetInSec,
-   sym, addend, type, expr);
+part.relaDyn->addRelativeReloc(ctx.target->relativeRel, isec,
+  offsetInSec, sym, addend, type, 
expr);
 // With MTE globals, we always want to derive the address tag by `ldg`-ing
 // the symbol. When we have a RELATIVE relocation though, we no longer have
 // a reference to the symbol. Because of this, when we have an addend that

``




https://github.com/llvm/llvm-project/pull/135123
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] CodeGen: Trim redundant template argument from defusechain_iterator (PR #135024)

2025-04-10 Thread Matt Arsenault via llvm-branch-commits


arsenm wrote:

### Merge activity

* **Apr 9, 12:21 PM EDT**: A user started a stack merge that includes this pull 
request via 
[Graphite](https://app.graphite.dev/github/pr/llvm/llvm-project/135024).


https://github.com/llvm/llvm-project/pull/135024
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [InstCombine] Improve inbounds preservation for ADD+GEP -> GEP+GEP (PR #135155)

2025-04-10 Thread Nikita Popov via llvm-branch-commits



@@ -3087,12 +3087,22 @@ Instruction 
*InstCombinerImpl::visitGetElementPtrInst(GetElementPtrInst &GEP) {
 return nullptr;
 
   if (GEP.getNumIndices() == 1) {
-// We can only preserve inbounds if the original gep is inbounds, the add
-// is nsw, and the add operands are non-negative.
-auto CanPreserveInBounds = [&](bool AddIsNSW, Value *Idx1, Value *Idx2) {
+auto CanPreserveNoWrapFlags = [&](bool AddIsNSW, bool AddIsNUW, Value 
*Idx1,
+  Value *Idx2) {
+  // Preserve "inbounds nuw" if the original gep is "inbounds nuw",
+  // and the add is "nuw".
+  if (GEP.isInBounds() && GEP.hasNoUnsignedWrap() && AddIsNUW)
+return GEPNoWrapFlags::inBounds() | GEPNoWrapFlags::noUnsignedWrap();

nikic wrote:

```suggestion
  if (GEP.hasNoUnsignedWrap() && AddIsNUW)
return GEP.getNoWrapFlags();
```
Would this work to subsume both this case and the only nuw one below?

https://github.com/llvm/llvm-project/pull/135155
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [InstCombine] Improve inbounds preservation for ADD+GEP -> GEP+GEP (PR #135155)

2025-04-10 Thread Nikita Popov via llvm-branch-commits



@@ -3087,12 +3087,22 @@ Instruction 
*InstCombinerImpl::visitGetElementPtrInst(GetElementPtrInst &GEP) {
 return nullptr;
 
   if (GEP.getNumIndices() == 1) {
-// We can only preserve inbounds if the original gep is inbounds, the add
-// is nsw, and the add operands are non-negative.
-auto CanPreserveInBounds = [&](bool AddIsNSW, Value *Idx1, Value *Idx2) {
+auto CanPreserveNoWrapFlags = [&](bool AddIsNSW, bool AddIsNUW, Value 
*Idx1,

nikic wrote:

Rename this to GetPreservedNoWrapFlags or something.

https://github.com/llvm/llvm-project/pull/135155
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [InstCombine] Improve inbounds preservation for ADD+GEP -> GEP+GEP (PR #135155)

2025-04-10 Thread Nikita Popov via llvm-branch-commits



@@ -3087,12 +3087,22 @@ Instruction 
*InstCombinerImpl::visitGetElementPtrInst(GetElementPtrInst &GEP) {
 return nullptr;
 
   if (GEP.getNumIndices() == 1) {
-// We can only preserve inbounds if the original gep is inbounds, the add
-// is nsw, and the add operands are non-negative.
-auto CanPreserveInBounds = [&](bool AddIsNSW, Value *Idx1, Value *Idx2) {
+auto CanPreserveNoWrapFlags = [&](bool AddIsNSW, bool AddIsNUW, Value 
*Idx1,
+  Value *Idx2) {
+  // Preserve "inbounds nuw" if the original gep is "inbounds nuw",
+  // and the add is "nuw".
+  if (GEP.isInBounds() && GEP.hasNoUnsignedWrap() && AddIsNUW)
+return GEPNoWrapFlags::inBounds() | GEPNoWrapFlags::noUnsignedWrap();
+  // Preserve "inbounds" if the original gep is "inbounds", the add
+  // is "nsw", and the add operands are non-negative.
   SimplifyQuery Q = SQ.getWithInstruction(&GEP);
-  return GEP.isInBounds() && AddIsNSW && isKnownNonNegative(Idx1, Q) &&
- isKnownNonNegative(Idx2, Q);
+  if (GEP.isInBounds() && AddIsNSW && isKnownNonNegative(Idx1, Q) &&
+  isKnownNonNegative(Idx2, Q))
+return GEPNoWrapFlags::inBounds();

nikic wrote:

Is it actually still necessary to explicitly handle this case? If we have an 
add nsw with nonneg operands, I think we should infer nuw on both add and gep 
and can then use the new code path?

https://github.com/llvm/llvm-project/pull/135155
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Only fold flat offsets if they are inbounds (PR #132353)

2025-04-10 Thread Fabian Ritter via llvm-branch-commits


https://github.com/ritter-x2a updated 
https://github.com/llvm/llvm-project/pull/132353

>From 11282b1d43e87a092a6d21cc23e6962b65554eb3 Mon Sep 17 00:00:00 2001
From: Fabian Ritter 
Date: Fri, 21 Mar 2025 03:33:02 -0400
Subject: [PATCH] [AMDGPU][SDAG] Only fold flat offsets if they are inbounds

For flat memory instructions where the address is supplied as a base address
register with an immediate offset, the memory aperture test ignores the
immediate offset. Currently, ISel does not respect that, which leads to
miscompilations where valid input programs crash when the address computation
relies on the immediate offset to get the base address in the proper memory
aperture. Global or scratch instructions are not affected.

This patch only selects flat instructions with immediate offsets from address
computations with the inbounds flag: If the address computation does not leave
the bounds of the allocated object, it cannot leave the bounds of the memory
aperture and is therefore safe to handle with an immediate offset.

It also adds the inbounds flag to DAG nodes resulting from transformations:
- Address computations resulting from getObjectPtrOffset. As far as I can tell,
  this function is only used to compute addresses within accessed memory ranges,
  e.g., for loads and stores that are split during legalization.
- Reassociated inbounds adds. If both involved operations are inbounds, then so
  are operations after the transformation.
- Address computations in the SelectionDAG lowering of the memcpy/move/set
  intrinsics. Base and result of the address arithmetic there are accessed, so
  the operation must be inbounds.

It might make sense to separate these changes into their own PR, but I don't
see a way to test them without adding a use of the inbounds SDAG flag.

Affected tests:
- CodeGen/AMDGPU/fold-gep-offset.ll: Offsets are no longer wrongly folded,
  added new positive tests where we still do fold them.
- Transforms/InferAddressSpaces/AMDGPU/flat_atomic.ll: Offset folding doesn't
  seem integral to this test, so the test is not changed to make offset folding
  still happen.
- CodeGen/AMDGPU/loop-prefetch-data.ll: loop-reduce prefers to base addresses
  on the potentially OOB addresses used for prefetching for memory accesses,
  that might be a separate issue to look into.
- Added memset tests to CodeGen/AMDGPU/memintrinsic-unroll.ll to make sure that
  offsets in the memset DAG lowering are still folded properly.

A similar patch for GlobalISel will follow.

Fixes SWDEV-516125.
---
 llvm/include/llvm/CodeGen/SelectionDAG.h  |  12 +-
 llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp |   9 +-
 .../lib/CodeGen/SelectionDAG/SelectionDAG.cpp |  12 +-
 llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp | 140 ---
 llvm/test/CodeGen/AMDGPU/fold-gep-offset.ll   | 374 +-
 .../test/CodeGen/AMDGPU/loop-prefetch-data.ll |  17 +-
 .../CodeGen/AMDGPU/memintrinsic-unroll.ll | 241 +++
 .../InferAddressSpaces/AMDGPU/flat_atomic.ll  |   6 +-
 8 files changed, 717 insertions(+), 94 deletions(-)

diff --git a/llvm/include/llvm/CodeGen/SelectionDAG.h 
b/llvm/include/llvm/CodeGen/SelectionDAG.h
index 15a2370e5d8b8..aa3668d3e9aae 100644
--- a/llvm/include/llvm/CodeGen/SelectionDAG.h
+++ b/llvm/include/llvm/CodeGen/SelectionDAG.h
@@ -1069,7 +1069,8 @@ class SelectionDAG {
  SDValue EVL);
 
   /// Returns sum of the base pointer and offset.
-  /// Unlike getObjectPtrOffset this does not set NoUnsignedWrap by default.
+  /// Unlike getObjectPtrOffset this does not set NoUnsignedWrap and InBounds 
by
+  /// default.
   SDValue getMemBasePlusOffset(SDValue Base, TypeSize Offset, const SDLoc &DL,
const SDNodeFlags Flags = SDNodeFlags());
   SDValue getMemBasePlusOffset(SDValue Base, SDValue Offset, const SDLoc &DL,
@@ -1077,15 +1078,18 @@ class SelectionDAG {
 
   /// Create an add instruction with appropriate flags when used for
   /// addressing some offset of an object. i.e. if a load is split into 
multiple
-  /// components, create an add nuw from the base pointer to the offset.
+  /// components, create an add nuw inbounds from the base pointer to the
+  /// offset.
   SDValue getObjectPtrOffset(const SDLoc &SL, SDValue Ptr, TypeSize Offset) {
-return getMemBasePlusOffset(Ptr, Offset, SL, SDNodeFlags::NoUnsignedWrap);
+return getMemBasePlusOffset(
+Ptr, Offset, SL, SDNodeFlags::NoUnsignedWrap | SDNodeFlags::InBounds);
   }
 
   SDValue getObjectPtrOffset(const SDLoc &SL, SDValue Ptr, SDValue Offset) {
 // The object itself can't wrap around the address space, so it shouldn't 
be
 // possible for the adds of the offsets to the split parts to overflow.
-return getMemBasePlusOffset(Ptr, Offset, SL, SDNodeFlags::NoUnsignedWrap);
+return getMemBasePlusOffset(
+Ptr, Offset, SL, SDNodeFlags::NoUnsignedWrap | SDNodeFlags::InBounds);
   }
 
   /// Return a new CALLSEQ_START node, that starts new call fram

[llvm-branch-commits] [llvm] [SDAG] Introduce inbounds flag for pointer arithmetic (PR #131862)

2025-04-10 Thread Fabian Ritter via llvm-branch-commits


https://github.com/ritter-x2a updated 
https://github.com/llvm/llvm-project/pull/131862

>From 75e41ae17d5daae609c6f25025c730e9bb3924bc Mon Sep 17 00:00:00 2001
From: Fabian Ritter 
Date: Mon, 17 Mar 2025 06:51:16 -0400
Subject: [PATCH] [SDAG] Introduce inbounds flag for pointer arithmetic

This patch introduces an inbounds SDNodeFlag, to show that a pointer
addition SDNode implements an inbounds getelementptr operation (i.e.,
the pointer operand is in bounds wrt. the allocated object it is based
on, and the arithmetic does not change that). The flag is set in the DAG
construction when lowering inbounds GEPs.

Inbounds information is useful in the ISel when selecting memory
instructions that perform address computations whose intermediate steps
must be in the same memory region as the final result. A follow-up patch
will start using it for AMDGPU's flat memory instructions, where the
immediate offset must not affect the memory aperture of the address.

A similar patch for gMIR and GlobalISel will follow.

For SWDEV-516125.
---
 llvm/include/llvm/CodeGen/SelectionDAGNodes.h| 9 +++--
 llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp| 3 +++
 llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp | 3 +++
 .../CodeGen/X86/merge-store-partially-alias-loads.ll | 2 +-
 4 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/llvm/include/llvm/CodeGen/SelectionDAGNodes.h 
b/llvm/include/llvm/CodeGen/SelectionDAGNodes.h
index 2283f99202e2f..13ac65f5d731c 100644
--- a/llvm/include/llvm/CodeGen/SelectionDAGNodes.h
+++ b/llvm/include/llvm/CodeGen/SelectionDAGNodes.h
@@ -415,12 +415,15 @@ struct SDNodeFlags {
 Unpredictable = 1 << 13,
 // Compare instructions which may carry the samesign flag.
 SameSign = 1 << 14,
+// Pointer arithmetic instructions that remain in bounds, e.g., 
implementing
+// an inbounds GEP.
+InBounds = 1 << 15,
 
 // NOTE: Please update LargestValue in LLVM_DECLARE_ENUM_AS_BITMASK below
 // the class definition when adding new flags.
 
 PoisonGeneratingFlags = NoUnsignedWrap | NoSignedWrap | Exact | Disjoint |
-NonNeg | NoNaNs | NoInfs | SameSign,
+NonNeg | NoNaNs | NoInfs | SameSign | InBounds,
 FastMathFlags = NoNaNs | NoInfs | NoSignedZeros | AllowReciprocal |
 AllowContract | ApproximateFuncs | AllowReassociation,
   };
@@ -455,6 +458,7 @@ struct SDNodeFlags {
   void setAllowReassociation(bool b) { setFlag(b); }
   void setNoFPExcept(bool b) { setFlag(b); }
   void setUnpredictable(bool b) { setFlag(b); }
+  void setInBounds(bool b) { setFlag(b); }
 
   // These are accessors for each flag.
   bool hasNoUnsignedWrap() const { return Flags & NoUnsignedWrap; }
@@ -472,6 +476,7 @@ struct SDNodeFlags {
   bool hasAllowReassociation() const { return Flags & AllowReassociation; }
   bool hasNoFPExcept() const { return Flags & NoFPExcept; }
   bool hasUnpredictable() const { return Flags & Unpredictable; }
+  bool hasInBounds() const { return Flags & InBounds; }
 
   bool operator==(const SDNodeFlags &Other) const {
 return Flags == Other.Flags;
@@ -481,7 +486,7 @@ struct SDNodeFlags {
 };
 
 LLVM_DECLARE_ENUM_AS_BITMASK(decltype(SDNodeFlags::None),
- SDNodeFlags::SameSign);
+ SDNodeFlags::InBounds);
 
 inline SDNodeFlags operator|(SDNodeFlags LHS, SDNodeFlags RHS) {
   LHS |= RHS;
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp 
b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
index 89793c30f3710..32973be608937 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
@@ -4283,6 +4283,7 @@ void SelectionDAGBuilder::visitGetElementPtr(const User 
&I) {
 if (NW.hasNoUnsignedWrap() ||
 (int64_t(Offset) >= 0 && NW.hasNoUnsignedSignedWrap()))
   Flags |= SDNodeFlags::NoUnsignedWrap;
+Flags.setInBounds(NW.isInBounds());
 
 N = DAG.getNode(ISD::ADD, dl, N.getValueType(), N,
 DAG.getConstant(Offset, dl, N.getValueType()), Flags);
@@ -4326,6 +4327,7 @@ void SelectionDAGBuilder::visitGetElementPtr(const User 
&I) {
 if (NW.hasNoUnsignedWrap() ||
 (Offs.isNonNegative() && NW.hasNoUnsignedSignedWrap()))
   Flags.setNoUnsignedWrap(true);
+Flags.setInBounds(NW.isInBounds());
 
 OffsVal = DAG.getSExtOrTrunc(OffsVal, dl, N.getValueType());
 
@@ -4388,6 +4390,7 @@ void SelectionDAGBuilder::visitGetElementPtr(const User 
&I) {
   // pointer index type (add nuw).
   SDNodeFlags AddFlags;
   AddFlags.setNoUnsignedWrap(NW.hasNoUnsignedWrap());
+  AddFlags.setInBounds(NW.isInBounds());
 
   N = DAG.getNode(ISD::ADD, dl, N.getValueType(), N, IdxN, AddFlags);
 }
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp 
b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
ind

[llvm-branch-commits] [llvm] [SDAG] Introduce inbounds flag for pointer arithmetic (PR #131862)

2025-04-10 Thread Fabian Ritter via llvm-branch-commits


https://github.com/ritter-x2a updated 
https://github.com/llvm/llvm-project/pull/131862

>From 75e41ae17d5daae609c6f25025c730e9bb3924bc Mon Sep 17 00:00:00 2001
From: Fabian Ritter 
Date: Mon, 17 Mar 2025 06:51:16 -0400
Subject: [PATCH] [SDAG] Introduce inbounds flag for pointer arithmetic

This patch introduces an inbounds SDNodeFlag, to show that a pointer
addition SDNode implements an inbounds getelementptr operation (i.e.,
the pointer operand is in bounds wrt. the allocated object it is based
on, and the arithmetic does not change that). The flag is set in the DAG
construction when lowering inbounds GEPs.

Inbounds information is useful in the ISel when selecting memory
instructions that perform address computations whose intermediate steps
must be in the same memory region as the final result. A follow-up patch
will start using it for AMDGPU's flat memory instructions, where the
immediate offset must not affect the memory aperture of the address.

A similar patch for gMIR and GlobalISel will follow.

For SWDEV-516125.
---
 llvm/include/llvm/CodeGen/SelectionDAGNodes.h| 9 +++--
 llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp| 3 +++
 llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp | 3 +++
 .../CodeGen/X86/merge-store-partially-alias-loads.ll | 2 +-
 4 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/llvm/include/llvm/CodeGen/SelectionDAGNodes.h 
b/llvm/include/llvm/CodeGen/SelectionDAGNodes.h
index 2283f99202e2f..13ac65f5d731c 100644
--- a/llvm/include/llvm/CodeGen/SelectionDAGNodes.h
+++ b/llvm/include/llvm/CodeGen/SelectionDAGNodes.h
@@ -415,12 +415,15 @@ struct SDNodeFlags {
 Unpredictable = 1 << 13,
 // Compare instructions which may carry the samesign flag.
 SameSign = 1 << 14,
+// Pointer arithmetic instructions that remain in bounds, e.g., 
implementing
+// an inbounds GEP.
+InBounds = 1 << 15,
 
 // NOTE: Please update LargestValue in LLVM_DECLARE_ENUM_AS_BITMASK below
 // the class definition when adding new flags.
 
 PoisonGeneratingFlags = NoUnsignedWrap | NoSignedWrap | Exact | Disjoint |
-NonNeg | NoNaNs | NoInfs | SameSign,
+NonNeg | NoNaNs | NoInfs | SameSign | InBounds,
 FastMathFlags = NoNaNs | NoInfs | NoSignedZeros | AllowReciprocal |
 AllowContract | ApproximateFuncs | AllowReassociation,
   };
@@ -455,6 +458,7 @@ struct SDNodeFlags {
   void setAllowReassociation(bool b) { setFlag(b); }
   void setNoFPExcept(bool b) { setFlag(b); }
   void setUnpredictable(bool b) { setFlag(b); }
+  void setInBounds(bool b) { setFlag(b); }
 
   // These are accessors for each flag.
   bool hasNoUnsignedWrap() const { return Flags & NoUnsignedWrap; }
@@ -472,6 +476,7 @@ struct SDNodeFlags {
   bool hasAllowReassociation() const { return Flags & AllowReassociation; }
   bool hasNoFPExcept() const { return Flags & NoFPExcept; }
   bool hasUnpredictable() const { return Flags & Unpredictable; }
+  bool hasInBounds() const { return Flags & InBounds; }
 
   bool operator==(const SDNodeFlags &Other) const {
 return Flags == Other.Flags;
@@ -481,7 +486,7 @@ struct SDNodeFlags {
 };
 
 LLVM_DECLARE_ENUM_AS_BITMASK(decltype(SDNodeFlags::None),
- SDNodeFlags::SameSign);
+ SDNodeFlags::InBounds);
 
 inline SDNodeFlags operator|(SDNodeFlags LHS, SDNodeFlags RHS) {
   LHS |= RHS;
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp 
b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
index 89793c30f3710..32973be608937 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
@@ -4283,6 +4283,7 @@ void SelectionDAGBuilder::visitGetElementPtr(const User 
&I) {
 if (NW.hasNoUnsignedWrap() ||
 (int64_t(Offset) >= 0 && NW.hasNoUnsignedSignedWrap()))
   Flags |= SDNodeFlags::NoUnsignedWrap;
+Flags.setInBounds(NW.isInBounds());
 
 N = DAG.getNode(ISD::ADD, dl, N.getValueType(), N,
 DAG.getConstant(Offset, dl, N.getValueType()), Flags);
@@ -4326,6 +4327,7 @@ void SelectionDAGBuilder::visitGetElementPtr(const User 
&I) {
 if (NW.hasNoUnsignedWrap() ||
 (Offs.isNonNegative() && NW.hasNoUnsignedSignedWrap()))
   Flags.setNoUnsignedWrap(true);
+Flags.setInBounds(NW.isInBounds());
 
 OffsVal = DAG.getSExtOrTrunc(OffsVal, dl, N.getValueType());
 
@@ -4388,6 +4390,7 @@ void SelectionDAGBuilder::visitGetElementPtr(const User 
&I) {
   // pointer index type (add nuw).
   SDNodeFlags AddFlags;
   AddFlags.setNoUnsignedWrap(NW.hasNoUnsignedWrap());
+  AddFlags.setInBounds(NW.isInBounds());
 
   N = DAG.getNode(ISD::ADD, dl, N.getValueType(), N, IdxN, AddFlags);
 }
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp 
b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
ind

[llvm-branch-commits] [libcxx] libcxx: In gdb test detect execute_mi with feature check instead of version check. (PR #132291)

2025-04-10 Thread Nikolas Klauser via llvm-branch-commits


https://github.com/philnik777 approved this pull request.

LGTM assuming the diff landed is the same I see. I'm really not a fan of 
complicating things unnecessarily though.


https://github.com/llvm/llvm-project/pull/132291
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [ctxprof] Flatten indirect call info in pre-thinlink compilation (PR #134766)

2025-04-10 Thread Mircea Trofin via llvm-branch-commits


https://github.com/mtrofin updated 
https://github.com/llvm/llvm-project/pull/134766

>From 97908a0b652420ce82f3fe965f8eb12002e74a85 Mon Sep 17 00:00:00 2001
From: Mircea Trofin 
Date: Mon, 7 Apr 2025 18:22:05 -0700
Subject: [PATCH] [ctxprof] Flatten indirect call info in pre-thinlink
 compilation

---
 llvm/include/llvm/Analysis/CtxProfAnalysis.h  |  5 ++
 llvm/lib/Analysis/CtxProfAnalysis.cpp | 14 +
 .../Instrumentation/PGOCtxProfFlattening.cpp  | 57 +++
 .../flatten-insert-icp-mdprof.ll  | 50 
 4 files changed, 126 insertions(+)
 create mode 100644 
llvm/test/Analysis/CtxProfAnalysis/flatten-insert-icp-mdprof.ll

diff --git a/llvm/include/llvm/Analysis/CtxProfAnalysis.h 
b/llvm/include/llvm/Analysis/CtxProfAnalysis.h
index 023b5a9bdb848..6f1c3696ca78c 100644
--- a/llvm/include/llvm/Analysis/CtxProfAnalysis.h
+++ b/llvm/include/llvm/Analysis/CtxProfAnalysis.h
@@ -21,6 +21,10 @@ namespace llvm {
 
 class CtxProfAnalysis;
 
+using FlatIndirectTargets = DenseMap;
+using CtxProfFlatIndirectCallProfile =
+DenseMap>;
+
 /// The instrumented contextual profile, produced by the CtxProfAnalysis.
 class PGOContextualProfile {
   friend class CtxProfAnalysis;
@@ -101,6 +105,7 @@ class PGOContextualProfile {
   void visit(ConstVisitor, const Function *F = nullptr) const;
 
   const CtxProfFlatProfile flatten() const;
+  const CtxProfFlatIndirectCallProfile flattenVirtCalls() const;
 
   bool invalidate(Module &, const PreservedAnalyses &PA,
   ModuleAnalysisManager::Invalidator &) {
diff --git a/llvm/lib/Analysis/CtxProfAnalysis.cpp 
b/llvm/lib/Analysis/CtxProfAnalysis.cpp
index 4042c87369462..304a77014f407 100644
--- a/llvm/lib/Analysis/CtxProfAnalysis.cpp
+++ b/llvm/lib/Analysis/CtxProfAnalysis.cpp
@@ -334,6 +334,20 @@ const CtxProfFlatProfile PGOContextualProfile::flatten() 
const {
   return Flat;
 }
 
+const CtxProfFlatIndirectCallProfile
+PGOContextualProfile::flattenVirtCalls() const {
+  CtxProfFlatIndirectCallProfile Ret;
+  preorderVisit(
+  Profiles.Contexts, [&](const PGOCtxProfContext &Ctx) {
+auto &Targets = Ret[Ctx.guid()];
+for (const auto &[ID, SubctxSet] : Ctx.callsites())
+  for (const auto &Subctx : SubctxSet)
+Targets[ID][Subctx.first] += Subctx.second.getEntrycount();
+  });
+  return Ret;
+}
+
 void CtxProfAnalysis::collectIndirectCallPromotionList(
 CallBase &IC, Result &Profile,
 SetVector> &Candidates) {
diff --git a/llvm/lib/Transforms/Instrumentation/PGOCtxProfFlattening.cpp 
b/llvm/lib/Transforms/Instrumentation/PGOCtxProfFlattening.cpp
index ffe0f385047c3..9b44d61726fa1 100644
--- a/llvm/lib/Transforms/Instrumentation/PGOCtxProfFlattening.cpp
+++ b/llvm/lib/Transforms/Instrumentation/PGOCtxProfFlattening.cpp
@@ -36,9 +36,12 @@
 #include "llvm/Transforms/Scalar/DCE.h"
 #include "llvm/Transforms/Utils/BasicBlockUtils.h"
 #include 
+#include 
 
 using namespace llvm;
 
+#define DEBUG_TYPE "ctx_prof_flatten"
+
 namespace {
 
 class ProfileAnnotator final {
@@ -414,6 +417,58 @@ void removeInstrumentation(Function &F) {
 I.eraseFromParent();
 }
 
+void annotateIndirectCall(
+Module &M, CallBase &CB,
+const DenseMap &FlatProf,
+const InstrProfCallsite &Ins) {
+  auto Idx = Ins.getIndex()->getZExtValue();
+  auto FIt = FlatProf.find(Idx);
+  if (FIt == FlatProf.end())
+return;
+  const auto &Targets = FIt->second;
+  SmallVector Data;
+  uint64_t Sum = 0;
+  for (auto &[Guid, Count] : Targets) {
+Data.push_back({/*.Value=*/Guid, /*.Count=*/Count});
+Sum += Count;
+  }
+  struct InstrProfValueDataGTComparer {
+bool operator()(const InstrProfValueData &A, const InstrProfValueData &B) {
+  return A.Count > B.Count;
+}
+  };
+  llvm::sort(Data, InstrProfValueDataGTComparer());
+  llvm::annotateValueSite(M, CB, Data, Sum,
+  InstrProfValueKind::IPVK_IndirectCallTarget,
+  Data.size());
+  LLVM_DEBUG(dbgs() << "[ctxprof] flat indirect call prof: " << CB
+<< CB.getMetadata(LLVMContext::MD_prof) << "\n");
+}
+
+// We normally return a "Changed" bool, but the calling pass' run assumes
+// something will change - some profile will be added - so this won't add much
+// by returning false when applicable.
+void annotateIndCalls(Module &M, const CtxProfAnalysis::Result &CtxProf) {
+  const auto FlatIndCalls = CtxProf.flattenVirtCalls();
+  for (auto &F : M) {
+if (F.isDeclaration())
+  continue;
+auto FlatProfIter = FlatIndCalls.find(AssignGUIDPass::getGUID(F));
+if (FlatProfIter == FlatIndCalls.end())
+  continue;
+const auto &FlatProf = FlatProfIter->second;
+for (auto &BB : F) {
+  for (auto &I : BB) {
+auto *CB = dyn_cast(&I);
+if (!CB || !CB->isIndirectCall())
+  continue;
+if (auto *Ins = CtxProfAnalysis::getCallsiteInstrumentation(*CB))
+  annotateIndirectCall(M, *CB, FlatProf, *Ins);
+  }

[llvm-branch-commits] [llvm] SCEVExpander: Don't look at uses of constants (PR #134691)

2025-04-10 Thread Matt Arsenault via llvm-branch-commits


https://github.com/arsenm created 
https://github.com/llvm/llvm-project/pull/134691

This could be more relaxed, and look for uses of globals in
the same function but no tests apparently depend on that.

>From f543f056aa7e16b1f793d018e0b9c022b006f477 Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Mon, 7 Apr 2025 21:56:00 +0700
Subject: [PATCH] SCEVExpander: Don't look at uses of constants

This could be more relaxed, and look for uses of globals in
the same function but no tests apparently depend on that.
---
 .../Utils/ScalarEvolutionExpander.cpp | 29 ++-
 1 file changed, 16 insertions(+), 13 deletions(-)

diff --git a/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp 
b/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
index 41bf202230e22..e25ec6c3b2a58 100644
--- a/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
+++ b/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
@@ -111,20 +111,23 @@ Value *SCEVExpander::ReuseOrCreateCast(Value *V, Type *Ty,
 
   Value *Ret = nullptr;
 
-  // Check to see if there is already a cast!
-  for (User *U : V->users()) {
-if (U->getType() != Ty)
-  continue;
-CastInst *CI = dyn_cast(U);
-if (!CI || CI->getOpcode() != Op)
-  continue;
+  if (!isa(V)) {
+// Check to see if there is already a cast!
+for (User *U : V->users()) {
+  if (U->getType() != Ty)
+continue;
+  CastInst *CI = dyn_cast(U);
+  if (!CI || CI->getOpcode() != Op)
+continue;
 
-// Found a suitable cast that is at IP or comes before IP. Use it. Note 
that
-// the cast must also properly dominate the Builder's insertion point.
-if (IP->getParent() == CI->getParent() && &*BIP != CI &&
-(&*IP == CI || CI->comesBefore(&*IP))) {
-  Ret = CI;
-  break;
+  // Found a suitable cast that is at IP or comes before IP. Use it. Note
+  // that the cast must also properly dominate the Builder's insertion
+  // point.
+  if (IP->getParent() == CI->getParent() && &*BIP != CI &&
+  (&*IP == CI || CI->comesBefore(&*IP))) {
+Ret = CI;
+break;
+  }
 }
   }
 

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] PHITransAddr: Avoid looking at constant use lists (PR #134689)

2025-04-10 Thread Matt Arsenault via llvm-branch-commits


arsenm wrote:

> [!WARNING]
> This pull request is not mergeable via GitHub because a downstack PR is 
> open. Once all requirements are satisfied, merge this PR as a stack  href="https://app.graphite.dev/github/pr/llvm/llvm-project/134689?utm_source=stack-comment-downstack-mergeability-warning";
>  >on Graphite.
> https://graphite.dev/docs/merge-pull-requests";>Learn more

* **#134692** https://app.graphite.dev/github/pr/llvm/llvm-project/134692?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* **#134691** https://app.graphite.dev/github/pr/llvm/llvm-project/134691?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* **#134690** https://app.graphite.dev/github/pr/llvm/llvm-project/134690?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* **#134689** https://app.graphite.dev/github/pr/llvm/llvm-project/134689?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/134689?utm_source=stack-comment-view-in-graphite";
 target="_blank">(View in Graphite)
* **#134688** https://app.graphite.dev/github/pr/llvm/llvm-project/134688?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* **#134275** https://app.graphite.dev/github/pr/llvm/llvm-project/134275?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* **#134274** https://app.graphite.dev/github/pr/llvm/llvm-project/134274?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* `main`




This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn 
more about https://stacking.dev/?utm_source=stack-comment";>stacking.


https://github.com/llvm/llvm-project/pull/134689
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [compiler-rt] release/20.x: [Sanitizers][Darwin][Test] XFAIL malloc_zone.cpp (PR #133832)

2025-04-10 Thread Mariusz Borsa via llvm-branch-commits


https://github.com/wrotki updated 
https://github.com/llvm/llvm-project/pull/133832

>From ca129ea5996c2f2b99868bccd2246690a65b6c9e Mon Sep 17 00:00:00 2001
From: Mariusz Borsa 
Date: Mon, 31 Mar 2025 17:06:41 -0700
Subject: [PATCH] [Sanitizers][Darwin][Test] XFAIL malloc_zone.cpp

The malloc_zone.cpp test currently fails on Darwin hosts, in SanitizerCommon 
tests with lsan enabled.

Need to XFAIL this test to buy time to investigate this failure. Also
we're trying to bring the number of test failing on Darwin bots to 0, to
get clearer signal of any new failures.

rdar://145873843

Co-authored-by: Mariusz Borsa 
(cherry picked from commit 02837acaaf2cfdfcbf77e4a7f6629575edb6ffb4)
---
 .../test/sanitizer_common/TestCases/Darwin/malloc_zone.cpp  | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/compiler-rt/test/sanitizer_common/TestCases/Darwin/malloc_zone.cpp 
b/compiler-rt/test/sanitizer_common/TestCases/Darwin/malloc_zone.cpp
index fd6ef03629438..5aa087fb4ca12 100644
--- a/compiler-rt/test/sanitizer_common/TestCases/Darwin/malloc_zone.cpp
+++ b/compiler-rt/test/sanitizer_common/TestCases/Darwin/malloc_zone.cpp
@@ -17,6 +17,8 @@
 // UBSan does not install a malloc zone.
 // XFAIL: ubsan
 //
+// Currently fails on darwin/lsan
+// XFAIL: darwin && lsan
 
 #include 
 #include 

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [ctxprof] Use `isInSpecializedModule` as criteria for using contextual profile (PR #134468)

2025-04-10 Thread Mircea Trofin via llvm-branch-commits


https://github.com/mtrofin ready_for_review 
https://github.com/llvm/llvm-project/pull/134468
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [LV] Reduce register usage for scaled reductions (PR #133090)

2025-04-10 Thread Sander de Smalen via llvm-branch-commits



@@ -5039,10 +5039,26 @@ calculateRegisterUsage(VPlan &Plan, 
ArrayRef VFs,
 // even in the scalar case.
 RegUsage[ClassID] += 1;
   } else {
+ElementCount VF = VFs[J];
+// The output from scaled phis and scaled reductions actually has
+// fewer lanes than the VF.
+if (isa(R)) {
+  auto *ReductionR = dyn_cast(R);
+  auto *PartialReductionR = ReductionR ? nullptr : 
dyn_cast(R);
+  unsigned ScaleFactor = ReductionR ? 
ReductionR->getVFScaleFactor() : PartialReductionR->getVFScaleFactor();
+  VF = VF.divideCoefficientBy(ScaleFactor);
+}

sdesmalen-arm wrote:

Maybe create a utility function that returns the scaling factor a `Recipe`, 
which returns `1` for any recipe other than the 
`VPPartialReductionRecipe/VPReductionPHIRecipe`.

Also, please run clang-format on your code.

https://github.com/llvm/llvm-project/pull/133090
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] release/20.x: [X86][SSE] Don't emit SSE2 load instructions in SSE1-only mode (#134547) (PR #135191)

2025-04-10 Thread via llvm-branch-commits


https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/135191
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] release/20.x: [X86][SSE] Don't emit SSE2 load instructions in SSE1-only mode (#134547) (PR #135191)

2025-04-10 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-backend-x86

Author: None (llvmbot)


Changes

Backport 08e080ee98832c2aec6f379b04f486bea18730cc

Requested by: @RKSimon

---
Full diff: https://github.com/llvm/llvm-project/pull/135191.diff


2 Files Affected:

- (modified) llvm/lib/Target/X86/X86FixupVectorConstants.cpp (+7-4) 
- (added) llvm/test/CodeGen/X86/pr134607.ll (+20) 


``diff
diff --git a/llvm/lib/Target/X86/X86FixupVectorConstants.cpp 
b/llvm/lib/Target/X86/X86FixupVectorConstants.cpp
index 453898e132ca4..9dc392d6e9626 100644
--- a/llvm/lib/Target/X86/X86FixupVectorConstants.cpp
+++ b/llvm/lib/Target/X86/X86FixupVectorConstants.cpp
@@ -333,6 +333,7 @@ bool 
X86FixupVectorConstantsPass::processInstruction(MachineFunction &MF,
  MachineInstr &MI) {
   unsigned Opc = MI.getOpcode();
   MachineConstantPool *CP = MI.getParent()->getParent()->getConstantPool();
+  bool HasSSE2 = ST->hasSSE2();
   bool HasSSE41 = ST->hasSSE41();
   bool HasAVX2 = ST->hasAVX2();
   bool HasDQI = ST->hasDQI();
@@ -394,11 +395,13 @@ bool 
X86FixupVectorConstantsPass::processInstruction(MachineFunction &MF,
   case X86::MOVAPDrm:
   case X86::MOVAPSrm:
   case X86::MOVUPDrm:
-  case X86::MOVUPSrm:
+  case X86::MOVUPSrm: {
 // TODO: SSE3 MOVDDUP Handling
-return FixupConstant({{X86::MOVSSrm, 1, 32, rebuildZeroUpperCst},
-  {X86::MOVSDrm, 1, 64, rebuildZeroUpperCst}},
- 128, 1);
+FixupEntry Fixups[] = {
+{X86::MOVSSrm, 1, 32, rebuildZeroUpperCst},
+{HasSSE2 ? X86::MOVSDrm : 0, 1, 64, rebuildZeroUpperCst}};
+return FixupConstant(Fixups, 128, 1);
+  }
   case X86::VMOVAPDrm:
   case X86::VMOVAPSrm:
   case X86::VMOVUPDrm:
diff --git a/llvm/test/CodeGen/X86/pr134607.ll 
b/llvm/test/CodeGen/X86/pr134607.ll
new file mode 100644
index 0..5e824c22e5a22
--- /dev/null
+++ b/llvm/test/CodeGen/X86/pr134607.ll
@@ -0,0 +1,20 @@
+; RUN: llc < %s -mtriple=i386-unknown-unknown -mattr=+sse -O3 | FileCheck %s 
--check-prefixes=X86
+; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=-sse2,+sse -O3 | 
FileCheck %s --check-prefixes=X64-SSE1
+; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+sse2,+sse -O3 | 
FileCheck %s --check-prefixes=X64-SSE2
+
+define void @store_v2f32_constant(ptr %v) {
+; X86-LABEL: store_v2f32_constant:
+; X86:   # %bb.0:
+; X86-NEXT:movl 4(%esp), %eax
+; X86-NEXT:movaps {{\.?LCPI[0-9]+_[0-9]+}}, %xmm0
+
+; X64-SSE1-LABEL: store_v2f32_constant:
+; X64-SSE1:   # %bb.0:
+; X64-SSE1-NEXT:movaps {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0
+
+; X64-SSE2-LABEL: store_v2f32_constant:
+; X64-SSE2:   # %bb.0:
+; X64-SSE2-NEXT:movsd {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0
+  store <2 x float> , ptr %v, align 4
+  ret void
+}

``




https://github.com/llvm/llvm-project/pull/135191
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [libcxx] libcxx: In gdb test detect execute_mi with feature check instead of version check. (PR #132291)

2025-04-10 Thread Peter Collingbourne via llvm-branch-commits


https://github.com/pcc updated https://github.com/llvm/llvm-project/pull/132291

>From 89ce369ab9b49b8c23a87ad0a888002dd85c094c Mon Sep 17 00:00:00 2001
From: Peter Collingbourne 
Date: Thu, 20 Mar 2025 15:12:39 -0700
Subject: [PATCH 1/2] Format

Created using spr 1.3.6-beta.1
---
 libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py 
b/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py
index 630b90c9d77a6..927f8958f4b43 100644
--- a/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py
+++ b/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py
@@ -30,7 +30,8 @@
 # we exit.
 has_run_tests = False
 
-has_execute_mi = 'execute_mi' in gdb.__dict__
+has_execute_mi = "execute_mi" in gdb.__dict__
+
 
 class CheckResult(gdb.Command):
 def __init__(self):

>From da2f682a8f1a1af58fbe85f760e1844c808b8093 Mon Sep 17 00:00:00 2001
From: Peter Collingbourne 
Date: Tue, 8 Apr 2025 13:21:06 -0700
Subject: [PATCH 2/2] Use getattr instead

Created using spr 1.3.6-beta.1
---
 libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py 
b/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py
index 927f8958f4b43..da09092b690c4 100644
--- a/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py
+++ b/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py
@@ -30,7 +30,7 @@
 # we exit.
 has_run_tests = False
 
-has_execute_mi = "execute_mi" in gdb.__dict__
+has_execute_mi = getattr(gdb, "execute_mi", None) is not None
 
 
 class CheckResult(gdb.Command):

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [libcxx] libcxx: In gdb test detect execute_mi with feature check instead of version check. (PR #132291)

2025-04-10 Thread Peter Collingbourne via llvm-branch-commits


https://github.com/pcc updated https://github.com/llvm/llvm-project/pull/132291

>From 89ce369ab9b49b8c23a87ad0a888002dd85c094c Mon Sep 17 00:00:00 2001
From: Peter Collingbourne 
Date: Thu, 20 Mar 2025 15:12:39 -0700
Subject: [PATCH 1/2] Format

Created using spr 1.3.6-beta.1
---
 libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py 
b/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py
index 630b90c9d77a6..927f8958f4b43 100644
--- a/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py
+++ b/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py
@@ -30,7 +30,8 @@
 # we exit.
 has_run_tests = False
 
-has_execute_mi = 'execute_mi' in gdb.__dict__
+has_execute_mi = "execute_mi" in gdb.__dict__
+
 
 class CheckResult(gdb.Command):
 def __init__(self):

>From da2f682a8f1a1af58fbe85f760e1844c808b8093 Mon Sep 17 00:00:00 2001
From: Peter Collingbourne 
Date: Tue, 8 Apr 2025 13:21:06 -0700
Subject: [PATCH 2/2] Use getattr instead

Created using spr 1.3.6-beta.1
---
 libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py 
b/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py
index 927f8958f4b43..da09092b690c4 100644
--- a/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py
+++ b/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py
@@ -30,7 +30,7 @@
 # we exit.
 has_run_tests = False
 
-has_execute_mi = "execute_mi" in gdb.__dict__
+has_execute_mi = getattr(gdb, "execute_mi", None) is not None
 
 
 class CheckResult(gdb.Command):

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] ELF: Remove lock from MTE global relocation handling code. (PR #135123)

2025-04-10 Thread Peter Collingbourne via llvm-branch-commits


https://github.com/pcc updated https://github.com/llvm/llvm-project/pull/135123


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] ELF: Remove lock from MTE global relocation handling code. (PR #135123)

2025-04-10 Thread Peter Collingbourne via llvm-branch-commits


https://github.com/pcc updated https://github.com/llvm/llvm-project/pull/135123


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] CodeGen: Trim redundant template argument from defusechain_iterator (PR #135024)

2025-04-10 Thread Quentin Colombet via llvm-branch-commits


https://github.com/qcolombet approved this pull request.


https://github.com/llvm/llvm-project/pull/135024
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [llvm] release/20.x: [fatlto] Add coroutine passes when using FatLTO with ThinLTO (#134434) (PR #134711)

2025-04-10 Thread via llvm-branch-commits


https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/134711
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [libcxx] libcxx: In gdb test detect execute_mi with feature check instead of version check. (PR #132291)

2025-04-10 Thread Peter Collingbourne via llvm-branch-commits


https://github.com/pcc updated https://github.com/llvm/llvm-project/pull/132291

>From 89ce369ab9b49b8c23a87ad0a888002dd85c094c Mon Sep 17 00:00:00 2001
From: Peter Collingbourne 
Date: Thu, 20 Mar 2025 15:12:39 -0700
Subject: [PATCH 1/2] Format

Created using spr 1.3.6-beta.1
---
 libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py 
b/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py
index 630b90c9d77a6..927f8958f4b43 100644
--- a/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py
+++ b/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py
@@ -30,7 +30,8 @@
 # we exit.
 has_run_tests = False
 
-has_execute_mi = 'execute_mi' in gdb.__dict__
+has_execute_mi = "execute_mi" in gdb.__dict__
+
 
 class CheckResult(gdb.Command):
 def __init__(self):

>From da2f682a8f1a1af58fbe85f760e1844c808b8093 Mon Sep 17 00:00:00 2001
From: Peter Collingbourne 
Date: Tue, 8 Apr 2025 13:21:06 -0700
Subject: [PATCH 2/2] Use getattr instead

Created using spr 1.3.6-beta.1
---
 libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py 
b/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py
index 927f8958f4b43..da09092b690c4 100644
--- a/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py
+++ b/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py
@@ -30,7 +30,7 @@
 # we exit.
 has_run_tests = False
 
-has_execute_mi = "execute_mi" in gdb.__dict__
+has_execute_mi = getattr(gdb, "execute_mi", None) is not None
 
 
 class CheckResult(gdb.Command):

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [compiler-rt] compiler-rt: Introduce runtime functions for emulated PAC. (PR #133530)

2025-04-10 Thread Peter Collingbourne via llvm-branch-commits


https://github.com/pcc edited https://github.com/llvm/llvm-project/pull/133530
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [polly] [Polly] Introduce PhaseManager and remove LPM support (PR #125442)

2025-04-10 Thread Karthika Devi C via llvm-branch-commits



@@ -16,105 +16,50 @@
 
//===--===//
 
 #include "polly/CodePreparation.h"
-#include "polly/LinkAllPasses.h"
 #include "polly/Support/ScopHelper.h"
 #include "llvm/Analysis/DominanceFrontier.h"
 #include "llvm/Analysis/LoopInfo.h"
 #include "llvm/Analysis/RegionInfo.h"
 #include "llvm/Analysis/ScalarEvolution.h"
-#include "llvm/InitializePasses.h"
 
 using namespace llvm;
 using namespace polly;
 
-namespace {
-
-/// Prepare the IR for the scop detection.
-///
-class CodePreparation final : public FunctionPass {
-  CodePreparation(const CodePreparation &) = delete;
-  const CodePreparation &operator=(const CodePreparation &) = delete;
-
-  LoopInfo *LI;
-  ScalarEvolution *SE;
-
-  void clear();
-
-public:
-  static char ID;
-
-  explicit CodePreparation() : FunctionPass(ID) {}
-  ~CodePreparation();
-
-  /// @name FunctionPass interface.
-  //@{
-  void getAnalysisUsage(AnalysisUsage &AU) const override;
-  void releaseMemory() override;
-  bool runOnFunction(Function &F) override;
-  void print(raw_ostream &OS, const Module *) const override;
-  //@}
-};
-} // namespace
-
-PreservedAnalyses CodePreparationPass::run(Function &F,
-   FunctionAnalysisManager &FAM) {
-
+static bool runCodePreprationImpl(Function &F, DominatorTree *DT, LoopInfo *LI,
+  RegionInfo *RI) {
   // Find first non-alloca instruction. Every basic block has a non-alloca
   // instruction, as every well formed basic block has a terminator.
   auto &EntryBlock = F.getEntryBlock();
   BasicBlock::iterator I = EntryBlock.begin();
   while (isa(I))
 ++I;
 
-  auto &DT = FAM.getResult(F);
-  auto &LI = FAM.getResult(F);
+  // Abort if not necessary to split
+  if (I->isTerminator() && isa(I) &&
+  cast(I)->isUnconditional())
+return false;
 
   // splitBlock updates DT, LI and RI.
-  splitEntryBlockForAlloca(&EntryBlock, &DT, &LI, nullptr);

kartcq wrote:

Can we please move CodePreparation pass changes to separate commit.
This will make the these changes more trackable.

https://github.com/llvm/llvm-project/pull/125442
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [LV] Reduce register usage for scaled reductions (PR #133090)

2025-04-10 Thread David Sherwood via llvm-branch-commits



@@ -2,6 +2,7 @@
 ; RUN: opt -passes=loop-vectorize -enable-epilogue-vectorization=false 
-mattr=+neon,+dotprod -force-vector-interleave=1 -S < %s | FileCheck %s 
--check-prefixes=CHECK-INTERLEAVE1
 ; RUN: opt -passes=loop-vectorize -enable-epilogue-vectorization=false 
-mattr=+neon,+dotprod -S < %s | FileCheck %s --check-prefixes=CHECK-INTERLEAVED
 ; RUN: opt -passes=loop-vectorize -enable-epilogue-vectorization=false 
-mattr=+neon,+dotprod -force-vector-interleave=1 -vectorizer-maximize-bandwidth 
-S < %s | FileCheck %s --check-prefixes=CHECK-MAXBW
+; RUN: opt -passes=loop-vectorize -debug-only=loop-vectorize --disable-output 
-S < %s 2>&1 | FileCheck %s --check-prefix=CHECK-REGS

david-arm wrote:

Still missing a `REQUIRES: asserts`

https://github.com/llvm/llvm-project/pull/133090
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [SSAUpdaterBulk] Add PHI simplification pass. (PR #135180)

2025-04-10 Thread Valery Pykhtin via llvm-branch-commits


https://github.com/vpykhtin edited 
https://github.com/llvm/llvm-project/pull/135180
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] LICM: Avoid looking at use list of constant data (PR #134690)

2025-04-10 Thread via llvm-branch-commits


github-actions[bot] wrote:




:warning: undef deprecator found issues in your code. :warning:



You can test this locally with the following command:


``bash
git diff -U0 --pickaxe-regex -S 
'([^a-zA-Z0-9#_-]undef[^a-zA-Z0-9_-]|UndefValue::get)' 'HEAD~1' HEAD 
llvm/lib/Transforms/Scalar/LICM.cpp llvm/test/CodeGen/AMDGPU/swdev380865.ll 
llvm/test/CodeGen/PowerPC/pr43527.ll llvm/test/CodeGen/PowerPC/pr48519.ll 
llvm/test/CodeGen/PowerPC/sms-grp-order.ll llvm/test/Transforms/LICM/pr50367.ll 
llvm/test/Transforms/LICM/pr59324.ll
``




The following files introduce new uses of undef:
 - llvm/test/CodeGen/PowerPC/sms-grp-order.ll

[Undef](https://llvm.org/docs/LangRef.html#undefined-values) is now deprecated 
and should only be used in the rare cases where no replacement is possible. For 
example, a load of uninitialized memory yields `undef`. You should use `poison` 
values for placeholders instead.

In tests, avoid using `undef` and having tests that trigger undefined behavior. 
If you need an operand with some unimportant value, you can add a new argument 
to the function and use that instead.

For example, this is considered a bad practice:
```llvm
define void @fn() {
  ...
  br i1 undef, ...
}
```

Please use the following instead:
```llvm
define void @fn(i1 %cond) {
  ...
  br i1 %cond, ...
}
```

Please refer to the [Undefined Behavior 
Manual](https://llvm.org/docs/UndefinedBehavior.html) for more information.



https://github.com/llvm/llvm-project/pull/134690
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [GOFF] Add writing of section symbols (PR #133799)

2025-04-10 Thread Kai Nacke via llvm-branch-commits



@@ -239,6 +298,63 @@ class GOFFWriter {
 GOFFWriter::GOFFWriter(raw_pwrite_stream &OS, MCAssembler &Asm)
 : OS(OS), Asm(Asm) {}
 
+void GOFFWriter::defineSectionSymbols(const MCSectionGOFF &Section) {
+  if (Section.isSD()) {
+GOFFSymbol SD(Section.getName(), Section.getId(),
+  Section.getSDAttributes());
+writeSymbol(SD);
+  }
+
+  if (Section.isED()) {
+GOFFSymbol ED(Section.getName(), Section.getId(),
+  Section.getParent()->getId(), Section.getEDAttributes());
+if (Section.requiresLength())
+  ED.SectionLength = Asm.getSectionAddressSize(Section);
+writeSymbol(ED);
+  }
+
+  if (Section.isPR()) {
+GOFFSymbol PR(Section.getName(), Section.getId(),
+  Section.getParent()->getId(), Section.getPRAttributes());
+PR.SectionLength = Asm.getSectionAddressSize(Section);
+if (Section.requiresNonZeroLength()) {

redstar wrote:

> That is a simple solution, too. I am not sure if this works with the HLASM 
> output.

Ok, I contradict myself. Setting the ADA to null gets around the binder error 
but other parts assume that there is always an ADA. E.g. when calling an 
external functions.

https://github.com/llvm/llvm-project/pull/133799
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] AArch64: Relax x16/x17 constraint on AUT in certain cases. (PR #132857)

2025-04-10 Thread Anatoly Trosinenko via llvm-branch-commits



@@ -191,16 +201,27 @@ define void @test_tailcall_omit_mov_x16_x16(ptr %objptr) 
#0 {
 define i32 @test_call_omit_extra_moves(ptr %objptr) #0 {
 ; CHECK-LABEL: test_call_omit_extra_moves:
 ; DARWIN-NEXT:   stp x29, x30, [sp, #-16]!
-; ELF-NEXT:  str x30, [sp, #-16]!
-; CHECK-NEXT:ldr x16, [x0]
-; CHECK-NEXT:mov x17, x0
-; CHECK-NEXT:movkx17, #6503, lsl #48
-; CHECK-NEXT:autda   x16, x17
-; CHECK-NEXT:ldr x8, [x16]
-; CHECK-NEXT:movkx16, #34646, lsl #48
-; CHECK-NEXT:blraa   x8, x16
-; CHECK-NEXT:mov w0, #42
+; DARWIN-NEXT:   ldr x16, [x0]
+; DARWIN-NEXT:   mov x17, x0
+; DARWIN-NEXT:   movkx17, #6503, lsl #48
+; DARWIN-NEXT:   autda   x16, x17
+; DARWIN-NEXT:   ldr x8, [x16]
+; DARWIN-NEXT:   movkx16, #34646, lsl #48
+; DARWIN-NEXT:   blraa   x8, x16
+; DARWIN-NEXT:   mov w0, #42
 ; DARWIN-NEXT:   ldp x29, x30, [sp], #16
+; ELF-NEXT:  str x30, [sp, #-16]!
+; ELF-NEXT:  ldr x8, [x0]
+; ELF-NEXT:  mov x9, x0
+; ELF-NEXT:  movkx9, #6503, lsl #48
+; ELF-NEXT:  autda   x8, x9
+; ELF-NEXT:  ldr x9, [x8]
+; FIXME: Get rid of the x16/x17 constraint on non-Darwin so we can eliminate
+; this mov.
+; ELF-NEXT:  mov x17, x8
+; ELF-NEXT:  movkx17, #34646, lsl #48
+; ELF-NEXT:  blraa   x9, x17
+; ELF-NEXT:  mov w0, #42
 ; ELF-NEXT:  ldr x30, [sp], #16
 ; CHECK-NEXT:ret

atrosinenko wrote:

Sorry, didn't notice that one of the instructions is a call.

https://github.com/llvm/llvm-project/pull/132857
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [lld] ELF: Only rewrite non-preemptible IFUNCs to IPLT functions if a non-IRELATIVE relocation is needed. (PR #133531)

2025-04-10 Thread Peter Smith via llvm-branch-commits


https://github.com/smithp35 commented:

How does this work in the non-PIE (PDE) case when we take the address of an 
ifunc and pass it to a function in a shared library, which then compares the 
argument with its own global address take of the ifunc?

For example:
shared lib
```
typedef void Fptr(void);
extern void ifn(void);

// take address of ifunc ifn defined in application
Fptr* ifp = &ifn;

// compare address of ifn we have calculated in ifp vs
// address calculated by application, passed in fp1.
int compare(Fptr* fp1) {
  return fp1 == ifp;
}
```
App
```
typedef void Fptr(void);
extern int compare(Fptr* fp1);
int val = 0;
static void impl(void) { val = 42; }
static void *resolver(void) { return impl; }
__attribute__((ifunc("resolver"))) void *ifn();

extern Fptr* fp;

int main(void) {
  return compare(fp);
}
// separate file so compiler is unaware ifn is an ifunc.
typedef void Fptr(void);
extern void ifn(void);
Fptr* fp = &ifn;
```

Right now in the application lld produces an iPLT entry for `ifn`, with `fp` 
pointing to the iPLT entry. The dynamic symbol table contains the address of 
the iPLT entry with type STT_FUNC . The shared library and the argument compare 
equal.

As I understand it, this patch will change `fp` to point directly to the result 
of the ifunc resolver. So unless we also change the value put into the dynamic 
symbol table we'll stop comparing equal. 

I don't think there's a STT_FUNC symbol we can put in the dynamic symbol table 
that holds the result of the ifunc resolver. GNU ld, puts the address of the 
resolver function with a STT_GNU_IFUNC symbol type in the dynamic symbol table. 
If that causes the dynamic loader to call the resolver and replace the value 
with the result then that would work. I haven't had time to check what glibc 
does though.

I'll put some more general comments below. Didn't want to make this one too 
long.

https://github.com/llvm/llvm-project/pull/133531
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [lld] ELF: Only rewrite non-preemptible IFUNCs to IPLT functions if a non-IRELATIVE relocation is needed. (PR #133531)

2025-04-10 Thread Peter Smith via llvm-branch-commits

smithp35 wrote:

I have some small reservations about using ifunc resolvers like this. Mostly in
that we are using a mechanism invented for a different purpose, and relying on
some specific linker behaviour to make this case work.

This is similar to comments made in the Discourse post
https://discourse.llvm.org/t/rfc-structure-protection-a-family-of-uaf-mitigation-techniques/8/9
but repeating them here as this is closest to the implementation.

As I understand it, this has a more limited and more specific use case than
ifuncs. Traditional ifuncs which can be address taken or called, possibly in
multiple ways, so it makes sense to use a symbol type STT_GNU_IFUNC rather than
special relocation directives. The initializers for structure field protection
are compiler generated, can not be legally called or address taken from user
code, and only have one relocation type R_*_ABS64 (or 32 on a 32-bit platform).
With an addition of a single relocation, something like R_*_ADDRINIT64 which
would target a STT_FUNC resolver symbol. We can isolate the structure field
initialization use case from an actual ifunc.

I guess it all comes down to whether structure field initialization needs, or
benefits from being distinguished from an ifunc. Ifuncs seem to be quite easy
to get wrong so being able to isolate this case has some attraction to me at
least. It also handles the structure field that points to an ifunc relatively
gracefully.

As you pointed out in your response, this does mean adding 2 relocations to
every psABI that supports structure field protection rather than just one.
Although I'd expect the alternative of having relocations that alternatively
write "directly call" ifunc resolver or take address of function might require
new relocations too?

https://github.com/llvm/llvm-project/pull/133531
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [clang] implement printing of canonical template arguments of expression kind (PR #135133)

2025-04-10 Thread via llvm-branch-commits


llvmbot wrote:




@llvm/pr-subscribers-debuginfo

Author: Matheus Izvekov (mizvekov)


Changes

This patch extends the canonicalization printing policy to cover expressions 
and template names, and wires that up to the template argument printer, 
covering expressions.

This is helpful for debugging, or if these template arguments somehow end up in 
diagnostics, as without this patch they can print as completely unrelated 
expressions, which can be quite confusing.

This is because expressions are not uniqued, unlike types, and when a template 
specialization containing an expression is the first to be canonicalized, the 
expression ends up appearing in the canonical type of subsequent equivalent 
specializations.

Fixes https://github.com/llvm/llvm-project/issues/92292

---

Patch is 48.39 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/135133.diff


12 Files Affected:

- (modified) clang/include/clang/AST/PrettyPrinter.h (+3-3) 
- (modified) clang/lib/AST/DeclPrinter.cpp (+2-2) 
- (modified) clang/lib/AST/JSONNodeDumper.cpp (+2) 
- (modified) clang/lib/AST/StmtPrinter.cpp (+5-1) 
- (modified) clang/lib/AST/TemplateBase.cpp (+5-2) 
- (modified) clang/lib/AST/TemplateName.cpp (+8-2) 
- (modified) clang/lib/AST/TextNodeDumper.cpp (+2) 
- (modified) clang/lib/AST/TypePrinter.cpp (+4-5) 
- (modified) clang/lib/CodeGen/CGDebugInfo.cpp (+1-1) 
- (modified) clang/lib/Sema/SemaTemplate.cpp (+1-1) 
- (modified) clang/test/AST/ast-dump-templates.cpp (+1022) 
- (modified) clang/unittests/AST/TypePrinterTest.cpp (+1-1) 


``diff
diff --git a/clang/include/clang/AST/PrettyPrinter.h 
b/clang/include/clang/AST/PrettyPrinter.h
index 91818776b770c..5a98ae1987b16 100644
--- a/clang/include/clang/AST/PrettyPrinter.h
+++ b/clang/include/clang/AST/PrettyPrinter.h
@@ -76,7 +76,7 @@ struct PrintingPolicy {
 MSWChar(LO.MicrosoftExt && !LO.WChar), IncludeNewlines(true),
 MSVCFormatting(false), ConstantsAsWritten(false),
 SuppressImplicitBase(false), FullyQualifiedName(false),
-PrintCanonicalTypes(false), PrintInjectedClassNameWithArguments(true),
+PrintAsCanonical(false), PrintInjectedClassNameWithArguments(true),
 UsePreferredNames(true), AlwaysIncludeTypeForTemplateArgument(false),
 CleanUglifiedParameters(false), EntireContentsOfLargeArray(true),
 UseEnumerators(true), UseHLSLTypes(LO.HLSL) {}
@@ -310,9 +310,9 @@ struct PrintingPolicy {
   LLVM_PREFERRED_TYPE(bool)
   unsigned FullyQualifiedName : 1;
 
-  /// Whether to print types as written or canonically.
+  /// Whether to print entities as written or canonically.
   LLVM_PREFERRED_TYPE(bool)
-  unsigned PrintCanonicalTypes : 1;
+  unsigned PrintAsCanonical : 1;
 
   /// Whether to print an InjectedClassNameType with template arguments or as
   /// written. When a template argument is unnamed, printing it results in
diff --git a/clang/lib/AST/DeclPrinter.cpp b/clang/lib/AST/DeclPrinter.cpp
index 28098b242d494..22da5bf251ecd 100644
--- a/clang/lib/AST/DeclPrinter.cpp
+++ b/clang/lib/AST/DeclPrinter.cpp
@@ -735,7 +735,7 @@ void DeclPrinter::VisitFunctionDecl(FunctionDecl *D) {
 llvm::raw_string_ostream POut(Proto);
 DeclPrinter TArgPrinter(POut, SubPolicy, Context, Indentation);
 const auto *TArgAsWritten = D->getTemplateSpecializationArgsAsWritten();
-if (TArgAsWritten && !Policy.PrintCanonicalTypes)
+if (TArgAsWritten && !Policy.PrintAsCanonical)
   TArgPrinter.printTemplateArguments(TArgAsWritten->arguments(), nullptr);
 else if (const TemplateArgumentList *TArgs =
  D->getTemplateSpecializationArgs())
@@ -1124,7 +1124,7 @@ void DeclPrinter::VisitCXXRecordDecl(CXXRecordDecl *D) {
   S->getSpecializedTemplate()->getTemplateParameters();
   const ASTTemplateArgumentListInfo *TArgAsWritten =
   S->getTemplateArgsAsWritten();
-  if (TArgAsWritten && !Policy.PrintCanonicalTypes)
+  if (TArgAsWritten && !Policy.PrintAsCanonical)
 printTemplateArguments(TArgAsWritten->arguments(), TParams);
   else
 printTemplateArguments(S->getTemplateArgs().asArray(), TParams);
diff --git a/clang/lib/AST/JSONNodeDumper.cpp b/clang/lib/AST/JSONNodeDumper.cpp
index 3420c1f343cf5..725db93b558f6 100644
--- a/clang/lib/AST/JSONNodeDumper.cpp
+++ b/clang/lib/AST/JSONNodeDumper.cpp
@@ -1724,6 +1724,8 @@ void 
JSONNodeDumper::VisitTemplateExpansionTemplateArgument(
 void JSONNodeDumper::VisitExpressionTemplateArgument(
 const TemplateArgument &TA) {
   JOS.attribute("isExpr", true);
+  if (TA.isCanonicalExpr())
+JOS.attribute("isCanon", true);
 }
 void JSONNodeDumper::VisitPackTemplateArgument(const TemplateArgument &TA) {
   JOS.attribute("isPack", true);
diff --git a/clang/lib/AST/StmtPrinter.cpp b/clang/lib/AST/StmtPrinter.cpp
index dbe2432d5c799..aae10fd3bd885 100644
--- a/clang/lib/AST/StmtPrinter.cpp
+++ b/clang/lib/AST/StmtPrinter.cpp
@@ -1305,9 +1305,13 @@ void StmtPrinter::VisitDe

[llvm-branch-commits] [llvm] [GOFF] Add writing of section symbols (PR #133799)

2025-04-10 Thread Ulrich Weigand via llvm-branch-commits



@@ -16,6 +16,9 @@ namespace llvm {
 class GOFFObjectWriter;
 
 class MCGOFFStreamer : public MCObjectStreamer {
+  std::string RootSDName;
+  std::string ADAPRName;

uweigand wrote:

These are no longer used, I think.

https://github.com/llvm/llvm-project/pull/133799
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] Add initial support for SPE brstack format (PR #129231)

2025-04-10 Thread Paschalis Mpeis via llvm-branch-commits



@@ -11,4 +11,4 @@ CHECK-SPE-NO-LBR: PERF2BOLT: Starting data aggregation job
 RUN: perf record -e cycles -q -o %t.perf.data -- %t.exe
 RUN: not perf2bolt -p %t.perf.data -o %t.perf.boltdata --spe %t.exe 2>&1 | 
FileCheck %s --check-prefix=CHECK-SPE-LBR
 
-CHECK-SPE-LBR: PERF2BOLT-ERROR: Arm SPE mode is combined only with 
BasicAggregation.
+CHECK-SPE-LBR: PERF2BOLT: spawning perf job to read SPE branch events

paschalis-mpeis wrote:

I realized I didn't include proper context in my previous comment about the 
**'fragility'**:

The reason for this fragility is the version of `perf` being used. Since 
`perf2bolt` is a wrapper over `perf`, older kernel versions may lack `brstack` 
support. In those cases `perf2bolt` would eventually return an error.

So here we intentionally ignore whether `perf2bolt` fails, and instead we only 
check that its original intent was to parse the SPE data, eg:
> PERF2BOLT: spawning perf job to read SPE brstack events

This should avoid flakiness in tests.

https://github.com/llvm/llvm-project/pull/129231
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [LV] Reduce register usage for scaled reductions (PR #133090)

2025-04-10 Thread David Sherwood via llvm-branch-commits



@@ -46,6 +46,11 @@ define i1 @select_exit_cond(ptr %start, ptr %end, i64 %N) {
 ; CHECK-NEXT:[[STEP_ADD_5:%.*]] = add <2 x i64> [[STEP_ADD_4]], splat (i64 
2)
 ; CHECK-NEXT:[[STEP_ADD_6:%.*]] = add <2 x i64> [[STEP_ADD_5]], splat (i64 
2)
 ; CHECK-NEXT:[[STEP_ADD_7:%.*]] = add <2 x i64> [[STEP_ADD_6]], splat (i64 
2)
+; CHECK-NEXT:[[STEP_ADD_8:%.*]] = add <2 x i64> [[STEP_ADD_7]], splat (i64 
2)

david-arm wrote:

I'm a bit surprised these are the only CHECK lines that have changed.

https://github.com/llvm/llvm-project/pull/133090
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [mlir] [mlir][LLVM] Delete `getFixedVectorType` and `getScalableVectorType` (PR #135051)

2025-04-10 Thread Matthias Springer via llvm-branch-commits


https://github.com/matthias-springer created 
https://github.com/llvm/llvm-project/pull/135051

The LLVM dialect no longer has its own vector types. It uses `mlir::VectorType` 
everywhere. Remove `LLVM::getFixedVectorType/getScalableVectorType` and use 
`VectorType::get` instead. This commit addresses a 
[comment](https://github.com/llvm/llvm-project/pull/133286#discussion_r2022192500)
 on the PR that deleted the LLVM vector types.

Depends on #134981.



>From 8b0377dedb64a3992ef9cf88144a13df797cd52d Mon Sep 17 00:00:00 2001
From: Matthias Springer 
Date: Wed, 9 Apr 2025 19:05:11 +0200
Subject: [PATCH] [mlir][LLVM] Delete `getFixedVectorType` and
 `getScalableVectorType`

---
 mlir/docs/Dialects/LLVM.md|  4 ---
 mlir/include/mlir/Dialect/LLVMIR/LLVMTypes.h  |  8 -
 .../Conversion/NVGPUToNVVM/NVGPUToNVVM.cpp| 33 +--
 mlir/lib/Dialect/LLVMIR/IR/LLVMTypes.cpp  | 12 ---
 mlir/lib/Dialect/LLVMIR/IR/NVVMDialect.cpp| 23 -
 mlir/lib/Dialect/NVGPU/Utils/MMAUtils.cpp | 24 ++
 mlir/lib/Target/LLVMIR/TypeFromLLVM.cpp   |  9 ++---
 7 files changed, 45 insertions(+), 68 deletions(-)

diff --git a/mlir/docs/Dialects/LLVM.md b/mlir/docs/Dialects/LLVM.md
index 468f69c419071..4b5d518ca4eab 100644
--- a/mlir/docs/Dialects/LLVM.md
+++ b/mlir/docs/Dialects/LLVM.md
@@ -336,10 +336,6 @@ compatible with the LLVM dialect:
 vector type compatible with the LLVM dialect;
 -   `llvm::ElementCount LLVM::getVectorNumElements(Type)` - returns the number
 of elements in any vector type compatible with the LLVM dialect;
--   `Type LLVM::getFixedVectorType(Type, unsigned)` - gets a fixed vector type
-with the given element type and size; the resulting type is either a
-built-in or an LLVM dialect vector type depending on which one supports the
-given element type.
 
  Examples of Compatible Vector Types
 
diff --git a/mlir/include/mlir/Dialect/LLVMIR/LLVMTypes.h 
b/mlir/include/mlir/Dialect/LLVMIR/LLVMTypes.h
index a2a76c49a2bda..17561f79d135a 100644
--- a/mlir/include/mlir/Dialect/LLVMIR/LLVMTypes.h
+++ b/mlir/include/mlir/Dialect/LLVMIR/LLVMTypes.h
@@ -126,14 +126,6 @@ Type getVectorType(Type elementType, unsigned numElements,
 /// and length.
 Type getVectorType(Type elementType, const llvm::ElementCount &numElements);
 
-/// Creates an LLVM dialect-compatible type with the given element type and
-/// length.
-Type getFixedVectorType(Type elementType, unsigned numElements);
-
-/// Creates an LLVM dialect-compatible type with the given element type and
-/// length.
-Type getScalableVectorType(Type elementType, unsigned numElements);
-
 /// Returns the size of the given primitive LLVM dialect-compatible type
 /// (including vectors) in bits, for example, the size of i16 is 16 and
 /// the size of vector<4xi16> is 64. Returns 0 for non-primitive
diff --git a/mlir/lib/Conversion/NVGPUToNVVM/NVGPUToNVVM.cpp 
b/mlir/lib/Conversion/NVGPUToNVVM/NVGPUToNVVM.cpp
index 51507c6507b69..e144a8063ae31 100644
--- a/mlir/lib/Conversion/NVGPUToNVVM/NVGPUToNVVM.cpp
+++ b/mlir/lib/Conversion/NVGPUToNVVM/NVGPUToNVVM.cpp
@@ -61,13 +61,13 @@ static Value truncToI32(ImplicitLocOpBuilder &b, Value 
value) {
 static Type inferIntrinsicResultType(Type vectorResultType) {
   MLIRContext *ctx = vectorResultType.getContext();
   auto a = cast(vectorResultType);
-  auto f16x2Ty = LLVM::getFixedVectorType(Float16Type::get(ctx), 2);
+  auto f16x2Ty = VectorType::get(2, Float16Type::get(ctx));
   auto i32Ty = IntegerType::get(ctx, 32);
-  auto i32x2Ty = LLVM::getFixedVectorType(i32Ty, 2);
+  auto i32x2Ty = VectorType::get(2, i32Ty);
   Type f64Ty = Float64Type::get(ctx);
-  Type f64x2Ty = LLVM::getFixedVectorType(f64Ty, 2);
+  Type f64x2Ty = VectorType::get(2, f64Ty);
   Type f32Ty = Float32Type::get(ctx);
-  Type f32x2Ty = LLVM::getFixedVectorType(f32Ty, 2);
+  Type f32x2Ty = VectorType::get(2, f32Ty);
   if (a.getElementType() == f16x2Ty) {
 return LLVM::LLVMStructType::getLiteral(
 ctx, SmallVector(a.getNumElements(), f16x2Ty));
@@ -85,7 +85,7 @@ static Type inferIntrinsicResultType(Type vectorResultType) {
 ctx,
 SmallVector(static_cast(a.getNumElements()) * 2, f32Ty));
   }
-  if (a.getElementType() == LLVM::getFixedVectorType(f32Ty, 1)) {
+  if (a.getElementType() == VectorType::get(f32Ty, {1})) {
 return LLVM::LLVMStructType::getLiteral(
 ctx, SmallVector(static_cast(a.getNumElements()), 
f32Ty));
   }
@@ -106,11 +106,11 @@ static Value convertIntrinsicResult(Location loc, Type 
intrinsicResultType,
   Type i32Ty = rewriter.getI32Type();
   Type f32Ty = rewriter.getF32Type();
   Type f64Ty = rewriter.getF64Type();
-  Type f16x2Ty = LLVM::getFixedVectorType(rewriter.getF16Type(), 2);
-  Type i32x2Ty = LLVM::getFixedVectorType(i32Ty, 2);
-  Type f64x2Ty = LLVM::getFixedVectorType(f64Ty, 2);
-  Type f32x2Ty = LLVM::getFixedVectorType(f32Ty, 2);
-  Type f32x1Ty = LLVM::getFixedVectorType(f32Ty, 1);
+  Type f16x2Ty =

[llvm-branch-commits] [clang-tools-extra] [clang-tidy] `matchesAnyListedTypeName` support non canonical types (PR #134869)

2025-04-10 Thread Congcong Cai via llvm-branch-commits


https://github.com/HerrCai0907 ready_for_review 
https://github.com/llvm/llvm-project/pull/134869
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

1 2 >

1 - 100 of 186 matches

Mail list logo