[llvm-branch-commits] [llvm] [InstCombine] Handle "add like" in ADD+GEP->GEP+GEP rewrites (PR #135156)
nikic wrote: See also https://github.com/llvm/llvm-project/pull/76981. https://github.com/llvm/llvm-project/pull/135156 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [InstCombine] Improve inbounds preservation for ADD+GEP -> GEP+GEP (PR #135155)
github-actions[bot] wrote: :warning: C/C++ code formatter, clang-format found issues in your code. :warning: You can test this locally with the following command: ``bash git-clang-format --diff HEAD~1 HEAD --extensions cpp -- llvm/lib/Transforms/InstCombine/InstructionCombining.cpp `` View the diff from clang-format here. ``diff diff --git a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp index b5e085be9..09b6f4880 100644 --- a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp +++ b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp @@ -3091,8 +3091,7 @@ Instruction *InstCombinerImpl::visitGetElementPtrInst(GetElementPtrInst &GEP) { DL.getIndexSizeInBits(PtrOp->getType()->getPointerAddressSpace()); APInt BasePtrOffset(IdxWidth, 0); Value *UnderlyingPtrOp = -PtrOp->stripAndAccumulateInBoundsConstantOffsets(DL, - BasePtrOffset); +PtrOp->stripAndAccumulateInBoundsConstantOffsets(DL, BasePtrOffset); bool CanBeNull, CanBeFreed; uint64_t DerefBytes = UnderlyingPtrOp->getPointerDereferenceableBytes( DL, CanBeNull, CanBeFreed); @@ -3120,7 +3119,8 @@ Instruction *InstCombinerImpl::visitGetElementPtrInst(GetElementPtrInst &GEP) { // These rewrites is trying to preserve inbounds/nuw attributes. So we want to // do this after having tried to derive "nuw" above. if (GEP.getNumIndices() == 1) { -auto GetPreservedNoWrapFlags = [&](bool AddIsNUW, Value *Idx1, Value *Idx2) { +auto GetPreservedNoWrapFlags = [&](bool AddIsNUW, Value *Idx1, + Value *Idx2) { // Preserve "inbounds nuw" if the original gep is "inbounds nuw", and the // add is "nuw". Preserve "nuw" if the original gep is "nuw", and the add // is "nuw". @@ -3160,8 +3160,8 @@ Instruction *InstCombinerImpl::visitGetElementPtrInst(GetElementPtrInst &GEP) { // as: // %newptr = getelementptr i32, ptr %ptr, i32 %idx1 // %newgep = getelementptr i32, ptr %newptr, i32 idx2 - bool NUW = match(GEP.getOperand(1), m_NNegZExt(m_NUWAddLike(m_Value(), - m_Value(; + bool NUW = match(GEP.getOperand(1), + m_NNegZExt(m_NUWAddLike(m_Value(), m_Value(; GEPNoWrapFlags NWFlags = GetPreservedNoWrapFlags(NUW, Idx1, C); auto *NewPtr = Builder.CreateGEP( GEP.getSourceElementType(), GEP.getPointerOperand(), `` https://github.com/llvm/llvm-project/pull/135155 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/20.x: [clang-format] Keep the space between `not` and a unary operator (#135035) (PR #135118)
https://github.com/HazardyKnusperkeks approved this pull request. https://github.com/llvm/llvm-project/pull/135118 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [GOFF] Add writing of section symbols (PR #133799)
@@ -2759,6 +2762,29 @@ MCSection *TargetLoweringObjectFileXCOFF::getSectionForLSDA( //===--===// TargetLoweringObjectFileGOFF::TargetLoweringObjectFileGOFF() = default; +void TargetLoweringObjectFileGOFF::getModuleMetadata(Module &M) { + // Construct the default names for the root SD and the ADA PR symbol. + StringRef FileName = sys::path::stem(M.getSourceFileName()); + if (FileName.size() > 1 && FileName.starts_with('<') && + FileName.ends_with('>')) +FileName = FileName.substr(1, FileName.size() - 2); + DefaultRootSDName = Twine(FileName).concat("#C").str(); redstar wrote: Using a name and setting the binding scope to "section scope" is similar to using " " as name and leaving the binding scope unspecified. The XLC and Open XL compilers provide a command line option to change this name (XLC: `-qcsect`, `-qnocsect`; Open XL: `-mcsect`, `-mnocsect`). The Open XL compiler defaults to the variant coded here but that can be changed to having a name with binding scope unspecified (`-mcsect=a`) or set to space (`-mnocsect`). Using `-mcsect=a` results in exactly the problem you describe. The compiler option will be added later to clang, along with the required code here. The front end (aka clang) provides this value in the `source_filename` property. All strings in LLVM/clang are in UTF-8 so there is no other choice. The same problem arises for symbols derived from function names etc. https://github.com/llvm/llvm-project/pull/133799 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [InstCombine] Improve inbounds preservation for ADD+GEP -> GEP+GEP (PR #135155)
https://github.com/bjope updated https://github.com/llvm/llvm-project/pull/135155 From 0abeb7b7eea0e15e15c98a8e4f8501fde81d4811 Mon Sep 17 00:00:00 2001 From: Bjorn Pettersson Date: Tue, 11 Mar 2025 16:27:43 +0100 Subject: [PATCH 1/3] [InstCombine] Improve inbounds preservation for ADD+GEP -> GEP+GEP Given that we have a "add nuw" and a "getelementptr inbounds nuw" like this: %idx = add nuw i64 %idx1, %idx2 %gep = getelementptr inbounds nuw i32, ptr %ptr, i64 %idx Then we can preserve the "inbounds nuw" flag when transforming that into two getelementptr instructions: %gep1 = getelementptr inbounds nuw i32, ptr %ptr, i64 %idx1 %gep = getelementptr inbounds nuw i32, ptr %ptr, i64 %idx2 Similarly for just having "nuw" instead of "inbounds nuw" on the getelementptr. Proof: https://alive2.llvm.org/ce/z/4uhfDq --- .../InstCombine/InstructionCombining.cpp | 43 +++ llvm/test/Transforms/InstCombine/array.ll | 10 ++--- 2 files changed, 30 insertions(+), 23 deletions(-) diff --git a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp index 856e02c9f1ddb..19a818f4baa30 100644 --- a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp +++ b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp @@ -3087,12 +3087,22 @@ Instruction *InstCombinerImpl::visitGetElementPtrInst(GetElementPtrInst &GEP) { return nullptr; if (GEP.getNumIndices() == 1) { -// We can only preserve inbounds if the original gep is inbounds, the add -// is nsw, and the add operands are non-negative. -auto CanPreserveInBounds = [&](bool AddIsNSW, Value *Idx1, Value *Idx2) { +auto CanPreserveNoWrapFlags = [&](bool AddIsNSW, bool AddIsNUW, Value *Idx1, + Value *Idx2) { + // Preserve "inbounds nuw" if the original gep is "inbounds nuw", + // and the add is "nuw". + if (GEP.isInBounds() && GEP.hasNoUnsignedWrap() && AddIsNUW) +return GEPNoWrapFlags::inBounds() | GEPNoWrapFlags::noUnsignedWrap(); + // Preserve "inbounds" if the original gep is "inbounds", the add + // is "nsw", and the add operands are non-negative. SimplifyQuery Q = SQ.getWithInstruction(&GEP); - return GEP.isInBounds() && AddIsNSW && isKnownNonNegative(Idx1, Q) && - isKnownNonNegative(Idx2, Q); + if (GEP.isInBounds() && AddIsNSW && isKnownNonNegative(Idx1, Q) && + isKnownNonNegative(Idx2, Q)) +return GEPNoWrapFlags::inBounds(); + // Preserve "nuw" if the original gep is "nuw", and the add is "nuw". + if (GEP.hasNoUnsignedWrap() && AddIsNUW) +return GEPNoWrapFlags::noUnsignedWrap(); + return GEPNoWrapFlags::none(); }; // Try to replace ADD + GEP with GEP + GEP. @@ -3104,15 +3114,15 @@ Instruction *InstCombinerImpl::visitGetElementPtrInst(GetElementPtrInst &GEP) { // as: // %newptr = getelementptr i32, ptr %ptr, i64 %idx1 // %newgep = getelementptr i32, ptr %newptr, i64 %idx2 - bool IsInBounds = CanPreserveInBounds( - cast(GEP.getOperand(1))->hasNoSignedWrap(), - Idx1, Idx2); + bool NSW = match(GEP.getOperand(1), m_NSWAddLike(m_Value(), m_Value())); + bool NUW = match(GEP.getOperand(1), m_NUWAddLike(m_Value(), m_Value())); + GEPNoWrapFlags NWFlags = CanPreserveNoWrapFlags(NSW, NUW, Idx1, Idx2); auto *NewPtr = Builder.CreateGEP(GEP.getSourceElementType(), GEP.getPointerOperand(), -Idx1, "", IsInBounds); - return replaceInstUsesWith( - GEP, Builder.CreateGEP(GEP.getSourceElementType(), NewPtr, Idx2, "", - IsInBounds)); +Idx1, "", NWFlags); + return replaceInstUsesWith(GEP, + Builder.CreateGEP(GEP.getSourceElementType(), + NewPtr, Idx2, "", NWFlags)); } ConstantInt *C; if (match(GEP.getOperand(1), m_OneUse(m_SExtLike(m_OneUse(m_NSWAdd( @@ -3123,17 +3133,16 @@ Instruction *InstCombinerImpl::visitGetElementPtrInst(GetElementPtrInst &GEP) { // as: // %newptr = getelementptr i32, ptr %ptr, i32 %idx1 // %newgep = getelementptr i32, ptr %newptr, i32 idx2 - bool IsInBounds = CanPreserveInBounds( - /*IsNSW=*/true, Idx1, C); + GEPNoWrapFlags NWFlags = CanPreserveNoWrapFlags( + /*IsNSW=*/true, /*IsNUW=*/false, Idx1, C); auto *NewPtr = Builder.CreateGEP( GEP.getSourceElementType(), GEP.getPointerOperand(), - Builder.CreateSExt(Idx1, GEP.getOperand(1)->getType()), "", - IsInBounds); + Builder.CreateSExt(Idx1, GEP.getOperand(1)->getType()), "", NWFlags); return replaceInstUsesWith( GEP, Builder.CreateGEP(GEP.getSourceElementType(), NewPtr, Builder.CreateSExt(C, GEP.getOperand(1)
[llvm-branch-commits] [llvm] [KeyInstr] Add Atom Group waterline to LLVMContext (PR #133478)
https://github.com/jmorse commented: Are there any expected interactions between atom-group-numbers and loading bitcode? i.e., if we serialise the literal atom-group-number to the output and then read it back in again, then it might conflict with atom-group-numbers seen in other functions in other bitcode files. It doesn't appear that they get re-numbered in the textual IR parsing patch for example. Possibly part of the design here is to simply not care, if it's only about internal consistency within a Function (does that hold after inlining too). Apologies if this is all explained in a later patch. The answers to that should ultimately be documented somewhere; I imagine that's in the patch stack or coming later. https://github.com/llvm/llvm-project/pull/133478 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr] Add Atom Group waterline to LLVMContext (PR #133478)
https://github.com/jmorse edited https://github.com/llvm/llvm-project/pull/133478 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr] Add Atom Group waterline to LLVMContext (PR #133478)
@@ -1366,6 +1367,43 @@ TEST_F(DILocationTest, discriminatorSpecialCases) { EXPECT_EQ(std::nullopt, L4->cloneByMultiplyingDuplicationFactor(0x1000)); } +TEST_F(DILocationTest, KeyInstructions) { + Context.pImpl->NextAtomGroup = 1; + + EXPECT_EQ(Context.pImpl->NextAtomGroup, 1u); + DILocation *A1 = DILocation::get(Context, 1, 0, getSubprogram(), nullptr, false, 1, 2); + // The group is only applied to the DILocation if the build has opted into + // the additional DILocation fields needed for the feature. jmorse wrote: Style nit: I feel "the build has opted into" is a bit too abstract, and is like the code referring to itself in the third person. "if we have been built with..." feels a lot cleaner IMHO, YMMV. https://github.com/llvm/llvm-project/pull/133478 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [X86][SSE] Don't emit SSE2 load instructions in SSE1-only mode (#134547) (PR #135191)
llvmbot wrote: @RKSimon What do you think about merging this PR to the release branch? https://github.com/llvm/llvm-project/pull/135191 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [X86][SSE] Don't emit SSE2 load instructions in SSE1-only mode (#134547) (PR #135191)
https://github.com/llvmbot created https://github.com/llvm/llvm-project/pull/135191 Backport 08e080ee98832c2aec6f379b04f486bea18730cc Requested by: @RKSimon >From bcb6ae86466e917786310b3133a2df6a776923fa Mon Sep 17 00:00:00 2001 From: Stefan Schmidt Date: Wed, 9 Apr 2025 11:19:26 +0200 Subject: [PATCH] [X86][SSE] Don't emit SSE2 load instructions in SSE1-only mode (#134547) This fixes a regression I traced back to https://github.com/llvm/llvm-project/commit/8b43c1be23119c1024bed0a8ce392bc73727e2e2 / https://github.com/llvm/llvm-project/pull/79000 The regression caused an SSE2 instruction, `movsd`, to be emitted as a replacement for an SSE instruction, `movaps` despite the target potentially not supporting this instruction, such as when building with clang using `-march=pentium3`. Fixes #134607 (cherry picked from commit 08e080ee98832c2aec6f379b04f486bea18730cc) --- .../Target/X86/X86FixupVectorConstants.cpp| 11 ++ llvm/test/CodeGen/X86/pr134607.ll | 20 +++ 2 files changed, 27 insertions(+), 4 deletions(-) create mode 100644 llvm/test/CodeGen/X86/pr134607.ll diff --git a/llvm/lib/Target/X86/X86FixupVectorConstants.cpp b/llvm/lib/Target/X86/X86FixupVectorConstants.cpp index 453898e132ca4..9dc392d6e9626 100644 --- a/llvm/lib/Target/X86/X86FixupVectorConstants.cpp +++ b/llvm/lib/Target/X86/X86FixupVectorConstants.cpp @@ -333,6 +333,7 @@ bool X86FixupVectorConstantsPass::processInstruction(MachineFunction &MF, MachineInstr &MI) { unsigned Opc = MI.getOpcode(); MachineConstantPool *CP = MI.getParent()->getParent()->getConstantPool(); + bool HasSSE2 = ST->hasSSE2(); bool HasSSE41 = ST->hasSSE41(); bool HasAVX2 = ST->hasAVX2(); bool HasDQI = ST->hasDQI(); @@ -394,11 +395,13 @@ bool X86FixupVectorConstantsPass::processInstruction(MachineFunction &MF, case X86::MOVAPDrm: case X86::MOVAPSrm: case X86::MOVUPDrm: - case X86::MOVUPSrm: + case X86::MOVUPSrm: { // TODO: SSE3 MOVDDUP Handling -return FixupConstant({{X86::MOVSSrm, 1, 32, rebuildZeroUpperCst}, - {X86::MOVSDrm, 1, 64, rebuildZeroUpperCst}}, - 128, 1); +FixupEntry Fixups[] = { +{X86::MOVSSrm, 1, 32, rebuildZeroUpperCst}, +{HasSSE2 ? X86::MOVSDrm : 0, 1, 64, rebuildZeroUpperCst}}; +return FixupConstant(Fixups, 128, 1); + } case X86::VMOVAPDrm: case X86::VMOVAPSrm: case X86::VMOVUPDrm: diff --git a/llvm/test/CodeGen/X86/pr134607.ll b/llvm/test/CodeGen/X86/pr134607.ll new file mode 100644 index 0..5e824c22e5a22 --- /dev/null +++ b/llvm/test/CodeGen/X86/pr134607.ll @@ -0,0 +1,20 @@ +; RUN: llc < %s -mtriple=i386-unknown-unknown -mattr=+sse -O3 | FileCheck %s --check-prefixes=X86 +; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=-sse2,+sse -O3 | FileCheck %s --check-prefixes=X64-SSE1 +; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+sse2,+sse -O3 | FileCheck %s --check-prefixes=X64-SSE2 + +define void @store_v2f32_constant(ptr %v) { +; X86-LABEL: store_v2f32_constant: +; X86: # %bb.0: +; X86-NEXT:movl 4(%esp), %eax +; X86-NEXT:movaps {{\.?LCPI[0-9]+_[0-9]+}}, %xmm0 + +; X64-SSE1-LABEL: store_v2f32_constant: +; X64-SSE1: # %bb.0: +; X64-SSE1-NEXT:movaps {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0 + +; X64-SSE2-LABEL: store_v2f32_constant: +; X64-SSE2: # %bb.0: +; X64-SSE2-NEXT:movsd {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0 + store <2 x float> , ptr %v, align 4 + ret void +} ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr] Add Atom Group waterline to LLVMContext (PR #133478)
@@ -335,6 +335,14 @@ class LLVMContext { StringRef getDefaultTargetFeatures(); void setDefaultTargetFeatures(StringRef Features); + /// Key Instructions: update the highest number atom group emitted for any + /// function. + void updateAtomGroupWaterline(uint64_t G); + + /// Key Instructions: get the next free atom group number and increment + /// the global tracker. + uint64_t incNextAtomGroup(); + jmorse wrote: IMO in isolation it's not clear that this is to do with debugging information and source locations; could we shoe-horn `DILocation` into the comments to make it clear what it affects? (Thinking purely about someone stumbling on this and not immediately knowing whether it's relevant to what they're studying) https://github.com/llvm/llvm-project/pull/133478 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [X86][SSE] Don't emit SSE2 load instructions in SSE1-only mode (#134547) (PR #135191)
github-actions[bot] wrote: ⚠️ We detected that you are using a GitHub private e-mail address to contribute to the repo. Please turn off [Keep my email addresses private](https://github.com/settings/emails) setting in your account. See [LLVM Discourse](https://discourse.llvm.org/t/hidden-emails-on-github-should-we-do-something-about-it) for more information. https://github.com/llvm/llvm-project/pull/135191 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [InstCombine] Improve inbounds preservation for ADD+GEP -> GEP+GEP (PR #135155)
https://github.com/bjope updated https://github.com/llvm/llvm-project/pull/135155 From 0abeb7b7eea0e15e15c98a8e4f8501fde81d4811 Mon Sep 17 00:00:00 2001 From: Bjorn Pettersson Date: Tue, 11 Mar 2025 16:27:43 +0100 Subject: [PATCH 1/2] [InstCombine] Improve inbounds preservation for ADD+GEP -> GEP+GEP Given that we have a "add nuw" and a "getelementptr inbounds nuw" like this: %idx = add nuw i64 %idx1, %idx2 %gep = getelementptr inbounds nuw i32, ptr %ptr, i64 %idx Then we can preserve the "inbounds nuw" flag when transforming that into two getelementptr instructions: %gep1 = getelementptr inbounds nuw i32, ptr %ptr, i64 %idx1 %gep = getelementptr inbounds nuw i32, ptr %ptr, i64 %idx2 Similarly for just having "nuw" instead of "inbounds nuw" on the getelementptr. Proof: https://alive2.llvm.org/ce/z/4uhfDq --- .../InstCombine/InstructionCombining.cpp | 43 +++ llvm/test/Transforms/InstCombine/array.ll | 10 ++--- 2 files changed, 30 insertions(+), 23 deletions(-) diff --git a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp index 856e02c9f1ddb..19a818f4baa30 100644 --- a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp +++ b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp @@ -3087,12 +3087,22 @@ Instruction *InstCombinerImpl::visitGetElementPtrInst(GetElementPtrInst &GEP) { return nullptr; if (GEP.getNumIndices() == 1) { -// We can only preserve inbounds if the original gep is inbounds, the add -// is nsw, and the add operands are non-negative. -auto CanPreserveInBounds = [&](bool AddIsNSW, Value *Idx1, Value *Idx2) { +auto CanPreserveNoWrapFlags = [&](bool AddIsNSW, bool AddIsNUW, Value *Idx1, + Value *Idx2) { + // Preserve "inbounds nuw" if the original gep is "inbounds nuw", + // and the add is "nuw". + if (GEP.isInBounds() && GEP.hasNoUnsignedWrap() && AddIsNUW) +return GEPNoWrapFlags::inBounds() | GEPNoWrapFlags::noUnsignedWrap(); + // Preserve "inbounds" if the original gep is "inbounds", the add + // is "nsw", and the add operands are non-negative. SimplifyQuery Q = SQ.getWithInstruction(&GEP); - return GEP.isInBounds() && AddIsNSW && isKnownNonNegative(Idx1, Q) && - isKnownNonNegative(Idx2, Q); + if (GEP.isInBounds() && AddIsNSW && isKnownNonNegative(Idx1, Q) && + isKnownNonNegative(Idx2, Q)) +return GEPNoWrapFlags::inBounds(); + // Preserve "nuw" if the original gep is "nuw", and the add is "nuw". + if (GEP.hasNoUnsignedWrap() && AddIsNUW) +return GEPNoWrapFlags::noUnsignedWrap(); + return GEPNoWrapFlags::none(); }; // Try to replace ADD + GEP with GEP + GEP. @@ -3104,15 +3114,15 @@ Instruction *InstCombinerImpl::visitGetElementPtrInst(GetElementPtrInst &GEP) { // as: // %newptr = getelementptr i32, ptr %ptr, i64 %idx1 // %newgep = getelementptr i32, ptr %newptr, i64 %idx2 - bool IsInBounds = CanPreserveInBounds( - cast(GEP.getOperand(1))->hasNoSignedWrap(), - Idx1, Idx2); + bool NSW = match(GEP.getOperand(1), m_NSWAddLike(m_Value(), m_Value())); + bool NUW = match(GEP.getOperand(1), m_NUWAddLike(m_Value(), m_Value())); + GEPNoWrapFlags NWFlags = CanPreserveNoWrapFlags(NSW, NUW, Idx1, Idx2); auto *NewPtr = Builder.CreateGEP(GEP.getSourceElementType(), GEP.getPointerOperand(), -Idx1, "", IsInBounds); - return replaceInstUsesWith( - GEP, Builder.CreateGEP(GEP.getSourceElementType(), NewPtr, Idx2, "", - IsInBounds)); +Idx1, "", NWFlags); + return replaceInstUsesWith(GEP, + Builder.CreateGEP(GEP.getSourceElementType(), + NewPtr, Idx2, "", NWFlags)); } ConstantInt *C; if (match(GEP.getOperand(1), m_OneUse(m_SExtLike(m_OneUse(m_NSWAdd( @@ -3123,17 +3133,16 @@ Instruction *InstCombinerImpl::visitGetElementPtrInst(GetElementPtrInst &GEP) { // as: // %newptr = getelementptr i32, ptr %ptr, i32 %idx1 // %newgep = getelementptr i32, ptr %newptr, i32 idx2 - bool IsInBounds = CanPreserveInBounds( - /*IsNSW=*/true, Idx1, C); + GEPNoWrapFlags NWFlags = CanPreserveNoWrapFlags( + /*IsNSW=*/true, /*IsNUW=*/false, Idx1, C); auto *NewPtr = Builder.CreateGEP( GEP.getSourceElementType(), GEP.getPointerOperand(), - Builder.CreateSExt(Idx1, GEP.getOperand(1)->getType()), "", - IsInBounds); + Builder.CreateSExt(Idx1, GEP.getOperand(1)->getType()), "", NWFlags); return replaceInstUsesWith( GEP, Builder.CreateGEP(GEP.getSourceElementType(), NewPtr, Builder.CreateSExt(C, GEP.getOperand(1)
[llvm-branch-commits] [llvm] Add initial support for SPE brstack format (PR #129231)
https://github.com/kaadam updated https://github.com/llvm/llvm-project/pull/129231 From 93c958c3f016092c340e897aeabbb470e58b9dbb Mon Sep 17 00:00:00 2001 From: Adam Kallai Date: Wed, 19 Feb 2025 17:00:47 +0100 Subject: [PATCH 1/2] Add initial support for SPE brstack Perf will be able to report SPE branch events as similar as it does with LBR brstack. Therefore we can utilize the existing LBR parsing process for SPE as well. Example of the SPE brstack input format: perf script -i perf.data -F pid,brstack --itrace=bl --- PIDFROM TO PREDICTED --- 16984 0x72e342e5f4/0x72e36192d0/M/-/-/11/RET/- 16984 0x72e7b8b3b4/0x72e7b8b3b8/PN/-/-/11/COND/- 16984 0x72e7b92b48/0x72e7b92b4c/PN/-/-/8/COND/- 16984 0x72eacc6b7c/0x760cc94b00/P/-/-/9/RET/- 16984 0x72e3f210fc/0x72e3f21068/P/-/-/4//- 16984 0x72e39b8c5c/0x72e3627b24/P/-/-/4//- 16984 0x72e7b89d20/0x72e7b92bbc/P/-/-/4/RET/- SPE brstack mispredicted flag might be two characters long: 'PN' or 'MN'. Where 'N' means the branch was marked as NOT-TAKEN. This event is only related to conditional instruction (conditional branch or compare-and-branch), it tells that failed its condition code check. Perf with 'brstack' support for SPE is available here: ``` https://github.com/Leo-Yan/linux/tree/perf_arm_spe_branch_flags_v2 ``` Example of useage with SPE perf data: ```bash perf2bolt -p perf.data -o perf.fdata --spe BINARY ``` Capture standard SPE branch events with perf: ```bash perf record -e 'arm_spe_0/branch_filter=1/u' -- BINARY ``` An unittest is also added to check parsing process of 'SPE brstack format'. --- bolt/lib/Profile/DataAggregator.cpp | 60 ++-- .../test/perf2bolt/AArch64/perf2bolt-spe.test | 2 +- bolt/unittests/Profile/PerfSpeEvents.cpp | 71 +++ 3 files changed, 109 insertions(+), 24 deletions(-) diff --git a/bolt/lib/Profile/DataAggregator.cpp b/bolt/lib/Profile/DataAggregator.cpp index cce9fdbef99bd..4af3a493b8be6 100644 --- a/bolt/lib/Profile/DataAggregator.cpp +++ b/bolt/lib/Profile/DataAggregator.cpp @@ -49,12 +49,10 @@ static cl::opt cl::desc("aggregate basic samples (without LBR info)"), cl::cat(AggregatorCategory)); -cl::opt ArmSPE( -"spe", -cl::desc( -"Enable Arm SPE mode. Used in conjuction with no-lbr mode, ie `--spe " -"--nl`"), -cl::cat(AggregatorCategory)); +cl::opt ArmSPE("spe", + cl::desc("Enable Arm SPE mode. Can combine with `--nl` " + "to use in no-lbr mode"), + cl::cat(AggregatorCategory)); static cl::opt ITraceAggregation("itrace", @@ -180,13 +178,16 @@ void DataAggregator::start() { if (opts::ArmSPE) { if (!opts::BasicAggregation) { - errs() << "PERF2BOLT-ERROR: Arm SPE mode is combined only with " -"BasicAggregation.\n"; - exit(1); + // pidfrom_ip to_ippredicted? + // 12345 0x123/0x456/P/-/-/8/RET/- + launchPerfProcess("SPE branch events", MainEventsPPI, +"script -F pid,brstack --itrace=bl", +/*Wait = */ false); +} else { + launchPerfProcess("SPE brstack events", MainEventsPPI, +"script -F pid,event,ip,addr --itrace=i1i", +/*Wait = */ false); } -launchPerfProcess("branch events with SPE", MainEventsPPI, - "script -F pid,event,ip,addr --itrace=i1i", - /*Wait = */ false); } else if (opts::BasicAggregation) { launchPerfProcess("events without LBR", MainEventsPPI, "script -F pid,event,ip", @@ -527,8 +528,7 @@ Error DataAggregator::preprocessProfile(BinaryContext &BC) { } exit(0); } - - if (((!opts::BasicAggregation && !opts::ArmSPE) && parseBranchEvents()) || + if ((!opts::BasicAggregation && parseBranchEvents()) || (opts::BasicAggregation && opts::ArmSPE && parseSpeAsBasicEvents()) || (opts::BasicAggregation && parseBasicEvents())) errs() << "PERF2BOLT: failed to parse samples\n"; @@ -1034,7 +1034,11 @@ ErrorOr DataAggregator::parseLBREntry() { if (std::error_code EC = MispredStrRes.getError()) return EC; StringRef MispredStr = MispredStrRes.get(); - if (MispredStr.size() != 1 || + // SPE brstack mispredicted flags might be two characters long: 'PN' or 'MN'. + bool ProperStrSize = (MispredStr.size() == 2 && opts::ArmSPE) + ? (MispredStr[1] == 'N') + : (MispredStr.size() == 1); + if (!ProperStrSize || (MispredStr[0] != 'P' && MispredStr[0] != 'M' && MispredStr[0] != '-')) { reportError("expected single char for mispred bit"); Diag << "Found: " << MispredStr << "\n"; @@ -1565,9 +1569,11 @@ uint64_t DataAggregator::parseLBRSample(const PerfBranchSample &Sample, } std::error_code DataAggregator::parseBranchEvents() { - outs() << "PERF2BOLT
[llvm-branch-commits] [llvm] Add initial support for SPE brstack format (PR #129231)
@@ -113,6 +153,37 @@ TEST_F(PerfSpeEventsTestHelper, SpeBranches) { EXPECT_TRUE(checkEvents(1234, 10, {"branches-spe:"})); } +TEST_F(PerfSpeEventsTestHelper, SpeBranchesWithBrstack) { + // Check perf input with SPE branch events as brstack format. + // Example collection command: + // ``` + // perf record -e 'arm_spe_0/branch_filter=1/u' -- BINARY + // ``` + // How Bolt extracts the branch events: + // ``` + // perf script -F pid,brstack --itrace=bl + // ``` + + opts::ArmSPE = true; + opts::ReadPerfEvents = " 1234 0xa001/0xa002/PN/-/-/10/COND/-\n" + " 1234 0xb001/0xb002/P/-/-/4/RET/-\n" + " 1234 0xc001/0xc002/P/-/-/13/-/-\n" + " 1234 0xd001/0xd002/M/-/-/7/RET/-\n" + " 1234 0xe001/0xe002/P/-/-/14/RET/-\n" + " 1234 0xf001/0xf002/MN/-/-/8/COND/-\n"; + + LBREntry Entry1 = {0xa001, 0xa002, false}; + LBREntry Entry2 = {0xb001, 0xb002, false}; + LBREntry Entry3 = {0xc001, 0xc002, false}; + LBREntry Entry4 = {0xd001, 0xd002, true}; + LBREntry Entry5 = {0xe001, 0xe002, false}; + LBREntry Entry6 = {0xf001, 0xf002, true}; + std::vector> ExpectedSamples = { + {{Entry1}}, {{Entry2}}, {{Entry3}}, {{Entry4}}, {{Entry5}}, {{Entry6}}, + }; kaadam wrote: Simplified, thanks for the hint. https://github.com/llvm/llvm-project/pull/129231 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] 6c36439 - Revert "Remember LLVM_ENABLE_LIBCXX setting in installed configuration (#134990)"
Author: Michael Kruse Date: 2025-04-10T11:54:53+02:00 New Revision: 6c3643905210c717831a18d4af1ae921a1ad9f74 URL: https://github.com/llvm/llvm-project/commit/6c3643905210c717831a18d4af1ae921a1ad9f74 DIFF: https://github.com/llvm/llvm-project/commit/6c3643905210c717831a18d4af1ae921a1ad9f74.diff LOG: Revert "Remember LLVM_ENABLE_LIBCXX setting in installed configuration (#134990)" This reverts commit 785e7f06ddb1ba36aa679d23436726dcf61f8afb. Added: Modified: llvm/cmake/modules/LLVMConfig.cmake.in Removed: diff --git a/llvm/cmake/modules/LLVMConfig.cmake.in b/llvm/cmake/modules/LLVMConfig.cmake.in index 1c34073f6b910..5ccc66b8039bf 100644 --- a/llvm/cmake/modules/LLVMConfig.cmake.in +++ b/llvm/cmake/modules/LLVMConfig.cmake.in @@ -55,8 +55,6 @@ endif() set(LLVM_ENABLE_RTTI @LLVM_ENABLE_RTTI@) -set(LLVM_ENABLE_LIBCXX @LLVM_ENABLE_LIBCXX@) - set(LLVM_ENABLE_LIBEDIT @HAVE_LIBEDIT@) if(LLVM_ENABLE_LIBEDIT) find_package(LibEdit) ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Add initial support for SPE brstack format (PR #129231)
@@ -11,4 +11,4 @@ CHECK-SPE-NO-LBR: PERF2BOLT: Starting data aggregation job RUN: perf record -e cycles -q -o %t.perf.data -- %t.exe RUN: not perf2bolt -p %t.perf.data -o %t.perf.boltdata --spe %t.exe 2>&1 | FileCheck %s --check-prefix=CHECK-SPE-LBR -CHECK-SPE-LBR: PERF2BOLT-ERROR: Arm SPE mode is combined only with BasicAggregation. +CHECK-SPE-LBR: PERF2BOLT: spawning perf job to read SPE branch events kaadam wrote: Thanks for clarifying this. Updated this test. https://github.com/llvm/llvm-project/pull/129231 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Add initial support for SPE brstack format (PR #129231)
https://github.com/kaadam updated https://github.com/llvm/llvm-project/pull/129231 From 93c958c3f016092c340e897aeabbb470e58b9dbb Mon Sep 17 00:00:00 2001 From: Adam Kallai Date: Wed, 19 Feb 2025 17:00:47 +0100 Subject: [PATCH 1/3] Add initial support for SPE brstack Perf will be able to report SPE branch events as similar as it does with LBR brstack. Therefore we can utilize the existing LBR parsing process for SPE as well. Example of the SPE brstack input format: perf script -i perf.data -F pid,brstack --itrace=bl --- PIDFROM TO PREDICTED --- 16984 0x72e342e5f4/0x72e36192d0/M/-/-/11/RET/- 16984 0x72e7b8b3b4/0x72e7b8b3b8/PN/-/-/11/COND/- 16984 0x72e7b92b48/0x72e7b92b4c/PN/-/-/8/COND/- 16984 0x72eacc6b7c/0x760cc94b00/P/-/-/9/RET/- 16984 0x72e3f210fc/0x72e3f21068/P/-/-/4//- 16984 0x72e39b8c5c/0x72e3627b24/P/-/-/4//- 16984 0x72e7b89d20/0x72e7b92bbc/P/-/-/4/RET/- SPE brstack mispredicted flag might be two characters long: 'PN' or 'MN'. Where 'N' means the branch was marked as NOT-TAKEN. This event is only related to conditional instruction (conditional branch or compare-and-branch), it tells that failed its condition code check. Perf with 'brstack' support for SPE is available here: ``` https://github.com/Leo-Yan/linux/tree/perf_arm_spe_branch_flags_v2 ``` Example of useage with SPE perf data: ```bash perf2bolt -p perf.data -o perf.fdata --spe BINARY ``` Capture standard SPE branch events with perf: ```bash perf record -e 'arm_spe_0/branch_filter=1/u' -- BINARY ``` An unittest is also added to check parsing process of 'SPE brstack format'. --- bolt/lib/Profile/DataAggregator.cpp | 60 ++-- .../test/perf2bolt/AArch64/perf2bolt-spe.test | 2 +- bolt/unittests/Profile/PerfSpeEvents.cpp | 71 +++ 3 files changed, 109 insertions(+), 24 deletions(-) diff --git a/bolt/lib/Profile/DataAggregator.cpp b/bolt/lib/Profile/DataAggregator.cpp index cce9fdbef99bd..4af3a493b8be6 100644 --- a/bolt/lib/Profile/DataAggregator.cpp +++ b/bolt/lib/Profile/DataAggregator.cpp @@ -49,12 +49,10 @@ static cl::opt cl::desc("aggregate basic samples (without LBR info)"), cl::cat(AggregatorCategory)); -cl::opt ArmSPE( -"spe", -cl::desc( -"Enable Arm SPE mode. Used in conjuction with no-lbr mode, ie `--spe " -"--nl`"), -cl::cat(AggregatorCategory)); +cl::opt ArmSPE("spe", + cl::desc("Enable Arm SPE mode. Can combine with `--nl` " + "to use in no-lbr mode"), + cl::cat(AggregatorCategory)); static cl::opt ITraceAggregation("itrace", @@ -180,13 +178,16 @@ void DataAggregator::start() { if (opts::ArmSPE) { if (!opts::BasicAggregation) { - errs() << "PERF2BOLT-ERROR: Arm SPE mode is combined only with " -"BasicAggregation.\n"; - exit(1); + // pidfrom_ip to_ippredicted? + // 12345 0x123/0x456/P/-/-/8/RET/- + launchPerfProcess("SPE branch events", MainEventsPPI, +"script -F pid,brstack --itrace=bl", +/*Wait = */ false); +} else { + launchPerfProcess("SPE brstack events", MainEventsPPI, +"script -F pid,event,ip,addr --itrace=i1i", +/*Wait = */ false); } -launchPerfProcess("branch events with SPE", MainEventsPPI, - "script -F pid,event,ip,addr --itrace=i1i", - /*Wait = */ false); } else if (opts::BasicAggregation) { launchPerfProcess("events without LBR", MainEventsPPI, "script -F pid,event,ip", @@ -527,8 +528,7 @@ Error DataAggregator::preprocessProfile(BinaryContext &BC) { } exit(0); } - - if (((!opts::BasicAggregation && !opts::ArmSPE) && parseBranchEvents()) || + if ((!opts::BasicAggregation && parseBranchEvents()) || (opts::BasicAggregation && opts::ArmSPE && parseSpeAsBasicEvents()) || (opts::BasicAggregation && parseBasicEvents())) errs() << "PERF2BOLT: failed to parse samples\n"; @@ -1034,7 +1034,11 @@ ErrorOr DataAggregator::parseLBREntry() { if (std::error_code EC = MispredStrRes.getError()) return EC; StringRef MispredStr = MispredStrRes.get(); - if (MispredStr.size() != 1 || + // SPE brstack mispredicted flags might be two characters long: 'PN' or 'MN'. + bool ProperStrSize = (MispredStr.size() == 2 && opts::ArmSPE) + ? (MispredStr[1] == 'N') + : (MispredStr.size() == 1); + if (!ProperStrSize || (MispredStr[0] != 'P' && MispredStr[0] != 'M' && MispredStr[0] != '-')) { reportError("expected single char for mispred bit"); Diag << "Found: " << MispredStr << "\n"; @@ -1565,9 +1569,11 @@ uint64_t DataAggregator::parseLBRSample(const PerfBranchSample &Sample, } std::error_code DataAggregator::parseBranchEvents() { - outs() << "PERF2BOLT
[llvm-branch-commits] [llvm] [AMDGPU] Improve StructurizeCFG pass performance by using SSAUpdaterBulk. (PR #135181)
https://github.com/vpykhtin ready_for_review https://github.com/llvm/llvm-project/pull/135181 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [SDAG] Introduce inbounds flag for pointer arithmetic (PR #131862)
https://github.com/ritter-x2a updated https://github.com/llvm/llvm-project/pull/131862 >From 4b88628633b065f3d8cc24d4f3bd4e3274fcc75a Mon Sep 17 00:00:00 2001 From: Fabian Ritter Date: Mon, 17 Mar 2025 06:51:16 -0400 Subject: [PATCH] [SDAG] Introduce inbounds flag for pointer arithmetic This patch introduces an inbounds SDNodeFlag, to show that a pointer addition SDNode implements an inbounds getelementptr operation (i.e., the pointer operand is in bounds wrt. the allocated object it is based on, and the arithmetic does not change that). The flag is set in the DAG construction when lowering inbounds GEPs. Inbounds information is useful in the ISel when selecting memory instructions that perform address computations whose intermediate steps must be in the same memory region as the final result. A follow-up patch will start using it for AMDGPU's flat memory instructions, where the immediate offset must not affect the memory aperture of the address. A similar patch for gMIR and GlobalISel will follow. For SWDEV-516125. --- llvm/include/llvm/CodeGen/SelectionDAGNodes.h| 9 +++-- llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp| 3 +++ llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp | 3 +++ .../CodeGen/X86/merge-store-partially-alias-loads.ll | 2 +- 4 files changed, 14 insertions(+), 3 deletions(-) diff --git a/llvm/include/llvm/CodeGen/SelectionDAGNodes.h b/llvm/include/llvm/CodeGen/SelectionDAGNodes.h index 2283f99202e2f..13ac65f5d731c 100644 --- a/llvm/include/llvm/CodeGen/SelectionDAGNodes.h +++ b/llvm/include/llvm/CodeGen/SelectionDAGNodes.h @@ -415,12 +415,15 @@ struct SDNodeFlags { Unpredictable = 1 << 13, // Compare instructions which may carry the samesign flag. SameSign = 1 << 14, +// Pointer arithmetic instructions that remain in bounds, e.g., implementing +// an inbounds GEP. +InBounds = 1 << 15, // NOTE: Please update LargestValue in LLVM_DECLARE_ENUM_AS_BITMASK below // the class definition when adding new flags. PoisonGeneratingFlags = NoUnsignedWrap | NoSignedWrap | Exact | Disjoint | -NonNeg | NoNaNs | NoInfs | SameSign, +NonNeg | NoNaNs | NoInfs | SameSign | InBounds, FastMathFlags = NoNaNs | NoInfs | NoSignedZeros | AllowReciprocal | AllowContract | ApproximateFuncs | AllowReassociation, }; @@ -455,6 +458,7 @@ struct SDNodeFlags { void setAllowReassociation(bool b) { setFlag(b); } void setNoFPExcept(bool b) { setFlag(b); } void setUnpredictable(bool b) { setFlag(b); } + void setInBounds(bool b) { setFlag(b); } // These are accessors for each flag. bool hasNoUnsignedWrap() const { return Flags & NoUnsignedWrap; } @@ -472,6 +476,7 @@ struct SDNodeFlags { bool hasAllowReassociation() const { return Flags & AllowReassociation; } bool hasNoFPExcept() const { return Flags & NoFPExcept; } bool hasUnpredictable() const { return Flags & Unpredictable; } + bool hasInBounds() const { return Flags & InBounds; } bool operator==(const SDNodeFlags &Other) const { return Flags == Other.Flags; @@ -481,7 +486,7 @@ struct SDNodeFlags { }; LLVM_DECLARE_ENUM_AS_BITMASK(decltype(SDNodeFlags::None), - SDNodeFlags::SameSign); + SDNodeFlags::InBounds); inline SDNodeFlags operator|(SDNodeFlags LHS, SDNodeFlags RHS) { LHS |= RHS; diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp index 89793c30f3710..32973be608937 100644 --- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp +++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp @@ -4283,6 +4283,7 @@ void SelectionDAGBuilder::visitGetElementPtr(const User &I) { if (NW.hasNoUnsignedWrap() || (int64_t(Offset) >= 0 && NW.hasNoUnsignedSignedWrap())) Flags |= SDNodeFlags::NoUnsignedWrap; +Flags.setInBounds(NW.isInBounds()); N = DAG.getNode(ISD::ADD, dl, N.getValueType(), N, DAG.getConstant(Offset, dl, N.getValueType()), Flags); @@ -4326,6 +4327,7 @@ void SelectionDAGBuilder::visitGetElementPtr(const User &I) { if (NW.hasNoUnsignedWrap() || (Offs.isNonNegative() && NW.hasNoUnsignedSignedWrap())) Flags.setNoUnsignedWrap(true); +Flags.setInBounds(NW.isInBounds()); OffsVal = DAG.getSExtOrTrunc(OffsVal, dl, N.getValueType()); @@ -4388,6 +4390,7 @@ void SelectionDAGBuilder::visitGetElementPtr(const User &I) { // pointer index type (add nuw). SDNodeFlags AddFlags; AddFlags.setNoUnsignedWrap(NW.hasNoUnsignedWrap()); + AddFlags.setInBounds(NW.isInBounds()); N = DAG.getNode(ISD::ADD, dl, N.getValueType(), N, IdxN, AddFlags); } diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp ind
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Only fold flat offsets if they are inbounds (PR #132353)
https://github.com/ritter-x2a updated https://github.com/llvm/llvm-project/pull/132353 >From b3a2dc9d2642a79cc3251db2623464075f206e12 Mon Sep 17 00:00:00 2001 From: Fabian Ritter Date: Fri, 21 Mar 2025 03:33:02 -0400 Subject: [PATCH] [AMDGPU][SDAG] Only fold flat offsets if they are inbounds For flat memory instructions where the address is supplied as a base address register with an immediate offset, the memory aperture test ignores the immediate offset. Currently, ISel does not respect that, which leads to miscompilations where valid input programs crash when the address computation relies on the immediate offset to get the base address in the proper memory aperture. Global or scratch instructions are not affected. This patch only selects flat instructions with immediate offsets from address computations with the inbounds flag: If the address computation does not leave the bounds of the allocated object, it cannot leave the bounds of the memory aperture and is therefore safe to handle with an immediate offset. It also adds the inbounds flag to DAG nodes resulting from transformations: - Address computations resulting from getObjectPtrOffset. As far as I can tell, this function is only used to compute addresses within accessed memory ranges, e.g., for loads and stores that are split during legalization. - Reassociated inbounds adds. If both involved operations are inbounds, then so are operations after the transformation. - Address computations in the SelectionDAG lowering of the memcpy/move/set intrinsics. Base and result of the address arithmetic there are accessed, so the operation must be inbounds. It might make sense to separate these changes into their own PR, but I don't see a way to test them without adding a use of the inbounds SDAG flag. Affected tests: - CodeGen/AMDGPU/fold-gep-offset.ll: Offsets are no longer wrongly folded, added new positive tests where we still do fold them. - Transforms/InferAddressSpaces/AMDGPU/flat_atomic.ll: Offset folding doesn't seem integral to this test, so the test is not changed to make offset folding still happen. - CodeGen/AMDGPU/loop-prefetch-data.ll: loop-reduce prefers to base addresses on the potentially OOB addresses used for prefetching for memory accesses, that might be a separate issue to look into. - Added memset tests to CodeGen/AMDGPU/memintrinsic-unroll.ll to make sure that offsets in the memset DAG lowering are still folded properly. A similar patch for GlobalISel will follow. Fixes SWDEV-516125. --- llvm/include/llvm/CodeGen/SelectionDAG.h | 12 +- llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | 9 +- .../lib/CodeGen/SelectionDAG/SelectionDAG.cpp | 12 +- llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp | 140 --- llvm/test/CodeGen/AMDGPU/fold-gep-offset.ll | 374 +- .../test/CodeGen/AMDGPU/loop-prefetch-data.ll | 17 +- .../CodeGen/AMDGPU/memintrinsic-unroll.ll | 241 +++ .../InferAddressSpaces/AMDGPU/flat_atomic.ll | 6 +- 8 files changed, 717 insertions(+), 94 deletions(-) diff --git a/llvm/include/llvm/CodeGen/SelectionDAG.h b/llvm/include/llvm/CodeGen/SelectionDAG.h index 15a2370e5d8b8..aa3668d3e9aae 100644 --- a/llvm/include/llvm/CodeGen/SelectionDAG.h +++ b/llvm/include/llvm/CodeGen/SelectionDAG.h @@ -1069,7 +1069,8 @@ class SelectionDAG { SDValue EVL); /// Returns sum of the base pointer and offset. - /// Unlike getObjectPtrOffset this does not set NoUnsignedWrap by default. + /// Unlike getObjectPtrOffset this does not set NoUnsignedWrap and InBounds by + /// default. SDValue getMemBasePlusOffset(SDValue Base, TypeSize Offset, const SDLoc &DL, const SDNodeFlags Flags = SDNodeFlags()); SDValue getMemBasePlusOffset(SDValue Base, SDValue Offset, const SDLoc &DL, @@ -1077,15 +1078,18 @@ class SelectionDAG { /// Create an add instruction with appropriate flags when used for /// addressing some offset of an object. i.e. if a load is split into multiple - /// components, create an add nuw from the base pointer to the offset. + /// components, create an add nuw inbounds from the base pointer to the + /// offset. SDValue getObjectPtrOffset(const SDLoc &SL, SDValue Ptr, TypeSize Offset) { -return getMemBasePlusOffset(Ptr, Offset, SL, SDNodeFlags::NoUnsignedWrap); +return getMemBasePlusOffset( +Ptr, Offset, SL, SDNodeFlags::NoUnsignedWrap | SDNodeFlags::InBounds); } SDValue getObjectPtrOffset(const SDLoc &SL, SDValue Ptr, SDValue Offset) { // The object itself can't wrap around the address space, so it shouldn't be // possible for the adds of the offsets to the split parts to overflow. -return getMemBasePlusOffset(Ptr, Offset, SL, SDNodeFlags::NoUnsignedWrap); +return getMemBasePlusOffset( +Ptr, Offset, SL, SDNodeFlags::NoUnsignedWrap | SDNodeFlags::InBounds); } /// Return a new CALLSEQ_START node, that starts new call fram
[llvm-branch-commits] [llvm] [SDAG] Introduce inbounds flag for pointer arithmetic (PR #131862)
https://github.com/ritter-x2a updated https://github.com/llvm/llvm-project/pull/131862 >From 4b88628633b065f3d8cc24d4f3bd4e3274fcc75a Mon Sep 17 00:00:00 2001 From: Fabian Ritter Date: Mon, 17 Mar 2025 06:51:16 -0400 Subject: [PATCH] [SDAG] Introduce inbounds flag for pointer arithmetic This patch introduces an inbounds SDNodeFlag, to show that a pointer addition SDNode implements an inbounds getelementptr operation (i.e., the pointer operand is in bounds wrt. the allocated object it is based on, and the arithmetic does not change that). The flag is set in the DAG construction when lowering inbounds GEPs. Inbounds information is useful in the ISel when selecting memory instructions that perform address computations whose intermediate steps must be in the same memory region as the final result. A follow-up patch will start using it for AMDGPU's flat memory instructions, where the immediate offset must not affect the memory aperture of the address. A similar patch for gMIR and GlobalISel will follow. For SWDEV-516125. --- llvm/include/llvm/CodeGen/SelectionDAGNodes.h| 9 +++-- llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp| 3 +++ llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp | 3 +++ .../CodeGen/X86/merge-store-partially-alias-loads.ll | 2 +- 4 files changed, 14 insertions(+), 3 deletions(-) diff --git a/llvm/include/llvm/CodeGen/SelectionDAGNodes.h b/llvm/include/llvm/CodeGen/SelectionDAGNodes.h index 2283f99202e2f..13ac65f5d731c 100644 --- a/llvm/include/llvm/CodeGen/SelectionDAGNodes.h +++ b/llvm/include/llvm/CodeGen/SelectionDAGNodes.h @@ -415,12 +415,15 @@ struct SDNodeFlags { Unpredictable = 1 << 13, // Compare instructions which may carry the samesign flag. SameSign = 1 << 14, +// Pointer arithmetic instructions that remain in bounds, e.g., implementing +// an inbounds GEP. +InBounds = 1 << 15, // NOTE: Please update LargestValue in LLVM_DECLARE_ENUM_AS_BITMASK below // the class definition when adding new flags. PoisonGeneratingFlags = NoUnsignedWrap | NoSignedWrap | Exact | Disjoint | -NonNeg | NoNaNs | NoInfs | SameSign, +NonNeg | NoNaNs | NoInfs | SameSign | InBounds, FastMathFlags = NoNaNs | NoInfs | NoSignedZeros | AllowReciprocal | AllowContract | ApproximateFuncs | AllowReassociation, }; @@ -455,6 +458,7 @@ struct SDNodeFlags { void setAllowReassociation(bool b) { setFlag(b); } void setNoFPExcept(bool b) { setFlag(b); } void setUnpredictable(bool b) { setFlag(b); } + void setInBounds(bool b) { setFlag(b); } // These are accessors for each flag. bool hasNoUnsignedWrap() const { return Flags & NoUnsignedWrap; } @@ -472,6 +476,7 @@ struct SDNodeFlags { bool hasAllowReassociation() const { return Flags & AllowReassociation; } bool hasNoFPExcept() const { return Flags & NoFPExcept; } bool hasUnpredictable() const { return Flags & Unpredictable; } + bool hasInBounds() const { return Flags & InBounds; } bool operator==(const SDNodeFlags &Other) const { return Flags == Other.Flags; @@ -481,7 +486,7 @@ struct SDNodeFlags { }; LLVM_DECLARE_ENUM_AS_BITMASK(decltype(SDNodeFlags::None), - SDNodeFlags::SameSign); + SDNodeFlags::InBounds); inline SDNodeFlags operator|(SDNodeFlags LHS, SDNodeFlags RHS) { LHS |= RHS; diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp index 89793c30f3710..32973be608937 100644 --- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp +++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp @@ -4283,6 +4283,7 @@ void SelectionDAGBuilder::visitGetElementPtr(const User &I) { if (NW.hasNoUnsignedWrap() || (int64_t(Offset) >= 0 && NW.hasNoUnsignedSignedWrap())) Flags |= SDNodeFlags::NoUnsignedWrap; +Flags.setInBounds(NW.isInBounds()); N = DAG.getNode(ISD::ADD, dl, N.getValueType(), N, DAG.getConstant(Offset, dl, N.getValueType()), Flags); @@ -4326,6 +4327,7 @@ void SelectionDAGBuilder::visitGetElementPtr(const User &I) { if (NW.hasNoUnsignedWrap() || (Offs.isNonNegative() && NW.hasNoUnsignedSignedWrap())) Flags.setNoUnsignedWrap(true); +Flags.setInBounds(NW.isInBounds()); OffsVal = DAG.getSExtOrTrunc(OffsVal, dl, N.getValueType()); @@ -4388,6 +4390,7 @@ void SelectionDAGBuilder::visitGetElementPtr(const User &I) { // pointer index type (add nuw). SDNodeFlags AddFlags; AddFlags.setNoUnsignedWrap(NW.hasNoUnsignedWrap()); + AddFlags.setInBounds(NW.isInBounds()); N = DAG.getNode(ISD::ADD, dl, N.getValueType(), N, IdxN, AddFlags); } diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp ind
[llvm-branch-commits] [llvm] [AMDGPU] Improve StructurizeCFG pass performance by using SSAUpdaterBulk. (PR #135181)
https://github.com/vpykhtin edited https://github.com/llvm/llvm-project/pull/135181 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] ELF: Remove lock from MTE global relocation handling code. (PR #135123)
https://github.com/MaskRay approved this pull request. https://github.com/llvm/llvm-project/pull/135123 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Only fold flat offsets if they are inbounds (PR #132353)
https://github.com/ritter-x2a updated https://github.com/llvm/llvm-project/pull/132353 >From b3a2dc9d2642a79cc3251db2623464075f206e12 Mon Sep 17 00:00:00 2001 From: Fabian Ritter Date: Fri, 21 Mar 2025 03:33:02 -0400 Subject: [PATCH] [AMDGPU][SDAG] Only fold flat offsets if they are inbounds For flat memory instructions where the address is supplied as a base address register with an immediate offset, the memory aperture test ignores the immediate offset. Currently, ISel does not respect that, which leads to miscompilations where valid input programs crash when the address computation relies on the immediate offset to get the base address in the proper memory aperture. Global or scratch instructions are not affected. This patch only selects flat instructions with immediate offsets from address computations with the inbounds flag: If the address computation does not leave the bounds of the allocated object, it cannot leave the bounds of the memory aperture and is therefore safe to handle with an immediate offset. It also adds the inbounds flag to DAG nodes resulting from transformations: - Address computations resulting from getObjectPtrOffset. As far as I can tell, this function is only used to compute addresses within accessed memory ranges, e.g., for loads and stores that are split during legalization. - Reassociated inbounds adds. If both involved operations are inbounds, then so are operations after the transformation. - Address computations in the SelectionDAG lowering of the memcpy/move/set intrinsics. Base and result of the address arithmetic there are accessed, so the operation must be inbounds. It might make sense to separate these changes into their own PR, but I don't see a way to test them without adding a use of the inbounds SDAG flag. Affected tests: - CodeGen/AMDGPU/fold-gep-offset.ll: Offsets are no longer wrongly folded, added new positive tests where we still do fold them. - Transforms/InferAddressSpaces/AMDGPU/flat_atomic.ll: Offset folding doesn't seem integral to this test, so the test is not changed to make offset folding still happen. - CodeGen/AMDGPU/loop-prefetch-data.ll: loop-reduce prefers to base addresses on the potentially OOB addresses used for prefetching for memory accesses, that might be a separate issue to look into. - Added memset tests to CodeGen/AMDGPU/memintrinsic-unroll.ll to make sure that offsets in the memset DAG lowering are still folded properly. A similar patch for GlobalISel will follow. Fixes SWDEV-516125. --- llvm/include/llvm/CodeGen/SelectionDAG.h | 12 +- llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | 9 +- .../lib/CodeGen/SelectionDAG/SelectionDAG.cpp | 12 +- llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp | 140 --- llvm/test/CodeGen/AMDGPU/fold-gep-offset.ll | 374 +- .../test/CodeGen/AMDGPU/loop-prefetch-data.ll | 17 +- .../CodeGen/AMDGPU/memintrinsic-unroll.ll | 241 +++ .../InferAddressSpaces/AMDGPU/flat_atomic.ll | 6 +- 8 files changed, 717 insertions(+), 94 deletions(-) diff --git a/llvm/include/llvm/CodeGen/SelectionDAG.h b/llvm/include/llvm/CodeGen/SelectionDAG.h index 15a2370e5d8b8..aa3668d3e9aae 100644 --- a/llvm/include/llvm/CodeGen/SelectionDAG.h +++ b/llvm/include/llvm/CodeGen/SelectionDAG.h @@ -1069,7 +1069,8 @@ class SelectionDAG { SDValue EVL); /// Returns sum of the base pointer and offset. - /// Unlike getObjectPtrOffset this does not set NoUnsignedWrap by default. + /// Unlike getObjectPtrOffset this does not set NoUnsignedWrap and InBounds by + /// default. SDValue getMemBasePlusOffset(SDValue Base, TypeSize Offset, const SDLoc &DL, const SDNodeFlags Flags = SDNodeFlags()); SDValue getMemBasePlusOffset(SDValue Base, SDValue Offset, const SDLoc &DL, @@ -1077,15 +1078,18 @@ class SelectionDAG { /// Create an add instruction with appropriate flags when used for /// addressing some offset of an object. i.e. if a load is split into multiple - /// components, create an add nuw from the base pointer to the offset. + /// components, create an add nuw inbounds from the base pointer to the + /// offset. SDValue getObjectPtrOffset(const SDLoc &SL, SDValue Ptr, TypeSize Offset) { -return getMemBasePlusOffset(Ptr, Offset, SL, SDNodeFlags::NoUnsignedWrap); +return getMemBasePlusOffset( +Ptr, Offset, SL, SDNodeFlags::NoUnsignedWrap | SDNodeFlags::InBounds); } SDValue getObjectPtrOffset(const SDLoc &SL, SDValue Ptr, SDValue Offset) { // The object itself can't wrap around the address space, so it shouldn't be // possible for the adds of the offsets to the split parts to overflow. -return getMemBasePlusOffset(Ptr, Offset, SL, SDNodeFlags::NoUnsignedWrap); +return getMemBasePlusOffset( +Ptr, Offset, SL, SDNodeFlags::NoUnsignedWrap | SDNodeFlags::InBounds); } /// Return a new CALLSEQ_START node, that starts new call fram
[llvm-branch-commits] [clang] [clang] implement printing of canonical template arguments of expression kind (PR #135133)
@@ -1357,6 +1357,8 @@ void TextNodeDumper::VisitTemplateExpansionTemplateArgument( void TextNodeDumper::VisitExpressionTemplateArgument( const TemplateArgument &TA) { OS << " expr"; + if (TA.isCanonicalExpr()) +OS << " canon"; erichkeane wrote: Same hope here on the full word. https://github.com/llvm/llvm-project/pull/135133 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang] implement printing of canonical template arguments of expression kind (PR #135133)
@@ -1305,9 +1305,13 @@ void StmtPrinter::VisitDeclRefExpr(DeclRefExpr *Node) { Qualifier->print(OS, Policy); if (Node->hasTemplateKeyword()) OS << "template "; + + bool ForceAnonymous = + Policy.PrintAsCanonical && VD->getKind() == Decl::NonTypeTemplateParm; erichkeane wrote: Can you explain what is going on here? This is a little subtle. https://github.com/llvm/llvm-project/pull/135133 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang] implement printing of canonical template arguments of expression kind (PR #135133)
@@ -1724,6 +1724,8 @@ void JSONNodeDumper::VisitTemplateExpansionTemplateArgument( void JSONNodeDumper::VisitExpressionTemplateArgument( const TemplateArgument &TA) { JOS.attribute("isExpr", true); + if (TA.isCanonicalExpr()) +JOS.attribute("isCanon", true); erichkeane wrote: Any reason to not just do `isCanonical` instead? `Canon` is already a word and sounds nonsensical. https://github.com/llvm/llvm-project/pull/135133 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [KeyInstr] Add Atom Group waterline to LLVMContext (PR #133478)
OCHyams wrote: > Possibly part of the design here is to simply not care, if it's only about > internal consistency within a Function (does that hold after inlining too). > Apologies if this is all explained in a later patch. It is indeed the goal not to care; an instruction is only considered to be from the same source atom as another instruction (implied: in the same function) if they've got the same `atomGroup` and `inlinedAt` fields. (we only ever examine the groups within the context of a function, i.e., we don't ever try to ask "are these instructions in different functions from the same source atom"). > The answers to that should ultimately be documented somewhere; I imagine > that's in the patch stack or coming later. Your imagination gives me too much credit. It's documented in some comments scattered through the stack, but there's not yet a documentation patch. https://github.com/llvm/llvm-project/pull/133478 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Add initial support for SPE brstack format (PR #129231)
github-actions[bot] wrote: :warning: C/C++ code formatter, clang-format found issues in your code. :warning: You can test this locally with the following command: ``bash git-clang-format --diff 47a986de762147c4f27a20ff9b1d75f9f5a50bdc aec7a556fed56c72184963d21d6893e586d6a7e2 --extensions cpp -- bolt/lib/Profile/DataAggregator.cpp bolt/unittests/Profile/PerfSpeEvents.cpp `` View the diff from clang-format here. ``diff diff --git a/bolt/lib/Profile/DataAggregator.cpp b/bolt/lib/Profile/DataAggregator.cpp index 4273eda865..bcb3b2c8ef 100644 --- a/bolt/lib/Profile/DataAggregator.cpp +++ b/bolt/lib/Profile/DataAggregator.cpp @@ -1035,19 +1035,20 @@ ErrorOr DataAggregator::parseLBREntry() { return EC; StringRef MispredStr = MispredStrRes.get(); // SPE brstack mispredicted flags might be two characters long: 'PN' or 'MN'. - bool ValidStrSize = opts::ArmSPE ? -MispredStr.size() >= 1 && MispredStr.size() <= 2 : MispredStr.size() == 1; + bool ValidStrSize = opts::ArmSPE + ? MispredStr.size() >= 1 && MispredStr.size() <= 2 + : MispredStr.size() == 1; bool SpeTakenBitErr = - (opts::ArmSPE && MispredStr.size() == 2 && MispredStr[1] != 'N'); + (opts::ArmSPE && MispredStr.size() == 2 && MispredStr[1] != 'N'); bool PredictionBitErr = - !ValidStrSize || - (MispredStr[0] != 'P' && MispredStr[0] != 'M' && MispredStr[0] != '-'); + !ValidStrSize || + (MispredStr[0] != 'P' && MispredStr[0] != 'M' && MispredStr[0] != '-'); if (SpeTakenBitErr) reportError("expected 'N' as SPE prediction bit for a not-taken branch"); if (PredictionBitErr) reportError("expected 'P', 'M' or '-' char as a prediction bit"); - if (SpeTakenBitErr || PredictionBitErr) { + if (SpeTakenBitErr || PredictionBitErr) { Diag << "Found: " << MispredStr << "\n"; return make_error_code(llvm::errc::io_error); } `` https://github.com/llvm/llvm-project/pull/129231 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang] implement printing of canonical template arguments of expression kind (PR #135133)
@@ -1305,9 +1305,13 @@ void StmtPrinter::VisitDeclRefExpr(DeclRefExpr *Node) { Qualifier->print(OS, Policy); if (Node->hasTemplateKeyword()) OS << "template "; + + bool ForceAnonymous = + Policy.PrintAsCanonical && VD->getKind() == Decl::NonTypeTemplateParm; mizvekov wrote: Yeah, canonicalization of expressions should erase the identity of any NTTPs referenced therein, which should make them print as 'value-parameter-X-X', as if the NTTP was anonymous, and similarly to how it happens with regards to types and type template parameters. https://github.com/llvm/llvm-project/pull/135133 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: analyze functions without CFG information (PR #133461)
https://github.com/kbeyls edited https://github.com/llvm/llvm-project/pull/133461 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang-tools-extra] [clang] implement printing of canonical template arguments of expression kind (PR #135133)
https://github.com/mizvekov updated https://github.com/llvm/llvm-project/pull/135133 >From e8ab5ff779bc00ff6a239f0acea8182c69cb7bcc Mon Sep 17 00:00:00 2001 From: Matheus Izvekov Date: Thu, 10 Apr 2025 02:52:36 -0300 Subject: [PATCH] [clang] implement printing of canonical template arguments of expression kind This patch extends the canonicalization printing policy to cover expressions and template names, and wires that up to the template argument printer, covering expressions. This is helpful for debugging, or if these template arguments somehow end up in diagnostics, as without this patch they can print as completely unrelated expressions, which can be quite confusing. This is because expressions are not uniqued, unlike types, and when a template specialization containing an expression is the first to be canonicalized, the expression ends up appearing in the canonical type of subsequent equivalent specializations. Fixes https://github.com/llvm/llvm-project/issues/92292 --- .../StaticAccessedThroughInstanceCheck.cpp|2 +- .../clang-tidy/utils/Matchers.cpp |2 +- clang/include/clang/AST/PrettyPrinter.h |6 +- clang/lib/AST/DeclPrinter.cpp |4 +- clang/lib/AST/JSONNodeDumper.cpp |2 + clang/lib/AST/StmtPrinter.cpp |6 +- clang/lib/AST/TemplateBase.cpp|7 +- clang/lib/AST/TemplateName.cpp| 10 +- clang/lib/AST/TextNodeDumper.cpp |2 + clang/lib/AST/TypePrinter.cpp |9 +- clang/lib/CodeGen/CGDebugInfo.cpp |2 +- clang/lib/Sema/SemaTemplate.cpp |2 +- clang/test/AST/ast-dump-templates.cpp | 1022 + clang/unittests/AST/TypePrinterTest.cpp |2 +- 14 files changed, 1058 insertions(+), 20 deletions(-) diff --git a/clang-tools-extra/clang-tidy/readability/StaticAccessedThroughInstanceCheck.cpp b/clang-tools-extra/clang-tidy/readability/StaticAccessedThroughInstanceCheck.cpp index 08adc7134cfea..fffb136e5a332 100644 --- a/clang-tools-extra/clang-tidy/readability/StaticAccessedThroughInstanceCheck.cpp +++ b/clang-tools-extra/clang-tidy/readability/StaticAccessedThroughInstanceCheck.cpp @@ -69,7 +69,7 @@ void StaticAccessedThroughInstanceCheck::check( PrintingPolicyWithSuppressedTag.SuppressTagKeyword = true; PrintingPolicyWithSuppressedTag.SuppressUnwrittenScope = true; - PrintingPolicyWithSuppressedTag.PrintCanonicalTypes = + PrintingPolicyWithSuppressedTag.PrintAsCanonical = !BaseExpr->getType()->isTypedefNameType(); std::string BaseTypeName = diff --git a/clang-tools-extra/clang-tidy/utils/Matchers.cpp b/clang-tools-extra/clang-tidy/utils/Matchers.cpp index 7e89cae1c3316..0721667fd0c41 100644 --- a/clang-tools-extra/clang-tidy/utils/Matchers.cpp +++ b/clang-tools-extra/clang-tidy/utils/Matchers.cpp @@ -32,7 +32,7 @@ bool MatchesAnyListedTypeNameMatcher::matches( PrintingPolicy PrintingPolicyWithSuppressedTag( Finder->getASTContext().getLangOpts()); - PrintingPolicyWithSuppressedTag.PrintCanonicalTypes = true; + PrintingPolicyWithSuppressedTag.PrintAsCanonical = true; PrintingPolicyWithSuppressedTag.SuppressElaboration = true; PrintingPolicyWithSuppressedTag.SuppressScope = false; PrintingPolicyWithSuppressedTag.SuppressTagKeyword = true; diff --git a/clang/include/clang/AST/PrettyPrinter.h b/clang/include/clang/AST/PrettyPrinter.h index 91818776b770c..5a98ae1987b16 100644 --- a/clang/include/clang/AST/PrettyPrinter.h +++ b/clang/include/clang/AST/PrettyPrinter.h @@ -76,7 +76,7 @@ struct PrintingPolicy { MSWChar(LO.MicrosoftExt && !LO.WChar), IncludeNewlines(true), MSVCFormatting(false), ConstantsAsWritten(false), SuppressImplicitBase(false), FullyQualifiedName(false), -PrintCanonicalTypes(false), PrintInjectedClassNameWithArguments(true), +PrintAsCanonical(false), PrintInjectedClassNameWithArguments(true), UsePreferredNames(true), AlwaysIncludeTypeForTemplateArgument(false), CleanUglifiedParameters(false), EntireContentsOfLargeArray(true), UseEnumerators(true), UseHLSLTypes(LO.HLSL) {} @@ -310,9 +310,9 @@ struct PrintingPolicy { LLVM_PREFERRED_TYPE(bool) unsigned FullyQualifiedName : 1; - /// Whether to print types as written or canonically. + /// Whether to print entities as written or canonically. LLVM_PREFERRED_TYPE(bool) - unsigned PrintCanonicalTypes : 1; + unsigned PrintAsCanonical : 1; /// Whether to print an InjectedClassNameType with template arguments or as /// written. When a template argument is unnamed, printing it results in diff --git a/clang/lib/AST/DeclPrinter.cpp b/clang/lib/AST/DeclPrinter.cpp index 28098b242d494..22da5bf251ecd 100644 --- a/clang/lib/AST/DeclPrinter.cpp +++ b/clang/lib/AST/DeclPrinter.cpp @@ -735,7 +735,7 @@ void DeclPrinter::VisitFunctionDecl(FunctionDecl *D) { llvm::raw_string_ostream POut
[llvm-branch-commits] [llvm] ssaupdaterbulk_add_phi_optimization (PR #135180)
https://github.com/vpykhtin created https://github.com/llvm/llvm-project/pull/135180 None >From 367db01dcf1d8f6305e86e624306f4aefc0b1f95 Mon Sep 17 00:00:00 2001 From: Valery Pykhtin Date: Thu, 10 Apr 2025 11:56:57 + Subject: [PATCH] ssaupdaterbulk_add_phi_optimization --- .../llvm/Transforms/Utils/SSAUpdaterBulk.h| 5 +- llvm/lib/Transforms/Utils/SSAUpdaterBulk.cpp | 38 ++- .../Transforms/Utils/SSAUpdaterBulkTest.cpp | 67 +++ 3 files changed, 108 insertions(+), 2 deletions(-) diff --git a/llvm/include/llvm/Transforms/Utils/SSAUpdaterBulk.h b/llvm/include/llvm/Transforms/Utils/SSAUpdaterBulk.h index b2cf29608f58b..2fb241b0d8e26 100644 --- a/llvm/include/llvm/Transforms/Utils/SSAUpdaterBulk.h +++ b/llvm/include/llvm/Transforms/Utils/SSAUpdaterBulk.h @@ -13,7 +13,6 @@ #ifndef LLVM_TRANSFORMS_UTILS_SSAUPDATERBULK_H #define LLVM_TRANSFORMS_UTILS_SSAUPDATERBULK_H -#include "llvm/ADT/DenseMap.h" #include "llvm/ADT/StringRef.h" #include "llvm/IR/PredIteratorCache.h" @@ -77,6 +76,10 @@ class SSAUpdaterBulk { /// vector. void RewriteAllUses(DominatorTree *DT, SmallVectorImpl *InsertedPHIs = nullptr); + + /// Rewrite all uses and simplify the inserted PHI nodes. + /// Use this method to preserve behavior when replacing SSAUpdater. + void RewriteAndOptimizeAllUses(DominatorTree *DT); }; } // end namespace llvm diff --git a/llvm/lib/Transforms/Utils/SSAUpdaterBulk.cpp b/llvm/lib/Transforms/Utils/SSAUpdaterBulk.cpp index d7bf791a23edf..437fd0c1dca91 100644 --- a/llvm/lib/Transforms/Utils/SSAUpdaterBulk.cpp +++ b/llvm/lib/Transforms/Utils/SSAUpdaterBulk.cpp @@ -11,13 +11,14 @@ //===--===// #include "llvm/Transforms/Utils/SSAUpdaterBulk.h" +#include "llvm/Analysis/InstructionSimplify.h" #include "llvm/Analysis/IteratedDominanceFrontier.h" #include "llvm/IR/BasicBlock.h" #include "llvm/IR/Dominators.h" #include "llvm/IR/IRBuilder.h" -#include "llvm/IR/Instructions.h" #include "llvm/IR/Use.h" #include "llvm/IR/Value.h" +#include "llvm/Transforms/Utils/Local.h" using namespace llvm; @@ -222,3 +223,38 @@ void SSAUpdaterBulk::RewriteAllUses(DominatorTree *DT, } } } + +// Perform a single pass of simplification over the worklist of PHIs. +static void SimplifyPass(MutableArrayRef Worklist) { + if (Worklist.empty()) +return; + + const DataLayout &DL = Worklist.front()->getParent()->getDataLayout(); + for (PHINode *&PHI : Worklist) { +if (Value *Simplified = simplifyInstruction(PHI, DL)) { + PHI->replaceAllUsesWith(Simplified); + PHI->eraseFromParent(); + PHI = nullptr; // Mark as removed. +} + } +} + +static void DeduplicatePass(ArrayRef Worklist) { + SmallDenseMap BBs; + for (PHINode *PHI : Worklist) { +if (PHI) + ++BBs[PHI->getParent()]; + } + + for (auto [BB, NumNewPHIs] : BBs) { +auto FirstExistedPN = std::next(BB->phis().begin(), NumNewPHIs); +EliminateNewDuplicatePHINodes(BB, FirstExistedPN); + } +} + +void SSAUpdaterBulk::RewriteAndOptimizeAllUses(DominatorTree *DT) { + SmallVector PHIs; + RewriteAllUses(DT, &PHIs); + SimplifyPass(PHIs); + DeduplicatePass(PHIs); +} \ No newline at end of file diff --git a/llvm/unittests/Transforms/Utils/SSAUpdaterBulkTest.cpp b/llvm/unittests/Transforms/Utils/SSAUpdaterBulkTest.cpp index 841f44cf6bfed..6f2e63dcd9f90 100644 --- a/llvm/unittests/Transforms/Utils/SSAUpdaterBulkTest.cpp +++ b/llvm/unittests/Transforms/Utils/SSAUpdaterBulkTest.cpp @@ -308,3 +308,70 @@ TEST(SSAUpdaterBulk, TwoBBLoop) { EXPECT_EQ(Phi->getIncomingValueForBlock(Entry), ConstantInt::get(I32Ty, 0)); EXPECT_EQ(Phi->getIncomingValueForBlock(Loop), I); } + +TEST(SSAUpdaterBulk, SimplifyPHIs) { + const char *IR = R"( + define void @main(i32 %val, i1 %cond) { + entry: + br i1 %cond, label %left, label %right + left: + %add = add i32 %val, 1 + br label %exit + right: + %sub = sub i32 %val, 1 + br label %exit + exit: + %phi = phi i32 [ %sub, %right ], [ %add, %left ] + %cmp = icmp slt i32 0, 42 + ret void + } + )"; + + llvm::LLVMContext Context; + llvm::SMDiagnostic Err; + std::unique_ptr M = llvm::parseAssemblyString(IR, Err, Context); + ASSERT_NE(M, nullptr) << "Failed to parse IR: " << Err.getMessage(); + + Function *F = M->getFunction("main"); + auto *Entry = &F->getEntryBlock(); + auto *Left = Entry->getTerminator()->getSuccessor(0); + auto *Right = Entry->getTerminator()->getSuccessor(1); + auto *Exit = Left->getSingleSuccessor(); + auto *Val = &*F->arg_begin(); + auto *Phi = &Exit->front(); + auto *Cmp = &*std::next(Exit->begin()); + auto *Add = &Left->front(); + auto *Sub = &Right->front(); + + SSAUpdaterBulk Updater; + Type *I32Ty = Type::getInt32Ty(Context); + + // Use %val directly instead of creating a phi. + unsigned ValVar = Updater.AddVariabl
[llvm-branch-commits] [llvm] ssaupdaterbulk_add_phi_optimization (PR #135180)
vpykhtin wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.dev/github/pr/llvm/llvm-project/135180?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#135181** https://app.graphite.dev/github/pr/llvm/llvm-project/135181?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#135180** https://app.graphite.dev/github/pr/llvm/llvm-project/135180?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/135180?utm_source=stack-comment-view-in-graphite"; target="_blank">(View in Graphite) * **#135179** https://app.graphite.dev/github/pr/llvm/llvm-project/135179?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * `main` This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn more about https://stacking.dev/?utm_source=stack-comment";>stacking. https://github.com/llvm/llvm-project/pull/135180 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Add initial support for SPE brstack format (PR #129231)
@@ -88,6 +89,45 @@ struct PerfSpeEventsTestHelper : public testing::Test { return SampleSize == DA.BasicSamples.size(); } + + /// Compare LBREntries + bool checkLBREntry(const LBREntry &Lhs, const LBREntry &Rhs) { +return Lhs.From == Rhs.From && Lhs.To == Rhs.To && + Lhs.Mispred == Rhs.Mispred; + } + + /// Parse and check SPE brstack as LBR + void parseAndCheckBrstackEvents( + uint64_t PID, + const std::vector> &ExpectedSamples) { +int NumSamples = 0; + +DataAggregator DA(""); +DA.ParsingBuf = opts::ReadPerfEvents; +DA.BC = BC.get(); +DataAggregator::MMapInfo MMap; +DA.BinaryMMapInfo.insert(std::make_pair(PID, MMap)); + +// Process buffer. +while (DA.hasData()) { kaadam wrote: I kept the original approach, since I haven't find good way to create such a simple ELF mock binary that test BranchLBRs functionality properly. Maybe better to add a new test, to check BranchLBRs in different manner. https://github.com/llvm/llvm-project/pull/129231 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] amdgpu_use_ssaupdaterbulk_in_structurizecfg (PR #135181)
vpykhtin wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.dev/github/pr/llvm/llvm-project/135181?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#135181** https://app.graphite.dev/github/pr/llvm/llvm-project/135181?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/135181?utm_source=stack-comment-view-in-graphite"; target="_blank">(View in Graphite) * **#135180** https://app.graphite.dev/github/pr/llvm/llvm-project/135180?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#135179** https://app.graphite.dev/github/pr/llvm/llvm-project/135179?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * `main` This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn more about https://stacking.dev/?utm_source=stack-comment";>stacking. https://github.com/llvm/llvm-project/pull/135181 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] ssaupdaterbulk_add_phi_optimization (PR #135180)
https://github.com/vpykhtin edited https://github.com/llvm/llvm-project/pull/135180 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] ssaupdaterbulk_add_phi_optimization (PR #135180)
https://github.com/vpykhtin edited https://github.com/llvm/llvm-project/pull/135180 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] ssaupdaterbulk_add_phi_optimization (PR #135180)
https://github.com/vpykhtin edited https://github.com/llvm/llvm-project/pull/135180 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [SSAUpdaterBulk] Add PHI simplification pass. (PR #135180)
https://github.com/vpykhtin edited https://github.com/llvm/llvm-project/pull/135180 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Improve StructurizeCFG pass performance by using SSAUpdaterBulk. (PR #135181)
https://github.com/vpykhtin edited https://github.com/llvm/llvm-project/pull/135181 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [SSAUpdaterBulk] Add PHI simplification pass. (PR #135180)
https://github.com/vpykhtin ready_for_review https://github.com/llvm/llvm-project/pull/135180 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [SSAUpdaterBulk] Add PHI simplification pass. (PR #135180)
llvmbot wrote: @llvm/pr-subscribers-llvm-transforms Author: Valery Pykhtin (vpykhtin) Changes This is a replacement PR for https://github.com/llvm/llvm-project/pull/132004, stacked version. --- Full diff: https://github.com/llvm/llvm-project/pull/135180.diff 3 Files Affected: - (modified) llvm/include/llvm/Transforms/Utils/SSAUpdaterBulk.h (+4-1) - (modified) llvm/lib/Transforms/Utils/SSAUpdaterBulk.cpp (+37-1) - (modified) llvm/unittests/Transforms/Utils/SSAUpdaterBulkTest.cpp (+67) ``diff diff --git a/llvm/include/llvm/Transforms/Utils/SSAUpdaterBulk.h b/llvm/include/llvm/Transforms/Utils/SSAUpdaterBulk.h index b2cf29608f58b..2fb241b0d8e26 100644 --- a/llvm/include/llvm/Transforms/Utils/SSAUpdaterBulk.h +++ b/llvm/include/llvm/Transforms/Utils/SSAUpdaterBulk.h @@ -13,7 +13,6 @@ #ifndef LLVM_TRANSFORMS_UTILS_SSAUPDATERBULK_H #define LLVM_TRANSFORMS_UTILS_SSAUPDATERBULK_H -#include "llvm/ADT/DenseMap.h" #include "llvm/ADT/StringRef.h" #include "llvm/IR/PredIteratorCache.h" @@ -77,6 +76,10 @@ class SSAUpdaterBulk { /// vector. void RewriteAllUses(DominatorTree *DT, SmallVectorImpl *InsertedPHIs = nullptr); + + /// Rewrite all uses and simplify the inserted PHI nodes. + /// Use this method to preserve behavior when replacing SSAUpdater. + void RewriteAndOptimizeAllUses(DominatorTree *DT); }; } // end namespace llvm diff --git a/llvm/lib/Transforms/Utils/SSAUpdaterBulk.cpp b/llvm/lib/Transforms/Utils/SSAUpdaterBulk.cpp index d7bf791a23edf..437fd0c1dca91 100644 --- a/llvm/lib/Transforms/Utils/SSAUpdaterBulk.cpp +++ b/llvm/lib/Transforms/Utils/SSAUpdaterBulk.cpp @@ -11,13 +11,14 @@ //===--===// #include "llvm/Transforms/Utils/SSAUpdaterBulk.h" +#include "llvm/Analysis/InstructionSimplify.h" #include "llvm/Analysis/IteratedDominanceFrontier.h" #include "llvm/IR/BasicBlock.h" #include "llvm/IR/Dominators.h" #include "llvm/IR/IRBuilder.h" -#include "llvm/IR/Instructions.h" #include "llvm/IR/Use.h" #include "llvm/IR/Value.h" +#include "llvm/Transforms/Utils/Local.h" using namespace llvm; @@ -222,3 +223,38 @@ void SSAUpdaterBulk::RewriteAllUses(DominatorTree *DT, } } } + +// Perform a single pass of simplification over the worklist of PHIs. +static void SimplifyPass(MutableArrayRef Worklist) { + if (Worklist.empty()) +return; + + const DataLayout &DL = Worklist.front()->getParent()->getDataLayout(); + for (PHINode *&PHI : Worklist) { +if (Value *Simplified = simplifyInstruction(PHI, DL)) { + PHI->replaceAllUsesWith(Simplified); + PHI->eraseFromParent(); + PHI = nullptr; // Mark as removed. +} + } +} + +static void DeduplicatePass(ArrayRef Worklist) { + SmallDenseMap BBs; + for (PHINode *PHI : Worklist) { +if (PHI) + ++BBs[PHI->getParent()]; + } + + for (auto [BB, NumNewPHIs] : BBs) { +auto FirstExistedPN = std::next(BB->phis().begin(), NumNewPHIs); +EliminateNewDuplicatePHINodes(BB, FirstExistedPN); + } +} + +void SSAUpdaterBulk::RewriteAndOptimizeAllUses(DominatorTree *DT) { + SmallVector PHIs; + RewriteAllUses(DT, &PHIs); + SimplifyPass(PHIs); + DeduplicatePass(PHIs); +} \ No newline at end of file diff --git a/llvm/unittests/Transforms/Utils/SSAUpdaterBulkTest.cpp b/llvm/unittests/Transforms/Utils/SSAUpdaterBulkTest.cpp index 841f44cf6bfed..6f2e63dcd9f90 100644 --- a/llvm/unittests/Transforms/Utils/SSAUpdaterBulkTest.cpp +++ b/llvm/unittests/Transforms/Utils/SSAUpdaterBulkTest.cpp @@ -308,3 +308,70 @@ TEST(SSAUpdaterBulk, TwoBBLoop) { EXPECT_EQ(Phi->getIncomingValueForBlock(Entry), ConstantInt::get(I32Ty, 0)); EXPECT_EQ(Phi->getIncomingValueForBlock(Loop), I); } + +TEST(SSAUpdaterBulk, SimplifyPHIs) { + const char *IR = R"( + define void @main(i32 %val, i1 %cond) { + entry: + br i1 %cond, label %left, label %right + left: + %add = add i32 %val, 1 + br label %exit + right: + %sub = sub i32 %val, 1 + br label %exit + exit: + %phi = phi i32 [ %sub, %right ], [ %add, %left ] + %cmp = icmp slt i32 0, 42 + ret void + } + )"; + + llvm::LLVMContext Context; + llvm::SMDiagnostic Err; + std::unique_ptr M = llvm::parseAssemblyString(IR, Err, Context); + ASSERT_NE(M, nullptr) << "Failed to parse IR: " << Err.getMessage(); + + Function *F = M->getFunction("main"); + auto *Entry = &F->getEntryBlock(); + auto *Left = Entry->getTerminator()->getSuccessor(0); + auto *Right = Entry->getTerminator()->getSuccessor(1); + auto *Exit = Left->getSingleSuccessor(); + auto *Val = &*F->arg_begin(); + auto *Phi = &Exit->front(); + auto *Cmp = &*std::next(Exit->begin()); + auto *Add = &Left->front(); + auto *Sub = &Right->front(); + + SSAUpdaterBulk Updater; + Type *I32Ty = Type::getInt32Ty(Context); + + // Use %val directly instead of creating a phi. + unsigned ValVar = Updater.AddVariable("V
[llvm-branch-commits] [llvm] [AMDGPU] Improve StructurizeCFG pass performance by using SSAUpdaterBulk. (PR #135181)
llvmbot wrote: @llvm/pr-subscribers-llvm-transforms Author: Valery Pykhtin (vpykhtin) Changes This is a replacement PR for https://github.com/llvm/llvm-project/pull/130611, stacked version. --- Full diff: https://github.com/llvm/llvm-project/pull/135181.diff 1 Files Affected: - (modified) llvm/lib/Transforms/Scalar/StructurizeCFG.cpp (+15-10) ``diff diff --git a/llvm/lib/Transforms/Scalar/StructurizeCFG.cpp b/llvm/lib/Transforms/Scalar/StructurizeCFG.cpp index 00c4fcc76e791..95c68ecd2255b 100644 --- a/llvm/lib/Transforms/Scalar/StructurizeCFG.cpp +++ b/llvm/lib/Transforms/Scalar/StructurizeCFG.cpp @@ -47,6 +47,7 @@ #include "llvm/Transforms/Utils/BasicBlockUtils.h" #include "llvm/Transforms/Utils/Local.h" #include "llvm/Transforms/Utils/SSAUpdater.h" +#include "llvm/Transforms/Utils/SSAUpdaterBulk.h" #include #include @@ -317,7 +318,7 @@ class StructurizeCFG { void collectInfos(); - void insertConditions(bool Loops); + void insertConditions(bool Loops, SSAUpdaterBulk &PhiInserter); void simplifyConditions(); @@ -600,10 +601,9 @@ void StructurizeCFG::collectInfos() { } /// Insert the missing branch conditions -void StructurizeCFG::insertConditions(bool Loops) { +void StructurizeCFG::insertConditions(bool Loops, SSAUpdaterBulk &PhiInserter) { BranchVector &Conds = Loops ? LoopConds : Conditions; Value *Default = Loops ? BoolTrue : BoolFalse; - SSAUpdater PhiInserter; for (BranchInst *Term : Conds) { assert(Term->isConditional()); @@ -619,22 +619,23 @@ void StructurizeCFG::insertConditions(bool Loops) { Term->setCondition(PI.Pred); CondBranchWeights::setMetadata(*Term, PI.Weights); } else { - PhiInserter.Initialize(Boolean, ""); - PhiInserter.AddAvailableValue(Loops ? SuccFalse : Parent, Default); + unsigned Variable = PhiInserter.AddVariable("", Boolean); + PhiInserter.AddAvailableValue(Variable, Loops ? SuccFalse : Parent, +Default); NearestCommonDominator Dominator(DT); Dominator.addBlock(Parent); for (auto [BB, PI] : Preds) { assert(BB != Parent); -PhiInserter.AddAvailableValue(BB, PI.Pred); +PhiInserter.AddAvailableValue(Variable, BB, PI.Pred); Dominator.addAndRememberBlock(BB); } if (!Dominator.resultIsRememberedBlock()) -PhiInserter.AddAvailableValue(Dominator.result(), Default); +PhiInserter.AddAvailableValue(Variable, Dominator.result(), Default); - Term->setCondition(PhiInserter.GetValueInMiddleOfBlock(Parent)); + PhiInserter.AddUse(Variable, &Term->getOperandUse(0)); } } } @@ -1318,8 +1319,12 @@ bool StructurizeCFG::run(Region *R, DominatorTree *DT) { orderNodes(); collectInfos(); createFlow(); - insertConditions(false); - insertConditions(true); + + SSAUpdaterBulk PhiInserter; + insertConditions(false, PhiInserter); + insertConditions(true, PhiInserter); + PhiInserter.RewriteAndOptimizeAllUses(DT); + setPhiValues(); simplifyConditions(); simplifyAffectedPhis(); `` https://github.com/llvm/llvm-project/pull/135181 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Add initial support for SPE brstack format (PR #129231)
@@ -180,13 +178,16 @@ void DataAggregator::start() { if (opts::ArmSPE) { if (!opts::BasicAggregation) { - errs() << "PERF2BOLT-ERROR: Arm SPE mode is combined only with " -"BasicAggregation.\n"; - exit(1); + // pidfrom_ip to_ippredicted? + // 12345 0x123/0x456/P/-/-/8/RET/- kaadam wrote: Updated https://github.com/llvm/llvm-project/pull/129231 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [NFC][KeyInstr] Add Atom Group (re)mapping (PR #133479)
@@ -87,6 +87,8 @@ class ValueMap { using ValueMapCVH = ValueMapCallbackVH; using MapT = DenseMap>; using MDMapT = DenseMap; + /// Map {(InlinedAt, old atom number) -> new atom number}. + using DMAtomT = DenseMap, uint64_t>; jmorse wrote: Consider using SmallDenseMap simply to reduce the initial allocations in the non-debug-info codepath? https://github.com/llvm/llvm-project/pull/133479 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [NFC][KeyInstr] Add Atom Group (re)mapping (PR #133479)
@@ -105,6 +105,13 @@ enum RemapFlags { /// Any global values not in value map are mapped to null instead of mapping /// to self. Illegal if RF_IgnoreMissingLocals is also set. RF_NullMapMissingGlobalValues = 8, + + /// Do not remap atom instances. Only safe if to do this if the cloned + /// instructions being remapped are inserted into a new function, or an + /// existing function where the inlined-at fields are updated. If in doubt, + /// don't use this flag. It's used for compiler performance reasons rather + /// than correctness. jmorse wrote: ```suggestion /// don't use this flag. It's used when remapping is known to be un-necessary /// to save some compile-time. ``` https://github.com/llvm/llvm-project/pull/133479 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [NFC][KeyInstr] Add Atom Group (re)mapping (PR #133479)
@@ -117,9 +118,21 @@ struct ClonedCodeInfo { /// If you would like to collect additional information about the cloned /// function, you can specify a ClonedCodeInfo object with the optional fifth /// parameter. +/// +/// Set \p MapAtoms to false to skip mapping source atoms for later remapping. jmorse wrote: IMO "source-location atoms" to make it even clearer that this is a debugging feature. Also IMO it's better to discuss when this flag is necessary instead of when it's not necessary, as it'll enlighten the reader what it's for. AFAIUI, something like "Must be true when you duplicate a code path and a source line is intended to appear twice in the generated instructions. Can be set to false if you are transplanting code from one place to another". https://github.com/llvm/llvm-project/pull/133479 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [NFC][KeyInstr] Add Atom Group (re)mapping (PR #133479)
@@ -105,6 +105,13 @@ enum RemapFlags { /// Any global values not in value map are mapped to null instead of mapping /// to self. Illegal if RF_IgnoreMissingLocals is also set. RF_NullMapMissingGlobalValues = 8, + + /// Do not remap atom instances. Only safe if to do this if the cloned + /// instructions being remapped are inserted into a new function, or an + /// existing function where the inlined-at fields are updated. If in doubt, + /// don't use this flag. It's used for compiler performance reasons rather + /// than correctness. jmorse wrote: IMO suggesting that it's not related to correctness is misleading, because the presence/absence of the flag can lead to correctness issues. Better to just avoid saying that and say something else as suggested. https://github.com/llvm/llvm-project/pull/133479 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [NFC][KeyInstr] Add Atom Group (re)mapping (PR #133479)
https://github.com/jmorse edited https://github.com/llvm/llvm-project/pull/133479 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [NFC][KeyInstr] Add Atom Group (re)mapping (PR #133479)
@@ -105,6 +105,13 @@ enum RemapFlags { /// Any global values not in value map are mapped to null instead of mapping /// to self. Illegal if RF_IgnoreMissingLocals is also set. RF_NullMapMissingGlobalValues = 8, + + /// Do not remap atom instances. Only safe if to do this if the cloned jmorse wrote: IMO needs "source location atom" instead of just "atom" to ensure the random reader knows it's about debug-info. https://github.com/llvm/llvm-project/pull/133479 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [NFC][KeyInstr] Add Atom Group (re)mapping (PR #133479)
https://github.com/jmorse approved this pull request. LGTM, code and tests are good. As ever all my comments are about comments and maintainability! https://github.com/llvm/llvm-project/pull/133479 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [NFC][KeyInstr] Add Atom Group (re)mapping (PR #133479)
@@ -284,6 +291,9 @@ inline void RemapInstruction(Instruction *I, ValueToValueMapTy &VM, .remapInstruction(*I); } +/// Remap source atom. Called by RemapInstruction. jmorse wrote: IMO too terse; needs some purpose and context. https://github.com/llvm/llvm-project/pull/133479 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] PHITransAddr: Avoid looking at constant use lists (PR #134689)
https://github.com/nikic approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/134689 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [lld] ELF: Only rewrite non-preemptible IFUNCs to IPLT functions if a non-IRELATIVE relocation is needed. (PR #133531)
@@ -1761,6 +1761,9 @@ void RelocationBaseSection::computeRels() { llvm::sort(nonRelative, irelative, [&](auto &a, auto &b) { return std::tie(a.r_sym, a.r_offset) < std::tie(b.r_sym, b.r_offset); }); +llvm::sort(irelative, relocs.end(), [&](auto &a, auto &b) { smithp35 wrote: Could be worth updating the comment on line 1753, which doesn't mention irelative after non-irelative. If the r_offset is not just for readability it will be worth updating that too. https://github.com/llvm/llvm-project/pull/133531 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [lld] ELF: Only rewrite non-preemptible IFUNCs to IPLT functions if a non-IRELATIVE relocation is needed. (PR #133531)
@@ -1964,6 +1979,26 @@ void elf::postScanRelocations(Ctx &ctx) { for (ELFFileBase *file : ctx.objectFiles) for (Symbol *sym : file->getLocalSymbols()) fn(*sym); + + // Now that we have checked all ifunc symbols for demotion to regular function + // symbols, move IRELATIVE relocations to the right place: + // - Relocations for non-demoted ifuncs are added to .rela.dyn + // - Relocations for demoted ifuncs are turned into RELATIVE relocations + // or static relocations in PDEs smithp35 wrote: Could you expand the acronym? I think this means Position Dependent Executable (PDE). It isn't used anywhere else in the codebase, and while derivable made me stop and think of alternatives. https://github.com/llvm/llvm-project/pull/133531 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [lld] ELF: Only rewrite non-preemptible IFUNCs to IPLT functions if a non-IRELATIVE relocation is needed. (PR #133531)
@@ -42,6 +42,8 @@ void printTraceSymbol(const Symbol &sym, StringRef name); enum { NEEDS_GOT = 1 << 0, NEEDS_PLT = 1 << 1, + // True if this is an ifunc with a direct relocation that cannot be smithp35 wrote: Although not new, could be worth expanding on what a direct relocation is in the comment. Could be just `direct (non GOT or PLT generating) relocation ...` https://github.com/llvm/llvm-project/pull/133531 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang-tools-extra] [clang] implement printing of canonical template arguments of expression kind (PR #135133)
@@ -1357,6 +1357,8 @@ void TextNodeDumper::VisitTemplateExpansionTemplateArgument( void TextNodeDumper::VisitExpressionTemplateArgument( const TemplateArgument &TA) { OS << " expr"; + if (TA.isCanonicalExpr()) +OS << " canon"; mizvekov wrote: Done https://github.com/llvm/llvm-project/pull/135133 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] amdgpu_use_ssaupdaterbulk_in_structurizecfg (PR #135181)
https://github.com/vpykhtin created https://github.com/llvm/llvm-project/pull/135181 None >From c22138d41a4e1d81f3017478bfe9496fc80164f8 Mon Sep 17 00:00:00 2001 From: Valery Pykhtin Date: Thu, 10 Apr 2025 11:58:13 + Subject: [PATCH] amdgpu_use_ssaupdaterbulk_in_structurizecfg --- llvm/lib/Transforms/Scalar/StructurizeCFG.cpp | 25 +++ 1 file changed, 15 insertions(+), 10 deletions(-) diff --git a/llvm/lib/Transforms/Scalar/StructurizeCFG.cpp b/llvm/lib/Transforms/Scalar/StructurizeCFG.cpp index 00c4fcc76e791..95c68ecd2255b 100644 --- a/llvm/lib/Transforms/Scalar/StructurizeCFG.cpp +++ b/llvm/lib/Transforms/Scalar/StructurizeCFG.cpp @@ -47,6 +47,7 @@ #include "llvm/Transforms/Utils/BasicBlockUtils.h" #include "llvm/Transforms/Utils/Local.h" #include "llvm/Transforms/Utils/SSAUpdater.h" +#include "llvm/Transforms/Utils/SSAUpdaterBulk.h" #include #include @@ -317,7 +318,7 @@ class StructurizeCFG { void collectInfos(); - void insertConditions(bool Loops); + void insertConditions(bool Loops, SSAUpdaterBulk &PhiInserter); void simplifyConditions(); @@ -600,10 +601,9 @@ void StructurizeCFG::collectInfos() { } /// Insert the missing branch conditions -void StructurizeCFG::insertConditions(bool Loops) { +void StructurizeCFG::insertConditions(bool Loops, SSAUpdaterBulk &PhiInserter) { BranchVector &Conds = Loops ? LoopConds : Conditions; Value *Default = Loops ? BoolTrue : BoolFalse; - SSAUpdater PhiInserter; for (BranchInst *Term : Conds) { assert(Term->isConditional()); @@ -619,22 +619,23 @@ void StructurizeCFG::insertConditions(bool Loops) { Term->setCondition(PI.Pred); CondBranchWeights::setMetadata(*Term, PI.Weights); } else { - PhiInserter.Initialize(Boolean, ""); - PhiInserter.AddAvailableValue(Loops ? SuccFalse : Parent, Default); + unsigned Variable = PhiInserter.AddVariable("", Boolean); + PhiInserter.AddAvailableValue(Variable, Loops ? SuccFalse : Parent, +Default); NearestCommonDominator Dominator(DT); Dominator.addBlock(Parent); for (auto [BB, PI] : Preds) { assert(BB != Parent); -PhiInserter.AddAvailableValue(BB, PI.Pred); +PhiInserter.AddAvailableValue(Variable, BB, PI.Pred); Dominator.addAndRememberBlock(BB); } if (!Dominator.resultIsRememberedBlock()) -PhiInserter.AddAvailableValue(Dominator.result(), Default); +PhiInserter.AddAvailableValue(Variable, Dominator.result(), Default); - Term->setCondition(PhiInserter.GetValueInMiddleOfBlock(Parent)); + PhiInserter.AddUse(Variable, &Term->getOperandUse(0)); } } } @@ -1318,8 +1319,12 @@ bool StructurizeCFG::run(Region *R, DominatorTree *DT) { orderNodes(); collectInfos(); createFlow(); - insertConditions(false); - insertConditions(true); + + SSAUpdaterBulk PhiInserter; + insertConditions(false, PhiInserter); + insertConditions(true, PhiInserter); + PhiInserter.RewriteAndOptimizeAllUses(DT); + setPhiValues(); simplifyConditions(); simplifyAffectedPhis(); ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [lld] ELF: Only rewrite non-preemptible IFUNCs to IPLT functions if a non-IRELATIVE relocation is needed. (PR #133531)
https://github.com/smithp35 edited https://github.com/llvm/llvm-project/pull/133531 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] ELF: Remove lock from MTE global relocation handling code. (PR #135123)
llvmbot wrote: @llvm/pr-subscribers-lld-elf Author: Peter Collingbourne (pcc) Changes This lock is unnecessary because we can add the relocations to shards and let them be sorted later. --- Full diff: https://github.com/llvm/llvm-project/pull/135123.diff 1 Files Affected: - (modified) lld/ELF/Relocations.cpp (+2-3) ``diff diff --git a/lld/ELF/Relocations.cpp b/lld/ELF/Relocations.cpp index 81de664fd1c23..277acb26987bc 100644 --- a/lld/ELF/Relocations.cpp +++ b/lld/ELF/Relocations.cpp @@ -847,9 +847,8 @@ static void addRelativeReloc(Ctx &ctx, InputSectionBase &isec, Partition &part = isec.getPartition(ctx); if (sym.isTagged()) { -std::lock_guard lock(ctx.relocMutex); -part.relaDyn->addRelativeReloc(ctx.target->relativeRel, isec, offsetInSec, - sym, addend, type, expr); +part.relaDyn->addRelativeReloc(ctx.target->relativeRel, isec, + offsetInSec, sym, addend, type, expr); // With MTE globals, we always want to derive the address tag by `ldg`-ing // the symbol. When we have a RELATIVE relocation though, we no longer have // a reference to the symbol. Because of this, when we have an addend that `` https://github.com/llvm/llvm-project/pull/135123 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] CodeGen: Trim redundant template argument from defusechain_iterator (PR #135024)
arsenm wrote: ### Merge activity * **Apr 9, 12:21 PM EDT**: A user started a stack merge that includes this pull request via [Graphite](https://app.graphite.dev/github/pr/llvm/llvm-project/135024). https://github.com/llvm/llvm-project/pull/135024 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [InstCombine] Improve inbounds preservation for ADD+GEP -> GEP+GEP (PR #135155)
@@ -3087,12 +3087,22 @@ Instruction *InstCombinerImpl::visitGetElementPtrInst(GetElementPtrInst &GEP) { return nullptr; if (GEP.getNumIndices() == 1) { -// We can only preserve inbounds if the original gep is inbounds, the add -// is nsw, and the add operands are non-negative. -auto CanPreserveInBounds = [&](bool AddIsNSW, Value *Idx1, Value *Idx2) { +auto CanPreserveNoWrapFlags = [&](bool AddIsNSW, bool AddIsNUW, Value *Idx1, + Value *Idx2) { + // Preserve "inbounds nuw" if the original gep is "inbounds nuw", + // and the add is "nuw". + if (GEP.isInBounds() && GEP.hasNoUnsignedWrap() && AddIsNUW) +return GEPNoWrapFlags::inBounds() | GEPNoWrapFlags::noUnsignedWrap(); nikic wrote: ```suggestion if (GEP.hasNoUnsignedWrap() && AddIsNUW) return GEP.getNoWrapFlags(); ``` Would this work to subsume both this case and the only nuw one below? https://github.com/llvm/llvm-project/pull/135155 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [InstCombine] Improve inbounds preservation for ADD+GEP -> GEP+GEP (PR #135155)
@@ -3087,12 +3087,22 @@ Instruction *InstCombinerImpl::visitGetElementPtrInst(GetElementPtrInst &GEP) { return nullptr; if (GEP.getNumIndices() == 1) { -// We can only preserve inbounds if the original gep is inbounds, the add -// is nsw, and the add operands are non-negative. -auto CanPreserveInBounds = [&](bool AddIsNSW, Value *Idx1, Value *Idx2) { +auto CanPreserveNoWrapFlags = [&](bool AddIsNSW, bool AddIsNUW, Value *Idx1, nikic wrote: Rename this to GetPreservedNoWrapFlags or something. https://github.com/llvm/llvm-project/pull/135155 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [InstCombine] Improve inbounds preservation for ADD+GEP -> GEP+GEP (PR #135155)
@@ -3087,12 +3087,22 @@ Instruction *InstCombinerImpl::visitGetElementPtrInst(GetElementPtrInst &GEP) { return nullptr; if (GEP.getNumIndices() == 1) { -// We can only preserve inbounds if the original gep is inbounds, the add -// is nsw, and the add operands are non-negative. -auto CanPreserveInBounds = [&](bool AddIsNSW, Value *Idx1, Value *Idx2) { +auto CanPreserveNoWrapFlags = [&](bool AddIsNSW, bool AddIsNUW, Value *Idx1, + Value *Idx2) { + // Preserve "inbounds nuw" if the original gep is "inbounds nuw", + // and the add is "nuw". + if (GEP.isInBounds() && GEP.hasNoUnsignedWrap() && AddIsNUW) +return GEPNoWrapFlags::inBounds() | GEPNoWrapFlags::noUnsignedWrap(); + // Preserve "inbounds" if the original gep is "inbounds", the add + // is "nsw", and the add operands are non-negative. SimplifyQuery Q = SQ.getWithInstruction(&GEP); - return GEP.isInBounds() && AddIsNSW && isKnownNonNegative(Idx1, Q) && - isKnownNonNegative(Idx2, Q); + if (GEP.isInBounds() && AddIsNSW && isKnownNonNegative(Idx1, Q) && + isKnownNonNegative(Idx2, Q)) +return GEPNoWrapFlags::inBounds(); nikic wrote: Is it actually still necessary to explicitly handle this case? If we have an add nsw with nonneg operands, I think we should infer nuw on both add and gep and can then use the new code path? https://github.com/llvm/llvm-project/pull/135155 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Only fold flat offsets if they are inbounds (PR #132353)
https://github.com/ritter-x2a updated https://github.com/llvm/llvm-project/pull/132353 >From 11282b1d43e87a092a6d21cc23e6962b65554eb3 Mon Sep 17 00:00:00 2001 From: Fabian Ritter Date: Fri, 21 Mar 2025 03:33:02 -0400 Subject: [PATCH] [AMDGPU][SDAG] Only fold flat offsets if they are inbounds For flat memory instructions where the address is supplied as a base address register with an immediate offset, the memory aperture test ignores the immediate offset. Currently, ISel does not respect that, which leads to miscompilations where valid input programs crash when the address computation relies on the immediate offset to get the base address in the proper memory aperture. Global or scratch instructions are not affected. This patch only selects flat instructions with immediate offsets from address computations with the inbounds flag: If the address computation does not leave the bounds of the allocated object, it cannot leave the bounds of the memory aperture and is therefore safe to handle with an immediate offset. It also adds the inbounds flag to DAG nodes resulting from transformations: - Address computations resulting from getObjectPtrOffset. As far as I can tell, this function is only used to compute addresses within accessed memory ranges, e.g., for loads and stores that are split during legalization. - Reassociated inbounds adds. If both involved operations are inbounds, then so are operations after the transformation. - Address computations in the SelectionDAG lowering of the memcpy/move/set intrinsics. Base and result of the address arithmetic there are accessed, so the operation must be inbounds. It might make sense to separate these changes into their own PR, but I don't see a way to test them without adding a use of the inbounds SDAG flag. Affected tests: - CodeGen/AMDGPU/fold-gep-offset.ll: Offsets are no longer wrongly folded, added new positive tests where we still do fold them. - Transforms/InferAddressSpaces/AMDGPU/flat_atomic.ll: Offset folding doesn't seem integral to this test, so the test is not changed to make offset folding still happen. - CodeGen/AMDGPU/loop-prefetch-data.ll: loop-reduce prefers to base addresses on the potentially OOB addresses used for prefetching for memory accesses, that might be a separate issue to look into. - Added memset tests to CodeGen/AMDGPU/memintrinsic-unroll.ll to make sure that offsets in the memset DAG lowering are still folded properly. A similar patch for GlobalISel will follow. Fixes SWDEV-516125. --- llvm/include/llvm/CodeGen/SelectionDAG.h | 12 +- llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | 9 +- .../lib/CodeGen/SelectionDAG/SelectionDAG.cpp | 12 +- llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp | 140 --- llvm/test/CodeGen/AMDGPU/fold-gep-offset.ll | 374 +- .../test/CodeGen/AMDGPU/loop-prefetch-data.ll | 17 +- .../CodeGen/AMDGPU/memintrinsic-unroll.ll | 241 +++ .../InferAddressSpaces/AMDGPU/flat_atomic.ll | 6 +- 8 files changed, 717 insertions(+), 94 deletions(-) diff --git a/llvm/include/llvm/CodeGen/SelectionDAG.h b/llvm/include/llvm/CodeGen/SelectionDAG.h index 15a2370e5d8b8..aa3668d3e9aae 100644 --- a/llvm/include/llvm/CodeGen/SelectionDAG.h +++ b/llvm/include/llvm/CodeGen/SelectionDAG.h @@ -1069,7 +1069,8 @@ class SelectionDAG { SDValue EVL); /// Returns sum of the base pointer and offset. - /// Unlike getObjectPtrOffset this does not set NoUnsignedWrap by default. + /// Unlike getObjectPtrOffset this does not set NoUnsignedWrap and InBounds by + /// default. SDValue getMemBasePlusOffset(SDValue Base, TypeSize Offset, const SDLoc &DL, const SDNodeFlags Flags = SDNodeFlags()); SDValue getMemBasePlusOffset(SDValue Base, SDValue Offset, const SDLoc &DL, @@ -1077,15 +1078,18 @@ class SelectionDAG { /// Create an add instruction with appropriate flags when used for /// addressing some offset of an object. i.e. if a load is split into multiple - /// components, create an add nuw from the base pointer to the offset. + /// components, create an add nuw inbounds from the base pointer to the + /// offset. SDValue getObjectPtrOffset(const SDLoc &SL, SDValue Ptr, TypeSize Offset) { -return getMemBasePlusOffset(Ptr, Offset, SL, SDNodeFlags::NoUnsignedWrap); +return getMemBasePlusOffset( +Ptr, Offset, SL, SDNodeFlags::NoUnsignedWrap | SDNodeFlags::InBounds); } SDValue getObjectPtrOffset(const SDLoc &SL, SDValue Ptr, SDValue Offset) { // The object itself can't wrap around the address space, so it shouldn't be // possible for the adds of the offsets to the split parts to overflow. -return getMemBasePlusOffset(Ptr, Offset, SL, SDNodeFlags::NoUnsignedWrap); +return getMemBasePlusOffset( +Ptr, Offset, SL, SDNodeFlags::NoUnsignedWrap | SDNodeFlags::InBounds); } /// Return a new CALLSEQ_START node, that starts new call fram
[llvm-branch-commits] [llvm] [SDAG] Introduce inbounds flag for pointer arithmetic (PR #131862)
https://github.com/ritter-x2a updated https://github.com/llvm/llvm-project/pull/131862 >From 75e41ae17d5daae609c6f25025c730e9bb3924bc Mon Sep 17 00:00:00 2001 From: Fabian Ritter Date: Mon, 17 Mar 2025 06:51:16 -0400 Subject: [PATCH] [SDAG] Introduce inbounds flag for pointer arithmetic This patch introduces an inbounds SDNodeFlag, to show that a pointer addition SDNode implements an inbounds getelementptr operation (i.e., the pointer operand is in bounds wrt. the allocated object it is based on, and the arithmetic does not change that). The flag is set in the DAG construction when lowering inbounds GEPs. Inbounds information is useful in the ISel when selecting memory instructions that perform address computations whose intermediate steps must be in the same memory region as the final result. A follow-up patch will start using it for AMDGPU's flat memory instructions, where the immediate offset must not affect the memory aperture of the address. A similar patch for gMIR and GlobalISel will follow. For SWDEV-516125. --- llvm/include/llvm/CodeGen/SelectionDAGNodes.h| 9 +++-- llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp| 3 +++ llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp | 3 +++ .../CodeGen/X86/merge-store-partially-alias-loads.ll | 2 +- 4 files changed, 14 insertions(+), 3 deletions(-) diff --git a/llvm/include/llvm/CodeGen/SelectionDAGNodes.h b/llvm/include/llvm/CodeGen/SelectionDAGNodes.h index 2283f99202e2f..13ac65f5d731c 100644 --- a/llvm/include/llvm/CodeGen/SelectionDAGNodes.h +++ b/llvm/include/llvm/CodeGen/SelectionDAGNodes.h @@ -415,12 +415,15 @@ struct SDNodeFlags { Unpredictable = 1 << 13, // Compare instructions which may carry the samesign flag. SameSign = 1 << 14, +// Pointer arithmetic instructions that remain in bounds, e.g., implementing +// an inbounds GEP. +InBounds = 1 << 15, // NOTE: Please update LargestValue in LLVM_DECLARE_ENUM_AS_BITMASK below // the class definition when adding new flags. PoisonGeneratingFlags = NoUnsignedWrap | NoSignedWrap | Exact | Disjoint | -NonNeg | NoNaNs | NoInfs | SameSign, +NonNeg | NoNaNs | NoInfs | SameSign | InBounds, FastMathFlags = NoNaNs | NoInfs | NoSignedZeros | AllowReciprocal | AllowContract | ApproximateFuncs | AllowReassociation, }; @@ -455,6 +458,7 @@ struct SDNodeFlags { void setAllowReassociation(bool b) { setFlag(b); } void setNoFPExcept(bool b) { setFlag(b); } void setUnpredictable(bool b) { setFlag(b); } + void setInBounds(bool b) { setFlag(b); } // These are accessors for each flag. bool hasNoUnsignedWrap() const { return Flags & NoUnsignedWrap; } @@ -472,6 +476,7 @@ struct SDNodeFlags { bool hasAllowReassociation() const { return Flags & AllowReassociation; } bool hasNoFPExcept() const { return Flags & NoFPExcept; } bool hasUnpredictable() const { return Flags & Unpredictable; } + bool hasInBounds() const { return Flags & InBounds; } bool operator==(const SDNodeFlags &Other) const { return Flags == Other.Flags; @@ -481,7 +486,7 @@ struct SDNodeFlags { }; LLVM_DECLARE_ENUM_AS_BITMASK(decltype(SDNodeFlags::None), - SDNodeFlags::SameSign); + SDNodeFlags::InBounds); inline SDNodeFlags operator|(SDNodeFlags LHS, SDNodeFlags RHS) { LHS |= RHS; diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp index 89793c30f3710..32973be608937 100644 --- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp +++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp @@ -4283,6 +4283,7 @@ void SelectionDAGBuilder::visitGetElementPtr(const User &I) { if (NW.hasNoUnsignedWrap() || (int64_t(Offset) >= 0 && NW.hasNoUnsignedSignedWrap())) Flags |= SDNodeFlags::NoUnsignedWrap; +Flags.setInBounds(NW.isInBounds()); N = DAG.getNode(ISD::ADD, dl, N.getValueType(), N, DAG.getConstant(Offset, dl, N.getValueType()), Flags); @@ -4326,6 +4327,7 @@ void SelectionDAGBuilder::visitGetElementPtr(const User &I) { if (NW.hasNoUnsignedWrap() || (Offs.isNonNegative() && NW.hasNoUnsignedSignedWrap())) Flags.setNoUnsignedWrap(true); +Flags.setInBounds(NW.isInBounds()); OffsVal = DAG.getSExtOrTrunc(OffsVal, dl, N.getValueType()); @@ -4388,6 +4390,7 @@ void SelectionDAGBuilder::visitGetElementPtr(const User &I) { // pointer index type (add nuw). SDNodeFlags AddFlags; AddFlags.setNoUnsignedWrap(NW.hasNoUnsignedWrap()); + AddFlags.setInBounds(NW.isInBounds()); N = DAG.getNode(ISD::ADD, dl, N.getValueType(), N, IdxN, AddFlags); } diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp ind
[llvm-branch-commits] [llvm] [SDAG] Introduce inbounds flag for pointer arithmetic (PR #131862)
https://github.com/ritter-x2a updated https://github.com/llvm/llvm-project/pull/131862 >From 75e41ae17d5daae609c6f25025c730e9bb3924bc Mon Sep 17 00:00:00 2001 From: Fabian Ritter Date: Mon, 17 Mar 2025 06:51:16 -0400 Subject: [PATCH] [SDAG] Introduce inbounds flag for pointer arithmetic This patch introduces an inbounds SDNodeFlag, to show that a pointer addition SDNode implements an inbounds getelementptr operation (i.e., the pointer operand is in bounds wrt. the allocated object it is based on, and the arithmetic does not change that). The flag is set in the DAG construction when lowering inbounds GEPs. Inbounds information is useful in the ISel when selecting memory instructions that perform address computations whose intermediate steps must be in the same memory region as the final result. A follow-up patch will start using it for AMDGPU's flat memory instructions, where the immediate offset must not affect the memory aperture of the address. A similar patch for gMIR and GlobalISel will follow. For SWDEV-516125. --- llvm/include/llvm/CodeGen/SelectionDAGNodes.h| 9 +++-- llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp| 3 +++ llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp | 3 +++ .../CodeGen/X86/merge-store-partially-alias-loads.ll | 2 +- 4 files changed, 14 insertions(+), 3 deletions(-) diff --git a/llvm/include/llvm/CodeGen/SelectionDAGNodes.h b/llvm/include/llvm/CodeGen/SelectionDAGNodes.h index 2283f99202e2f..13ac65f5d731c 100644 --- a/llvm/include/llvm/CodeGen/SelectionDAGNodes.h +++ b/llvm/include/llvm/CodeGen/SelectionDAGNodes.h @@ -415,12 +415,15 @@ struct SDNodeFlags { Unpredictable = 1 << 13, // Compare instructions which may carry the samesign flag. SameSign = 1 << 14, +// Pointer arithmetic instructions that remain in bounds, e.g., implementing +// an inbounds GEP. +InBounds = 1 << 15, // NOTE: Please update LargestValue in LLVM_DECLARE_ENUM_AS_BITMASK below // the class definition when adding new flags. PoisonGeneratingFlags = NoUnsignedWrap | NoSignedWrap | Exact | Disjoint | -NonNeg | NoNaNs | NoInfs | SameSign, +NonNeg | NoNaNs | NoInfs | SameSign | InBounds, FastMathFlags = NoNaNs | NoInfs | NoSignedZeros | AllowReciprocal | AllowContract | ApproximateFuncs | AllowReassociation, }; @@ -455,6 +458,7 @@ struct SDNodeFlags { void setAllowReassociation(bool b) { setFlag(b); } void setNoFPExcept(bool b) { setFlag(b); } void setUnpredictable(bool b) { setFlag(b); } + void setInBounds(bool b) { setFlag(b); } // These are accessors for each flag. bool hasNoUnsignedWrap() const { return Flags & NoUnsignedWrap; } @@ -472,6 +476,7 @@ struct SDNodeFlags { bool hasAllowReassociation() const { return Flags & AllowReassociation; } bool hasNoFPExcept() const { return Flags & NoFPExcept; } bool hasUnpredictable() const { return Flags & Unpredictable; } + bool hasInBounds() const { return Flags & InBounds; } bool operator==(const SDNodeFlags &Other) const { return Flags == Other.Flags; @@ -481,7 +486,7 @@ struct SDNodeFlags { }; LLVM_DECLARE_ENUM_AS_BITMASK(decltype(SDNodeFlags::None), - SDNodeFlags::SameSign); + SDNodeFlags::InBounds); inline SDNodeFlags operator|(SDNodeFlags LHS, SDNodeFlags RHS) { LHS |= RHS; diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp index 89793c30f3710..32973be608937 100644 --- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp +++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp @@ -4283,6 +4283,7 @@ void SelectionDAGBuilder::visitGetElementPtr(const User &I) { if (NW.hasNoUnsignedWrap() || (int64_t(Offset) >= 0 && NW.hasNoUnsignedSignedWrap())) Flags |= SDNodeFlags::NoUnsignedWrap; +Flags.setInBounds(NW.isInBounds()); N = DAG.getNode(ISD::ADD, dl, N.getValueType(), N, DAG.getConstant(Offset, dl, N.getValueType()), Flags); @@ -4326,6 +4327,7 @@ void SelectionDAGBuilder::visitGetElementPtr(const User &I) { if (NW.hasNoUnsignedWrap() || (Offs.isNonNegative() && NW.hasNoUnsignedSignedWrap())) Flags.setNoUnsignedWrap(true); +Flags.setInBounds(NW.isInBounds()); OffsVal = DAG.getSExtOrTrunc(OffsVal, dl, N.getValueType()); @@ -4388,6 +4390,7 @@ void SelectionDAGBuilder::visitGetElementPtr(const User &I) { // pointer index type (add nuw). SDNodeFlags AddFlags; AddFlags.setNoUnsignedWrap(NW.hasNoUnsignedWrap()); + AddFlags.setInBounds(NW.isInBounds()); N = DAG.getNode(ISD::ADD, dl, N.getValueType(), N, IdxN, AddFlags); } diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp ind
[llvm-branch-commits] [libcxx] libcxx: In gdb test detect execute_mi with feature check instead of version check. (PR #132291)
https://github.com/philnik777 approved this pull request. LGTM assuming the diff landed is the same I see. I'm really not a fan of complicating things unnecessarily though. https://github.com/llvm/llvm-project/pull/132291 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [ctxprof] Flatten indirect call info in pre-thinlink compilation (PR #134766)
https://github.com/mtrofin updated https://github.com/llvm/llvm-project/pull/134766 >From 97908a0b652420ce82f3fe965f8eb12002e74a85 Mon Sep 17 00:00:00 2001 From: Mircea Trofin Date: Mon, 7 Apr 2025 18:22:05 -0700 Subject: [PATCH] [ctxprof] Flatten indirect call info in pre-thinlink compilation --- llvm/include/llvm/Analysis/CtxProfAnalysis.h | 5 ++ llvm/lib/Analysis/CtxProfAnalysis.cpp | 14 + .../Instrumentation/PGOCtxProfFlattening.cpp | 57 +++ .../flatten-insert-icp-mdprof.ll | 50 4 files changed, 126 insertions(+) create mode 100644 llvm/test/Analysis/CtxProfAnalysis/flatten-insert-icp-mdprof.ll diff --git a/llvm/include/llvm/Analysis/CtxProfAnalysis.h b/llvm/include/llvm/Analysis/CtxProfAnalysis.h index 023b5a9bdb848..6f1c3696ca78c 100644 --- a/llvm/include/llvm/Analysis/CtxProfAnalysis.h +++ b/llvm/include/llvm/Analysis/CtxProfAnalysis.h @@ -21,6 +21,10 @@ namespace llvm { class CtxProfAnalysis; +using FlatIndirectTargets = DenseMap; +using CtxProfFlatIndirectCallProfile = +DenseMap>; + /// The instrumented contextual profile, produced by the CtxProfAnalysis. class PGOContextualProfile { friend class CtxProfAnalysis; @@ -101,6 +105,7 @@ class PGOContextualProfile { void visit(ConstVisitor, const Function *F = nullptr) const; const CtxProfFlatProfile flatten() const; + const CtxProfFlatIndirectCallProfile flattenVirtCalls() const; bool invalidate(Module &, const PreservedAnalyses &PA, ModuleAnalysisManager::Invalidator &) { diff --git a/llvm/lib/Analysis/CtxProfAnalysis.cpp b/llvm/lib/Analysis/CtxProfAnalysis.cpp index 4042c87369462..304a77014f407 100644 --- a/llvm/lib/Analysis/CtxProfAnalysis.cpp +++ b/llvm/lib/Analysis/CtxProfAnalysis.cpp @@ -334,6 +334,20 @@ const CtxProfFlatProfile PGOContextualProfile::flatten() const { return Flat; } +const CtxProfFlatIndirectCallProfile +PGOContextualProfile::flattenVirtCalls() const { + CtxProfFlatIndirectCallProfile Ret; + preorderVisit( + Profiles.Contexts, [&](const PGOCtxProfContext &Ctx) { +auto &Targets = Ret[Ctx.guid()]; +for (const auto &[ID, SubctxSet] : Ctx.callsites()) + for (const auto &Subctx : SubctxSet) +Targets[ID][Subctx.first] += Subctx.second.getEntrycount(); + }); + return Ret; +} + void CtxProfAnalysis::collectIndirectCallPromotionList( CallBase &IC, Result &Profile, SetVector> &Candidates) { diff --git a/llvm/lib/Transforms/Instrumentation/PGOCtxProfFlattening.cpp b/llvm/lib/Transforms/Instrumentation/PGOCtxProfFlattening.cpp index ffe0f385047c3..9b44d61726fa1 100644 --- a/llvm/lib/Transforms/Instrumentation/PGOCtxProfFlattening.cpp +++ b/llvm/lib/Transforms/Instrumentation/PGOCtxProfFlattening.cpp @@ -36,9 +36,12 @@ #include "llvm/Transforms/Scalar/DCE.h" #include "llvm/Transforms/Utils/BasicBlockUtils.h" #include +#include using namespace llvm; +#define DEBUG_TYPE "ctx_prof_flatten" + namespace { class ProfileAnnotator final { @@ -414,6 +417,58 @@ void removeInstrumentation(Function &F) { I.eraseFromParent(); } +void annotateIndirectCall( +Module &M, CallBase &CB, +const DenseMap &FlatProf, +const InstrProfCallsite &Ins) { + auto Idx = Ins.getIndex()->getZExtValue(); + auto FIt = FlatProf.find(Idx); + if (FIt == FlatProf.end()) +return; + const auto &Targets = FIt->second; + SmallVector Data; + uint64_t Sum = 0; + for (auto &[Guid, Count] : Targets) { +Data.push_back({/*.Value=*/Guid, /*.Count=*/Count}); +Sum += Count; + } + struct InstrProfValueDataGTComparer { +bool operator()(const InstrProfValueData &A, const InstrProfValueData &B) { + return A.Count > B.Count; +} + }; + llvm::sort(Data, InstrProfValueDataGTComparer()); + llvm::annotateValueSite(M, CB, Data, Sum, + InstrProfValueKind::IPVK_IndirectCallTarget, + Data.size()); + LLVM_DEBUG(dbgs() << "[ctxprof] flat indirect call prof: " << CB +<< CB.getMetadata(LLVMContext::MD_prof) << "\n"); +} + +// We normally return a "Changed" bool, but the calling pass' run assumes +// something will change - some profile will be added - so this won't add much +// by returning false when applicable. +void annotateIndCalls(Module &M, const CtxProfAnalysis::Result &CtxProf) { + const auto FlatIndCalls = CtxProf.flattenVirtCalls(); + for (auto &F : M) { +if (F.isDeclaration()) + continue; +auto FlatProfIter = FlatIndCalls.find(AssignGUIDPass::getGUID(F)); +if (FlatProfIter == FlatIndCalls.end()) + continue; +const auto &FlatProf = FlatProfIter->second; +for (auto &BB : F) { + for (auto &I : BB) { +auto *CB = dyn_cast(&I); +if (!CB || !CB->isIndirectCall()) + continue; +if (auto *Ins = CtxProfAnalysis::getCallsiteInstrumentation(*CB)) + annotateIndirectCall(M, *CB, FlatProf, *Ins); + }
[llvm-branch-commits] [llvm] SCEVExpander: Don't look at uses of constants (PR #134691)
https://github.com/arsenm created https://github.com/llvm/llvm-project/pull/134691 This could be more relaxed, and look for uses of globals in the same function but no tests apparently depend on that. >From f543f056aa7e16b1f793d018e0b9c022b006f477 Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Mon, 7 Apr 2025 21:56:00 +0700 Subject: [PATCH] SCEVExpander: Don't look at uses of constants This could be more relaxed, and look for uses of globals in the same function but no tests apparently depend on that. --- .../Utils/ScalarEvolutionExpander.cpp | 29 ++- 1 file changed, 16 insertions(+), 13 deletions(-) diff --git a/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp b/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp index 41bf202230e22..e25ec6c3b2a58 100644 --- a/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp +++ b/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp @@ -111,20 +111,23 @@ Value *SCEVExpander::ReuseOrCreateCast(Value *V, Type *Ty, Value *Ret = nullptr; - // Check to see if there is already a cast! - for (User *U : V->users()) { -if (U->getType() != Ty) - continue; -CastInst *CI = dyn_cast(U); -if (!CI || CI->getOpcode() != Op) - continue; + if (!isa(V)) { +// Check to see if there is already a cast! +for (User *U : V->users()) { + if (U->getType() != Ty) +continue; + CastInst *CI = dyn_cast(U); + if (!CI || CI->getOpcode() != Op) +continue; -// Found a suitable cast that is at IP or comes before IP. Use it. Note that -// the cast must also properly dominate the Builder's insertion point. -if (IP->getParent() == CI->getParent() && &*BIP != CI && -(&*IP == CI || CI->comesBefore(&*IP))) { - Ret = CI; - break; + // Found a suitable cast that is at IP or comes before IP. Use it. Note + // that the cast must also properly dominate the Builder's insertion + // point. + if (IP->getParent() == CI->getParent() && &*BIP != CI && + (&*IP == CI || CI->comesBefore(&*IP))) { +Ret = CI; +break; + } } } ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] PHITransAddr: Avoid looking at constant use lists (PR #134689)
arsenm wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.dev/github/pr/llvm/llvm-project/134689?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#134692** https://app.graphite.dev/github/pr/llvm/llvm-project/134692?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#134691** https://app.graphite.dev/github/pr/llvm/llvm-project/134691?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#134690** https://app.graphite.dev/github/pr/llvm/llvm-project/134690?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#134689** https://app.graphite.dev/github/pr/llvm/llvm-project/134689?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/134689?utm_source=stack-comment-view-in-graphite"; target="_blank">(View in Graphite) * **#134688** https://app.graphite.dev/github/pr/llvm/llvm-project/134688?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#134275** https://app.graphite.dev/github/pr/llvm/llvm-project/134275?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#134274** https://app.graphite.dev/github/pr/llvm/llvm-project/134274?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * `main` This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn more about https://stacking.dev/?utm_source=stack-comment";>stacking. https://github.com/llvm/llvm-project/pull/134689 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] release/20.x: [Sanitizers][Darwin][Test] XFAIL malloc_zone.cpp (PR #133832)
https://github.com/wrotki updated https://github.com/llvm/llvm-project/pull/133832 >From ca129ea5996c2f2b99868bccd2246690a65b6c9e Mon Sep 17 00:00:00 2001 From: Mariusz Borsa Date: Mon, 31 Mar 2025 17:06:41 -0700 Subject: [PATCH] [Sanitizers][Darwin][Test] XFAIL malloc_zone.cpp The malloc_zone.cpp test currently fails on Darwin hosts, in SanitizerCommon tests with lsan enabled. Need to XFAIL this test to buy time to investigate this failure. Also we're trying to bring the number of test failing on Darwin bots to 0, to get clearer signal of any new failures. rdar://145873843 Co-authored-by: Mariusz Borsa (cherry picked from commit 02837acaaf2cfdfcbf77e4a7f6629575edb6ffb4) --- .../test/sanitizer_common/TestCases/Darwin/malloc_zone.cpp | 2 ++ 1 file changed, 2 insertions(+) diff --git a/compiler-rt/test/sanitizer_common/TestCases/Darwin/malloc_zone.cpp b/compiler-rt/test/sanitizer_common/TestCases/Darwin/malloc_zone.cpp index fd6ef03629438..5aa087fb4ca12 100644 --- a/compiler-rt/test/sanitizer_common/TestCases/Darwin/malloc_zone.cpp +++ b/compiler-rt/test/sanitizer_common/TestCases/Darwin/malloc_zone.cpp @@ -17,6 +17,8 @@ // UBSan does not install a malloc zone. // XFAIL: ubsan // +// Currently fails on darwin/lsan +// XFAIL: darwin && lsan #include #include ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [ctxprof] Use `isInSpecializedModule` as criteria for using contextual profile (PR #134468)
https://github.com/mtrofin ready_for_review https://github.com/llvm/llvm-project/pull/134468 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LV] Reduce register usage for scaled reductions (PR #133090)
@@ -5039,10 +5039,26 @@ calculateRegisterUsage(VPlan &Plan, ArrayRef VFs, // even in the scalar case. RegUsage[ClassID] += 1; } else { +ElementCount VF = VFs[J]; +// The output from scaled phis and scaled reductions actually has +// fewer lanes than the VF. +if (isa(R)) { + auto *ReductionR = dyn_cast(R); + auto *PartialReductionR = ReductionR ? nullptr : dyn_cast(R); + unsigned ScaleFactor = ReductionR ? ReductionR->getVFScaleFactor() : PartialReductionR->getVFScaleFactor(); + VF = VF.divideCoefficientBy(ScaleFactor); +} sdesmalen-arm wrote: Maybe create a utility function that returns the scaling factor a `Recipe`, which returns `1` for any recipe other than the `VPPartialReductionRecipe/VPReductionPHIRecipe`. Also, please run clang-format on your code. https://github.com/llvm/llvm-project/pull/133090 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [X86][SSE] Don't emit SSE2 load instructions in SSE1-only mode (#134547) (PR #135191)
https://github.com/llvmbot milestoned https://github.com/llvm/llvm-project/pull/135191 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/20.x: [X86][SSE] Don't emit SSE2 load instructions in SSE1-only mode (#134547) (PR #135191)
llvmbot wrote: @llvm/pr-subscribers-backend-x86 Author: None (llvmbot) Changes Backport 08e080ee98832c2aec6f379b04f486bea18730cc Requested by: @RKSimon --- Full diff: https://github.com/llvm/llvm-project/pull/135191.diff 2 Files Affected: - (modified) llvm/lib/Target/X86/X86FixupVectorConstants.cpp (+7-4) - (added) llvm/test/CodeGen/X86/pr134607.ll (+20) ``diff diff --git a/llvm/lib/Target/X86/X86FixupVectorConstants.cpp b/llvm/lib/Target/X86/X86FixupVectorConstants.cpp index 453898e132ca4..9dc392d6e9626 100644 --- a/llvm/lib/Target/X86/X86FixupVectorConstants.cpp +++ b/llvm/lib/Target/X86/X86FixupVectorConstants.cpp @@ -333,6 +333,7 @@ bool X86FixupVectorConstantsPass::processInstruction(MachineFunction &MF, MachineInstr &MI) { unsigned Opc = MI.getOpcode(); MachineConstantPool *CP = MI.getParent()->getParent()->getConstantPool(); + bool HasSSE2 = ST->hasSSE2(); bool HasSSE41 = ST->hasSSE41(); bool HasAVX2 = ST->hasAVX2(); bool HasDQI = ST->hasDQI(); @@ -394,11 +395,13 @@ bool X86FixupVectorConstantsPass::processInstruction(MachineFunction &MF, case X86::MOVAPDrm: case X86::MOVAPSrm: case X86::MOVUPDrm: - case X86::MOVUPSrm: + case X86::MOVUPSrm: { // TODO: SSE3 MOVDDUP Handling -return FixupConstant({{X86::MOVSSrm, 1, 32, rebuildZeroUpperCst}, - {X86::MOVSDrm, 1, 64, rebuildZeroUpperCst}}, - 128, 1); +FixupEntry Fixups[] = { +{X86::MOVSSrm, 1, 32, rebuildZeroUpperCst}, +{HasSSE2 ? X86::MOVSDrm : 0, 1, 64, rebuildZeroUpperCst}}; +return FixupConstant(Fixups, 128, 1); + } case X86::VMOVAPDrm: case X86::VMOVAPSrm: case X86::VMOVUPDrm: diff --git a/llvm/test/CodeGen/X86/pr134607.ll b/llvm/test/CodeGen/X86/pr134607.ll new file mode 100644 index 0..5e824c22e5a22 --- /dev/null +++ b/llvm/test/CodeGen/X86/pr134607.ll @@ -0,0 +1,20 @@ +; RUN: llc < %s -mtriple=i386-unknown-unknown -mattr=+sse -O3 | FileCheck %s --check-prefixes=X86 +; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=-sse2,+sse -O3 | FileCheck %s --check-prefixes=X64-SSE1 +; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+sse2,+sse -O3 | FileCheck %s --check-prefixes=X64-SSE2 + +define void @store_v2f32_constant(ptr %v) { +; X86-LABEL: store_v2f32_constant: +; X86: # %bb.0: +; X86-NEXT:movl 4(%esp), %eax +; X86-NEXT:movaps {{\.?LCPI[0-9]+_[0-9]+}}, %xmm0 + +; X64-SSE1-LABEL: store_v2f32_constant: +; X64-SSE1: # %bb.0: +; X64-SSE1-NEXT:movaps {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0 + +; X64-SSE2-LABEL: store_v2f32_constant: +; X64-SSE2: # %bb.0: +; X64-SSE2-NEXT:movsd {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0 + store <2 x float> , ptr %v, align 4 + ret void +} `` https://github.com/llvm/llvm-project/pull/135191 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] libcxx: In gdb test detect execute_mi with feature check instead of version check. (PR #132291)
https://github.com/pcc updated https://github.com/llvm/llvm-project/pull/132291 >From 89ce369ab9b49b8c23a87ad0a888002dd85c094c Mon Sep 17 00:00:00 2001 From: Peter Collingbourne Date: Thu, 20 Mar 2025 15:12:39 -0700 Subject: [PATCH 1/2] Format Created using spr 1.3.6-beta.1 --- libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py b/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py index 630b90c9d77a6..927f8958f4b43 100644 --- a/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py +++ b/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py @@ -30,7 +30,8 @@ # we exit. has_run_tests = False -has_execute_mi = 'execute_mi' in gdb.__dict__ +has_execute_mi = "execute_mi" in gdb.__dict__ + class CheckResult(gdb.Command): def __init__(self): >From da2f682a8f1a1af58fbe85f760e1844c808b8093 Mon Sep 17 00:00:00 2001 From: Peter Collingbourne Date: Tue, 8 Apr 2025 13:21:06 -0700 Subject: [PATCH 2/2] Use getattr instead Created using spr 1.3.6-beta.1 --- libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py b/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py index 927f8958f4b43..da09092b690c4 100644 --- a/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py +++ b/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py @@ -30,7 +30,7 @@ # we exit. has_run_tests = False -has_execute_mi = "execute_mi" in gdb.__dict__ +has_execute_mi = getattr(gdb, "execute_mi", None) is not None class CheckResult(gdb.Command): ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] libcxx: In gdb test detect execute_mi with feature check instead of version check. (PR #132291)
https://github.com/pcc updated https://github.com/llvm/llvm-project/pull/132291 >From 89ce369ab9b49b8c23a87ad0a888002dd85c094c Mon Sep 17 00:00:00 2001 From: Peter Collingbourne Date: Thu, 20 Mar 2025 15:12:39 -0700 Subject: [PATCH 1/2] Format Created using spr 1.3.6-beta.1 --- libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py b/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py index 630b90c9d77a6..927f8958f4b43 100644 --- a/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py +++ b/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py @@ -30,7 +30,8 @@ # we exit. has_run_tests = False -has_execute_mi = 'execute_mi' in gdb.__dict__ +has_execute_mi = "execute_mi" in gdb.__dict__ + class CheckResult(gdb.Command): def __init__(self): >From da2f682a8f1a1af58fbe85f760e1844c808b8093 Mon Sep 17 00:00:00 2001 From: Peter Collingbourne Date: Tue, 8 Apr 2025 13:21:06 -0700 Subject: [PATCH 2/2] Use getattr instead Created using spr 1.3.6-beta.1 --- libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py b/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py index 927f8958f4b43..da09092b690c4 100644 --- a/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py +++ b/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py @@ -30,7 +30,7 @@ # we exit. has_run_tests = False -has_execute_mi = "execute_mi" in gdb.__dict__ +has_execute_mi = getattr(gdb, "execute_mi", None) is not None class CheckResult(gdb.Command): ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] ELF: Remove lock from MTE global relocation handling code. (PR #135123)
https://github.com/pcc updated https://github.com/llvm/llvm-project/pull/135123 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] ELF: Remove lock from MTE global relocation handling code. (PR #135123)
https://github.com/pcc updated https://github.com/llvm/llvm-project/pull/135123 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] CodeGen: Trim redundant template argument from defusechain_iterator (PR #135024)
https://github.com/qcolombet approved this pull request. https://github.com/llvm/llvm-project/pull/135024 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] release/20.x: [fatlto] Add coroutine passes when using FatLTO with ThinLTO (#134434) (PR #134711)
https://github.com/llvmbot milestoned https://github.com/llvm/llvm-project/pull/134711 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] libcxx: In gdb test detect execute_mi with feature check instead of version check. (PR #132291)
https://github.com/pcc updated https://github.com/llvm/llvm-project/pull/132291 >From 89ce369ab9b49b8c23a87ad0a888002dd85c094c Mon Sep 17 00:00:00 2001 From: Peter Collingbourne Date: Thu, 20 Mar 2025 15:12:39 -0700 Subject: [PATCH 1/2] Format Created using spr 1.3.6-beta.1 --- libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py b/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py index 630b90c9d77a6..927f8958f4b43 100644 --- a/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py +++ b/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py @@ -30,7 +30,8 @@ # we exit. has_run_tests = False -has_execute_mi = 'execute_mi' in gdb.__dict__ +has_execute_mi = "execute_mi" in gdb.__dict__ + class CheckResult(gdb.Command): def __init__(self): >From da2f682a8f1a1af58fbe85f760e1844c808b8093 Mon Sep 17 00:00:00 2001 From: Peter Collingbourne Date: Tue, 8 Apr 2025 13:21:06 -0700 Subject: [PATCH 2/2] Use getattr instead Created using spr 1.3.6-beta.1 --- libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py b/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py index 927f8958f4b43..da09092b690c4 100644 --- a/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py +++ b/libcxx/test/libcxx/gdb/gdb_pretty_printer_test.py @@ -30,7 +30,7 @@ # we exit. has_run_tests = False -has_execute_mi = "execute_mi" in gdb.__dict__ +has_execute_mi = getattr(gdb, "execute_mi", None) is not None class CheckResult(gdb.Command): ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] compiler-rt: Introduce runtime functions for emulated PAC. (PR #133530)
https://github.com/pcc edited https://github.com/llvm/llvm-project/pull/133530 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [polly] [Polly] Introduce PhaseManager and remove LPM support (PR #125442)
@@ -16,105 +16,50 @@ //===--===// #include "polly/CodePreparation.h" -#include "polly/LinkAllPasses.h" #include "polly/Support/ScopHelper.h" #include "llvm/Analysis/DominanceFrontier.h" #include "llvm/Analysis/LoopInfo.h" #include "llvm/Analysis/RegionInfo.h" #include "llvm/Analysis/ScalarEvolution.h" -#include "llvm/InitializePasses.h" using namespace llvm; using namespace polly; -namespace { - -/// Prepare the IR for the scop detection. -/// -class CodePreparation final : public FunctionPass { - CodePreparation(const CodePreparation &) = delete; - const CodePreparation &operator=(const CodePreparation &) = delete; - - LoopInfo *LI; - ScalarEvolution *SE; - - void clear(); - -public: - static char ID; - - explicit CodePreparation() : FunctionPass(ID) {} - ~CodePreparation(); - - /// @name FunctionPass interface. - //@{ - void getAnalysisUsage(AnalysisUsage &AU) const override; - void releaseMemory() override; - bool runOnFunction(Function &F) override; - void print(raw_ostream &OS, const Module *) const override; - //@} -}; -} // namespace - -PreservedAnalyses CodePreparationPass::run(Function &F, - FunctionAnalysisManager &FAM) { - +static bool runCodePreprationImpl(Function &F, DominatorTree *DT, LoopInfo *LI, + RegionInfo *RI) { // Find first non-alloca instruction. Every basic block has a non-alloca // instruction, as every well formed basic block has a terminator. auto &EntryBlock = F.getEntryBlock(); BasicBlock::iterator I = EntryBlock.begin(); while (isa(I)) ++I; - auto &DT = FAM.getResult(F); - auto &LI = FAM.getResult(F); + // Abort if not necessary to split + if (I->isTerminator() && isa(I) && + cast(I)->isUnconditional()) +return false; // splitBlock updates DT, LI and RI. - splitEntryBlockForAlloca(&EntryBlock, &DT, &LI, nullptr); kartcq wrote: Can we please move CodePreparation pass changes to separate commit. This will make the these changes more trackable. https://github.com/llvm/llvm-project/pull/125442 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LV] Reduce register usage for scaled reductions (PR #133090)
@@ -2,6 +2,7 @@ ; RUN: opt -passes=loop-vectorize -enable-epilogue-vectorization=false -mattr=+neon,+dotprod -force-vector-interleave=1 -S < %s | FileCheck %s --check-prefixes=CHECK-INTERLEAVE1 ; RUN: opt -passes=loop-vectorize -enable-epilogue-vectorization=false -mattr=+neon,+dotprod -S < %s | FileCheck %s --check-prefixes=CHECK-INTERLEAVED ; RUN: opt -passes=loop-vectorize -enable-epilogue-vectorization=false -mattr=+neon,+dotprod -force-vector-interleave=1 -vectorizer-maximize-bandwidth -S < %s | FileCheck %s --check-prefixes=CHECK-MAXBW +; RUN: opt -passes=loop-vectorize -debug-only=loop-vectorize --disable-output -S < %s 2>&1 | FileCheck %s --check-prefix=CHECK-REGS david-arm wrote: Still missing a `REQUIRES: asserts` https://github.com/llvm/llvm-project/pull/133090 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [SSAUpdaterBulk] Add PHI simplification pass. (PR #135180)
https://github.com/vpykhtin edited https://github.com/llvm/llvm-project/pull/135180 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] LICM: Avoid looking at use list of constant data (PR #134690)
github-actions[bot] wrote: :warning: undef deprecator found issues in your code. :warning: You can test this locally with the following command: ``bash git diff -U0 --pickaxe-regex -S '([^a-zA-Z0-9#_-]undef[^a-zA-Z0-9_-]|UndefValue::get)' 'HEAD~1' HEAD llvm/lib/Transforms/Scalar/LICM.cpp llvm/test/CodeGen/AMDGPU/swdev380865.ll llvm/test/CodeGen/PowerPC/pr43527.ll llvm/test/CodeGen/PowerPC/pr48519.ll llvm/test/CodeGen/PowerPC/sms-grp-order.ll llvm/test/Transforms/LICM/pr50367.ll llvm/test/Transforms/LICM/pr59324.ll `` The following files introduce new uses of undef: - llvm/test/CodeGen/PowerPC/sms-grp-order.ll [Undef](https://llvm.org/docs/LangRef.html#undefined-values) is now deprecated and should only be used in the rare cases where no replacement is possible. For example, a load of uninitialized memory yields `undef`. You should use `poison` values for placeholders instead. In tests, avoid using `undef` and having tests that trigger undefined behavior. If you need an operand with some unimportant value, you can add a new argument to the function and use that instead. For example, this is considered a bad practice: ```llvm define void @fn() { ... br i1 undef, ... } ``` Please use the following instead: ```llvm define void @fn(i1 %cond) { ... br i1 %cond, ... } ``` Please refer to the [Undefined Behavior Manual](https://llvm.org/docs/UndefinedBehavior.html) for more information. https://github.com/llvm/llvm-project/pull/134690 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [GOFF] Add writing of section symbols (PR #133799)
@@ -239,6 +298,63 @@ class GOFFWriter { GOFFWriter::GOFFWriter(raw_pwrite_stream &OS, MCAssembler &Asm) : OS(OS), Asm(Asm) {} +void GOFFWriter::defineSectionSymbols(const MCSectionGOFF &Section) { + if (Section.isSD()) { +GOFFSymbol SD(Section.getName(), Section.getId(), + Section.getSDAttributes()); +writeSymbol(SD); + } + + if (Section.isED()) { +GOFFSymbol ED(Section.getName(), Section.getId(), + Section.getParent()->getId(), Section.getEDAttributes()); +if (Section.requiresLength()) + ED.SectionLength = Asm.getSectionAddressSize(Section); +writeSymbol(ED); + } + + if (Section.isPR()) { +GOFFSymbol PR(Section.getName(), Section.getId(), + Section.getParent()->getId(), Section.getPRAttributes()); +PR.SectionLength = Asm.getSectionAddressSize(Section); +if (Section.requiresNonZeroLength()) { redstar wrote: > That is a simple solution, too. I am not sure if this works with the HLASM > output. Ok, I contradict myself. Setting the ADA to null gets around the binder error but other parts assume that there is always an ADA. E.g. when calling an external functions. https://github.com/llvm/llvm-project/pull/133799 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] AArch64: Relax x16/x17 constraint on AUT in certain cases. (PR #132857)
@@ -191,16 +201,27 @@ define void @test_tailcall_omit_mov_x16_x16(ptr %objptr) #0 { define i32 @test_call_omit_extra_moves(ptr %objptr) #0 { ; CHECK-LABEL: test_call_omit_extra_moves: ; DARWIN-NEXT: stp x29, x30, [sp, #-16]! -; ELF-NEXT: str x30, [sp, #-16]! -; CHECK-NEXT:ldr x16, [x0] -; CHECK-NEXT:mov x17, x0 -; CHECK-NEXT:movkx17, #6503, lsl #48 -; CHECK-NEXT:autda x16, x17 -; CHECK-NEXT:ldr x8, [x16] -; CHECK-NEXT:movkx16, #34646, lsl #48 -; CHECK-NEXT:blraa x8, x16 -; CHECK-NEXT:mov w0, #42 +; DARWIN-NEXT: ldr x16, [x0] +; DARWIN-NEXT: mov x17, x0 +; DARWIN-NEXT: movkx17, #6503, lsl #48 +; DARWIN-NEXT: autda x16, x17 +; DARWIN-NEXT: ldr x8, [x16] +; DARWIN-NEXT: movkx16, #34646, lsl #48 +; DARWIN-NEXT: blraa x8, x16 +; DARWIN-NEXT: mov w0, #42 ; DARWIN-NEXT: ldp x29, x30, [sp], #16 +; ELF-NEXT: str x30, [sp, #-16]! +; ELF-NEXT: ldr x8, [x0] +; ELF-NEXT: mov x9, x0 +; ELF-NEXT: movkx9, #6503, lsl #48 +; ELF-NEXT: autda x8, x9 +; ELF-NEXT: ldr x9, [x8] +; FIXME: Get rid of the x16/x17 constraint on non-Darwin so we can eliminate +; this mov. +; ELF-NEXT: mov x17, x8 +; ELF-NEXT: movkx17, #34646, lsl #48 +; ELF-NEXT: blraa x9, x17 +; ELF-NEXT: mov w0, #42 ; ELF-NEXT: ldr x30, [sp], #16 ; CHECK-NEXT:ret atrosinenko wrote: Sorry, didn't notice that one of the instructions is a call. https://github.com/llvm/llvm-project/pull/132857 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [lld] ELF: Only rewrite non-preemptible IFUNCs to IPLT functions if a non-IRELATIVE relocation is needed. (PR #133531)
https://github.com/smithp35 commented: How does this work in the non-PIE (PDE) case when we take the address of an ifunc and pass it to a function in a shared library, which then compares the argument with its own global address take of the ifunc? For example: shared lib ``` typedef void Fptr(void); extern void ifn(void); // take address of ifunc ifn defined in application Fptr* ifp = &ifn; // compare address of ifn we have calculated in ifp vs // address calculated by application, passed in fp1. int compare(Fptr* fp1) { return fp1 == ifp; } ``` App ``` typedef void Fptr(void); extern int compare(Fptr* fp1); int val = 0; static void impl(void) { val = 42; } static void *resolver(void) { return impl; } __attribute__((ifunc("resolver"))) void *ifn(); extern Fptr* fp; int main(void) { return compare(fp); } // separate file so compiler is unaware ifn is an ifunc. typedef void Fptr(void); extern void ifn(void); Fptr* fp = &ifn; ``` Right now in the application lld produces an iPLT entry for `ifn`, with `fp` pointing to the iPLT entry. The dynamic symbol table contains the address of the iPLT entry with type STT_FUNC . The shared library and the argument compare equal. As I understand it, this patch will change `fp` to point directly to the result of the ifunc resolver. So unless we also change the value put into the dynamic symbol table we'll stop comparing equal. I don't think there's a STT_FUNC symbol we can put in the dynamic symbol table that holds the result of the ifunc resolver. GNU ld, puts the address of the resolver function with a STT_GNU_IFUNC symbol type in the dynamic symbol table. If that causes the dynamic loader to call the resolver and replace the value with the result then that would work. I haven't had time to check what glibc does though. I'll put some more general comments below. Didn't want to make this one too long. https://github.com/llvm/llvm-project/pull/133531 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [lld] ELF: Only rewrite non-preemptible IFUNCs to IPLT functions if a non-IRELATIVE relocation is needed. (PR #133531)
smithp35 wrote: I have some small reservations about using ifunc resolvers like this. Mostly in that we are using a mechanism invented for a different purpose, and relying on some specific linker behaviour to make this case work. This is similar to comments made in the Discourse post https://discourse.llvm.org/t/rfc-structure-protection-a-family-of-uaf-mitigation-techniques/8/9 but repeating them here as this is closest to the implementation. As I understand it, this has a more limited and more specific use case than ifuncs. Traditional ifuncs which can be address taken or called, possibly in multiple ways, so it makes sense to use a symbol type STT_GNU_IFUNC rather than special relocation directives. The initializers for structure field protection are compiler generated, can not be legally called or address taken from user code, and only have one relocation type R_*_ABS64 (or 32 on a 32-bit platform). With an addition of a single relocation, something like R_*_ADDRINIT64 which would target a STT_FUNC resolver symbol. We can isolate the structure field initialization use case from an actual ifunc. I guess it all comes down to whether structure field initialization needs, or benefits from being distinguished from an ifunc. Ifuncs seem to be quite easy to get wrong so being able to isolate this case has some attraction to me at least. It also handles the structure field that points to an ifunc relatively gracefully. As you pointed out in your response, this does mean adding 2 relocations to every psABI that supports structure field protection rather than just one. Although I'd expect the alternative of having relocations that alternatively write "directly call" ifunc resolver or take address of function might require new relocations too? https://github.com/llvm/llvm-project/pull/133531 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang] implement printing of canonical template arguments of expression kind (PR #135133)
llvmbot wrote: @llvm/pr-subscribers-debuginfo Author: Matheus Izvekov (mizvekov) Changes This patch extends the canonicalization printing policy to cover expressions and template names, and wires that up to the template argument printer, covering expressions. This is helpful for debugging, or if these template arguments somehow end up in diagnostics, as without this patch they can print as completely unrelated expressions, which can be quite confusing. This is because expressions are not uniqued, unlike types, and when a template specialization containing an expression is the first to be canonicalized, the expression ends up appearing in the canonical type of subsequent equivalent specializations. Fixes https://github.com/llvm/llvm-project/issues/92292 --- Patch is 48.39 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/135133.diff 12 Files Affected: - (modified) clang/include/clang/AST/PrettyPrinter.h (+3-3) - (modified) clang/lib/AST/DeclPrinter.cpp (+2-2) - (modified) clang/lib/AST/JSONNodeDumper.cpp (+2) - (modified) clang/lib/AST/StmtPrinter.cpp (+5-1) - (modified) clang/lib/AST/TemplateBase.cpp (+5-2) - (modified) clang/lib/AST/TemplateName.cpp (+8-2) - (modified) clang/lib/AST/TextNodeDumper.cpp (+2) - (modified) clang/lib/AST/TypePrinter.cpp (+4-5) - (modified) clang/lib/CodeGen/CGDebugInfo.cpp (+1-1) - (modified) clang/lib/Sema/SemaTemplate.cpp (+1-1) - (modified) clang/test/AST/ast-dump-templates.cpp (+1022) - (modified) clang/unittests/AST/TypePrinterTest.cpp (+1-1) ``diff diff --git a/clang/include/clang/AST/PrettyPrinter.h b/clang/include/clang/AST/PrettyPrinter.h index 91818776b770c..5a98ae1987b16 100644 --- a/clang/include/clang/AST/PrettyPrinter.h +++ b/clang/include/clang/AST/PrettyPrinter.h @@ -76,7 +76,7 @@ struct PrintingPolicy { MSWChar(LO.MicrosoftExt && !LO.WChar), IncludeNewlines(true), MSVCFormatting(false), ConstantsAsWritten(false), SuppressImplicitBase(false), FullyQualifiedName(false), -PrintCanonicalTypes(false), PrintInjectedClassNameWithArguments(true), +PrintAsCanonical(false), PrintInjectedClassNameWithArguments(true), UsePreferredNames(true), AlwaysIncludeTypeForTemplateArgument(false), CleanUglifiedParameters(false), EntireContentsOfLargeArray(true), UseEnumerators(true), UseHLSLTypes(LO.HLSL) {} @@ -310,9 +310,9 @@ struct PrintingPolicy { LLVM_PREFERRED_TYPE(bool) unsigned FullyQualifiedName : 1; - /// Whether to print types as written or canonically. + /// Whether to print entities as written or canonically. LLVM_PREFERRED_TYPE(bool) - unsigned PrintCanonicalTypes : 1; + unsigned PrintAsCanonical : 1; /// Whether to print an InjectedClassNameType with template arguments or as /// written. When a template argument is unnamed, printing it results in diff --git a/clang/lib/AST/DeclPrinter.cpp b/clang/lib/AST/DeclPrinter.cpp index 28098b242d494..22da5bf251ecd 100644 --- a/clang/lib/AST/DeclPrinter.cpp +++ b/clang/lib/AST/DeclPrinter.cpp @@ -735,7 +735,7 @@ void DeclPrinter::VisitFunctionDecl(FunctionDecl *D) { llvm::raw_string_ostream POut(Proto); DeclPrinter TArgPrinter(POut, SubPolicy, Context, Indentation); const auto *TArgAsWritten = D->getTemplateSpecializationArgsAsWritten(); -if (TArgAsWritten && !Policy.PrintCanonicalTypes) +if (TArgAsWritten && !Policy.PrintAsCanonical) TArgPrinter.printTemplateArguments(TArgAsWritten->arguments(), nullptr); else if (const TemplateArgumentList *TArgs = D->getTemplateSpecializationArgs()) @@ -1124,7 +1124,7 @@ void DeclPrinter::VisitCXXRecordDecl(CXXRecordDecl *D) { S->getSpecializedTemplate()->getTemplateParameters(); const ASTTemplateArgumentListInfo *TArgAsWritten = S->getTemplateArgsAsWritten(); - if (TArgAsWritten && !Policy.PrintCanonicalTypes) + if (TArgAsWritten && !Policy.PrintAsCanonical) printTemplateArguments(TArgAsWritten->arguments(), TParams); else printTemplateArguments(S->getTemplateArgs().asArray(), TParams); diff --git a/clang/lib/AST/JSONNodeDumper.cpp b/clang/lib/AST/JSONNodeDumper.cpp index 3420c1f343cf5..725db93b558f6 100644 --- a/clang/lib/AST/JSONNodeDumper.cpp +++ b/clang/lib/AST/JSONNodeDumper.cpp @@ -1724,6 +1724,8 @@ void JSONNodeDumper::VisitTemplateExpansionTemplateArgument( void JSONNodeDumper::VisitExpressionTemplateArgument( const TemplateArgument &TA) { JOS.attribute("isExpr", true); + if (TA.isCanonicalExpr()) +JOS.attribute("isCanon", true); } void JSONNodeDumper::VisitPackTemplateArgument(const TemplateArgument &TA) { JOS.attribute("isPack", true); diff --git a/clang/lib/AST/StmtPrinter.cpp b/clang/lib/AST/StmtPrinter.cpp index dbe2432d5c799..aae10fd3bd885 100644 --- a/clang/lib/AST/StmtPrinter.cpp +++ b/clang/lib/AST/StmtPrinter.cpp @@ -1305,9 +1305,13 @@ void StmtPrinter::VisitDe
[llvm-branch-commits] [llvm] [GOFF] Add writing of section symbols (PR #133799)
@@ -16,6 +16,9 @@ namespace llvm { class GOFFObjectWriter; class MCGOFFStreamer : public MCObjectStreamer { + std::string RootSDName; + std::string ADAPRName; uweigand wrote: These are no longer used, I think. https://github.com/llvm/llvm-project/pull/133799 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Add initial support for SPE brstack format (PR #129231)
@@ -11,4 +11,4 @@ CHECK-SPE-NO-LBR: PERF2BOLT: Starting data aggregation job RUN: perf record -e cycles -q -o %t.perf.data -- %t.exe RUN: not perf2bolt -p %t.perf.data -o %t.perf.boltdata --spe %t.exe 2>&1 | FileCheck %s --check-prefix=CHECK-SPE-LBR -CHECK-SPE-LBR: PERF2BOLT-ERROR: Arm SPE mode is combined only with BasicAggregation. +CHECK-SPE-LBR: PERF2BOLT: spawning perf job to read SPE branch events paschalis-mpeis wrote: I realized I didn't include proper context in my previous comment about the **'fragility'**: The reason for this fragility is the version of `perf` being used. Since `perf2bolt` is a wrapper over `perf`, older kernel versions may lack `brstack` support. In those cases `perf2bolt` would eventually return an error. So here we intentionally ignore whether `perf2bolt` fails, and instead we only check that its original intent was to parse the SPE data, eg: > PERF2BOLT: spawning perf job to read SPE brstack events This should avoid flakiness in tests. https://github.com/llvm/llvm-project/pull/129231 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LV] Reduce register usage for scaled reductions (PR #133090)
@@ -46,6 +46,11 @@ define i1 @select_exit_cond(ptr %start, ptr %end, i64 %N) { ; CHECK-NEXT:[[STEP_ADD_5:%.*]] = add <2 x i64> [[STEP_ADD_4]], splat (i64 2) ; CHECK-NEXT:[[STEP_ADD_6:%.*]] = add <2 x i64> [[STEP_ADD_5]], splat (i64 2) ; CHECK-NEXT:[[STEP_ADD_7:%.*]] = add <2 x i64> [[STEP_ADD_6]], splat (i64 2) +; CHECK-NEXT:[[STEP_ADD_8:%.*]] = add <2 x i64> [[STEP_ADD_7]], splat (i64 2) david-arm wrote: I'm a bit surprised these are the only CHECK lines that have changed. https://github.com/llvm/llvm-project/pull/133090 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] [mlir][LLVM] Delete `getFixedVectorType` and `getScalableVectorType` (PR #135051)
https://github.com/matthias-springer created https://github.com/llvm/llvm-project/pull/135051 The LLVM dialect no longer has its own vector types. It uses `mlir::VectorType` everywhere. Remove `LLVM::getFixedVectorType/getScalableVectorType` and use `VectorType::get` instead. This commit addresses a [comment](https://github.com/llvm/llvm-project/pull/133286#discussion_r2022192500) on the PR that deleted the LLVM vector types. Depends on #134981. >From 8b0377dedb64a3992ef9cf88144a13df797cd52d Mon Sep 17 00:00:00 2001 From: Matthias Springer Date: Wed, 9 Apr 2025 19:05:11 +0200 Subject: [PATCH] [mlir][LLVM] Delete `getFixedVectorType` and `getScalableVectorType` --- mlir/docs/Dialects/LLVM.md| 4 --- mlir/include/mlir/Dialect/LLVMIR/LLVMTypes.h | 8 - .../Conversion/NVGPUToNVVM/NVGPUToNVVM.cpp| 33 +-- mlir/lib/Dialect/LLVMIR/IR/LLVMTypes.cpp | 12 --- mlir/lib/Dialect/LLVMIR/IR/NVVMDialect.cpp| 23 - mlir/lib/Dialect/NVGPU/Utils/MMAUtils.cpp | 24 ++ mlir/lib/Target/LLVMIR/TypeFromLLVM.cpp | 9 ++--- 7 files changed, 45 insertions(+), 68 deletions(-) diff --git a/mlir/docs/Dialects/LLVM.md b/mlir/docs/Dialects/LLVM.md index 468f69c419071..4b5d518ca4eab 100644 --- a/mlir/docs/Dialects/LLVM.md +++ b/mlir/docs/Dialects/LLVM.md @@ -336,10 +336,6 @@ compatible with the LLVM dialect: vector type compatible with the LLVM dialect; - `llvm::ElementCount LLVM::getVectorNumElements(Type)` - returns the number of elements in any vector type compatible with the LLVM dialect; -- `Type LLVM::getFixedVectorType(Type, unsigned)` - gets a fixed vector type -with the given element type and size; the resulting type is either a -built-in or an LLVM dialect vector type depending on which one supports the -given element type. Examples of Compatible Vector Types diff --git a/mlir/include/mlir/Dialect/LLVMIR/LLVMTypes.h b/mlir/include/mlir/Dialect/LLVMIR/LLVMTypes.h index a2a76c49a2bda..17561f79d135a 100644 --- a/mlir/include/mlir/Dialect/LLVMIR/LLVMTypes.h +++ b/mlir/include/mlir/Dialect/LLVMIR/LLVMTypes.h @@ -126,14 +126,6 @@ Type getVectorType(Type elementType, unsigned numElements, /// and length. Type getVectorType(Type elementType, const llvm::ElementCount &numElements); -/// Creates an LLVM dialect-compatible type with the given element type and -/// length. -Type getFixedVectorType(Type elementType, unsigned numElements); - -/// Creates an LLVM dialect-compatible type with the given element type and -/// length. -Type getScalableVectorType(Type elementType, unsigned numElements); - /// Returns the size of the given primitive LLVM dialect-compatible type /// (including vectors) in bits, for example, the size of i16 is 16 and /// the size of vector<4xi16> is 64. Returns 0 for non-primitive diff --git a/mlir/lib/Conversion/NVGPUToNVVM/NVGPUToNVVM.cpp b/mlir/lib/Conversion/NVGPUToNVVM/NVGPUToNVVM.cpp index 51507c6507b69..e144a8063ae31 100644 --- a/mlir/lib/Conversion/NVGPUToNVVM/NVGPUToNVVM.cpp +++ b/mlir/lib/Conversion/NVGPUToNVVM/NVGPUToNVVM.cpp @@ -61,13 +61,13 @@ static Value truncToI32(ImplicitLocOpBuilder &b, Value value) { static Type inferIntrinsicResultType(Type vectorResultType) { MLIRContext *ctx = vectorResultType.getContext(); auto a = cast(vectorResultType); - auto f16x2Ty = LLVM::getFixedVectorType(Float16Type::get(ctx), 2); + auto f16x2Ty = VectorType::get(2, Float16Type::get(ctx)); auto i32Ty = IntegerType::get(ctx, 32); - auto i32x2Ty = LLVM::getFixedVectorType(i32Ty, 2); + auto i32x2Ty = VectorType::get(2, i32Ty); Type f64Ty = Float64Type::get(ctx); - Type f64x2Ty = LLVM::getFixedVectorType(f64Ty, 2); + Type f64x2Ty = VectorType::get(2, f64Ty); Type f32Ty = Float32Type::get(ctx); - Type f32x2Ty = LLVM::getFixedVectorType(f32Ty, 2); + Type f32x2Ty = VectorType::get(2, f32Ty); if (a.getElementType() == f16x2Ty) { return LLVM::LLVMStructType::getLiteral( ctx, SmallVector(a.getNumElements(), f16x2Ty)); @@ -85,7 +85,7 @@ static Type inferIntrinsicResultType(Type vectorResultType) { ctx, SmallVector(static_cast(a.getNumElements()) * 2, f32Ty)); } - if (a.getElementType() == LLVM::getFixedVectorType(f32Ty, 1)) { + if (a.getElementType() == VectorType::get(f32Ty, {1})) { return LLVM::LLVMStructType::getLiteral( ctx, SmallVector(static_cast(a.getNumElements()), f32Ty)); } @@ -106,11 +106,11 @@ static Value convertIntrinsicResult(Location loc, Type intrinsicResultType, Type i32Ty = rewriter.getI32Type(); Type f32Ty = rewriter.getF32Type(); Type f64Ty = rewriter.getF64Type(); - Type f16x2Ty = LLVM::getFixedVectorType(rewriter.getF16Type(), 2); - Type i32x2Ty = LLVM::getFixedVectorType(i32Ty, 2); - Type f64x2Ty = LLVM::getFixedVectorType(f64Ty, 2); - Type f32x2Ty = LLVM::getFixedVectorType(f32Ty, 2); - Type f32x1Ty = LLVM::getFixedVectorType(f32Ty, 1); + Type f16x2Ty =
[llvm-branch-commits] [clang-tools-extra] [clang-tidy] `matchesAnyListedTypeName` support non canonical types (PR #134869)
https://github.com/HerrCai0907 ready_for_review https://github.com/llvm/llvm-project/pull/134869 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits