[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)
https://github.com/ChuanqiXu9 edited https://github.com/llvm/llvm-project/pull/99283 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)
@@ -1455,6 +1462,62 @@ struct SwitchCoroutineSplitter { setCoroInfo(F, Shape, Clones); } + static Function *createNoAllocVariant(Function &F, coro::Shape &Shape, +SmallVectorImpl &Clones) { +auto *OrigFnTy = F.getFunctionType(); ChuanqiXu9 wrote: If I am not mistaken, the identation looks not good here. https://github.com/llvm/llvm-project/pull/99283 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)
@@ -1455,6 +1462,64 @@ struct SwitchCoroutineSplitter { setCoroInfo(F, Shape, Clones); } + static Function *createNoAllocVariant(Function &F, coro::Shape &Shape, +SmallVectorImpl &Clones) { +auto *OrigFnTy = F.getFunctionType(); +auto OldParams = OrigFnTy->params(); + +SmallVector NewParams; +NewParams.reserve(OldParams.size() + 1); +for (Type *T : OldParams) { + NewParams.push_back(T); +} +NewParams.push_back(PointerType::getUnqual(Shape.FrameTy)); + +auto *NewFnTy = FunctionType::get(OrigFnTy->getReturnType(), NewParams, + OrigFnTy->isVarArg()); +Function *NoAllocF = +Function::Create(NewFnTy, F.getLinkage(), F.getName() + ".noalloc"); ChuanqiXu9 wrote: I think this should be marked as internal like .resume and .destroy function does. While LTO can import them, I don't feel we can benefit a lot from that. https://github.com/llvm/llvm-project/pull/99283 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)
@@ -63,6 +63,13 @@ suspend: ; CHECK-NOT: call void @free( ; CHECK: ret void +; CHECK-LABEL: @f.noalloc({{.*}}) ChuanqiXu9 wrote: I think it worth to list the arguments explicitly. https://github.com/llvm/llvm-project/pull/99283 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)
@@ -1967,22 +2047,13 @@ splitCoroutine(Function &F, SmallVectorImpl &Clones, for (DbgVariableRecord *DVR : DbgVariableRecords) coro::salvageDebugInfo(ArgToAllocaMap, *DVR, Shape.OptimizeFrame, false /*UseEntryValue*/); - return Shape; -} -/// Remove calls to llvm.coro.end in the original function. -static void removeCoroEndsFromRampFunction(const coro::Shape &Shape) { - if (Shape.ABI != coro::ABI::Switch) { -for (auto *End : Shape.CoroEnds) { - replaceCoroEnd(End, Shape, Shape.FramePtr, /*in resume*/ false, nullptr); -} - } else { -for (llvm::AnyCoroEndInst *End : Shape.CoroEnds) { - auto &Context = End->getContext(); - End->replaceAllUsesWith(ConstantInt::getFalse(Context)); - End->eraseFromParent(); -} + removeCoroEndsFromRampFunction(Shape); + + if (!isNoSuspendCoroutine && Shape.ABI == coro::ABI::Switch) { ChuanqiXu9 wrote: Agreed. https://github.com/llvm/llvm-project/pull/99283 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)
https://github.com/ChuanqiXu9 commented: We need to update coroutines.rst for such changes. https://github.com/llvm/llvm-project/pull/99283 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)
@@ -5,7 +5,7 @@ define nonnull ptr @f(i32 %n) presplitcoroutine { ; CHECK-LABEL: @f( ; CHECK-NEXT: entry: -; CHECK-NEXT:[[ID:%.*]] = call token @llvm.coro.id(i32 0, ptr null, ptr null, ptr @f.resumers) +; CHECK-NEXT:[[ID:%.*]] = call token @llvm.coro.id(i32 0, ptr null, ptr null, ptr @{{.*}}) ChuanqiXu9 wrote: Are these changes to `@llvm.coro.id` related to this patch really? https://github.com/llvm/llvm-project/pull/99283 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)
@@ -1455,6 +1462,62 @@ struct SwitchCoroutineSplitter { setCoroInfo(F, Shape, Clones); } + static Function *createNoAllocVariant(Function &F, coro::Shape &Shape, ChuanqiXu9 wrote: We need comment and an example for this function https://github.com/llvm/llvm-project/pull/99283 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Transform "coro_must_elide" calls to switch ABI coroutines to the `noalloc` variant (PR #99285)
https://github.com/ChuanqiXu9 commented: I did a quick scanning and I'll put the comment in https://github.com/llvm/llvm-project/pull/99283#pullrequestreview-822800 https://github.com/llvm/llvm-project/pull/99285 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Transform "coro_must_elide" calls to switch ABI coroutines to the `noalloc` variant (PR #99285)
https://github.com/ChuanqiXu9 edited https://github.com/llvm/llvm-project/pull/99285 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Transform "coro_must_elide" calls to switch ABI coroutines to the `noalloc` variant (PR #99285)
@@ -0,0 +1,136 @@ +//===- CoroSplit.cpp - Converts a coroutine into a state machine --===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// + +//===--===// + +#include "llvm/Transforms/Coroutines/CoroAnnotationElide.h" + +#include "llvm/Analysis/LazyCallGraph.h" +#include "llvm/Analysis/OptimizationRemarkEmitter.h" +#include "llvm/IR/Analysis.h" +#include "llvm/IR/IRBuilder.h" +#include "llvm/IR/InstIterator.h" +#include "llvm/IR/Instruction.h" +#include "llvm/IR/Module.h" +#include "llvm/IR/PassManager.h" +#include "llvm/Transforms/Utils/CallGraphUpdater.h" + +#include + +using namespace llvm; + +#define DEBUG_TYPE "coro-annotation-elide" + +#define CORO_MUST_ELIDE_ANNOTATION "coro_must_elide" + +static Instruction *getFirstNonAllocaInTheEntryBlock(Function *F) { + for (Instruction &I : F->getEntryBlock()) +if (!isa(&I)) + return &I; + llvm_unreachable("no terminator in the entry block"); +} + +static Value *allocateFrameInCaller(Function *Caller, uint64_t FrameSize, +Align FrameAlign) { + LLVMContext &C = Caller->getContext(); + BasicBlock::iterator InsertPt = + getFirstNonAllocaInTheEntryBlock(Caller)->getIterator(); + const DataLayout &DL = Caller->getDataLayout(); + auto FrameTy = ArrayType::get(Type::getInt8Ty(C), FrameSize); + auto *Frame = new AllocaInst(FrameTy, DL.getAllocaAddrSpace(), "", InsertPt); + Frame->setAlignment(FrameAlign); + return new BitCastInst(Frame, PointerType::getUnqual(C), "vFrame", InsertPt); +} + +static void processCall(CallBase *CB, Function *Caller, Function *NewCallee, ChuanqiXu9 wrote: We really comments for the function we can't understand it by its name. https://github.com/llvm/llvm-project/pull/99285 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)
ChuanqiXu9 wrote: After I took a quick look at https://github.com/llvm/llvm-project/pull/99285, I feel this is not what I thought when I heard the idea. Correct me if I misunderstand it. The problems are: 1. It looks like all the `.noalloc` variant are emitted all the time. This is absolutely not good. It emits duplicated code (especially currently its linkage is not internal). What I had in mind is to create `.noalloc` variant on need or lazily. 2. The original ramp function and the `.noalloc` variant shares a lot of codes if I am not mistaken. Then it is pretty bad for the code size. What I had in mind is, after we generate the `.noalloc` variant, we will rewrite the ramp function too and the ramp function will call the `.noalloc` function after it allocates the coroutine frame. https://github.com/llvm/llvm-project/pull/99283 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [AArch64] Add streaming-mode stack hazard optimization remarks (#101695) (PR #102168)
https://github.com/davemgreen approved this pull request. https://github.com/llvm/llvm-project/pull/102168 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [BOLT] Support map other function entry address (#101466) (PR #102282)
https://github.com/llvmbot created https://github.com/llvm/llvm-project/pull/102282 Backport 734c048 Requested by: @linsinan1995 >From e362039d6b069b4d7668371f13a94271cb70c52c Mon Sep 17 00:00:00 2001 From: sinan Date: Wed, 7 Aug 2024 15:57:25 +0800 Subject: [PATCH] [BOLT] Support map other function entry address (#101466) Allow BOLT to map the old address to a new binary address if the old address is the entry of the function. (cherry picked from commit 734c0488b6e69300adaf568f880f40b113ae02ca) --- bolt/lib/Rewrite/RewriteInstance.cpp| 8 +++ bolt/test/X86/dynamic-relocs-on-entry.s | 32 + 2 files changed, 40 insertions(+) create mode 100644 bolt/test/X86/dynamic-relocs-on-entry.s diff --git a/bolt/lib/Rewrite/RewriteInstance.cpp b/bolt/lib/Rewrite/RewriteInstance.cpp index 33ebae3b6e6de..2e93b6576edad 100644 --- a/bolt/lib/Rewrite/RewriteInstance.cpp +++ b/bolt/lib/Rewrite/RewriteInstance.cpp @@ -5498,6 +5498,14 @@ uint64_t RewriteInstance::getNewFunctionOrDataAddress(uint64_t OldAddress) { if (const BinaryFunction *BF = BC->getBinaryFunctionContainingAddress(OldAddress)) { if (BF->isEmitted()) { + // If OldAddress is the another entry point of + // the function, then BOLT could get the new address. + if (BF->isMultiEntry()) { +for (const BinaryBasicBlock &BB : *BF) + if (BB.isEntryPoint() && + (BF->getAddress() + BB.getOffset()) == OldAddress) +return BF->getOutputAddress() + BB.getOffset(); + } BC->errs() << "BOLT-ERROR: unable to get new address corresponding to " "input address 0x" << Twine::utohexstr(OldAddress) << " in function " << *BF diff --git a/bolt/test/X86/dynamic-relocs-on-entry.s b/bolt/test/X86/dynamic-relocs-on-entry.s new file mode 100644 index 0..2a29a43c4939a --- /dev/null +++ b/bolt/test/X86/dynamic-relocs-on-entry.s @@ -0,0 +1,32 @@ +// This test examines whether BOLT can correctly process when +// dynamic relocation points to other entry points of the +// function. + +# RUN: %clang %cflags -fPIC -pie %s -o %t.exe -nostdlib -Wl,-q +# RUN: llvm-bolt %t.exe -o %t.bolt > %t.out.txt +# RUN: readelf -r %t.bolt >> %t.out.txt +# RUN: llvm-objdump --disassemble-symbols=chain %t.bolt >> %t.out.txt +# RUN: FileCheck %s --input-file=%t.out.txt + +## Check if the new address in `chain` is correctly updated by BOLT +# CHECK: Relocation section '.rela.dyn' at offset 0x{{.*}} contains 1 entry: +# CHECK: {{.*}} R_X86_64_RELATIVE [[#%x,ADDR:]] +# CHECK: [[#ADDR]]: c3 retq + .text + .type chain, @function +chain: + movq$1, %rax +Label: + ret + .size chain, .-chain + + .type _start, @function + .global _start +_start: + jmpq*.Lfoo(%rip) + ret + .size _start, .-_start + + .data +.Lfoo: + .quad Label \ No newline at end of file ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [BOLT] Support map other function entry address (#101466) (PR #102282)
https://github.com/llvmbot milestoned https://github.com/llvm/llvm-project/pull/102282 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [BOLT] Support map other function entry address (#101466) (PR #102282)
llvmbot wrote: @yota9 What do you think about merging this PR to the release branch? https://github.com/llvm/llvm-project/pull/102282 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [BOLT] Support map other function entry address (#101466) (PR #102282)
llvmbot wrote: @llvm/pr-subscribers-bolt Author: None (llvmbot) Changes Backport 734c048 Requested by: @linsinan1995 --- Full diff: https://github.com/llvm/llvm-project/pull/102282.diff 2 Files Affected: - (modified) bolt/lib/Rewrite/RewriteInstance.cpp (+8) - (added) bolt/test/X86/dynamic-relocs-on-entry.s (+32) ``diff diff --git a/bolt/lib/Rewrite/RewriteInstance.cpp b/bolt/lib/Rewrite/RewriteInstance.cpp index 33ebae3b6e6de..2e93b6576edad 100644 --- a/bolt/lib/Rewrite/RewriteInstance.cpp +++ b/bolt/lib/Rewrite/RewriteInstance.cpp @@ -5498,6 +5498,14 @@ uint64_t RewriteInstance::getNewFunctionOrDataAddress(uint64_t OldAddress) { if (const BinaryFunction *BF = BC->getBinaryFunctionContainingAddress(OldAddress)) { if (BF->isEmitted()) { + // If OldAddress is the another entry point of + // the function, then BOLT could get the new address. + if (BF->isMultiEntry()) { +for (const BinaryBasicBlock &BB : *BF) + if (BB.isEntryPoint() && + (BF->getAddress() + BB.getOffset()) == OldAddress) +return BF->getOutputAddress() + BB.getOffset(); + } BC->errs() << "BOLT-ERROR: unable to get new address corresponding to " "input address 0x" << Twine::utohexstr(OldAddress) << " in function " << *BF diff --git a/bolt/test/X86/dynamic-relocs-on-entry.s b/bolt/test/X86/dynamic-relocs-on-entry.s new file mode 100644 index 0..2a29a43c4939a --- /dev/null +++ b/bolt/test/X86/dynamic-relocs-on-entry.s @@ -0,0 +1,32 @@ +// This test examines whether BOLT can correctly process when +// dynamic relocation points to other entry points of the +// function. + +# RUN: %clang %cflags -fPIC -pie %s -o %t.exe -nostdlib -Wl,-q +# RUN: llvm-bolt %t.exe -o %t.bolt > %t.out.txt +# RUN: readelf -r %t.bolt >> %t.out.txt +# RUN: llvm-objdump --disassemble-symbols=chain %t.bolt >> %t.out.txt +# RUN: FileCheck %s --input-file=%t.out.txt + +## Check if the new address in `chain` is correctly updated by BOLT +# CHECK: Relocation section '.rela.dyn' at offset 0x{{.*}} contains 1 entry: +# CHECK: {{.*}} R_X86_64_RELATIVE [[#%x,ADDR:]] +# CHECK: [[#ADDR]]: c3 retq + .text + .type chain, @function +chain: + movq$1, %rax +Label: + ret + .size chain, .-chain + + .type _start, @function + .global _start +_start: + jmpq*.Lfoo(%rip) + ret + .size _start, .-_start + + .data +.Lfoo: + .quad Label \ No newline at end of file `` https://github.com/llvm/llvm-project/pull/102282 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [BOLT] Support map other function entry address (#101466) (PR #102282)
https://github.com/yota9 approved this pull request. Bugfix LGTM https://github.com/llvm/llvm-project/pull/102282 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [lld] release/19.x: [lld][ARM] Fix assertion when mixing ARM and Thumb objects (#101985) (PR #102292)
https://github.com/llvmbot created https://github.com/llvm/llvm-project/pull/102292 Backport a1c6467bd Requested by: @ostannard >From 15960c61afc58e8ade3ee337a8501abac2f3ae45 Mon Sep 17 00:00:00 2001 From: Oliver Stannard Date: Wed, 7 Aug 2024 10:20:26 +0100 Subject: [PATCH] [lld][ARM] Fix assertion when mixing ARM and Thumb objects (#101985) Previously, we selected the Thumb2 PLT sequences if any input object is marked as not supporting the ARM ISA, which then causes assertion failures when calls from ARM code in other objects are seen. I think the intention here was to only use Thumb PLTs when the target does not have the ARM ISA available, signalled by no objects being marked as having it available. To do that we need to track which ISAs we have seen as we parse the build attributes, and defer the decision about PLTs until all input objects have been parsed. This bug was triggered by real code in picolibc, which have some versions of string.h functions built with Thumb2-only build attributes, so that they are compatible with v7-A, v7-R and v7-M. Fixes #99008. (cherry picked from commit a1c6467bd90905d52cf8f6162b60907f8e98a704) --- lld/ELF/Arch/ARM.cpp | 21 +++-- lld/ELF/Config.h | 3 ++- lld/ELF/InputFiles.cpp| 6 ++--- lld/test/ELF/arm-mixed-plts.s | 44 +++ 4 files changed, 62 insertions(+), 12 deletions(-) create mode 100644 lld/test/ELF/arm-mixed-plts.s diff --git a/lld/ELF/Arch/ARM.cpp b/lld/ELF/Arch/ARM.cpp index 3e0efe540e1bf..07a7535c4a231 100644 --- a/lld/ELF/Arch/ARM.cpp +++ b/lld/ELF/Arch/ARM.cpp @@ -228,10 +228,16 @@ static void writePltHeaderLong(uint8_t *buf) { write32(buf + 16, gotPlt - l1 - 8); } +// True if we should use Thumb PLTs, which currently require Thumb2, and are +// only used if the target does not have the ARM ISA. +static bool useThumbPLTs() { + return config->armHasThumb2ISA && !config->armHasArmISA; +} + // The default PLT header requires the .got.plt to be within 128 Mb of the // .plt in the positive direction. void ARM::writePltHeader(uint8_t *buf) const { - if (config->armThumbPLTs) { + if (useThumbPLTs()) { // The instruction sequence for thumb: // // 0: b500 push{lr} @@ -289,7 +295,7 @@ void ARM::writePltHeader(uint8_t *buf) const { } void ARM::addPltHeaderSymbols(InputSection &isec) const { - if (config->armThumbPLTs) { + if (useThumbPLTs()) { addSyntheticLocal("$t", STT_NOTYPE, 0, 0, isec); addSyntheticLocal("$d", STT_NOTYPE, 12, 0, isec); } else { @@ -315,7 +321,7 @@ static void writePltLong(uint8_t *buf, uint64_t gotPltEntryAddr, void ARM::writePlt(uint8_t *buf, const Symbol &sym, uint64_t pltEntryAddr) const { - if (!config->armThumbPLTs) { + if (!useThumbPLTs()) { uint64_t offset = sym.getGotPltVA() - pltEntryAddr - 8; // The PLT entry is similar to the example given in Appendix A of ELF for @@ -367,7 +373,7 @@ void ARM::writePlt(uint8_t *buf, const Symbol &sym, } void ARM::addPltSymbols(InputSection &isec, uint64_t off) const { - if (config->armThumbPLTs) { + if (useThumbPLTs()) { addSyntheticLocal("$t", STT_NOTYPE, off, 0, isec); } else { addSyntheticLocal("$a", STT_NOTYPE, off, 0, isec); @@ -393,7 +399,7 @@ bool ARM::needsThunk(RelExpr expr, RelType type, const InputFile *file, case R_ARM_JUMP24: // Source is ARM, all PLT entries are ARM so no interworking required. // Otherwise we need to interwork if STT_FUNC Symbol has bit 0 set (Thumb). -assert(!config->armThumbPLTs && +assert(!useThumbPLTs() && "If the source is ARM, we should not need Thumb PLTs"); if (s.isFunc() && expr == R_PC && (s.getVA() & 1)) return true; @@ -407,7 +413,8 @@ bool ARM::needsThunk(RelExpr expr, RelType type, const InputFile *file, case R_ARM_THM_JUMP24: // Source is Thumb, when all PLT entries are ARM interworking is required. // Otherwise we need to interwork if STT_FUNC Symbol has bit 0 clear (ARM). -if ((expr == R_PLT_PC && !config->armThumbPLTs) || (s.isFunc() && (s.getVA() & 1) == 0)) +if ((expr == R_PLT_PC && !useThumbPLTs()) || +(s.isFunc() && (s.getVA() & 1) == 0)) return true; [[fallthrough]]; case R_ARM_THM_CALL: { @@ -675,7 +682,7 @@ void ARM::relocate(uint8_t *loc, const Relocation &rel, uint64_t val) const { // PLT entries are always ARM state so we know we need to interwork. assert(rel.sym); // R_ARM_THM_CALL is always reached via relocate(). bool bit0Thumb = val & 1; -bool useThumb = bit0Thumb || config->armThumbPLTs; +bool useThumb = bit0Thumb || useThumbPLTs(); bool isBlx = (read16(loc + 2) & 0x1000) == 0; // lld 10.0 and before always used bit0Thumb when deciding to write a BLX // even when type not STT_FUNC. diff --git a/lld/ELF/Config.h b/lld/ELF/Config.h index 0173be396163e..28726d48e4284 100644 --- a/lld/ELF/Config.h +++ b/lld/EL
[llvm-branch-commits] [lld] release/19.x: [lld][ARM] Fix assertion when mixing ARM and Thumb objects (#101985) (PR #102292)
https://github.com/llvmbot milestoned https://github.com/llvm/llvm-project/pull/102292 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [lld] release/19.x: [lld][ARM] Fix assertion when mixing ARM and Thumb objects (#101985) (PR #102292)
llvmbot wrote: @MaskRay What do you think about merging this PR to the release branch? https://github.com/llvm/llvm-project/pull/102292 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [lld] release/19.x: [lld][ARM] Fix assertion when mixing ARM and Thumb objects (#101985) (PR #102292)
llvmbot wrote: @llvm/pr-subscribers-lld-elf Author: None (llvmbot) Changes Backport a1c6467bd Requested by: @ostannard --- Full diff: https://github.com/llvm/llvm-project/pull/102292.diff 4 Files Affected: - (modified) lld/ELF/Arch/ARM.cpp (+14-7) - (modified) lld/ELF/Config.h (+2-1) - (modified) lld/ELF/InputFiles.cpp (+2-4) - (added) lld/test/ELF/arm-mixed-plts.s (+44) ``diff diff --git a/lld/ELF/Arch/ARM.cpp b/lld/ELF/Arch/ARM.cpp index 3e0efe540e1bf..07a7535c4a231 100644 --- a/lld/ELF/Arch/ARM.cpp +++ b/lld/ELF/Arch/ARM.cpp @@ -228,10 +228,16 @@ static void writePltHeaderLong(uint8_t *buf) { write32(buf + 16, gotPlt - l1 - 8); } +// True if we should use Thumb PLTs, which currently require Thumb2, and are +// only used if the target does not have the ARM ISA. +static bool useThumbPLTs() { + return config->armHasThumb2ISA && !config->armHasArmISA; +} + // The default PLT header requires the .got.plt to be within 128 Mb of the // .plt in the positive direction. void ARM::writePltHeader(uint8_t *buf) const { - if (config->armThumbPLTs) { + if (useThumbPLTs()) { // The instruction sequence for thumb: // // 0: b500 push{lr} @@ -289,7 +295,7 @@ void ARM::writePltHeader(uint8_t *buf) const { } void ARM::addPltHeaderSymbols(InputSection &isec) const { - if (config->armThumbPLTs) { + if (useThumbPLTs()) { addSyntheticLocal("$t", STT_NOTYPE, 0, 0, isec); addSyntheticLocal("$d", STT_NOTYPE, 12, 0, isec); } else { @@ -315,7 +321,7 @@ static void writePltLong(uint8_t *buf, uint64_t gotPltEntryAddr, void ARM::writePlt(uint8_t *buf, const Symbol &sym, uint64_t pltEntryAddr) const { - if (!config->armThumbPLTs) { + if (!useThumbPLTs()) { uint64_t offset = sym.getGotPltVA() - pltEntryAddr - 8; // The PLT entry is similar to the example given in Appendix A of ELF for @@ -367,7 +373,7 @@ void ARM::writePlt(uint8_t *buf, const Symbol &sym, } void ARM::addPltSymbols(InputSection &isec, uint64_t off) const { - if (config->armThumbPLTs) { + if (useThumbPLTs()) { addSyntheticLocal("$t", STT_NOTYPE, off, 0, isec); } else { addSyntheticLocal("$a", STT_NOTYPE, off, 0, isec); @@ -393,7 +399,7 @@ bool ARM::needsThunk(RelExpr expr, RelType type, const InputFile *file, case R_ARM_JUMP24: // Source is ARM, all PLT entries are ARM so no interworking required. // Otherwise we need to interwork if STT_FUNC Symbol has bit 0 set (Thumb). -assert(!config->armThumbPLTs && +assert(!useThumbPLTs() && "If the source is ARM, we should not need Thumb PLTs"); if (s.isFunc() && expr == R_PC && (s.getVA() & 1)) return true; @@ -407,7 +413,8 @@ bool ARM::needsThunk(RelExpr expr, RelType type, const InputFile *file, case R_ARM_THM_JUMP24: // Source is Thumb, when all PLT entries are ARM interworking is required. // Otherwise we need to interwork if STT_FUNC Symbol has bit 0 clear (ARM). -if ((expr == R_PLT_PC && !config->armThumbPLTs) || (s.isFunc() && (s.getVA() & 1) == 0)) +if ((expr == R_PLT_PC && !useThumbPLTs()) || +(s.isFunc() && (s.getVA() & 1) == 0)) return true; [[fallthrough]]; case R_ARM_THM_CALL: { @@ -675,7 +682,7 @@ void ARM::relocate(uint8_t *loc, const Relocation &rel, uint64_t val) const { // PLT entries are always ARM state so we know we need to interwork. assert(rel.sym); // R_ARM_THM_CALL is always reached via relocate(). bool bit0Thumb = val & 1; -bool useThumb = bit0Thumb || config->armThumbPLTs; +bool useThumb = bit0Thumb || useThumbPLTs(); bool isBlx = (read16(loc + 2) & 0x1000) == 0; // lld 10.0 and before always used bit0Thumb when deciding to write a BLX // even when type not STT_FUNC. diff --git a/lld/ELF/Config.h b/lld/ELF/Config.h index 0173be396163e..28726d48e4284 100644 --- a/lld/ELF/Config.h +++ b/lld/ELF/Config.h @@ -217,7 +217,8 @@ struct Config { bool allowMultipleDefinition; bool fatLTOObjects; bool androidPackDynRelocs = false; - bool armThumbPLTs = false; + bool armHasArmISA = false; + bool armHasThumb2ISA = false; bool armHasBlx = false; bool armHasMovtMovw = false; bool armJ1J2BranchEncoding = false; diff --git a/lld/ELF/InputFiles.cpp b/lld/ELF/InputFiles.cpp index f1c0eb292361b..48f5a9609ecfb 100644 --- a/lld/ELF/InputFiles.cpp +++ b/lld/ELF/InputFiles.cpp @@ -203,10 +203,8 @@ static void updateSupportedARMFeatures(const ARMAttributeParser &attributes) { attributes.getAttributeValue(ARMBuildAttrs::ARM_ISA_use); std::optional thumb = attributes.getAttributeValue(ARMBuildAttrs::THUMB_ISA_use); - bool noArmISA = !armISA || *armISA == ARMBuildAttrs::Not_Allowed; - bool hasThumb2 = thumb && *thumb >= ARMBuildAttrs::AllowThumb32; - if (noArmISA && hasThumb2) -config->armThumbPLTs = true; + config->armHasArmISA |= armISA && *armISA >= ARMBuildAttrs::Allowed; + config->armHasThumb2ISA |= thumb && *t
[llvm-branch-commits] [llvm] TTI: Check legalization cost of abs nodes (PR #100523)
https://github.com/RKSimon approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/100523 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [BOLT] Skip PLT search for zero-value weak reference symbols (#69136) (PR #102295)
https://github.com/llvmbot created https://github.com/llvm/llvm-project/pull/102295 Backport 6c8933e Requested by: @linsinan1995 >From f30f9cb6eaee0f364619f47c3ad76066c0907dc6 Mon Sep 17 00:00:00 2001 From: sinan Date: Wed, 7 Aug 2024 18:02:42 +0800 Subject: [PATCH] [BOLT] Skip PLT search for zero-value weak reference symbols (#69136) Take a common weak reference pattern for example ``` __attribute__((weak)) void undef_weak_fun(); if (&undef_weak_fun) undef_weak_fun(); ``` In this case, an undefined weak symbol `undef_weak_fun` has an address of zero, and Bolt incorrectly changes the relocation for the corresponding symbol to symbol@PLT, leading to incorrect runtime behavior. (cherry picked from commit 6c8933e1a095028d648a5a26aecee0f569304dd0) --- bolt/lib/Rewrite/RewriteInstance.cpp | 11 +- .../AArch64/update-weak-reference-symbol.s| 34 +++ 2 files changed, 44 insertions(+), 1 deletion(-) create mode 100644 bolt/test/AArch64/update-weak-reference-symbol.s diff --git a/bolt/lib/Rewrite/RewriteInstance.cpp b/bolt/lib/Rewrite/RewriteInstance.cpp index 33ebae3b6e6de2..45d594ef3fa6f5 100644 --- a/bolt/lib/Rewrite/RewriteInstance.cpp +++ b/bolt/lib/Rewrite/RewriteInstance.cpp @@ -2143,6 +2143,14 @@ bool RewriteInstance::analyzeRelocation( if (!Relocation::isSupported(RType)) return false; + auto IsWeakReference = [](const SymbolRef &Symbol) { +Expected SymFlagsOrErr = Symbol.getFlags(); +if (!SymFlagsOrErr) + return false; +return (*SymFlagsOrErr & SymbolRef::SF_Undefined) && + (*SymFlagsOrErr & SymbolRef::SF_Weak); + }; + const bool IsAArch64 = BC->isAArch64(); const size_t RelSize = Relocation::getSizeForType(RType); @@ -2174,7 +2182,8 @@ bool RewriteInstance::analyzeRelocation( // Section symbols are marked as ST_Debug. IsSectionRelocation = (cantFail(Symbol.getType()) == SymbolRef::ST_Debug); // Check for PLT entry registered with symbol name -if (!SymbolAddress && (IsAArch64 || BC->isRISCV())) { +if (!SymbolAddress && !IsWeakReference(Symbol) && +(IsAArch64 || BC->isRISCV())) { const BinaryData *BD = BC->getPLTBinaryDataByName(SymbolName); SymbolAddress = BD ? BD->getAddress() : 0; } diff --git a/bolt/test/AArch64/update-weak-reference-symbol.s b/bolt/test/AArch64/update-weak-reference-symbol.s new file mode 100644 index 00..600a06b8b6d8fd --- /dev/null +++ b/bolt/test/AArch64/update-weak-reference-symbol.s @@ -0,0 +1,34 @@ +// This test checks whether BOLT can correctly handle relocations against weak symbols. + +// RUN: %clang %cflags -Wl,-z,notext -shared -Wl,-q %s -o %t.so +// RUN: llvm-bolt %t.so -o %t.so.bolt +// RUN: llvm-nm -n %t.so.bolt > %t.out.txt +// RUN: llvm-objdump -dj .rodata %t.so.bolt >> %t.out.txt +// RUN: FileCheck %s --input-file=%t.out.txt + +# CHECK: w func_1 +# CHECK: {{0+}}[[#%x,ADDR:]] W func_2 + +# CHECK: {{.*}} <.rodata>: +# CHECK-NEXT: {{.*}} .word 0x +# CHECK-NEXT: {{.*}} .word 0x +# CHECK-NEXT: {{.*}} .word 0x{{[0]+}}[[#ADDR]] +# CHECK-NEXT: {{.*}} .word 0x + + .text + .weak func_2 + .weak func_1 + .global wow + .type wow, %function +wow: + bl func_1 + bl func_2 + ret + .type func_2, %function +func_2: + ret + .section.rodata +.LC0: + .xword func_1 +.LC1: + .xword func_2 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [BOLT] Skip PLT search for zero-value weak reference symbols (#69136) (PR #102295)
llvmbot wrote: @yota9 What do you think about merging this PR to the release branch? https://github.com/llvm/llvm-project/pull/102295 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [BOLT] Skip PLT search for zero-value weak reference symbols (#69136) (PR #102295)
https://github.com/llvmbot milestoned https://github.com/llvm/llvm-project/pull/102295 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [BOLT] Skip PLT search for zero-value weak reference symbols (#69136) (PR #102295)
llvmbot wrote: @llvm/pr-subscribers-bolt Author: None (llvmbot) Changes Backport 6c8933e Requested by: @linsinan1995 --- Full diff: https://github.com/llvm/llvm-project/pull/102295.diff 2 Files Affected: - (modified) bolt/lib/Rewrite/RewriteInstance.cpp (+10-1) - (added) bolt/test/AArch64/update-weak-reference-symbol.s (+34) ``diff diff --git a/bolt/lib/Rewrite/RewriteInstance.cpp b/bolt/lib/Rewrite/RewriteInstance.cpp index 33ebae3b6e6de2..45d594ef3fa6f5 100644 --- a/bolt/lib/Rewrite/RewriteInstance.cpp +++ b/bolt/lib/Rewrite/RewriteInstance.cpp @@ -2143,6 +2143,14 @@ bool RewriteInstance::analyzeRelocation( if (!Relocation::isSupported(RType)) return false; + auto IsWeakReference = [](const SymbolRef &Symbol) { +Expected SymFlagsOrErr = Symbol.getFlags(); +if (!SymFlagsOrErr) + return false; +return (*SymFlagsOrErr & SymbolRef::SF_Undefined) && + (*SymFlagsOrErr & SymbolRef::SF_Weak); + }; + const bool IsAArch64 = BC->isAArch64(); const size_t RelSize = Relocation::getSizeForType(RType); @@ -2174,7 +2182,8 @@ bool RewriteInstance::analyzeRelocation( // Section symbols are marked as ST_Debug. IsSectionRelocation = (cantFail(Symbol.getType()) == SymbolRef::ST_Debug); // Check for PLT entry registered with symbol name -if (!SymbolAddress && (IsAArch64 || BC->isRISCV())) { +if (!SymbolAddress && !IsWeakReference(Symbol) && +(IsAArch64 || BC->isRISCV())) { const BinaryData *BD = BC->getPLTBinaryDataByName(SymbolName); SymbolAddress = BD ? BD->getAddress() : 0; } diff --git a/bolt/test/AArch64/update-weak-reference-symbol.s b/bolt/test/AArch64/update-weak-reference-symbol.s new file mode 100644 index 00..600a06b8b6d8fd --- /dev/null +++ b/bolt/test/AArch64/update-weak-reference-symbol.s @@ -0,0 +1,34 @@ +// This test checks whether BOLT can correctly handle relocations against weak symbols. + +// RUN: %clang %cflags -Wl,-z,notext -shared -Wl,-q %s -o %t.so +// RUN: llvm-bolt %t.so -o %t.so.bolt +// RUN: llvm-nm -n %t.so.bolt > %t.out.txt +// RUN: llvm-objdump -dj .rodata %t.so.bolt >> %t.out.txt +// RUN: FileCheck %s --input-file=%t.out.txt + +# CHECK: w func_1 +# CHECK: {{0+}}[[#%x,ADDR:]] W func_2 + +# CHECK: {{.*}} <.rodata>: +# CHECK-NEXT: {{.*}} .word 0x +# CHECK-NEXT: {{.*}} .word 0x +# CHECK-NEXT: {{.*}} .word 0x{{[0]+}}[[#ADDR]] +# CHECK-NEXT: {{.*}} .word 0x + + .text + .weak func_2 + .weak func_1 + .global wow + .type wow, %function +wow: + bl func_1 + bl func_2 + ret + .type func_2, %function +func_2: + ret + .section.rodata +.LC0: + .xword func_1 +.LC1: + .xword func_2 `` https://github.com/llvm/llvm-project/pull/102295 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [BOLT] Skip PLT search for zero-value weak reference symbols (#69136) (PR #102295)
https://github.com/yota9 approved this pull request. Bugfix LGTM https://github.com/llvm/llvm-project/pull/102295 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] TTI: Check legalization cost of fptosi_sat/fptoui_sat nodes (PR #100521)
https://github.com/RKSimon approved this pull request. LGTM - the x86 lowering needs reworking but that shouldn't hold this PR up. https://github.com/llvm/llvm-project/pull/100521 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [CalcSpillWeights] Avoid x87 excess precision influencing weight result (PR #102207)
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/102207 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] IR/AMDGPU: Autoupgrade amdgpu-unsafe-fp-atomics attribute (PR #101698)
https://github.com/yxsamliu approved this pull request. https://github.com/llvm/llvm-project/pull/101698 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [AArch64] Don't replace dst of SWP instructions with (X|W)ZR (#102139) (PR #102316)
https://github.com/llvmbot milestoned https://github.com/llvm/llvm-project/pull/102316 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [AArch64] Don't replace dst of SWP instructions with (X|W)ZR (#102139) (PR #102316)
https://github.com/llvmbot created https://github.com/llvm/llvm-project/pull/102316 Backport beb37e2 Requested by: @pratlucas >From ba42328a5f44d5158e77a13f5397d248cdb483f7 Mon Sep 17 00:00:00 2001 From: Lucas Duarte Prates Date: Wed, 7 Aug 2024 15:15:25 +0100 Subject: [PATCH] [AArch64] Don't replace dst of SWP instructions with (X|W)ZR (#102139) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This change updates the AArch64DeadRegisterDefinition pass to ensure it does not replace the destination register of a SWP instruction with the zero register when its value is unused. This is necessary to ensure that the ordering of such instructions in relation to DMB.LD barries adheres to the definitions of the AArch64 Memory Model. The memory model states the following (ARMARM version DDI 0487K.a §B2.3.7): ``` Barrier-ordered-before An effect E1 is Barrier-ordered-before an effect E2 if one of the following applies: [...] * All of the following apply: - E1 is a Memory Read effect. - E1 is generated by an instruction whose destination register is not WZR or XZR. - E1 appears in program order before E3. - E3 is either a DMB LD effect or a DSB LD effect. - E3 appears in program order before E2. ``` Prior to this change, by replacing the destination register of such SWP instruction with WZR/XZR, the ordering relation described above was incorrectly removed from the generated code. The new behaviour is ensured in this patch by adding the relevant `SWP[L](B|H|W|X)` instructions to list in the `atomicReadDroppedOnZero` predicate, which already covered the `LD` instructions that are subject to the same effect. Fixes #68428. (cherry picked from commit beb37e2e22b549b361be7269a52a3715649e956a) --- .../AArch64DeadRegisterDefinitionsPass.cpp| 4 ++ .../Atomics/aarch64-atomic-exchange-fence.ll | 64 +++ 2 files changed, 68 insertions(+) create mode 100644 llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-exchange-fence.ll diff --git a/llvm/lib/Target/AArch64/AArch64DeadRegisterDefinitionsPass.cpp b/llvm/lib/Target/AArch64/AArch64DeadRegisterDefinitionsPass.cpp index 2bc14f9821e639..161cf24dd4037f 100644 --- a/llvm/lib/Target/AArch64/AArch64DeadRegisterDefinitionsPass.cpp +++ b/llvm/lib/Target/AArch64/AArch64DeadRegisterDefinitionsPass.cpp @@ -108,6 +108,10 @@ static bool atomicReadDroppedOnZero(unsigned Opcode) { case AArch64::LDUMINW:case AArch64::LDUMINX: case AArch64::LDUMINLB: case AArch64::LDUMINLH: case AArch64::LDUMINLW: case AArch64::LDUMINLX: +case AArch64::SWPB: case AArch64::SWPH: +case AArch64::SWPW: case AArch64::SWPX: +case AArch64::SWPLB: case AArch64::SWPLH: +case AArch64::SWPLW: case AArch64::SWPLX: return true; } return false; diff --git a/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-exchange-fence.ll b/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-exchange-fence.ll new file mode 100644 index 00..2adbc709d238da --- /dev/null +++ b/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-exchange-fence.ll @@ -0,0 +1,64 @@ +; RUN: llc %s -o - -verify-machineinstrs -mtriple=aarch64 -mattr=+lse -O0 | FileCheck %s +; RUN: llc %s -o - -verify-machineinstrs -mtriple=aarch64 -mattr=+lse -O1 | FileCheck %s + +; When their destination register is WZR/ZZR, SWP operations are not regarded as +; a read for the purpose of a DMB.LD in the AArch64 memory model. +; This test ensures that the AArch64DeadRegisterDefinitions pass does not +; replace the desitnation register of SWP instructions with the zero register +; when the read value is unused. + +define dso_local i32 @atomic_exchange_monotonic(ptr %ptr, ptr %ptr2, i32 %value) { +; CHECK-LABEL: atomic_exchange_monotonic: +; CHECK: // %bb.0: +; CHECK-NEXT:swp +; CHECK-NOT: wzr +; CHECK-NEXT:dmb ishld +; CHECK-NEXT:ldr w0, [x1] +; CHECK-NEXT:ret +%r0 = atomicrmw xchg ptr %ptr, i32 %value monotonic +fence acquire +%r1 = load atomic i32, ptr %ptr2 monotonic, align 4 +ret i32 %r1 +} + +define dso_local i32 @atomic_exchange_acquire(ptr %ptr, ptr %ptr2, i32 %value) { +; CHECK-LABEL: atomic_exchange_acquire: +; CHECK: // %bb.0: +; CHECK-NEXT:swpa +; CHECK-NOT: wzr +; CHECK-NEXT:dmb ishld +; CHECK-NEXT:ldr w0, [x1] +; CHECK-NEXT:ret +%r0 = atomicrmw xchg ptr %ptr, i32 %value acquire +fence acquire +%r1 = load atomic i32, ptr %ptr2 monotonic, align 4 +ret i32 %r1 +} + +define dso_local i32 @atomic_exchange_release(ptr %ptr, ptr %ptr2, i32 %value) { +; CHECK-LABEL: atomic_exchange_release: +; CHECK: // %bb.0: +; CHECK-NEXT:swpl +; CHECK-NOT: wzr +; CHECK-NEXT:dmb ishld +; CHECK-NEXT:ldr w0, [x1] +; CHECK-NEXT:ret +%r0 = atomicrmw xchg ptr %ptr, i32 %value release +fence acquire +%r1 = load atomic i32, ptr %ptr2 monotonic, align 4 +ret i32 %r1 +} + +define dso_local i32 @atomic_exchange_acqu
[llvm-branch-commits] [llvm] release/19.x: [AArch64] Don't replace dst of SWP instructions with (X|W)ZR (#102139) (PR #102316)
llvmbot wrote: @statham-arm What do you think about merging this PR to the release branch? https://github.com/llvm/llvm-project/pull/102316 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [AArch64] Don't replace dst of SWP instructions with (X|W)ZR (#102139) (PR #102316)
llvmbot wrote: @llvm/pr-subscribers-backend-aarch64 Author: None (llvmbot) Changes Backport beb37e2 Requested by: @pratlucas --- Full diff: https://github.com/llvm/llvm-project/pull/102316.diff 2 Files Affected: - (modified) llvm/lib/Target/AArch64/AArch64DeadRegisterDefinitionsPass.cpp (+4) - (added) llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-exchange-fence.ll (+64) ``diff diff --git a/llvm/lib/Target/AArch64/AArch64DeadRegisterDefinitionsPass.cpp b/llvm/lib/Target/AArch64/AArch64DeadRegisterDefinitionsPass.cpp index 2bc14f9821e63..161cf24dd4037 100644 --- a/llvm/lib/Target/AArch64/AArch64DeadRegisterDefinitionsPass.cpp +++ b/llvm/lib/Target/AArch64/AArch64DeadRegisterDefinitionsPass.cpp @@ -108,6 +108,10 @@ static bool atomicReadDroppedOnZero(unsigned Opcode) { case AArch64::LDUMINW:case AArch64::LDUMINX: case AArch64::LDUMINLB: case AArch64::LDUMINLH: case AArch64::LDUMINLW: case AArch64::LDUMINLX: +case AArch64::SWPB: case AArch64::SWPH: +case AArch64::SWPW: case AArch64::SWPX: +case AArch64::SWPLB: case AArch64::SWPLH: +case AArch64::SWPLW: case AArch64::SWPLX: return true; } return false; diff --git a/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-exchange-fence.ll b/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-exchange-fence.ll new file mode 100644 index 0..2adbc709d238d --- /dev/null +++ b/llvm/test/CodeGen/AArch64/Atomics/aarch64-atomic-exchange-fence.ll @@ -0,0 +1,64 @@ +; RUN: llc %s -o - -verify-machineinstrs -mtriple=aarch64 -mattr=+lse -O0 | FileCheck %s +; RUN: llc %s -o - -verify-machineinstrs -mtriple=aarch64 -mattr=+lse -O1 | FileCheck %s + +; When their destination register is WZR/ZZR, SWP operations are not regarded as +; a read for the purpose of a DMB.LD in the AArch64 memory model. +; This test ensures that the AArch64DeadRegisterDefinitions pass does not +; replace the desitnation register of SWP instructions with the zero register +; when the read value is unused. + +define dso_local i32 @atomic_exchange_monotonic(ptr %ptr, ptr %ptr2, i32 %value) { +; CHECK-LABEL: atomic_exchange_monotonic: +; CHECK: // %bb.0: +; CHECK-NEXT:swp +; CHECK-NOT: wzr +; CHECK-NEXT:dmb ishld +; CHECK-NEXT:ldr w0, [x1] +; CHECK-NEXT:ret +%r0 = atomicrmw xchg ptr %ptr, i32 %value monotonic +fence acquire +%r1 = load atomic i32, ptr %ptr2 monotonic, align 4 +ret i32 %r1 +} + +define dso_local i32 @atomic_exchange_acquire(ptr %ptr, ptr %ptr2, i32 %value) { +; CHECK-LABEL: atomic_exchange_acquire: +; CHECK: // %bb.0: +; CHECK-NEXT:swpa +; CHECK-NOT: wzr +; CHECK-NEXT:dmb ishld +; CHECK-NEXT:ldr w0, [x1] +; CHECK-NEXT:ret +%r0 = atomicrmw xchg ptr %ptr, i32 %value acquire +fence acquire +%r1 = load atomic i32, ptr %ptr2 monotonic, align 4 +ret i32 %r1 +} + +define dso_local i32 @atomic_exchange_release(ptr %ptr, ptr %ptr2, i32 %value) { +; CHECK-LABEL: atomic_exchange_release: +; CHECK: // %bb.0: +; CHECK-NEXT:swpl +; CHECK-NOT: wzr +; CHECK-NEXT:dmb ishld +; CHECK-NEXT:ldr w0, [x1] +; CHECK-NEXT:ret +%r0 = atomicrmw xchg ptr %ptr, i32 %value release +fence acquire +%r1 = load atomic i32, ptr %ptr2 monotonic, align 4 +ret i32 %r1 +} + +define dso_local i32 @atomic_exchange_acquire_release(ptr %ptr, ptr %ptr2, i32 %value) { +; CHECK-LABEL: atomic_exchange_acquire_release: +; CHECK: // %bb.0: +; CHECK-NEXT:swpal +; CHECK-NOT: wzr +; CHECK-NEXT:dmb ishld +; CHECK-NEXT:ldr w0, [x1] +; CHECK-NEXT:ret +%r0 = atomicrmw xchg ptr %ptr, i32 %value acq_rel +fence acquire +%r1 = load atomic i32, ptr %ptr2 monotonic, align 4 +ret i32 %r1 +} `` https://github.com/llvm/llvm-project/pull/102316 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [AArch64] Don't replace dst of SWP instructions with (X|W)ZR (#102139) (PR #102316)
https://github.com/statham-arm approved this pull request. Seems sensible to me. It's fixing a genuine codegen fault (a subtle one, but of course that makes it worse – harder to spot when it occurs!). And it's a small safe change that disables one very small case of a conceptually simple optimisation, unlikely to introduce other bugs. https://github.com/llvm/llvm-project/pull/102316 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [AArch64] Don't replace dst of SWP instructions with (X|W)ZR (#102139) (PR #102316)
lukeg101 wrote: U second Simon, looks good to me https://github.com/llvm/llvm-project/pull/102316 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)
@@ -1455,6 +1462,62 @@ struct SwitchCoroutineSplitter { setCoroInfo(F, Shape, Clones); } + static Function *createNoAllocVariant(Function &F, coro::Shape &Shape, +SmallVectorImpl &Clones) { +auto *OrigFnTy = F.getFunctionType(); yuxuanchen1997 wrote: Didn't get that? https://github.com/llvm/llvm-project/pull/99283 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)
@@ -1455,6 +1462,64 @@ struct SwitchCoroutineSplitter { setCoroInfo(F, Shape, Clones); } + static Function *createNoAllocVariant(Function &F, coro::Shape &Shape, +SmallVectorImpl &Clones) { +auto *OrigFnTy = F.getFunctionType(); +auto OldParams = OrigFnTy->params(); + +SmallVector NewParams; +NewParams.reserve(OldParams.size() + 1); +for (Type *T : OldParams) { + NewParams.push_back(T); +} +NewParams.push_back(PointerType::getUnqual(Shape.FrameTy)); + +auto *NewFnTy = FunctionType::get(OrigFnTy->getReturnType(), NewParams, + OrigFnTy->isVarArg()); +Function *NoAllocF = +Function::Create(NewFnTy, F.getLinkage(), F.getName() + ".noalloc"); yuxuanchen1997 wrote: This pass is now configured to not run at all during LTO pre-link. So internal linkage SGTM. https://github.com/llvm/llvm-project/pull/99283 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [LLVM][Coroutines] Create `.noalloc` variant of switch ABI coroutine ramp functions during CoroSplit (PR #99283)
@@ -5,7 +5,7 @@ define nonnull ptr @f(i32 %n) presplitcoroutine { ; CHECK-LABEL: @f( ; CHECK-NEXT: entry: -; CHECK-NEXT:[[ID:%.*]] = call token @llvm.coro.id(i32 0, ptr null, ptr null, ptr @f.resumers) +; CHECK-NEXT:[[ID:%.*]] = call token @llvm.coro.id(i32 0, ptr null, ptr null, ptr @{{.*}}) yuxuanchen1997 wrote: Ah, the `noalloc` variant was appended to `f.resumers`, causing the name to be changed to `f.resumers.1`. https://github.com/llvm/llvm-project/pull/99283 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/19.x: [clang][modules] Enable built-in modules for the upcoming Apple releases (#102239) (PR #102335)
https://github.com/llvmbot milestoned https://github.com/llvm/llvm-project/pull/102335 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/19.x: [clang][modules] Enable built-in modules for the upcoming Apple releases (#102239) (PR #102335)
https://github.com/llvmbot created https://github.com/llvm/llvm-project/pull/102335 Backport 961639962251de7428c3fe93fa17cfa6ab3c561a Requested by: @ian-twilightcoder >From 8f5890ae4e8e40f7e6d4732f3d3ed2b0843c2545 Mon Sep 17 00:00:00 2001 From: Ian Anderson Date: Wed, 7 Aug 2024 10:14:58 -0700 Subject: [PATCH] [clang][modules] Enable built-in modules for the upcoming Apple releases (#102239) The upcoming Apple SDK releases will support the clang built-in headers being in the clang built-in modules: stop passing -fbuiltin-headers-in-system-modules for those SDK versions. (cherry picked from commit 961639962251de7428c3fe93fa17cfa6ab3c561a) --- clang/lib/Driver/ToolChains/Darwin.cpp| 39 +++ .../Inputs/DriverKit23.0.sdk/SDKSettings.json | 1 + .../SDKSettings.json | 0 clang/test/Driver/darwin-builtin-modules.c| 5 ++- 4 files changed, 36 insertions(+), 9 deletions(-) create mode 100644 clang/test/Driver/Inputs/DriverKit23.0.sdk/SDKSettings.json rename clang/test/Driver/Inputs/{MacOSX99.0.sdk => MacOSX15.0.sdk}/SDKSettings.json (100%) diff --git a/clang/lib/Driver/ToolChains/Darwin.cpp b/clang/lib/Driver/ToolChains/Darwin.cpp index c6f9d7beffb1d..b8b2feb5a149e 100644 --- a/clang/lib/Driver/ToolChains/Darwin.cpp +++ b/clang/lib/Driver/ToolChains/Darwin.cpp @@ -2923,22 +2923,47 @@ bool Darwin::isAlignedAllocationUnavailable() const { return TargetVersion < alignedAllocMinVersion(OS); } -static bool sdkSupportsBuiltinModules(const Darwin::DarwinPlatformKind &TargetPlatform, const std::optional &SDKInfo) { +static bool sdkSupportsBuiltinModules( +const Darwin::DarwinPlatformKind &TargetPlatform, +const Darwin::DarwinEnvironmentKind &TargetEnvironment, +const std::optional &SDKInfo) { + switch (TargetEnvironment) { + case Darwin::NativeEnvironment: + case Darwin::Simulator: + case Darwin::MacCatalyst: +// Standard xnu/Mach/Darwin based environments +// depend on the SDK version. +break; + default: +// All other environments support builtin modules from the start. +return true; + } + if (!SDKInfo) +// If there is no SDK info, assume this is building against a +// pre-SDK version of macOS (i.e. before Mac OS X 10.4). Those +// don't support modules anyway, but the headers definitely +// don't support builtin modules either. It might also be some +// kind of degenerate build environment, err on the side of +// the old behavior which is to not use builtin modules. return false; VersionTuple SDKVersion = SDKInfo->getVersion(); switch (TargetPlatform) { + // Existing SDKs added support for builtin modules in the fall + // 2024 major releases. case Darwin::MacOS: -return SDKVersion >= VersionTuple(99U); +return SDKVersion >= VersionTuple(15U); case Darwin::IPhoneOS: -return SDKVersion >= VersionTuple(99U); +return SDKVersion >= VersionTuple(18U); case Darwin::TvOS: -return SDKVersion >= VersionTuple(99U); +return SDKVersion >= VersionTuple(18U); case Darwin::WatchOS: -return SDKVersion >= VersionTuple(99U); +return SDKVersion >= VersionTuple(11U); case Darwin::XROS: -return SDKVersion >= VersionTuple(99U); +return SDKVersion >= VersionTuple(2U); + + // New SDKs support builtin modules from the start. default: return true; } @@ -3030,7 +3055,7 @@ void Darwin::addClangTargetOptions( // i.e. when the builtin stdint.h is in the Darwin module too, the cycle // goes away. Note that -fbuiltin-headers-in-system-modules does nothing // to fix the same problem with C++ headers, and is generally fragile. - if (!sdkSupportsBuiltinModules(TargetPlatform, SDKInfo)) + if (!sdkSupportsBuiltinModules(TargetPlatform, TargetEnvironment, SDKInfo)) CC1Args.push_back("-fbuiltin-headers-in-system-modules"); if (!DriverArgs.hasArgNoClaim(options::OPT_fdefine_target_os_macros, diff --git a/clang/test/Driver/Inputs/DriverKit23.0.sdk/SDKSettings.json b/clang/test/Driver/Inputs/DriverKit23.0.sdk/SDKSettings.json new file mode 100644 index 0..7ba6c244df211 --- /dev/null +++ b/clang/test/Driver/Inputs/DriverKit23.0.sdk/SDKSettings.json @@ -0,0 +1 @@ +{"Version":"23.0", "MaximumDeploymentTarget": "23.0.99"} diff --git a/clang/test/Driver/Inputs/MacOSX99.0.sdk/SDKSettings.json b/clang/test/Driver/Inputs/MacOSX15.0.sdk/SDKSettings.json similarity index 100% rename from clang/test/Driver/Inputs/MacOSX99.0.sdk/SDKSettings.json rename to clang/test/Driver/Inputs/MacOSX15.0.sdk/SDKSettings.json diff --git a/clang/test/Driver/darwin-builtin-modules.c b/clang/test/Driver/darwin-builtin-modules.c index 1c56e13bfb929..ec515133be8ab 100644 --- a/clang/test/Driver/darwin-builtin-modules.c +++ b/clang/test/Driver/darwin-builtin-modules.c @@ -6,6 +6,7 @@ // RUN: %clang -isysroot %S/Inputs/iPhoneOS13.0.sdk -target arm64-apple-ios13.0 -### %s 2>&1 | FileCheck %s // CHECK: -fbuiltin-headers-in-system-modules
[llvm-branch-commits] [clang] release/19.x: [clang][modules] Enable built-in modules for the upcoming Apple releases (#102239) (PR #102335)
llvmbot wrote: @cyndyishida What do you think about merging this PR to the release branch? https://github.com/llvm/llvm-project/pull/102335 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/19.x: [clang][modules] Enable built-in modules for the upcoming Apple releases (#102239) (PR #102335)
llvmbot wrote: @llvm/pr-subscribers-clang Author: None (llvmbot) Changes Backport 961639962251de7428c3fe93fa17cfa6ab3c561a Requested by: @ian-twilightcoder --- Full diff: https://github.com/llvm/llvm-project/pull/102335.diff 4 Files Affected: - (modified) clang/lib/Driver/ToolChains/Darwin.cpp (+32-7) - (added) clang/test/Driver/Inputs/DriverKit23.0.sdk/SDKSettings.json (+1) - (renamed) clang/test/Driver/Inputs/MacOSX15.0.sdk/SDKSettings.json () - (modified) clang/test/Driver/darwin-builtin-modules.c (+3-2) ``diff diff --git a/clang/lib/Driver/ToolChains/Darwin.cpp b/clang/lib/Driver/ToolChains/Darwin.cpp index c6f9d7beffb1d..b8b2feb5a149e 100644 --- a/clang/lib/Driver/ToolChains/Darwin.cpp +++ b/clang/lib/Driver/ToolChains/Darwin.cpp @@ -2923,22 +2923,47 @@ bool Darwin::isAlignedAllocationUnavailable() const { return TargetVersion < alignedAllocMinVersion(OS); } -static bool sdkSupportsBuiltinModules(const Darwin::DarwinPlatformKind &TargetPlatform, const std::optional &SDKInfo) { +static bool sdkSupportsBuiltinModules( +const Darwin::DarwinPlatformKind &TargetPlatform, +const Darwin::DarwinEnvironmentKind &TargetEnvironment, +const std::optional &SDKInfo) { + switch (TargetEnvironment) { + case Darwin::NativeEnvironment: + case Darwin::Simulator: + case Darwin::MacCatalyst: +// Standard xnu/Mach/Darwin based environments +// depend on the SDK version. +break; + default: +// All other environments support builtin modules from the start. +return true; + } + if (!SDKInfo) +// If there is no SDK info, assume this is building against a +// pre-SDK version of macOS (i.e. before Mac OS X 10.4). Those +// don't support modules anyway, but the headers definitely +// don't support builtin modules either. It might also be some +// kind of degenerate build environment, err on the side of +// the old behavior which is to not use builtin modules. return false; VersionTuple SDKVersion = SDKInfo->getVersion(); switch (TargetPlatform) { + // Existing SDKs added support for builtin modules in the fall + // 2024 major releases. case Darwin::MacOS: -return SDKVersion >= VersionTuple(99U); +return SDKVersion >= VersionTuple(15U); case Darwin::IPhoneOS: -return SDKVersion >= VersionTuple(99U); +return SDKVersion >= VersionTuple(18U); case Darwin::TvOS: -return SDKVersion >= VersionTuple(99U); +return SDKVersion >= VersionTuple(18U); case Darwin::WatchOS: -return SDKVersion >= VersionTuple(99U); +return SDKVersion >= VersionTuple(11U); case Darwin::XROS: -return SDKVersion >= VersionTuple(99U); +return SDKVersion >= VersionTuple(2U); + + // New SDKs support builtin modules from the start. default: return true; } @@ -3030,7 +3055,7 @@ void Darwin::addClangTargetOptions( // i.e. when the builtin stdint.h is in the Darwin module too, the cycle // goes away. Note that -fbuiltin-headers-in-system-modules does nothing // to fix the same problem with C++ headers, and is generally fragile. - if (!sdkSupportsBuiltinModules(TargetPlatform, SDKInfo)) + if (!sdkSupportsBuiltinModules(TargetPlatform, TargetEnvironment, SDKInfo)) CC1Args.push_back("-fbuiltin-headers-in-system-modules"); if (!DriverArgs.hasArgNoClaim(options::OPT_fdefine_target_os_macros, diff --git a/clang/test/Driver/Inputs/DriverKit23.0.sdk/SDKSettings.json b/clang/test/Driver/Inputs/DriverKit23.0.sdk/SDKSettings.json new file mode 100644 index 0..7ba6c244df211 --- /dev/null +++ b/clang/test/Driver/Inputs/DriverKit23.0.sdk/SDKSettings.json @@ -0,0 +1 @@ +{"Version":"23.0", "MaximumDeploymentTarget": "23.0.99"} diff --git a/clang/test/Driver/Inputs/MacOSX99.0.sdk/SDKSettings.json b/clang/test/Driver/Inputs/MacOSX15.0.sdk/SDKSettings.json similarity index 100% rename from clang/test/Driver/Inputs/MacOSX99.0.sdk/SDKSettings.json rename to clang/test/Driver/Inputs/MacOSX15.0.sdk/SDKSettings.json diff --git a/clang/test/Driver/darwin-builtin-modules.c b/clang/test/Driver/darwin-builtin-modules.c index 1c56e13bfb929..ec515133be8ab 100644 --- a/clang/test/Driver/darwin-builtin-modules.c +++ b/clang/test/Driver/darwin-builtin-modules.c @@ -6,6 +6,7 @@ // RUN: %clang -isysroot %S/Inputs/iPhoneOS13.0.sdk -target arm64-apple-ios13.0 -### %s 2>&1 | FileCheck %s // CHECK: -fbuiltin-headers-in-system-modules -// RUN: %clang -isysroot %S/Inputs/MacOSX99.0.sdk -target x86_64-apple-macos98.0 -### %s 2>&1 | FileCheck --check-prefix=CHECK_FUTURE %s -// RUN: %clang -isysroot %S/Inputs/MacOSX99.0.sdk -target x86_64-apple-macos99.0 -### %s 2>&1 | FileCheck --check-prefix=CHECK_FUTURE %s +// RUN: %clang -isysroot %S/Inputs/MacOSX15.0.sdk -target x86_64-apple-macos14.0 -### %s 2>&1 | FileCheck --check-prefix=CHECK_FUTURE %s +// RUN: %clang -isysroot %S/Inputs/MacOSX15.0.sdk -target x86_64-apple-macos15.0 -### %s 2>&1 | FileCheck --check-prefix=CHECK_FUTURE %s +// RUN: %cl
[llvm-branch-commits] [clang] release/19.x: [clang][modules] Enable built-in modules for the upcoming Apple releases (#102239) (PR #102335)
https://github.com/cyndyishida approved this pull request. LGTM We definitely need this for any clients with newer apple sdks using llvm-19 toolchain. https://github.com/llvm/llvm-project/pull/102335 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [mlir] [OpenMP][MLIR] Set omp.composite attr for composite loop wrappers and add verifier checks (PR #102341)
https://github.com/TIFitis created https://github.com/llvm/llvm-project/pull/102341 This patch sets the omp.composite unit attr for composite wrapper ops and also add appropriate checks to the verifiers of supported ops for the presence/absence of the attribute. This is patch 2/2 in a series of patches. >From 9e43d5e608935add740e078e74e1884270775862 Mon Sep 17 00:00:00 2001 From: Akash Banerjee Date: Mon, 5 Aug 2024 15:31:25 +0100 Subject: [PATCH 1/3] [MLIR][OpenMP] Add a new ComposableLoopWrapperInterface to check and set a composite unitAttr. --- mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td | 9 +++-- .../Dialect/OpenMP/OpenMPOpsInterfaces.td | 34 +++ 2 files changed, 41 insertions(+), 2 deletions(-) diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td index 68f92e6952694b..44210025dfe18b 100644 --- a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td +++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td @@ -129,6 +129,7 @@ def PrivateClauseOp : OpenMP_Op<"private", [IsolatedFromAbove, RecipeInterface]> def ParallelOp : OpenMP_Op<"parallel", traits = [ AttrSizedOperandSegments, AutomaticAllocationScope, DeclareOpInterfaceMethods, +DeclareOpInterfaceMethods, DeclareOpInterfaceMethods, RecursiveMemoryEffects ], clauses = [ @@ -357,6 +358,7 @@ def LoopNestOp : OpenMP_Op<"loop_nest", traits = [ def WsloopOp : OpenMP_Op<"wsloop", traits = [ AttrSizedOperandSegments, DeclareOpInterfaceMethods, +DeclareOpInterfaceMethods, RecursiveMemoryEffects, SingleBlock ], clauses = [ OpenMP_AllocateClauseSkip, @@ -433,6 +435,7 @@ def WsloopOp : OpenMP_Op<"wsloop", traits = [ def SimdOp : OpenMP_Op<"simd", traits = [ AttrSizedOperandSegments, DeclareOpInterfaceMethods, +DeclareOpInterfaceMethods, RecursiveMemoryEffects, SingleBlock ], clauses = [ OpenMP_AlignedClause, OpenMP_IfClause, OpenMP_LinearClause, @@ -500,6 +503,7 @@ def YieldOp : OpenMP_Op<"yield", //===--===// def DistributeOp : OpenMP_Op<"distribute", traits = [ AttrSizedOperandSegments, DeclareOpInterfaceMethods, +DeclareOpInterfaceMethods, RecursiveMemoryEffects, SingleBlock ], clauses = [ OpenMP_AllocateClause, OpenMP_DistScheduleClause, OpenMP_OrderClause, @@ -587,8 +591,9 @@ def TaskOp : OpenMP_Op<"task", traits = [ def TaskloopOp : OpenMP_Op<"taskloop", traits = [ AttrSizedOperandSegments, AutomaticAllocationScope, -DeclareOpInterfaceMethods, RecursiveMemoryEffects, -SingleBlock +DeclareOpInterfaceMethods, +DeclareOpInterfaceMethods, +RecursiveMemoryEffects, SingleBlock ], clauses = [ OpenMP_AllocateClause, OpenMP_FinalClause, OpenMP_GrainsizeClause, OpenMP_IfClause, OpenMP_InReductionClauseSkip, diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPOpsInterfaces.td b/mlir/include/mlir/Dialect/OpenMP/OpenMPOpsInterfaces.td index 2d1de37239c82a..50394eaf236f94 100644 --- a/mlir/include/mlir/Dialect/OpenMP/OpenMPOpsInterfaces.td +++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPOpsInterfaces.td @@ -142,6 +142,40 @@ def LoopWrapperInterface : OpInterface<"LoopWrapperInterface"> { ]; } +def ComposableLoopWrapperInterface : OpInterface<"ComposableLoopWrapperInterface"> { + let description = [{ +OpenMP operations that can wrap a single loop nest. When taking a wrapper +role, these operations must only contain a single region with a single block +in which there's a single operation and a terminator. That nested operation +must be another loop wrapper or an `omp.loop_nest`. + }]; + + let cppNamespace = "::mlir::omp"; + + let methods = [ +InterfaceMethod< + /*description=*/[{ +Check whether the operation is composable. + }], + /*retTy=*/"bool", + /*methodName=*/"isComposite", + (ins ), [{}], [{ +return $_op->hasAttr("omp.composite"); + }] +>, +InterfaceMethod< + /*description=*/[{ +Set the CompositeAttr for the op. + }], + /*retTy=*/"void", + /*methodName=*/"setComposite", + (ins), [{}], [{ +$_op->setDiscardableAttr("omp.composite", mlir::UnitAttr::get($_op->getContext())); + }] +> + ]; +} + def DeclareTargetInterface : OpInterface<"DeclareTargetInterface"> { let description = [{ OpenMP operations that support declare target have this interface. >From 1ce8a0500e95b0bcd5d2e64b1cbcfe3f216a048f Mon Sep 17 00:00:00 2001 From: Akash Banerjee Date: Tue, 6 Aug 2024 13:25:06 +0100 Subject: [PATCH 2/3] Address reviewer comments. --- mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td | 19 ++ .../Dialect/OpenMP/OpenMPOpsInterfaces.td | 20 ++- 2 files changed, 22 insertions(+), 17 deletions(-) diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td index 4421
[llvm-branch-commits] [flang] [mlir] [OpenMP][MLIR] Set omp.composite attr for composite loop wrappers and add verifier checks (PR #102341)
llvmbot wrote: @llvm/pr-subscribers-mlir Author: Akash Banerjee (TIFitis) Changes This patch sets the omp.composite unit attr for composite wrapper ops and also add appropriate checks to the verifiers of supported ops for the presence/absence of the attribute. This is patch 2/2 in a series of patches. --- Full diff: https://github.com/llvm/llvm-project/pull/102341.diff 4 Files Affected: - (modified) flang/lib/Lower/OpenMP/OpenMP.cpp (+8) - (modified) mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td (+13-5) - (modified) mlir/include/mlir/Dialect/OpenMP/OpenMPOpsInterfaces.td (+36) - (modified) mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp (+32) ``diff diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp index bbde77c14f36a1..3b18e7b3ecf80e 100644 --- a/flang/lib/Lower/OpenMP/OpenMP.cpp +++ b/flang/lib/Lower/OpenMP/OpenMP.cpp @@ -2063,10 +2063,14 @@ static void genCompositeDistributeSimd( // TODO: Populate entry block arguments with private variables. auto distributeOp = genWrapperOp( converter, loc, distributeClauseOps, /*blockArgTypes=*/{}); + llvm::cast(distributeOp.getOperation()) + .setComposite(/*val=*/true); // TODO: Populate entry block arguments with reduction and private variables. auto simdOp = genWrapperOp(converter, loc, simdClauseOps, /*blockArgTypes=*/{}); + llvm::cast(simdOp.getOperation()) + .setComposite(/*val=*/true); // Construct wrapper entry block list and associated symbols. It is important // that the symbol order and the block argument order match, so that the @@ -2111,10 +2115,14 @@ static void genCompositeDoSimd(lower::AbstractConverter &converter, // TODO: Add private variables to entry block arguments. auto wsloopOp = genWrapperOp( converter, loc, wsloopClauseOps, wsloopReductionTypes); + llvm::cast(wsloopOp.getOperation()) + .setComposite(/*val=*/true); // TODO: Populate entry block arguments with reduction and private variables. auto simdOp = genWrapperOp(converter, loc, simdClauseOps, /*blockArgTypes=*/{}); + llvm::cast(simdOp.getOperation()) + .setComposite(/*val=*/true); // Construct wrapper entry block list and associated symbols. It is important // that the symbol and block argument order match, so that the symbol-value diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td index 68f92e6952694b..d63fdd88f79104 100644 --- a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td +++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td @@ -128,6 +128,7 @@ def PrivateClauseOp : OpenMP_Op<"private", [IsolatedFromAbove, RecipeInterface]> def ParallelOp : OpenMP_Op<"parallel", traits = [ AttrSizedOperandSegments, AutomaticAllocationScope, +DeclareOpInterfaceMethods, DeclareOpInterfaceMethods, DeclareOpInterfaceMethods, RecursiveMemoryEffects @@ -356,7 +357,9 @@ def LoopNestOp : OpenMP_Op<"loop_nest", traits = [ //===--===// def WsloopOp : OpenMP_Op<"wsloop", traits = [ -AttrSizedOperandSegments, DeclareOpInterfaceMethods, +AttrSizedOperandSegments, +DeclareOpInterfaceMethods, +DeclareOpInterfaceMethods, RecursiveMemoryEffects, SingleBlock ], clauses = [ OpenMP_AllocateClauseSkip, @@ -432,7 +435,9 @@ def WsloopOp : OpenMP_Op<"wsloop", traits = [ //===--===// def SimdOp : OpenMP_Op<"simd", traits = [ -AttrSizedOperandSegments, DeclareOpInterfaceMethods, +AttrSizedOperandSegments, +DeclareOpInterfaceMethods, +DeclareOpInterfaceMethods, RecursiveMemoryEffects, SingleBlock ], clauses = [ OpenMP_AlignedClause, OpenMP_IfClause, OpenMP_LinearClause, @@ -499,7 +504,9 @@ def YieldOp : OpenMP_Op<"yield", // Distribute construct [2.9.4.1] //===--===// def DistributeOp : OpenMP_Op<"distribute", traits = [ -AttrSizedOperandSegments, DeclareOpInterfaceMethods, +AttrSizedOperandSegments, +DeclareOpInterfaceMethods, +DeclareOpInterfaceMethods, RecursiveMemoryEffects, SingleBlock ], clauses = [ OpenMP_AllocateClause, OpenMP_DistScheduleClause, OpenMP_OrderClause, @@ -587,8 +594,9 @@ def TaskOp : OpenMP_Op<"task", traits = [ def TaskloopOp : OpenMP_Op<"taskloop", traits = [ AttrSizedOperandSegments, AutomaticAllocationScope, -DeclareOpInterfaceMethods, RecursiveMemoryEffects, -SingleBlock +DeclareOpInterfaceMethods, +DeclareOpInterfaceMethods, +RecursiveMemoryEffects, SingleBlock ], clauses = [ OpenMP_AllocateClause, OpenMP_FinalClause, OpenMP_GrainsizeClause, OpenMP_IfClause, OpenMP_InReductionClauseSkip, diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPO
[llvm-branch-commits] [llvm] AMDGPU: Fold frame indexes into s_or_b32 and s_and_b32 (PR #102345)
https://github.com/arsenm created https://github.com/llvm/llvm-project/pull/102345 Some pointer adds get turned into ors, and sometimes and is performed on pointers for masking. >From ac17eedeea4d38a7bd490ffed9b38b241e4098dc Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Fri, 2 Aug 2024 13:17:01 +0400 Subject: [PATCH] AMDGPU: Fold frame indexes into s_or_b32 and s_and_b32 Some pointer adds get turned into ors, and sometimes and is performed on pointers for masking. --- llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp | 5 +- .../eliminate-frame-index-scalar-bit-ops.mir | 88 +-- 2 files changed, 45 insertions(+), 48 deletions(-) diff --git a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp index e8c2cbd3dd671..76da1f0eb4f7d 100644 --- a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp +++ b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp @@ -2268,8 +2268,9 @@ bool SIRegisterInfo::eliminateFrameIndex(MachineBasicBlock::iterator MI, MI->eraseFromParent(); return true; } -case AMDGPU::S_ADD_I32: { - // TODO: Handle s_or_b32, s_and_b32. +case AMDGPU::S_ADD_I32: +case AMDGPU::S_OR_B32: +case AMDGPU::S_AND_B32: { MachineOperand &OtherOp = MI->getOperand(FIOperandNum == 1 ? 2 : 1); assert(FrameReg || MFI->isBottomOfStack()); diff --git a/llvm/test/CodeGen/AMDGPU/eliminate-frame-index-scalar-bit-ops.mir b/llvm/test/CodeGen/AMDGPU/eliminate-frame-index-scalar-bit-ops.mir index 1456bbc369b6a..f7f43b59085de 100644 --- a/llvm/test/CodeGen/AMDGPU/eliminate-frame-index-scalar-bit-ops.mir +++ b/llvm/test/CodeGen/AMDGPU/eliminate-frame-index-scalar-bit-ops.mir @@ -21,21 +21,21 @@ machineFunctionInfo: body: | bb.0: ; MUBUFW64-LABEL: name: s_or_b32__inline_imm__fi_offset0 -; MUBUFW64: $sgpr4 = S_LSHR_B32 $sgpr32, 6, implicit-def dead $scc -; MUBUFW64-NEXT: renamable $sgpr7 = S_OR_B32 12, killed $sgpr4, implicit-def $scc +; MUBUFW64: renamable $sgpr4 = S_LSHR_B32 $sgpr32, 6, implicit-def dead $scc +; MUBUFW64-NEXT: renamable $sgpr7 = S_OR_B32 $sgpr4, 12, implicit-def $scc ; MUBUFW64-NEXT: SI_RETURN implicit $sgpr7, implicit $scc ; ; MUBUFW32-LABEL: name: s_or_b32__inline_imm__fi_offset0 -; MUBUFW32: $sgpr4 = S_LSHR_B32 $sgpr32, 5, implicit-def dead $scc -; MUBUFW32-NEXT: renamable $sgpr7 = S_OR_B32 12, killed $sgpr4, implicit-def $scc +; MUBUFW32: renamable $sgpr4 = S_LSHR_B32 $sgpr32, 5, implicit-def dead $scc +; MUBUFW32-NEXT: renamable $sgpr7 = S_OR_B32 $sgpr4, 12, implicit-def $scc ; MUBUFW32-NEXT: SI_RETURN implicit $sgpr7, implicit $scc ; ; FLATSCRW64-LABEL: name: s_or_b32__inline_imm__fi_offset0 -; FLATSCRW64: renamable $sgpr7 = S_OR_B32 12, $sgpr32, implicit-def $scc +; FLATSCRW64: renamable $sgpr7 = S_OR_B32 $sgpr32, 12, implicit-def $scc ; FLATSCRW64-NEXT: SI_RETURN implicit $sgpr7, implicit $scc ; ; FLATSCRW32-LABEL: name: s_or_b32__inline_imm__fi_offset0 -; FLATSCRW32: renamable $sgpr7 = S_OR_B32 12, $sgpr32, implicit-def $scc +; FLATSCRW32: renamable $sgpr7 = S_OR_B32 $sgpr32, 12, implicit-def $scc ; FLATSCRW32-NEXT: SI_RETURN implicit $sgpr7, implicit $scc renamable $sgpr7 = S_OR_B32 12, %stack.0, implicit-def $scc SI_RETURN implicit $sgpr7, implicit $scc @@ -55,25 +55,21 @@ machineFunctionInfo: body: | bb.0: ; MUBUFW64-LABEL: name: s_or_b32__literal__fi_offset96 -; MUBUFW64: $sgpr4 = S_LSHR_B32 $sgpr32, 6, implicit-def $scc -; MUBUFW64-NEXT: $sgpr4 = S_ADD_I32 killed $sgpr4, 96, implicit-def $scc -; MUBUFW64-NEXT: renamable $sgpr7 = S_OR_B32 68, killed $sgpr4, implicit-def $scc +; MUBUFW64: renamable $sgpr4 = S_LSHR_B32 $sgpr32, 6, implicit-def dead $scc +; MUBUFW64-NEXT: renamable $sgpr7 = S_OR_B32 $sgpr4, 164, implicit-def $scc ; MUBUFW64-NEXT: SI_RETURN implicit $sgpr7, implicit $scc ; ; MUBUFW32-LABEL: name: s_or_b32__literal__fi_offset96 -; MUBUFW32: $sgpr4 = S_LSHR_B32 $sgpr32, 5, implicit-def $scc -; MUBUFW32-NEXT: $sgpr4 = S_ADD_I32 killed $sgpr4, 96, implicit-def $scc -; MUBUFW32-NEXT: renamable $sgpr7 = S_OR_B32 68, killed $sgpr4, implicit-def $scc +; MUBUFW32: renamable $sgpr4 = S_LSHR_B32 $sgpr32, 5, implicit-def dead $scc +; MUBUFW32-NEXT: renamable $sgpr7 = S_OR_B32 $sgpr4, 164, implicit-def $scc ; MUBUFW32-NEXT: SI_RETURN implicit $sgpr7, implicit $scc ; ; FLATSCRW64-LABEL: name: s_or_b32__literal__fi_offset96 -; FLATSCRW64: $sgpr4 = S_ADD_I32 $sgpr32, 96, implicit-def $scc -; FLATSCRW64-NEXT: renamable $sgpr7 = S_OR_B32 68, killed $sgpr4, implicit-def $scc +; FLATSCRW64: renamable $sgpr7 = S_OR_B32 $sgpr32, 164, implicit-def $scc ; FLATSCRW64-NEXT: SI_RETURN implicit $sgpr7, implicit $scc ; ; FLATSCRW32-LABEL: name: s_or_b32__literal__fi_offset96 -; FLATSCRW32: $sgpr4 = S_ADD_I32 $sgpr32, 96, implicit-def $scc -; FLATSCRW32-NE
[llvm-branch-commits] [llvm] AMDGPU: Fold frame indexes into s_or_b32 and s_and_b32 (PR #102345)
arsenm wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.dev/github/pr/llvm/llvm-project/102345?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#102346** https://app.graphite.dev/github/pr/llvm/llvm-project/102346?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#102345** https://app.graphite.dev/github/pr/llvm/llvm-project/102345?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 * **#101694** https://app.graphite.dev/github/pr/llvm/llvm-project/101694?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#101691** https://app.graphite.dev/github/pr/llvm/llvm-project/101691?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/>: 1 other dependent PR ([#101692](https://github.com/llvm/llvm-project/pull/101692) https://app.graphite.dev/github/pr/llvm/llvm-project/101692?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/>) * `main` This stack of pull requests is managed by Graphite. https://stacking.dev/?utm_source=stack-comment";>Learn more about stacking. Join @arsenm and the rest of your teammates on https://graphite.dev?utm-source=stack-comment";>https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="11px" height="11px"/> Graphite https://github.com/llvm/llvm-project/pull/102345 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [WIP] AMDGPU: Handle v_add* in eliminateFrameIndex (PR #102346)
arsenm wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.dev/github/pr/llvm/llvm-project/102346?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#102346** https://app.graphite.dev/github/pr/llvm/llvm-project/102346?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 * **#102345** https://app.graphite.dev/github/pr/llvm/llvm-project/102345?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#101694** https://app.graphite.dev/github/pr/llvm/llvm-project/101694?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#101691** https://app.graphite.dev/github/pr/llvm/llvm-project/101691?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/>: 1 other dependent PR ([#101692](https://github.com/llvm/llvm-project/pull/101692) https://app.graphite.dev/github/pr/llvm/llvm-project/101692?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/>) * `main` This stack of pull requests is managed by Graphite. https://stacking.dev/?utm_source=stack-comment";>Learn more about stacking. Join @arsenm and the rest of your teammates on https://graphite.dev?utm-source=stack-comment";>https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="11px" height="11px"/> Graphite https://github.com/llvm/llvm-project/pull/102346 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] AMDGPU: Fold frame indexes into s_or_b32 and s_and_b32 (PR #102345)
https://github.com/arsenm ready_for_review https://github.com/llvm/llvm-project/pull/102345 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] AMDGPU: Fold frame indexes into s_or_b32 and s_and_b32 (PR #102345)
llvmbot wrote: @llvm/pr-subscribers-backend-amdgpu Author: Matt Arsenault (arsenm) Changes Some pointer adds get turned into ors, and sometimes and is performed on pointers for masking. --- Full diff: https://github.com/llvm/llvm-project/pull/102345.diff 2 Files Affected: - (modified) llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp (+3-2) - (modified) llvm/test/CodeGen/AMDGPU/eliminate-frame-index-scalar-bit-ops.mir (+42-46) ``diff diff --git a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp index e8c2cbd3dd671..76da1f0eb4f7d 100644 --- a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp +++ b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp @@ -2268,8 +2268,9 @@ bool SIRegisterInfo::eliminateFrameIndex(MachineBasicBlock::iterator MI, MI->eraseFromParent(); return true; } -case AMDGPU::S_ADD_I32: { - // TODO: Handle s_or_b32, s_and_b32. +case AMDGPU::S_ADD_I32: +case AMDGPU::S_OR_B32: +case AMDGPU::S_AND_B32: { MachineOperand &OtherOp = MI->getOperand(FIOperandNum == 1 ? 2 : 1); assert(FrameReg || MFI->isBottomOfStack()); diff --git a/llvm/test/CodeGen/AMDGPU/eliminate-frame-index-scalar-bit-ops.mir b/llvm/test/CodeGen/AMDGPU/eliminate-frame-index-scalar-bit-ops.mir index 1456bbc369b6a..f7f43b59085de 100644 --- a/llvm/test/CodeGen/AMDGPU/eliminate-frame-index-scalar-bit-ops.mir +++ b/llvm/test/CodeGen/AMDGPU/eliminate-frame-index-scalar-bit-ops.mir @@ -21,21 +21,21 @@ machineFunctionInfo: body: | bb.0: ; MUBUFW64-LABEL: name: s_or_b32__inline_imm__fi_offset0 -; MUBUFW64: $sgpr4 = S_LSHR_B32 $sgpr32, 6, implicit-def dead $scc -; MUBUFW64-NEXT: renamable $sgpr7 = S_OR_B32 12, killed $sgpr4, implicit-def $scc +; MUBUFW64: renamable $sgpr4 = S_LSHR_B32 $sgpr32, 6, implicit-def dead $scc +; MUBUFW64-NEXT: renamable $sgpr7 = S_OR_B32 $sgpr4, 12, implicit-def $scc ; MUBUFW64-NEXT: SI_RETURN implicit $sgpr7, implicit $scc ; ; MUBUFW32-LABEL: name: s_or_b32__inline_imm__fi_offset0 -; MUBUFW32: $sgpr4 = S_LSHR_B32 $sgpr32, 5, implicit-def dead $scc -; MUBUFW32-NEXT: renamable $sgpr7 = S_OR_B32 12, killed $sgpr4, implicit-def $scc +; MUBUFW32: renamable $sgpr4 = S_LSHR_B32 $sgpr32, 5, implicit-def dead $scc +; MUBUFW32-NEXT: renamable $sgpr7 = S_OR_B32 $sgpr4, 12, implicit-def $scc ; MUBUFW32-NEXT: SI_RETURN implicit $sgpr7, implicit $scc ; ; FLATSCRW64-LABEL: name: s_or_b32__inline_imm__fi_offset0 -; FLATSCRW64: renamable $sgpr7 = S_OR_B32 12, $sgpr32, implicit-def $scc +; FLATSCRW64: renamable $sgpr7 = S_OR_B32 $sgpr32, 12, implicit-def $scc ; FLATSCRW64-NEXT: SI_RETURN implicit $sgpr7, implicit $scc ; ; FLATSCRW32-LABEL: name: s_or_b32__inline_imm__fi_offset0 -; FLATSCRW32: renamable $sgpr7 = S_OR_B32 12, $sgpr32, implicit-def $scc +; FLATSCRW32: renamable $sgpr7 = S_OR_B32 $sgpr32, 12, implicit-def $scc ; FLATSCRW32-NEXT: SI_RETURN implicit $sgpr7, implicit $scc renamable $sgpr7 = S_OR_B32 12, %stack.0, implicit-def $scc SI_RETURN implicit $sgpr7, implicit $scc @@ -55,25 +55,21 @@ machineFunctionInfo: body: | bb.0: ; MUBUFW64-LABEL: name: s_or_b32__literal__fi_offset96 -; MUBUFW64: $sgpr4 = S_LSHR_B32 $sgpr32, 6, implicit-def $scc -; MUBUFW64-NEXT: $sgpr4 = S_ADD_I32 killed $sgpr4, 96, implicit-def $scc -; MUBUFW64-NEXT: renamable $sgpr7 = S_OR_B32 68, killed $sgpr4, implicit-def $scc +; MUBUFW64: renamable $sgpr4 = S_LSHR_B32 $sgpr32, 6, implicit-def dead $scc +; MUBUFW64-NEXT: renamable $sgpr7 = S_OR_B32 $sgpr4, 164, implicit-def $scc ; MUBUFW64-NEXT: SI_RETURN implicit $sgpr7, implicit $scc ; ; MUBUFW32-LABEL: name: s_or_b32__literal__fi_offset96 -; MUBUFW32: $sgpr4 = S_LSHR_B32 $sgpr32, 5, implicit-def $scc -; MUBUFW32-NEXT: $sgpr4 = S_ADD_I32 killed $sgpr4, 96, implicit-def $scc -; MUBUFW32-NEXT: renamable $sgpr7 = S_OR_B32 68, killed $sgpr4, implicit-def $scc +; MUBUFW32: renamable $sgpr4 = S_LSHR_B32 $sgpr32, 5, implicit-def dead $scc +; MUBUFW32-NEXT: renamable $sgpr7 = S_OR_B32 $sgpr4, 164, implicit-def $scc ; MUBUFW32-NEXT: SI_RETURN implicit $sgpr7, implicit $scc ; ; FLATSCRW64-LABEL: name: s_or_b32__literal__fi_offset96 -; FLATSCRW64: $sgpr4 = S_ADD_I32 $sgpr32, 96, implicit-def $scc -; FLATSCRW64-NEXT: renamable $sgpr7 = S_OR_B32 68, killed $sgpr4, implicit-def $scc +; FLATSCRW64: renamable $sgpr7 = S_OR_B32 $sgpr32, 164, implicit-def $scc ; FLATSCRW64-NEXT: SI_RETURN implicit $sgpr7, implicit $scc ; ; FLATSCRW32-LABEL: name: s_or_b32__literal__fi_offset96 -; FLATSCRW32: $sgpr4 = S_ADD_I32 $sgpr32, 96, implicit-def $scc -; FLATSCRW32-NEXT: renamable $sgpr7 = S_OR_B32 68, killed $sgpr4, implicit-def $scc +; FLATSCRW32: renamable $sgpr7 = S_OR_B32 $sgpr32, 164, implicit-def $scc ; FLATSCRW32-NEXT: SI_RETURN implicit $sgpr7, implicit
[llvm-branch-commits] [llvm] [WIP] AMDGPU: Handle v_add* in eliminateFrameIndex (PR #102346)
https://github.com/arsenm ready_for_review https://github.com/llvm/llvm-project/pull/102346 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [WIP] AMDGPU: Handle v_add* in eliminateFrameIndex (PR #102346)
llvmbot wrote: @llvm/pr-subscribers-backend-amdgpu Author: Matt Arsenault (arsenm) Changes --- Patch is 297.50 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/102346.diff 13 Files Affected: - (modified) llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp (+218-16) - (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/flat-scratch.ll (+34-42) - (modified) llvm/test/CodeGen/AMDGPU/amdgpu-cs-chain-cc.ll (+1-2) - (modified) llvm/test/CodeGen/AMDGPU/amdgpu-cs-chain-preserve-cc.ll (+1-2) - (modified) llvm/test/CodeGen/AMDGPU/array-ptr-calc-i32.ll (+6-1) - (modified) llvm/test/CodeGen/AMDGPU/eliminate-frame-index-v-add-co-u32.mir (+333-584) - (modified) llvm/test/CodeGen/AMDGPU/eliminate-frame-index-v-add-u32.mir (+171-221) - (modified) llvm/test/CodeGen/AMDGPU/flat-scratch.ll (+4-4) - (modified) llvm/test/CodeGen/AMDGPU/frame-index.mir (+4-4) - (modified) llvm/test/CodeGen/AMDGPU/materialize-frame-index-sgpr.gfx10.ll (+14-14) - (modified) llvm/test/CodeGen/AMDGPU/mubuf-offset-private.ll (+7-3) - (modified) llvm/test/CodeGen/AMDGPU/stack-pointer-offset-relative-frameindex.ll (+1-1) - (modified) llvm/test/CodeGen/AMDGPU/stack-realign.ll (+1-1) ``diff diff --git a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp index 76da1f0eb4f7d..81337c62ffe17 100644 --- a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp +++ b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp @@ -2086,7 +2086,7 @@ bool SIRegisterInfo::eliminateFrameIndex(MachineBasicBlock::iterator MI, assert(MF->getRegInfo().isReserved(MFI->getScratchRSrcReg()) && "unreserved scratch RSRC register"); - MachineOperand &FIOp = MI->getOperand(FIOperandNum); + MachineOperand *FIOp = &MI->getOperand(FIOperandNum); int Index = MI->getOperand(FIOperandNum).getIndex(); Register FrameReg = FrameInfo.isFixedObjectIndex(Index) && hasBasePointer(*MF) @@ -2268,6 +2268,208 @@ bool SIRegisterInfo::eliminateFrameIndex(MachineBasicBlock::iterator MI, MI->eraseFromParent(); return true; } +case AMDGPU::V_ADD_U32_e32: +case AMDGPU::V_ADD_U32_e64: +case AMDGPU::V_ADD_CO_U32_e32: +case AMDGPU::V_ADD_CO_U32_e64: { + // TODO: Handle sub, and, or. + unsigned NumDefs = MI->getNumExplicitDefs(); + unsigned Src0Idx = NumDefs; + + bool HasClamp = false; + MachineOperand *VCCOp = nullptr; + + switch (MI->getOpcode()) { + case AMDGPU::V_ADD_U32_e32: +break; + case AMDGPU::V_ADD_U32_e64: +HasClamp = MI->getOperand(3).getImm(); +break; + case AMDGPU::V_ADD_CO_U32_e32: +VCCOp = &MI->getOperand(3); +break; + case AMDGPU::V_ADD_CO_U32_e64: +VCCOp = &MI->getOperand(1); +HasClamp = MI->getOperand(4).getImm(); +break; + default: +break; + } + bool DeadVCC = !VCCOp || VCCOp->isDead(); + MachineOperand &DstOp = MI->getOperand(0); + Register DstReg = DstOp.getReg(); + + unsigned OtherOpIdx = + FIOperandNum == Src0Idx ? FIOperandNum + 1 : Src0Idx; + MachineOperand *OtherOp = &MI->getOperand(OtherOpIdx); + + unsigned Src1Idx = Src0Idx + 1; + Register MaterializedReg = FrameReg; + Register ScavengedVGPR; + + if (FrameReg && !ST.enableFlatScratch()) { +// We should just do an in-place update of the result register. However, +// the value there may also be used by the add, in which case we need a +// temporary register. +// +// FIXME: The scavenger is not finding the result register in the +// common case where the add does not read the register. + +ScavengedVGPR = RS->scavengeRegisterBackwards( +AMDGPU::VGPR_32RegClass, MI, /*RestoreAfter=*/false, /*SPAdj=*/0); + +// TODO: If we have a free SGPR, it's sometimes better to use a scalar +// shift. +BuildMI(*MBB, *MI, DL, TII->get(AMDGPU::V_LSHRREV_B32_e64)) +.addDef(ScavengedVGPR, RegState::Renamable) +.addImm(ST.getWavefrontSizeLog2()) +.addReg(FrameReg); +MaterializedReg = ScavengedVGPR; + } + + int64_t Offset = FrameInfo.getObjectOffset(Index); + // For the non-immediate case, we could fall through to the default + // handling, but we do an in-place update of the result register here to + // avoid scavenging another register. + if (OtherOp->isImm()) { +OtherOp->setImm(OtherOp->getImm() + Offset); +Offset = 0; + } + + if ((!OtherOp->isImm() || OtherOp->getImm() != 0) && MaterializedReg) { +if (ST.enableFlatScratch() && +!TII->isOperandLegal(*MI, Src1Idx, OtherOp)) { + // We didn't need the shift above, so we have an SGPR for the frame + // register, but may have a VGPR only operand. + // + // TODO: On gfx10+, we can easily change the opcode to the e64 version + // a
[llvm-branch-commits] [lld] release/19.x: [lld][ARM] Fix assertion when mixing ARM and Thumb objects (#101985) (PR #102292)
https://github.com/MaskRay approved this pull request. https://github.com/llvm/llvm-project/pull/102292 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] AMDGPU: Fold frame indexes into s_or_b32 and s_and_b32 (PR #102345)
@@ -190,31 +186,31 @@ body: | ; MUBUFW64-LABEL: name: s_and_b32__sgpr__fi_literal_offset ; MUBUFW64: liveins: $sgpr8 ; MUBUFW64-NEXT: {{ $}} -; MUBUFW64-NEXT: $sgpr4 = S_LSHR_B32 $sgpr32, 6, implicit-def $scc -; MUBUFW64-NEXT: $sgpr4 = S_ADD_I32 killed $sgpr4, 80, implicit-def $scc -; MUBUFW64-NEXT: renamable $sgpr7 = S_AND_B32 $sgpr8, killed $sgpr4, implicit-def $scc +; MUBUFW64-NEXT: renamable $sgpr4 = S_LSHR_B32 $sgpr32, 6, implicit-def dead $scc +; MUBUFW64-NEXT: renamable $sgpr7 = S_ADD_I32 $sgpr4, $sgpr8, implicit-def $scc +; MUBUFW64-NEXT: renamable $sgpr7 = S_AND_B32 $sgpr7, 80, implicit-def $scc ; MUBUFW64-NEXT: SI_RETURN implicit $sgpr7, implicit $scc ; ; MUBUFW32-LABEL: name: s_and_b32__sgpr__fi_literal_offset ; MUBUFW32: liveins: $sgpr8 ; MUBUFW32-NEXT: {{ $}} -; MUBUFW32-NEXT: $sgpr4 = S_LSHR_B32 $sgpr32, 5, implicit-def $scc -; MUBUFW32-NEXT: $sgpr4 = S_ADD_I32 killed $sgpr4, 80, implicit-def $scc -; MUBUFW32-NEXT: renamable $sgpr7 = S_AND_B32 $sgpr8, killed $sgpr4, implicit-def $scc +; MUBUFW32-NEXT: renamable $sgpr4 = S_LSHR_B32 $sgpr32, 5, implicit-def dead $scc +; MUBUFW32-NEXT: renamable $sgpr7 = S_ADD_I32 $sgpr4, $sgpr8, implicit-def $scc +; MUBUFW32-NEXT: renamable $sgpr7 = S_AND_B32 $sgpr7, 80, implicit-def $scc ; MUBUFW32-NEXT: SI_RETURN implicit $sgpr7, implicit $scc ; ; FLATSCRW64-LABEL: name: s_and_b32__sgpr__fi_literal_offset ; FLATSCRW64: liveins: $sgpr8 ; FLATSCRW64-NEXT: {{ $}} -; FLATSCRW64-NEXT: $sgpr4 = S_ADD_I32 $sgpr32, 80, implicit-def $scc -; FLATSCRW64-NEXT: renamable $sgpr7 = S_AND_B32 $sgpr8, killed $sgpr4, implicit-def $scc +; FLATSCRW64-NEXT: renamable $sgpr7 = S_ADD_I32 $sgpr32, $sgpr8, implicit-def $scc +; FLATSCRW64-NEXT: renamable $sgpr7 = S_AND_B32 $sgpr7, 80, implicit-def $scc ; FLATSCRW64-NEXT: SI_RETURN implicit $sgpr7, implicit $scc ; ; FLATSCRW32-LABEL: name: s_and_b32__sgpr__fi_literal_offset ; FLATSCRW32: liveins: $sgpr8 ; FLATSCRW32-NEXT: {{ $}} -; FLATSCRW32-NEXT: $sgpr4 = S_ADD_I32 $sgpr32, 80, implicit-def $scc -; FLATSCRW32-NEXT: renamable $sgpr7 = S_AND_B32 $sgpr8, killed $sgpr4, implicit-def $scc +; FLATSCRW32-NEXT: renamable $sgpr7 = S_ADD_I32 $sgpr32, $sgpr8, implicit-def $scc rampitec wrote: I do not understand this. The transformation is `(s8 & (sp + 80)) ->((s8 + sp) & 80)` does not look immediately obvious. https://github.com/llvm/llvm-project/pull/102345 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [NFC] [sanitizers] leave BufferedStackTrace uninit in tests (PR #102251)
https://github.com/fmayer edited https://github.com/llvm/llvm-project/pull/102251 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [DFSan] [compiler-rt] leave BufferedStackTrace uninit (PR #102252)
https://github.com/fmayer edited https://github.com/llvm/llvm-project/pull/102252 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [UBSan] leave BufferedStackTrace uninit (PR #102253)
https://github.com/fmayer edited https://github.com/llvm/llvm-project/pull/102253 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [NSan] leave BufferedStackTrace uninit (PR #102254)
https://github.com/fmayer edited https://github.com/llvm/llvm-project/pull/102254 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [TSan] leave BufferedStackTrace uninit (PR #102255)
https://github.com/fmayer edited https://github.com/llvm/llvm-project/pull/102255 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [Memprof] leave BufferedStackTrace uninit (PR #102256)
https://github.com/fmayer edited https://github.com/llvm/llvm-project/pull/102256 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [DFSan] [compiler-rt] leave BufferedStackTrace uninit (PR #102252)
https://github.com/fmayer updated https://github.com/llvm/llvm-project/pull/102252 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [DFSan] [compiler-rt] leave BufferedStackTrace uninit (PR #102252)
https://github.com/fmayer updated https://github.com/llvm/llvm-project/pull/102252 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [UBSan] leave BufferedStackTrace uninit (PR #102253)
https://github.com/fmayer updated https://github.com/llvm/llvm-project/pull/102253 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [UBSan] leave BufferedStackTrace uninit (PR #102253)
https://github.com/fmayer updated https://github.com/llvm/llvm-project/pull/102253 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [NSan] leave BufferedStackTrace uninit (PR #102254)
https://github.com/fmayer updated https://github.com/llvm/llvm-project/pull/102254 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [NSan] leave BufferedStackTrace uninit (PR #102254)
https://github.com/fmayer updated https://github.com/llvm/llvm-project/pull/102254 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [TSan] leave BufferedStackTrace uninit (PR #102255)
https://github.com/fmayer updated https://github.com/llvm/llvm-project/pull/102255 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [Memprof] leave BufferedStackTrace uninit (PR #102256)
https://github.com/fmayer updated https://github.com/llvm/llvm-project/pull/102256 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [Memprof] leave BufferedStackTrace uninit (PR #102256)
https://github.com/fmayer updated https://github.com/llvm/llvm-project/pull/102256 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [TSan] leave BufferedStackTrace uninit (PR #102255)
https://github.com/fmayer updated https://github.com/llvm/llvm-project/pull/102255 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [NFC] [sanitizers] leave BufferedStackTrace uninit in tests (PR #102251)
https://github.com/fmayer updated https://github.com/llvm/llvm-project/pull/102251 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [NFC] [sanitizers] leave BufferedStackTrace uninit in tests (PR #102251)
https://github.com/fmayer updated https://github.com/llvm/llvm-project/pull/102251 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: workflows: Fix permissions for release-sources job (#100750) (PR #102373)
https://github.com/llvmbot milestoned https://github.com/llvm/llvm-project/pull/102373 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: workflows: Fix permissions for release-sources job (#100750) (PR #102373)
https://github.com/llvmbot created https://github.com/llvm/llvm-project/pull/102373 Backport 82c2259aeb87f5cb418decfb6a1961287055e5d2 Requested by: @tstellar >From d76aaed435edce7e07a760200b7e9aa7eb03b820 Mon Sep 17 00:00:00 2001 From: Tom Stellard Date: Wed, 7 Aug 2024 14:19:22 -0700 Subject: [PATCH] workflows: Fix permissions for release-sources job (#100750) For reusable workflows, the called workflow cannot upgrade it's permissions, and since the default permission is none, we need to explicitly declare 'contents: read' when calling the release-sources workflow. Fixes the error: The workflow is requesting 'contents: read', but is only allowed 'contents: none'. (cherry picked from commit 82c2259aeb87f5cb418decfb6a1961287055e5d2) --- .github/workflows/release-tasks.yml | 1 + 1 file changed, 1 insertion(+) diff --git a/.github/workflows/release-tasks.yml b/.github/workflows/release-tasks.yml index 7dd4c306671b74..deacc24f54e077 100644 --- a/.github/workflows/release-tasks.yml +++ b/.github/workflows/release-tasks.yml @@ -99,6 +99,7 @@ jobs: release-sources: name: Package Release Sources permissions: + contents: read id-token: write attestations: write needs: ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: workflows: Fix permissions for release-sources job (#100750) (PR #102373)
llvmbot wrote: @tru What do you think about merging this PR to the release branch? https://github.com/llvm/llvm-project/pull/102373 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: workflows: Fix permissions for release-sources job (#100750) (PR #102373)
llvmbot wrote: @llvm/pr-subscribers-github-workflow Author: None (llvmbot) Changes Backport 82c2259aeb87f5cb418decfb6a1961287055e5d2 Requested by: @tstellar --- Full diff: https://github.com/llvm/llvm-project/pull/102373.diff 1 Files Affected: - (modified) .github/workflows/release-tasks.yml (+1) ``diff diff --git a/.github/workflows/release-tasks.yml b/.github/workflows/release-tasks.yml index 7dd4c306671b74..deacc24f54e077 100644 --- a/.github/workflows/release-tasks.yml +++ b/.github/workflows/release-tasks.yml @@ -99,6 +99,7 @@ jobs: release-sources: name: Package Release Sources permissions: + contents: read id-token: write attestations: write needs: `` https://github.com/llvm/llvm-project/pull/102373 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [TSan] leave BufferedStackTrace uninit (PR #102255)
https://github.com/vitalybuka approved this pull request. https://github.com/llvm/llvm-project/pull/102255 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [Memprof] leave BufferedStackTrace uninit (PR #102256)
https://github.com/vitalybuka approved this pull request. https://github.com/llvm/llvm-project/pull/102256 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/19.x: [clang][modules] Enable built-in modules for the upcoming Apple releases (#102239) (PR #102335)
@@ -2923,22 +2923,47 @@ bool Darwin::isAlignedAllocationUnavailable() const { return TargetVersion < alignedAllocMinVersion(OS); } -static bool sdkSupportsBuiltinModules(const Darwin::DarwinPlatformKind &TargetPlatform, const std::optional &SDKInfo) { +static bool sdkSupportsBuiltinModules( +const Darwin::DarwinPlatformKind &TargetPlatform, +const Darwin::DarwinEnvironmentKind &TargetEnvironment, +const std::optional &SDKInfo) { + switch (TargetEnvironment) { + case Darwin::NativeEnvironment: + case Darwin::Simulator: + case Darwin::MacCatalyst: +// Standard xnu/Mach/Darwin based environments +// depend on the SDK version. +break; + default: chapuni wrote: It might cause a warning. Fixed in 0f1361baf650641a59aaa1710d7a0b7b02f2e56d. @kazutakahirata I suggest mentioning (e.g. #102239) in your commit message, to notify fixups. https://github.com/llvm/llvm-project/pull/102335 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [NSan] leave BufferedStackTrace uninit (PR #102254)
https://github.com/vitalybuka approved this pull request. https://github.com/llvm/llvm-project/pull/102254 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [UBSan] leave BufferedStackTrace uninit (PR #102253)
https://github.com/vitalybuka approved this pull request. https://github.com/llvm/llvm-project/pull/102253 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [NFC] [sanitizers] leave BufferedStackTrace uninit in tests (PR #102251)
https://github.com/vitalybuka approved this pull request. https://github.com/llvm/llvm-project/pull/102251 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [DFSan] [compiler-rt] leave BufferedStackTrace uninit (PR #102252)
https://github.com/vitalybuka approved this pull request. https://github.com/llvm/llvm-project/pull/102252 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/19.x: [clang][modules] Enable built-in modules for the upcoming Apple releases (#102239) (PR #102335)
@@ -2923,22 +2923,47 @@ bool Darwin::isAlignedAllocationUnavailable() const { return TargetVersion < alignedAllocMinVersion(OS); } -static bool sdkSupportsBuiltinModules(const Darwin::DarwinPlatformKind &TargetPlatform, const std::optional &SDKInfo) { +static bool sdkSupportsBuiltinModules( +const Darwin::DarwinPlatformKind &TargetPlatform, +const Darwin::DarwinEnvironmentKind &TargetEnvironment, +const std::optional &SDKInfo) { + switch (TargetEnvironment) { + case Darwin::NativeEnvironment: + case Darwin::Simulator: + case Darwin::MacCatalyst: +// Standard xnu/Mach/Darwin based environments +// depend on the SDK version. +break; + default: ian-twilightcoder wrote: Should we drop the `default` instead? I guess we could also drop the TargetEnvironment argument completely since it's not currently needed. https://github.com/llvm/llvm-project/pull/102335 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [NFC] [sanitizers] leave BufferedStackTrace uninit in tests (PR #102251)
https://github.com/fmayer closed https://github.com/llvm/llvm-project/pull/102251 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [DFSan] [compiler-rt] leave BufferedStackTrace uninit (PR #102252)
https://github.com/fmayer closed https://github.com/llvm/llvm-project/pull/102252 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [UBSan] leave BufferedStackTrace uninit (PR #102253)
https://github.com/fmayer closed https://github.com/llvm/llvm-project/pull/102253 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [NSan] leave BufferedStackTrace uninit (PR #102254)
https://github.com/fmayer closed https://github.com/llvm/llvm-project/pull/102254 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [TSan] leave BufferedStackTrace uninit (PR #102255)
https://github.com/fmayer closed https://github.com/llvm/llvm-project/pull/102255 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [Memprof] leave BufferedStackTrace uninit (PR #102256)
https://github.com/fmayer closed https://github.com/llvm/llvm-project/pull/102256 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/19.x: [clang][modules] Enable built-in modules for the upcoming Apple releases (#102239) (PR #102335)
https://github.com/ian-twilightcoder edited https://github.com/llvm/llvm-project/pull/102335 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/19.x: [clang][modules] Enable built-in modules for the upcoming Apple releases (#102239) (PR #102335)
https://github.com/ian-twilightcoder deleted https://github.com/llvm/llvm-project/pull/102335 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [sanitizer_common] Fix internal_*stat on Linux/sparc64 (PR #101236)
https://github.com/vitalybuka approved this pull request. https://github.com/llvm/llvm-project/pull/101236 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] release/19.x: [clang][modules] Enable built-in modules for the upcoming Apple releases (#102239) (PR #102335)
@@ -2923,22 +2923,47 @@ bool Darwin::isAlignedAllocationUnavailable() const { return TargetVersion < alignedAllocMinVersion(OS); } -static bool sdkSupportsBuiltinModules(const Darwin::DarwinPlatformKind &TargetPlatform, const std::optional &SDKInfo) { +static bool sdkSupportsBuiltinModules( +const Darwin::DarwinPlatformKind &TargetPlatform, +const Darwin::DarwinEnvironmentKind &TargetEnvironment, +const std::optional &SDKInfo) { + switch (TargetEnvironment) { + case Darwin::NativeEnvironment: + case Darwin::Simulator: + case Darwin::MacCatalyst: +// Standard xnu/Mach/Darwin based environments +// depend on the SDK version. +break; + default: kazutakahirata wrote: Will do next time. Thanks! https://github.com/llvm/llvm-project/pull/102335 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [llvm] release/19.x: [MC/DC][Coverage] Introduce "Bitmap Bias" for continuous mode (#96126) (PR #101629)
vitalybuka wrote: If this is new feature, why it needs a backport? https://github.com/llvm/llvm-project/pull/101629 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] release/19.x: [sanitizer_common] Fix internal_*stat on Linux/sparc64 (#101012) (PR #101143)
https://github.com/vitalybuka approved this pull request. https://github.com/llvm/llvm-project/pull/101143 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] release/19.x: [sanitizer_common][test] Fix SanitizerIoctl/KVM_GET_* tests on Linux/… (#100532) (PR #101136)
https://github.com/vitalybuka approved this pull request. https://github.com/llvm/llvm-project/pull/101136 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] release/19.x: [sanitizer_common] Don't use syscall(SYS_clone) on Linux/sparc64 (#100534) (PR #101137)
https://github.com/vitalybuka approved this pull request. https://github.com/llvm/llvm-project/pull/101137 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits