[llvm-branch-commits] [mlir] PR for llvm/llvm-project#79600 (PR #79603)
andrey-golubev wrote: > Here we have to wait for the build bots, as they are mandatory. Finished, so asking for someone to press the merge button. Thanks in advance! https://github.com/llvm/llvm-project/pull/79603 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] New llvm.amdgcn.wave.id intrinsic (#79325) (PR #79689)
jayfoad wrote: @tstellar does this backport PR look OK? I created it with `gh pr create -f -B release/18.x` and I wasn't sure if I had to edit anything, apart from adding the release milestone. https://github.com/llvm/llvm-project/pull/79689 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] New llvm.amdgcn.wave.id intrinsic (#79325) (PR #79689)
https://github.com/jayfoad closed https://github.com/llvm/llvm-project/pull/79689 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79800 (PR #79813)
https://github.com/llvmbot created https://github.com/llvm/llvm-project/pull/79813 resolves llvm/llvm-project#79800 >From 0f9256b72bcda1b0444bd302aa22ede428c73a54 Mon Sep 17 00:00:00 2001 From: David Sherwood <57997763+david-...@users.noreply.github.com> Date: Fri, 26 Jan 2024 14:43:48 + Subject: [PATCH] [LoopVectorize] Refine runtime memory check costs when there is an outer loop (#76034) When we generate runtime memory checks for an inner loop it's possible that these checks are invariant in the outer loop and so will get hoisted out. In such cases, the effective cost of the checks should reduce to reflect the outer loop trip count. This fixes a 25% performance regression introduced by commit 49b0e6dcc296792b577ae8f0f674e61a0929b99d when building the SPEC2017 x264 benchmark with PGO, where we decided the inner loop trip count wasn't high enough to warrant the (incorrect) high cost of the runtime checks. Also, when runtime memory checks consist entirely of diff checks these are likely to be outer loop invariant. (cherry picked from commit 962fbafecf4730ba84a3b9fd7a662a5c30bb2c7c) --- .../Transforms/Vectorize/LoopVectorize.cpp| 62 - .../AArch64/low_trip_memcheck_cost.ll | 217 ++ 2 files changed, 273 insertions(+), 6 deletions(-) create mode 100644 llvm/test/Transforms/LoopVectorize/AArch64/low_trip_memcheck_cost.ll diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp index 6ca93e15719fb27..dd596c567cd4824 100644 --- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp +++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp @@ -1957,6 +1957,8 @@ class GeneratedRTChecks { bool CostTooHigh = false; const bool AddBranchWeights; + Loop *OuterLoop = nullptr; + public: GeneratedRTChecks(ScalarEvolution &SE, DominatorTree *DT, LoopInfo *LI, TargetTransformInfo *TTI, const DataLayout &DL, @@ -2053,6 +2055,9 @@ class GeneratedRTChecks { DT->eraseNode(SCEVCheckBlock); LI->removeBlock(SCEVCheckBlock); } + +// Outer loop is used as part of the later cost calculations. +OuterLoop = L->getParentLoop(); } InstructionCost getCost() { @@ -2076,16 +2081,61 @@ class GeneratedRTChecks { LLVM_DEBUG(dbgs() << " " << C << " for " << I << "\n"); RTCheckCost += C; } -if (MemCheckBlock) +if (MemCheckBlock) { + InstructionCost MemCheckCost = 0; for (Instruction &I : *MemCheckBlock) { if (MemCheckBlock->getTerminator() == &I) continue; InstructionCost C = TTI->getInstructionCost(&I, TTI::TCK_RecipThroughput); LLVM_DEBUG(dbgs() << " " << C << " for " << I << "\n"); -RTCheckCost += C; +MemCheckCost += C; } + // If the runtime memory checks are being created inside an outer loop + // we should find out if these checks are outer loop invariant. If so, + // the checks will likely be hoisted out and so the effective cost will + // reduce according to the outer loop trip count. + if (OuterLoop) { +ScalarEvolution *SE = MemCheckExp.getSE(); +// TODO: If profitable, we could refine this further by analysing every +// individual memory check, since there could be a mixture of loop +// variant and invariant checks that mean the final condition is +// variant. +const SCEV *Cond = SE->getSCEV(MemRuntimeCheckCond); +if (SE->isLoopInvariant(Cond, OuterLoop)) { + // It seems reasonable to assume that we can reduce the effective + // cost of the checks even when we know nothing about the trip + // count. Assume that the outer loop executes at least twice. + unsigned BestTripCount = 2; + + // If exact trip count is known use that. + if (unsigned SmallTC = SE->getSmallConstantTripCount(OuterLoop)) +BestTripCount = SmallTC; + else if (LoopVectorizeWithBlockFrequency) { +// Else use profile data if available. +if (auto EstimatedTC = getLoopEstimatedTripCount(OuterLoop)) + BestTripCount = *EstimatedTC; + } + + InstructionCost NewMemCheckCost = MemCheckCost / BestTripCount; + + // Let's ensure the cost is always at least 1. + NewMemCheckCost = std::max(*NewMemCheckCost.getValue(), + (InstructionCost::CostType)1); + + LLVM_DEBUG(dbgs() + << "We expect runtime memory checks to be hoisted " + << "out of the outer loop. Cost reduced from " + << MemCheckCost << " to " << NewMemCheckCost << '\n'); + + MemCheckCost = NewMemCheckCost; +} + } + + RTCheckCost += MemCheckCost; +} + if (SCEVCheckBlock || MemCheckBlock) LLVM_DEBUG(dbgs() << "Total cost of runtime checks: " << RTCheckCost
[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79800 (PR #79813)
https://github.com/llvmbot milestoned https://github.com/llvm/llvm-project/pull/79813 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79800 (PR #79813)
llvmbot wrote: @david-arm What do you think about merging this PR to the release branch? https://github.com/llvm/llvm-project/pull/79813 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79800 (PR #79813)
llvmbot wrote: @llvm/pr-subscribers-llvm-transforms Author: None (llvmbot) Changes resolves llvm/llvm-project#79800 --- Full diff: https://github.com/llvm/llvm-project/pull/79813.diff 2 Files Affected: - (modified) llvm/lib/Transforms/Vectorize/LoopVectorize.cpp (+56-6) - (added) llvm/test/Transforms/LoopVectorize/AArch64/low_trip_memcheck_cost.ll (+217) ``diff diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp index 6ca93e15719fb27..dd596c567cd4824 100644 --- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp +++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp @@ -1957,6 +1957,8 @@ class GeneratedRTChecks { bool CostTooHigh = false; const bool AddBranchWeights; + Loop *OuterLoop = nullptr; + public: GeneratedRTChecks(ScalarEvolution &SE, DominatorTree *DT, LoopInfo *LI, TargetTransformInfo *TTI, const DataLayout &DL, @@ -2053,6 +2055,9 @@ class GeneratedRTChecks { DT->eraseNode(SCEVCheckBlock); LI->removeBlock(SCEVCheckBlock); } + +// Outer loop is used as part of the later cost calculations. +OuterLoop = L->getParentLoop(); } InstructionCost getCost() { @@ -2076,16 +2081,61 @@ class GeneratedRTChecks { LLVM_DEBUG(dbgs() << " " << C << " for " << I << "\n"); RTCheckCost += C; } -if (MemCheckBlock) +if (MemCheckBlock) { + InstructionCost MemCheckCost = 0; for (Instruction &I : *MemCheckBlock) { if (MemCheckBlock->getTerminator() == &I) continue; InstructionCost C = TTI->getInstructionCost(&I, TTI::TCK_RecipThroughput); LLVM_DEBUG(dbgs() << " " << C << " for " << I << "\n"); -RTCheckCost += C; +MemCheckCost += C; } + // If the runtime memory checks are being created inside an outer loop + // we should find out if these checks are outer loop invariant. If so, + // the checks will likely be hoisted out and so the effective cost will + // reduce according to the outer loop trip count. + if (OuterLoop) { +ScalarEvolution *SE = MemCheckExp.getSE(); +// TODO: If profitable, we could refine this further by analysing every +// individual memory check, since there could be a mixture of loop +// variant and invariant checks that mean the final condition is +// variant. +const SCEV *Cond = SE->getSCEV(MemRuntimeCheckCond); +if (SE->isLoopInvariant(Cond, OuterLoop)) { + // It seems reasonable to assume that we can reduce the effective + // cost of the checks even when we know nothing about the trip + // count. Assume that the outer loop executes at least twice. + unsigned BestTripCount = 2; + + // If exact trip count is known use that. + if (unsigned SmallTC = SE->getSmallConstantTripCount(OuterLoop)) +BestTripCount = SmallTC; + else if (LoopVectorizeWithBlockFrequency) { +// Else use profile data if available. +if (auto EstimatedTC = getLoopEstimatedTripCount(OuterLoop)) + BestTripCount = *EstimatedTC; + } + + InstructionCost NewMemCheckCost = MemCheckCost / BestTripCount; + + // Let's ensure the cost is always at least 1. + NewMemCheckCost = std::max(*NewMemCheckCost.getValue(), + (InstructionCost::CostType)1); + + LLVM_DEBUG(dbgs() + << "We expect runtime memory checks to be hoisted " + << "out of the outer loop. Cost reduced from " + << MemCheckCost << " to " << NewMemCheckCost << '\n'); + + MemCheckCost = NewMemCheckCost; +} + } + + RTCheckCost += MemCheckCost; +} + if (SCEVCheckBlock || MemCheckBlock) LLVM_DEBUG(dbgs() << "Total cost of runtime checks: " << RTCheckCost << "\n"); @@ -2144,8 +2194,8 @@ class GeneratedRTChecks { BranchInst::Create(LoopVectorPreHeader, SCEVCheckBlock); // Create new preheader for vector loop. -if (auto *PL = LI->getLoopFor(LoopVectorPreHeader)) - PL->addBasicBlockToLoop(SCEVCheckBlock, *LI); +if (OuterLoop) + OuterLoop->addBasicBlockToLoop(SCEVCheckBlock, *LI); SCEVCheckBlock->getTerminator()->eraseFromParent(); SCEVCheckBlock->moveBefore(LoopVectorPreHeader); @@ -2179,8 +2229,8 @@ class GeneratedRTChecks { DT->changeImmediateDominator(LoopVectorPreHeader, MemCheckBlock); MemCheckBlock->moveBefore(LoopVectorPreHeader); -if (auto *PL = LI->getLoopFor(LoopVectorPreHeader)) - PL->addBasicBlockToLoop(MemCheckBlock, *LI); +if (OuterLoop) + OuterLoop->addBasicBlockToLoop(MemCheckBlock, *LI); BranchInst &BI = *BranchInst::Create(Bypass, LoopVectorPreHeader, MemRuntimeCheckCond); diff --git a/llvm/test/Transforms/LoopVectorize/AArch6
[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79756 (PR #79814)
https://github.com/llvmbot created https://github.com/llvm/llvm-project/pull/79814 resolves llvm/llvm-project#79756 >From 2ca45150c7984eea123409e6a7d25b2c7606ef5c Mon Sep 17 00:00:00 2001 From: David Green Date: Sun, 28 Jan 2024 17:01:21 + Subject: [PATCH] Revert "[AArch64] merge index address with large offset into base address" This reverts commit 32878c2065c8005b3ea30c79e16dfd7eed55d645 due to #79756 and #76202. (cherry picked from commit 915c3d9e5a2d1314afe64cd6116a3b6c9809ec90) --- llvm/lib/Target/AArch64/AArch64InstrInfo.cpp | 10 - llvm/lib/Target/AArch64/AArch64InstrInfo.h| 3 - .../AArch64/AArch64LoadStoreOptimizer.cpp | 229 -- llvm/test/CodeGen/AArch64/arm64-addrmode.ll | 15 +- .../AArch64/large-offset-ldr-merge.mir| 5 +- 5 files changed, 12 insertions(+), 250 deletions(-) diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp index 2e8d8c63d6bec2..13e9d9725cc2ed 100644 --- a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp +++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp @@ -4098,16 +4098,6 @@ AArch64InstrInfo::getLdStOffsetOp(const MachineInstr &MI) { return MI.getOperand(Idx); } -const MachineOperand & -AArch64InstrInfo::getLdStAmountOp(const MachineInstr &MI) { - switch (MI.getOpcode()) { - default: -llvm_unreachable("Unexpected opcode"); - case AArch64::LDRBBroX: -return MI.getOperand(4); - } -} - static const TargetRegisterClass *getRegClass(const MachineInstr &MI, Register Reg) { if (MI.getParent() == nullptr) diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.h b/llvm/lib/Target/AArch64/AArch64InstrInfo.h index db24a19fe5f8e3..6526f6740747ab 100644 --- a/llvm/lib/Target/AArch64/AArch64InstrInfo.h +++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.h @@ -111,9 +111,6 @@ class AArch64InstrInfo final : public AArch64GenInstrInfo { /// Returns the immediate offset operator of a load/store. static const MachineOperand &getLdStOffsetOp(const MachineInstr &MI); - /// Returns the shift amount operator of a load/store. - static const MachineOperand &getLdStAmountOp(const MachineInstr &MI); - /// Returns whether the instruction is FP or NEON. static bool isFpOrNEON(const MachineInstr &MI); diff --git a/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp b/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp index e90b8a8ca7acee..926a89466255ca 100644 --- a/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp +++ b/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp @@ -62,8 +62,6 @@ STATISTIC(NumUnscaledPairCreated, "Number of load/store from unscaled generated"); STATISTIC(NumZeroStoresPromoted, "Number of narrow zero stores promoted"); STATISTIC(NumLoadsFromStoresPromoted, "Number of loads from stores promoted"); -STATISTIC(NumConstOffsetFolded, - "Number of const offset of index address folded"); DEBUG_COUNTER(RegRenamingCounter, DEBUG_TYPE "-reg-renaming", "Controls which pairs are considered for renaming"); @@ -77,11 +75,6 @@ static cl::opt LdStLimit("aarch64-load-store-scan-limit", static cl::opt UpdateLimit("aarch64-update-scan-limit", cl::init(100), cl::Hidden); -// The LdStConstLimit limits how far we search for const offset instructions -// when we form index address load/store instructions. -static cl::opt LdStConstLimit("aarch64-load-store-const-scan-limit", -cl::init(10), cl::Hidden); - // Enable register renaming to find additional store pairing opportunities. static cl::opt EnableRenaming("aarch64-load-store-renaming", cl::init(true), cl::Hidden); @@ -178,13 +171,6 @@ struct AArch64LoadStoreOpt : public MachineFunctionPass { findMatchingUpdateInsnForward(MachineBasicBlock::iterator I, int UnscaledOffset, unsigned Limit); - // Scan the instruction list to find a register assigned with a const - // value that can be combined with the current instruction (a load or store) - // using base addressing with writeback. Scan forwards. - MachineBasicBlock::iterator - findMatchingConstOffsetBackward(MachineBasicBlock::iterator I, unsigned Limit, - unsigned &Offset); - // Scan the instruction list to find a base register update that can // be combined with the current instruction (a load or store) using // pre or post indexed addressing with writeback. Scan backwards. @@ -196,19 +182,11 @@ struct AArch64LoadStoreOpt : public MachineFunctionPass { bool isMatchingUpdateInsn(MachineInstr &MemMI, MachineInstr &MI, unsigned BaseReg, int Offset); - bool isMatchingMovConstInsn(MachineInstr &MemMI, MachineInstr &MI, - unsigned IndexReg, unsigned &Offset); - // Merge a pre- or post-index base regi
[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79756 (PR #79814)
https://github.com/llvmbot milestoned https://github.com/llvm/llvm-project/pull/79814 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79756 (PR #79814)
llvmbot wrote: @llvm/pr-subscribers-backend-aarch64 Author: None (llvmbot) Changes resolves llvm/llvm-project#79756 --- Full diff: https://github.com/llvm/llvm-project/pull/79814.diff 5 Files Affected: - (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.cpp (-10) - (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.h (-3) - (modified) llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp (-229) - (modified) llvm/test/CodeGen/AArch64/arm64-addrmode.ll (+9-6) - (modified) llvm/test/CodeGen/AArch64/large-offset-ldr-merge.mir (+3-2) ``diff diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp index 2e8d8c63d6bec24..13e9d9725cc2ed1 100644 --- a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp +++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp @@ -4098,16 +4098,6 @@ AArch64InstrInfo::getLdStOffsetOp(const MachineInstr &MI) { return MI.getOperand(Idx); } -const MachineOperand & -AArch64InstrInfo::getLdStAmountOp(const MachineInstr &MI) { - switch (MI.getOpcode()) { - default: -llvm_unreachable("Unexpected opcode"); - case AArch64::LDRBBroX: -return MI.getOperand(4); - } -} - static const TargetRegisterClass *getRegClass(const MachineInstr &MI, Register Reg) { if (MI.getParent() == nullptr) diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.h b/llvm/lib/Target/AArch64/AArch64InstrInfo.h index db24a19fe5f8e3c..6526f6740747abb 100644 --- a/llvm/lib/Target/AArch64/AArch64InstrInfo.h +++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.h @@ -111,9 +111,6 @@ class AArch64InstrInfo final : public AArch64GenInstrInfo { /// Returns the immediate offset operator of a load/store. static const MachineOperand &getLdStOffsetOp(const MachineInstr &MI); - /// Returns the shift amount operator of a load/store. - static const MachineOperand &getLdStAmountOp(const MachineInstr &MI); - /// Returns whether the instruction is FP or NEON. static bool isFpOrNEON(const MachineInstr &MI); diff --git a/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp b/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp index e90b8a8ca7aceee..926a89466255cab 100644 --- a/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp +++ b/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp @@ -62,8 +62,6 @@ STATISTIC(NumUnscaledPairCreated, "Number of load/store from unscaled generated"); STATISTIC(NumZeroStoresPromoted, "Number of narrow zero stores promoted"); STATISTIC(NumLoadsFromStoresPromoted, "Number of loads from stores promoted"); -STATISTIC(NumConstOffsetFolded, - "Number of const offset of index address folded"); DEBUG_COUNTER(RegRenamingCounter, DEBUG_TYPE "-reg-renaming", "Controls which pairs are considered for renaming"); @@ -77,11 +75,6 @@ static cl::opt LdStLimit("aarch64-load-store-scan-limit", static cl::opt UpdateLimit("aarch64-update-scan-limit", cl::init(100), cl::Hidden); -// The LdStConstLimit limits how far we search for const offset instructions -// when we form index address load/store instructions. -static cl::opt LdStConstLimit("aarch64-load-store-const-scan-limit", -cl::init(10), cl::Hidden); - // Enable register renaming to find additional store pairing opportunities. static cl::opt EnableRenaming("aarch64-load-store-renaming", cl::init(true), cl::Hidden); @@ -178,13 +171,6 @@ struct AArch64LoadStoreOpt : public MachineFunctionPass { findMatchingUpdateInsnForward(MachineBasicBlock::iterator I, int UnscaledOffset, unsigned Limit); - // Scan the instruction list to find a register assigned with a const - // value that can be combined with the current instruction (a load or store) - // using base addressing with writeback. Scan forwards. - MachineBasicBlock::iterator - findMatchingConstOffsetBackward(MachineBasicBlock::iterator I, unsigned Limit, - unsigned &Offset); - // Scan the instruction list to find a base register update that can // be combined with the current instruction (a load or store) using // pre or post indexed addressing with writeback. Scan backwards. @@ -196,19 +182,11 @@ struct AArch64LoadStoreOpt : public MachineFunctionPass { bool isMatchingUpdateInsn(MachineInstr &MemMI, MachineInstr &MI, unsigned BaseReg, int Offset); - bool isMatchingMovConstInsn(MachineInstr &MemMI, MachineInstr &MI, - unsigned IndexReg, unsigned &Offset); - // Merge a pre- or post-index base register update into a ld/st instruction. MachineBasicBlock::iterator mergeUpdateInsn(MachineBasicBlock::iterator I, MachineBasicBlock::iterator Update, bool IsPreIdx); - MachineBasicBlock::iterator - mergeConstOffsetInsn(MachineBasicBlock::iterator
[llvm-branch-commits] [llvm] [workflows] Fix argument passing in abi-dump jobs (#79658) (PR #79836)
https://github.com/tstellar created https://github.com/llvm/llvm-project/pull/79836 This was broken by 859e6aa1008b80d9b10657bac37822a32ee14a23, which added quotes around the EXTRA_ARGS variable. >From 4649293daee971fe03dba59ee54e2c2a0b86 Mon Sep 17 00:00:00 2001 From: Tom Stellard Date: Mon, 29 Jan 2024 06:30:22 -0800 Subject: [PATCH] [workflows] Fix argument passing in abi-dump jobs (#79658) This was broken by 859e6aa1008b80d9b10657bac37822a32ee14a23, which added quotes around the EXTRA_ARGS variable. --- .github/workflows/llvm-tests.yml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/.github/workflows/llvm-tests.yml b/.github/workflows/llvm-tests.yml index cc9855ce182b2b8..8a53abee376716f 100644 --- a/.github/workflows/llvm-tests.yml +++ b/.github/workflows/llvm-tests.yml @@ -143,7 +143,7 @@ jobs: else touch llvm.symbols fi - abi-dumper "$EXTRA_ARGS" -lver ${{ matrix.ref }} -skip-cxx -public-headers ./install/include/${{ needs.abi-dump-setup.outputs.ABI_HEADERS }} -o ${{ matrix.ref }}.abi ./install/lib/libLLVM.so + abi-dumper $EXTRA_ARGS -lver ${{ matrix.ref }} -skip-cxx -public-headers ./install/include/${{ needs.abi-dump-setup.outputs.ABI_HEADERS }} -o ${{ matrix.ref }}.abi ./install/lib/libLLVM.so # Remove symbol versioning from dumps, so we can compare across major versions. sed -i 's/LLVM_${{ matrix.llvm_version_major }}/LLVM_NOVERSION/' ${{ matrix.ref }}.abi - name: Upload ABI file @@ -193,7 +193,7 @@ jobs: # FIXME: Reading of gzip'd abi files on the GitHub runners stop # working some time in March of 2021, likely due to a change in the # runner's environment. - abi-compliance-checker "$EXTRA_ARGS" -l libLLVM.so -old build-baseline/*.abi -new build-latest/*.abi || test "${{ needs.abi-dump-setup.outputs.ABI_HEADERS }}" = "llvm-c" + abi-compliance-checker $EXTRA_ARGS -l libLLVM.so -old build-baseline/*.abi -new build-latest/*.abi || test "${{ needs.abi-dump-setup.outputs.ABI_HEADERS }}" = "llvm-c" - name: Upload ABI Comparison if: always() uses: actions/upload-artifact@v3 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [workflows] Fix argument passing in abi-dump jobs (#79658) (PR #79836)
llvmbot wrote: @llvm/pr-subscribers-github-workflow Author: Tom Stellard (tstellar) Changes This was broken by 859e6aa1008b80d9b10657bac37822a32ee14a23, which added quotes around the EXTRA_ARGS variable. --- Full diff: https://github.com/llvm/llvm-project/pull/79836.diff 1 Files Affected: - (modified) .github/workflows/llvm-tests.yml (+2-2) ``diff diff --git a/.github/workflows/llvm-tests.yml b/.github/workflows/llvm-tests.yml index cc9855ce182b2b8..8a53abee376716f 100644 --- a/.github/workflows/llvm-tests.yml +++ b/.github/workflows/llvm-tests.yml @@ -143,7 +143,7 @@ jobs: else touch llvm.symbols fi - abi-dumper "$EXTRA_ARGS" -lver ${{ matrix.ref }} -skip-cxx -public-headers ./install/include/${{ needs.abi-dump-setup.outputs.ABI_HEADERS }} -o ${{ matrix.ref }}.abi ./install/lib/libLLVM.so + abi-dumper $EXTRA_ARGS -lver ${{ matrix.ref }} -skip-cxx -public-headers ./install/include/${{ needs.abi-dump-setup.outputs.ABI_HEADERS }} -o ${{ matrix.ref }}.abi ./install/lib/libLLVM.so # Remove symbol versioning from dumps, so we can compare across major versions. sed -i 's/LLVM_${{ matrix.llvm_version_major }}/LLVM_NOVERSION/' ${{ matrix.ref }}.abi - name: Upload ABI file @@ -193,7 +193,7 @@ jobs: # FIXME: Reading of gzip'd abi files on the GitHub runners stop # working some time in March of 2021, likely due to a change in the # runner's environment. - abi-compliance-checker "$EXTRA_ARGS" -l libLLVM.so -old build-baseline/*.abi -new build-latest/*.abi || test "${{ needs.abi-dump-setup.outputs.ABI_HEADERS }}" = "llvm-c" + abi-compliance-checker $EXTRA_ARGS -l libLLVM.so -old build-baseline/*.abi -new build-latest/*.abi || test "${{ needs.abi-dump-setup.outputs.ABI_HEADERS }}" = "llvm-c" - name: Upload ABI Comparison if: always() uses: actions/upload-artifact@v3 `` https://github.com/llvm/llvm-project/pull/79836 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [workflows] Fix argument passing in abi-dump jobs (#79658) (PR #79836)
https://github.com/tstellar updated https://github.com/llvm/llvm-project/pull/79836 >From 01d7ece20a46cec1bc1ef512d9961ee134ca73bd Mon Sep 17 00:00:00 2001 From: Tom Stellard Date: Mon, 29 Jan 2024 06:30:22 -0800 Subject: [PATCH] [workflows] Fix argument passing in abi-dump jobs (#79658) This was broken by 859e6aa1008b80d9b10657bac37822a32ee14a23, which added quotes around the EXTRA_ARGS variable. --- .github/workflows/llvm-tests.yml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/.github/workflows/llvm-tests.yml b/.github/workflows/llvm-tests.yml index 63f0f3abfd70a5b..127628d76f1913f 100644 --- a/.github/workflows/llvm-tests.yml +++ b/.github/workflows/llvm-tests.yml @@ -125,7 +125,7 @@ jobs: else touch llvm.symbols fi - abi-dumper "$EXTRA_ARGS" -lver ${{ matrix.ref }} -skip-cxx -public-headers ./install/include/${{ needs.abi-dump-setup.outputs.ABI_HEADERS }} -o ${{ matrix.ref }}.abi ./install/lib/libLLVM.so + abi-dumper $EXTRA_ARGS -lver ${{ matrix.ref }} -skip-cxx -public-headers ./install/include/${{ needs.abi-dump-setup.outputs.ABI_HEADERS }} -o ${{ matrix.ref }}.abi ./install/lib/libLLVM.so # Remove symbol versioning from dumps, so we can compare across major versions. sed -i 's/LLVM_${{ matrix.llvm_version_major }}/LLVM_NOVERSION/' ${{ matrix.ref }}.abi - name: Upload ABI file @@ -175,7 +175,7 @@ jobs: # FIXME: Reading of gzip'd abi files on the GitHub runners stop # working some time in March of 2021, likely due to a change in the # runner's environment. - abi-compliance-checker "$EXTRA_ARGS" -l libLLVM.so -old build-baseline/*.abi -new build-latest/*.abi || test "${{ needs.abi-dump-setup.outputs.ABI_HEADERS }}" = "llvm-c" + abi-compliance-checker $EXTRA_ARGS -l libLLVM.so -old build-baseline/*.abi -new build-latest/*.abi || test "${{ needs.abi-dump-setup.outputs.ABI_HEADERS }}" = "llvm-c" - name: Upload ABI Comparison if: always() uses: actions/upload-artifact@v3 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Backport 45d2d7757feb386186f69af6ef57bde7b5adc2db to release/18.x (PR #79839)
https://github.com/jayfoad created https://github.com/llvm/llvm-project/pull/79839 This just missed the branch creation and is the last piece of functionality required to get AMDGPU GFX12 support working in the 18.x release. >From c265c8527285075a58b2425198dbd4cca8b69477 Mon Sep 17 00:00:00 2001 From: Jay Foad Date: Thu, 25 Jan 2024 07:48:06 + Subject: [PATCH] [AMDGPU] New llvm.amdgcn.wave.id intrinsic (#79325) This is only valid on targets with architected SGPRs. --- llvm/include/llvm/IR/IntrinsicsAMDGPU.td | 4 ++ .../lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp | 19 ++ llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h | 1 + llvm/lib/Target/AMDGPU/SIISelLowering.cpp | 14 + llvm/lib/Target/AMDGPU/SIISelLowering.h | 1 + .../CodeGen/AMDGPU/llvm.amdgcn.wave.id.ll | 61 +++ 6 files changed, 100 insertions(+) create mode 100644 llvm/test/CodeGen/AMDGPU/llvm.amdgcn.wave.id.ll diff --git a/llvm/include/llvm/IR/IntrinsicsAMDGPU.td b/llvm/include/llvm/IR/IntrinsicsAMDGPU.td index 9eb1ac8e27befb1..c5f43d17d1c1481 100644 --- a/llvm/include/llvm/IR/IntrinsicsAMDGPU.td +++ b/llvm/include/llvm/IR/IntrinsicsAMDGPU.td @@ -2777,6 +2777,10 @@ class AMDGPULoadTr: def int_amdgcn_global_load_tr : AMDGPULoadTr; +// i32 @llvm.amdgcn.wave.id() +def int_amdgcn_wave_id : + DefaultAttrsIntrinsic<[llvm_i32_ty], [], [IntrNoMem, IntrSpeculatable]>; + //===--===// // Deep learning intrinsics. //===--===// diff --git a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp index 615685822f91eeb..e98ede88a7e2db9 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp @@ -6883,6 +6883,23 @@ bool AMDGPULegalizerInfo::legalizeStackSave(MachineInstr &MI, return true; } +bool AMDGPULegalizerInfo::legalizeWaveID(MachineInstr &MI, + MachineIRBuilder &B) const { + // With architected SGPRs, waveIDinGroup is in TTMP8[29:25]. + if (!ST.hasArchitectedSGPRs()) +return false; + LLT S32 = LLT::scalar(32); + Register DstReg = MI.getOperand(0).getReg(); + Register TTMP8 = + getFunctionLiveInPhysReg(B.getMF(), B.getTII(), AMDGPU::TTMP8, + AMDGPU::SReg_32RegClass, B.getDebugLoc(), S32); + auto LSB = B.buildConstant(S32, 25); + auto Width = B.buildConstant(S32, 5); + B.buildUbfx(DstReg, TTMP8, LSB, Width); + MI.eraseFromParent(); + return true; +} + bool AMDGPULegalizerInfo::legalizeIntrinsic(LegalizerHelper &Helper, MachineInstr &MI) const { MachineIRBuilder &B = Helper.MIRBuilder; @@ -7005,6 +7022,8 @@ bool AMDGPULegalizerInfo::legalizeIntrinsic(LegalizerHelper &Helper, case Intrinsic::amdgcn_workgroup_id_z: return legalizePreloadedArgIntrin(MI, MRI, B, AMDGPUFunctionArgInfo::WORKGROUP_ID_Z); + case Intrinsic::amdgcn_wave_id: +return legalizeWaveID(MI, B); case Intrinsic::amdgcn_lds_kernel_id: return legalizePreloadedArgIntrin(MI, MRI, B, AMDGPUFunctionArgInfo::LDS_KERNEL_ID); diff --git a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h index 56aabd4f6ab71b6..ecbe42681c6690c 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h +++ b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h @@ -212,6 +212,7 @@ class AMDGPULegalizerInfo final : public LegalizerInfo { bool legalizeFPTruncRound(MachineInstr &MI, MachineIRBuilder &B) const; bool legalizeStackSave(MachineInstr &MI, MachineIRBuilder &B) const; + bool legalizeWaveID(MachineInstr &MI, MachineIRBuilder &B) const; bool legalizeImageIntrinsic( MachineInstr &MI, MachineIRBuilder &B, diff --git a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp index d60f511302613e1..c5ad9da88ec2b31 100644 --- a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp +++ b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp @@ -7920,6 +7920,18 @@ SDValue SITargetLowering::lowerSBuffer(EVT VT, SDLoc DL, SDValue Rsrc, return Loads[0]; } +SDValue SITargetLowering::lowerWaveID(SelectionDAG &DAG, SDValue Op) const { + // With architected SGPRs, waveIDinGroup is in TTMP8[29:25]. + if (!Subtarget->hasArchitectedSGPRs()) +return {}; + SDLoc SL(Op); + MVT VT = MVT::i32; + SDValue TTMP8 = CreateLiveInRegister(DAG, &AMDGPU::SReg_32RegClass, + AMDGPU::TTMP8, VT, SL); + return DAG.getNode(AMDGPUISD::BFE_U32, SL, VT, TTMP8, + DAG.getConstant(25, SL, VT), DAG.getConstant(5, SL, VT)); +} + SDValue SITargetLowering::lowerWorkitemID(SelectionDAG &DAG, SDValue Op, unsigned Dim,
[llvm-branch-commits] [llvm] Backport 45d2d7757feb386186f69af6ef57bde7b5adc2db to release/18.x (PR #79839)
https://github.com/jayfoad milestoned https://github.com/llvm/llvm-project/pull/79839 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Backport 45d2d7757feb386186f69af6ef57bde7b5adc2db to release/18.x (PR #79839)
llvmbot wrote: @llvm/pr-subscribers-backend-amdgpu Author: Jay Foad (jayfoad) Changes This just missed the branch creation and is the last piece of functionality required to get AMDGPU GFX12 support working in the 18.x release. --- Full diff: https://github.com/llvm/llvm-project/pull/79839.diff 6 Files Affected: - (modified) llvm/include/llvm/IR/IntrinsicsAMDGPU.td (+4) - (modified) llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp (+19) - (modified) llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h (+1) - (modified) llvm/lib/Target/AMDGPU/SIISelLowering.cpp (+14) - (modified) llvm/lib/Target/AMDGPU/SIISelLowering.h (+1) - (added) llvm/test/CodeGen/AMDGPU/llvm.amdgcn.wave.id.ll (+61) ``diff diff --git a/llvm/include/llvm/IR/IntrinsicsAMDGPU.td b/llvm/include/llvm/IR/IntrinsicsAMDGPU.td index 9eb1ac8e27befb1..c5f43d17d1c1481 100644 --- a/llvm/include/llvm/IR/IntrinsicsAMDGPU.td +++ b/llvm/include/llvm/IR/IntrinsicsAMDGPU.td @@ -2777,6 +2777,10 @@ class AMDGPULoadTr: def int_amdgcn_global_load_tr : AMDGPULoadTr; +// i32 @llvm.amdgcn.wave.id() +def int_amdgcn_wave_id : + DefaultAttrsIntrinsic<[llvm_i32_ty], [], [IntrNoMem, IntrSpeculatable]>; + //===--===// // Deep learning intrinsics. //===--===// diff --git a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp index 615685822f91eeb..e98ede88a7e2db9 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp @@ -6883,6 +6883,23 @@ bool AMDGPULegalizerInfo::legalizeStackSave(MachineInstr &MI, return true; } +bool AMDGPULegalizerInfo::legalizeWaveID(MachineInstr &MI, + MachineIRBuilder &B) const { + // With architected SGPRs, waveIDinGroup is in TTMP8[29:25]. + if (!ST.hasArchitectedSGPRs()) +return false; + LLT S32 = LLT::scalar(32); + Register DstReg = MI.getOperand(0).getReg(); + Register TTMP8 = + getFunctionLiveInPhysReg(B.getMF(), B.getTII(), AMDGPU::TTMP8, + AMDGPU::SReg_32RegClass, B.getDebugLoc(), S32); + auto LSB = B.buildConstant(S32, 25); + auto Width = B.buildConstant(S32, 5); + B.buildUbfx(DstReg, TTMP8, LSB, Width); + MI.eraseFromParent(); + return true; +} + bool AMDGPULegalizerInfo::legalizeIntrinsic(LegalizerHelper &Helper, MachineInstr &MI) const { MachineIRBuilder &B = Helper.MIRBuilder; @@ -7005,6 +7022,8 @@ bool AMDGPULegalizerInfo::legalizeIntrinsic(LegalizerHelper &Helper, case Intrinsic::amdgcn_workgroup_id_z: return legalizePreloadedArgIntrin(MI, MRI, B, AMDGPUFunctionArgInfo::WORKGROUP_ID_Z); + case Intrinsic::amdgcn_wave_id: +return legalizeWaveID(MI, B); case Intrinsic::amdgcn_lds_kernel_id: return legalizePreloadedArgIntrin(MI, MRI, B, AMDGPUFunctionArgInfo::LDS_KERNEL_ID); diff --git a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h index 56aabd4f6ab71b6..ecbe42681c6690c 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h +++ b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h @@ -212,6 +212,7 @@ class AMDGPULegalizerInfo final : public LegalizerInfo { bool legalizeFPTruncRound(MachineInstr &MI, MachineIRBuilder &B) const; bool legalizeStackSave(MachineInstr &MI, MachineIRBuilder &B) const; + bool legalizeWaveID(MachineInstr &MI, MachineIRBuilder &B) const; bool legalizeImageIntrinsic( MachineInstr &MI, MachineIRBuilder &B, diff --git a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp index d60f511302613e1..c5ad9da88ec2b31 100644 --- a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp +++ b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp @@ -7920,6 +7920,18 @@ SDValue SITargetLowering::lowerSBuffer(EVT VT, SDLoc DL, SDValue Rsrc, return Loads[0]; } +SDValue SITargetLowering::lowerWaveID(SelectionDAG &DAG, SDValue Op) const { + // With architected SGPRs, waveIDinGroup is in TTMP8[29:25]. + if (!Subtarget->hasArchitectedSGPRs()) +return {}; + SDLoc SL(Op); + MVT VT = MVT::i32; + SDValue TTMP8 = CreateLiveInRegister(DAG, &AMDGPU::SReg_32RegClass, + AMDGPU::TTMP8, VT, SL); + return DAG.getNode(AMDGPUISD::BFE_U32, SL, VT, TTMP8, + DAG.getConstant(25, SL, VT), DAG.getConstant(5, SL, VT)); +} + SDValue SITargetLowering::lowerWorkitemID(SelectionDAG &DAG, SDValue Op, unsigned Dim, const ArgDescriptor &Arg) const { @@ -8090,6 +8102,8 @@ SDValue SITargetLowering::LowerINTRINSIC_WO_CHAIN(SDValue Op, case Intrinsic::amdgcn_workgroup_id_z: return getPreloadedValue(DAG, *MFI, VT,
[llvm-branch-commits] [llvm] [AMDGPU] New llvm.amdgcn.wave.id intrinsic (#79325) (PR #79689)
jayfoad wrote: > jayfoad closed this by deleting the head repository 3 hours ago Sorry. Recreated as #79839 https://github.com/llvm/llvm-project/pull/79689 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] PR for llvm/llvm-project#79838 (PR #79841)
https://github.com/llvmbot created https://github.com/llvm/llvm-project/pull/79841 resolves llvm/llvm-project#79838 >From 672a561cf963fbcee33de65efe25f220c2c21173 Mon Sep 17 00:00:00 2001 From: Sam James Date: Wed, 24 Jan 2024 08:23:03 + Subject: [PATCH] [sanitizer] Handle Gentoo's libstdc++ path On Gentoo, libc++ is indeed in /usr/include/c++/*, but libstdc++ is at e.g. /usr/lib/gcc/x86_64-pc-linux-gnu/14/include/g++-v14. Use '/include/g++' as it should be unique enough. Note that the omission of a trailing slash is intentional to match g++-*. See https://github.com/llvm/llvm-project/pull/78534#issuecomment-1904145839. Reviewed by: mgorny Closes: https://github.com/llvm/llvm-project/pull/79264 Signed-off-by: Sam James (cherry picked from commit e8f882f83acf30d9b4da8846bd26314139660430) --- .../lib/sanitizer_common/sanitizer_symbolizer_report.cpp | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_report.cpp b/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_report.cpp index 8438e019591b58a..f6b157c07c6557c 100644 --- a/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_report.cpp +++ b/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_report.cpp @@ -34,8 +34,10 @@ static bool FrameIsInternal(const SymbolizedStack *frame) { return true; const char *file = frame->info.file; const char *module = frame->info.module; + // On Gentoo, the path is g++-*, so there's *not* a missing /. if (file && (internal_strstr(file, "/compiler-rt/lib/") || - internal_strstr(file, "/include/c++/"))) + internal_strstr(file, "/include/c++/") || + internal_strstr(file, "/include/g++"))) return true; if (module && (internal_strstr(module, "libclang_rt."))) return true; ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] PR for llvm/llvm-project#79838 (PR #79841)
https://github.com/llvmbot milestoned https://github.com/llvm/llvm-project/pull/79841 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] PR for llvm/llvm-project#79838 (PR #79841)
llvmbot wrote: @llvm/pr-subscribers-compiler-rt-sanitizer Author: None (llvmbot) Changes resolves llvm/llvm-project#79838 --- Full diff: https://github.com/llvm/llvm-project/pull/79841.diff 1 Files Affected: - (modified) compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_report.cpp (+3-1) ``diff diff --git a/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_report.cpp b/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_report.cpp index 8438e019591b58a..f6b157c07c6557c 100644 --- a/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_report.cpp +++ b/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_report.cpp @@ -34,8 +34,10 @@ static bool FrameIsInternal(const SymbolizedStack *frame) { return true; const char *file = frame->info.file; const char *module = frame->info.module; + // On Gentoo, the path is g++-*, so there's *not* a missing /. if (file && (internal_strstr(file, "/compiler-rt/lib/") || - internal_strstr(file, "/include/c++/"))) + internal_strstr(file, "/include/c++/") || + internal_strstr(file, "/include/g++"))) return true; if (module && (internal_strstr(module, "libclang_rt."))) return true; `` https://github.com/llvm/llvm-project/pull/79841 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] [clang] PR for llvm/llvm-project#79762 (PR #79763)
https://github.com/ldionne approved this pull request. https://github.com/llvm/llvm-project/pull/79763 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] [clang-tools-extra] [DirectX] Move DXIL ResourceKind and ElementType to DXILABI.h. NFC (PR #78225)
https://github.com/bogner updated https://github.com/llvm/llvm-project/pull/78225 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang-tools-extra] [clang] [llvm] [DirectX] Move DXIL ResourceKind and ElementType to DXILABI.h. NFC (PR #78225)
https://github.com/bogner updated https://github.com/llvm/llvm-project/pull/78225 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] PR for llvm/llvm-project#79797 (PR #79863)
https://github.com/llvmbot created https://github.com/llvm/llvm-project/pull/79863 resolves llvm/llvm-project#79797 >From 7eea3aec978bdfb154868f846d52e4cba4cf246c Mon Sep 17 00:00:00 2001 From: Andrei Golubev Date: Mon, 29 Jan 2024 10:37:11 +0200 Subject: [PATCH] [mlir] Revert to old fold logic in IR::Dialect::add{Types, Attributes}() (#79582) Fold expressions on Clang are limited to 256 elements. This causes compilation errors in cases when the amount of elements added exceeds this limit. Side-step the issue by restoring the original trick that would use the std::initializer_list. For the record, in our downstream Clang 16 gives: mlir/include/mlir/IR/Dialect.h:269:23: fatal error: instantiating fold expression with 688 arguments exceeded expression nesting limit of 256 (addType(), ...); Partially reverts 26d811b3ecd2fa1ca3d9b41e17fb42b8c7ad03d6. Co-authored-by: Nikita Kudriavtsev (cherry picked from commit e3a38a75ddc6ff00301ec19a0e2488d00f2cc297) --- mlir/include/mlir/IR/Dialect.h | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/mlir/include/mlir/IR/Dialect.h b/mlir/include/mlir/IR/Dialect.h index 45f29f37dd3b97c..50f6f6de5c2897a 100644 --- a/mlir/include/mlir/IR/Dialect.h +++ b/mlir/include/mlir/IR/Dialect.h @@ -281,7 +281,11 @@ class Dialect { /// Register a set of type classes with this dialect. template void addTypes() { -(addType(), ...); +// This initializer_list argument pack expansion is essentially equal to +// using a fold expression with a comma operator. Clang however, refuses +// to compile a fold expression with a depth of more than 256 by default. +// There seem to be no such limitations for initializer_list. +(void)std::initializer_list{0, (addType(), 0)...}; } /// Register a type instance with this dialect. @@ -292,7 +296,11 @@ class Dialect { /// Register a set of attribute classes with this dialect. template void addAttributes() { -(addAttribute(), ...); +// This initializer_list argument pack expansion is essentially equal to +// using a fold expression with a comma operator. Clang however, refuses +// to compile a fold expression with a depth of more than 256 by default. +// There seem to be no such limitations for initializer_list. +(void)std::initializer_list{0, (addAttribute(), 0)...}; } /// Register an attribute instance with this dialect. ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] PR for llvm/llvm-project#79797 (PR #79863)
llvmbot wrote: @zero9178 What do you think about merging this PR to the release branch? https://github.com/llvm/llvm-project/pull/79863 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] PR for llvm/llvm-project#79797 (PR #79863)
https://github.com/llvmbot milestoned https://github.com/llvm/llvm-project/pull/79863 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] PR for llvm/llvm-project#79797 (PR #79863)
llvmbot wrote: @llvm/pr-subscribers-mlir-core Author: None (llvmbot) Changes resolves llvm/llvm-project#79797 --- Full diff: https://github.com/llvm/llvm-project/pull/79863.diff 1 Files Affected: - (modified) mlir/include/mlir/IR/Dialect.h (+10-2) ``diff diff --git a/mlir/include/mlir/IR/Dialect.h b/mlir/include/mlir/IR/Dialect.h index 45f29f37dd3b97c..50f6f6de5c2897a 100644 --- a/mlir/include/mlir/IR/Dialect.h +++ b/mlir/include/mlir/IR/Dialect.h @@ -281,7 +281,11 @@ class Dialect { /// Register a set of type classes with this dialect. template void addTypes() { -(addType(), ...); +// This initializer_list argument pack expansion is essentially equal to +// using a fold expression with a comma operator. Clang however, refuses +// to compile a fold expression with a depth of more than 256 by default. +// There seem to be no such limitations for initializer_list. +(void)std::initializer_list{0, (addType(), 0)...}; } /// Register a type instance with this dialect. @@ -292,7 +296,11 @@ class Dialect { /// Register a set of attribute classes with this dialect. template void addAttributes() { -(addAttribute(), ...); +// This initializer_list argument pack expansion is essentially equal to +// using a fold expression with a comma operator. Clang however, refuses +// to compile a fold expression with a depth of more than 256 by default. +// There seem to be no such limitations for initializer_list. +(void)std::initializer_list{0, (addAttribute(), 0)...}; } /// Register an attribute instance with this dialect. `` https://github.com/llvm/llvm-project/pull/79863 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] PR for llvm/llvm-project#79797 (PR #79863)
https://github.com/zero9178 approved this pull request. https://github.com/llvm/llvm-project/pull/79863 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] PR for llvm/llvm-project#79600 (PR #79603)
https://github.com/zero9178 approved this pull request. (I think only release managers can merge to the release branch, not sure however) https://github.com/llvm/llvm-project/pull/79603 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79800 (PR #79813)
https://github.com/davemgreen approved this pull request. The perf regression was fairly significant, so it would be good to get this into the branch. Thanks. https://github.com/llvm/llvm-project/pull/79813 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Backport 45d2d7757feb386186f69af6ef57bde7b5adc2db to release/18.x (PR #79839)
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/79839 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Backport 45d2d7757feb386186f69af6ef57bde7b5adc2db to release/18.x (PR #79839)
@@ -6883,6 +6883,23 @@ bool AMDGPULegalizerInfo::legalizeStackSave(MachineInstr &MI, return true; } +bool AMDGPULegalizerInfo::legalizeWaveID(MachineInstr &MI, + MachineIRBuilder &B) const { + // With architected SGPRs, waveIDinGroup is in TTMP8[29:25]. + if (!ST.hasArchitectedSGPRs()) +return false; + LLT S32 = LLT::scalar(32); + Register DstReg = MI.getOperand(0).getReg(); + Register TTMP8 = + getFunctionLiveInPhysReg(B.getMF(), B.getTII(), AMDGPU::TTMP8, arsenm wrote: This avoids the live in in the later patch? https://github.com/llvm/llvm-project/pull/79839 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Backport 45d2d7757feb386186f69af6ef57bde7b5adc2db to release/18.x (PR #79839)
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/79839 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] [clang-tools-extra] [flang] [openmp] [DirectX] Move DXIL ResourceKind and ElementType to DXILABI.h. NFC (PR #78225)
https://github.com/bogner updated https://github.com/llvm/llvm-project/pull/78225 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang-tools-extra] [llvm] [openmp] [flang] [DirectX] Move DXIL ResourceKind and ElementType to DXILABI.h. NFC (PR #78225)
https://github.com/bogner updated https://github.com/llvm/llvm-project/pull/78225 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Backport 45d2d7757feb386186f69af6ef57bde7b5adc2db to release/18.x (PR #79839)
https://github.com/jayfoad updated https://github.com/llvm/llvm-project/pull/79839 >From c265c8527285075a58b2425198dbd4cca8b69477 Mon Sep 17 00:00:00 2001 From: Jay Foad Date: Thu, 25 Jan 2024 07:48:06 + Subject: [PATCH 1/2] [AMDGPU] New llvm.amdgcn.wave.id intrinsic (#79325) This is only valid on targets with architected SGPRs. --- llvm/include/llvm/IR/IntrinsicsAMDGPU.td | 4 ++ .../lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp | 19 ++ llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h | 1 + llvm/lib/Target/AMDGPU/SIISelLowering.cpp | 14 + llvm/lib/Target/AMDGPU/SIISelLowering.h | 1 + .../CodeGen/AMDGPU/llvm.amdgcn.wave.id.ll | 61 +++ 6 files changed, 100 insertions(+) create mode 100644 llvm/test/CodeGen/AMDGPU/llvm.amdgcn.wave.id.ll diff --git a/llvm/include/llvm/IR/IntrinsicsAMDGPU.td b/llvm/include/llvm/IR/IntrinsicsAMDGPU.td index 9eb1ac8e27befb..c5f43d17d1c148 100644 --- a/llvm/include/llvm/IR/IntrinsicsAMDGPU.td +++ b/llvm/include/llvm/IR/IntrinsicsAMDGPU.td @@ -2777,6 +2777,10 @@ class AMDGPULoadTr: def int_amdgcn_global_load_tr : AMDGPULoadTr; +// i32 @llvm.amdgcn.wave.id() +def int_amdgcn_wave_id : + DefaultAttrsIntrinsic<[llvm_i32_ty], [], [IntrNoMem, IntrSpeculatable]>; + //===--===// // Deep learning intrinsics. //===--===// diff --git a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp index 615685822f91ee..e98ede88a7e2db 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp @@ -6883,6 +6883,23 @@ bool AMDGPULegalizerInfo::legalizeStackSave(MachineInstr &MI, return true; } +bool AMDGPULegalizerInfo::legalizeWaveID(MachineInstr &MI, + MachineIRBuilder &B) const { + // With architected SGPRs, waveIDinGroup is in TTMP8[29:25]. + if (!ST.hasArchitectedSGPRs()) +return false; + LLT S32 = LLT::scalar(32); + Register DstReg = MI.getOperand(0).getReg(); + Register TTMP8 = + getFunctionLiveInPhysReg(B.getMF(), B.getTII(), AMDGPU::TTMP8, + AMDGPU::SReg_32RegClass, B.getDebugLoc(), S32); + auto LSB = B.buildConstant(S32, 25); + auto Width = B.buildConstant(S32, 5); + B.buildUbfx(DstReg, TTMP8, LSB, Width); + MI.eraseFromParent(); + return true; +} + bool AMDGPULegalizerInfo::legalizeIntrinsic(LegalizerHelper &Helper, MachineInstr &MI) const { MachineIRBuilder &B = Helper.MIRBuilder; @@ -7005,6 +7022,8 @@ bool AMDGPULegalizerInfo::legalizeIntrinsic(LegalizerHelper &Helper, case Intrinsic::amdgcn_workgroup_id_z: return legalizePreloadedArgIntrin(MI, MRI, B, AMDGPUFunctionArgInfo::WORKGROUP_ID_Z); + case Intrinsic::amdgcn_wave_id: +return legalizeWaveID(MI, B); case Intrinsic::amdgcn_lds_kernel_id: return legalizePreloadedArgIntrin(MI, MRI, B, AMDGPUFunctionArgInfo::LDS_KERNEL_ID); diff --git a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h index 56aabd4f6ab71b..ecbe42681c6690 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h +++ b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h @@ -212,6 +212,7 @@ class AMDGPULegalizerInfo final : public LegalizerInfo { bool legalizeFPTruncRound(MachineInstr &MI, MachineIRBuilder &B) const; bool legalizeStackSave(MachineInstr &MI, MachineIRBuilder &B) const; + bool legalizeWaveID(MachineInstr &MI, MachineIRBuilder &B) const; bool legalizeImageIntrinsic( MachineInstr &MI, MachineIRBuilder &B, diff --git a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp index d60f511302613e..c5ad9da88ec2b3 100644 --- a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp +++ b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp @@ -7920,6 +7920,18 @@ SDValue SITargetLowering::lowerSBuffer(EVT VT, SDLoc DL, SDValue Rsrc, return Loads[0]; } +SDValue SITargetLowering::lowerWaveID(SelectionDAG &DAG, SDValue Op) const { + // With architected SGPRs, waveIDinGroup is in TTMP8[29:25]. + if (!Subtarget->hasArchitectedSGPRs()) +return {}; + SDLoc SL(Op); + MVT VT = MVT::i32; + SDValue TTMP8 = CreateLiveInRegister(DAG, &AMDGPU::SReg_32RegClass, + AMDGPU::TTMP8, VT, SL); + return DAG.getNode(AMDGPUISD::BFE_U32, SL, VT, TTMP8, + DAG.getConstant(25, SL, VT), DAG.getConstant(5, SL, VT)); +} + SDValue SITargetLowering::lowerWorkitemID(SelectionDAG &DAG, SDValue Op, unsigned Dim, const ArgDescriptor &Arg) const { @@ -8090,6 +8102,8 @@ SDValue SITargetLowering::LowerINTRINSIC_WO_CHAIN(SDValue Op, case Intrinsic::
[llvm-branch-commits] [llvm] Backport 45d2d7757feb386186f69af6ef57bde7b5adc2db to release/18.x (PR #79839)
@@ -6883,6 +6883,23 @@ bool AMDGPULegalizerInfo::legalizeStackSave(MachineInstr &MI, return true; } +bool AMDGPULegalizerInfo::legalizeWaveID(MachineInstr &MI, + MachineIRBuilder &B) const { + // With architected SGPRs, waveIDinGroup is in TTMP8[29:25]. + if (!ST.hasArchitectedSGPRs()) +return false; + LLT S32 = LLT::scalar(32); + Register DstReg = MI.getOperand(0).getReg(); + Register TTMP8 = + getFunctionLiveInPhysReg(B.getMF(), B.getTII(), AMDGPU::TTMP8, jayfoad wrote: True, 66c710ec9dcdbdec6cadd89b972d8945983dc92f improved this to avoid adding liveins. I wasn't going to bother backporting that since I didn't think it was required for correctness. But I have cherry-picked it into this PR now. https://github.com/llvm/llvm-project/pull/79839 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] PR for llvm/llvm-project#79614 (PR #79870)
https://github.com/llvmbot created https://github.com/llvm/llvm-project/pull/79870 resolves llvm/llvm-project#79614 >From 4b861c2643e93050618488e58141b5802c1f4e35 Mon Sep 17 00:00:00 2001 From: Alexandros Lamprineas Date: Mon, 29 Jan 2024 16:37:09 + Subject: [PATCH] [AArch64][TargetParser] Add mcpu alias for Microsoft Azure Cobalt 100. (#79614) With a690e86 we added -mcpu/mtune=native support to handle the Microsoft Azure Cobalt 100 CPU as a Neoverse N2. This patch adds a CPU alias in TargetParser to maintain compatibility with GCC. (cherry picked from commit ae8005ffb6cd18900de8ed5a86f60a4a16975471) --- clang/test/Driver/aarch64-mcpu.c | 3 +++ clang/test/Misc/target-invalid-cpu-note.c| 4 ++-- llvm/include/llvm/TargetParser/AArch64TargetParser.h | 3 ++- llvm/unittests/TargetParser/TargetParserTest.cpp | 2 +- 4 files changed, 8 insertions(+), 4 deletions(-) diff --git a/clang/test/Driver/aarch64-mcpu.c b/clang/test/Driver/aarch64-mcpu.c index 511482a420da268..3e07f3597f34081 100644 --- a/clang/test/Driver/aarch64-mcpu.c +++ b/clang/test/Driver/aarch64-mcpu.c @@ -72,6 +72,9 @@ // RUN: %clang --target=aarch64 -mcpu=cortex-r82 -### -c %s 2>&1 | FileCheck -check-prefix=CORTEXR82 %s // CORTEXR82: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "cortex-r82" +// RUN: %clang --target=aarch64 -mcpu=cobalt-100 -### -c %s 2>&1 | FileCheck -check-prefix=COBALT-100 %s +// COBALT-100: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "neoverse-n2" + // RUN: %clang --target=aarch64 -mcpu=grace -### -c %s 2>&1 | FileCheck -check-prefix=GRACE %s // GRACE: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "neoverse-v2" diff --git a/clang/test/Misc/target-invalid-cpu-note.c b/clang/test/Misc/target-invalid-cpu-note.c index 84aed5c9c36fe47..2f10bfb1fd82fe3 100644 --- a/clang/test/Misc/target-invalid-cpu-note.c +++ b/clang/test/Misc/target-invalid-cpu-note.c @@ -5,11 +5,11 @@ // RUN: not %clang_cc1 -triple arm64--- -target-cpu not-a-cpu -fsyntax-only %s 2>&1 | FileCheck %s --check-prefix AARCH64 // AARCH64: error: unknown target CPU 'not-a-cpu' -// AARCH64-NEXT: note: valid target CPU values are: cortex-a34, cortex-a35, cortex-a53, cortex-a55, cortex-a510, cortex-a520, cortex-a57, cortex-a65, cortex-a65ae, cortex-a72, cortex-a73, cortex-a75, cortex-a76, cortex-a76ae, cortex-a77, cortex-a78, cortex-a78c, cortex-a710, cortex-a715, cortex-a720, cortex-r82, cortex-x1, cortex-x1c, cortex-x2, cortex-x3, cortex-x4, neoverse-e1, neoverse-n1, neoverse-n2, neoverse-512tvb, neoverse-v1, neoverse-v2, cyclone, apple-a7, apple-a8, apple-a9, apple-a10, apple-a11, apple-a12, apple-a13, apple-a14, apple-a15, apple-a16, apple-a17, apple-m1, apple-m2, apple-m3, apple-s4, apple-s5, exynos-m3, exynos-m4, exynos-m5, falkor, saphira, kryo, thunderx2t99, thunderx3t110, thunderx, thunderxt88, thunderxt81, thunderxt83, tsv110, a64fx, carmel, ampere1, ampere1a, grace{{$}} +// AARCH64-NEXT: note: valid target CPU values are: cortex-a34, cortex-a35, cortex-a53, cortex-a55, cortex-a510, cortex-a520, cortex-a57, cortex-a65, cortex-a65ae, cortex-a72, cortex-a73, cortex-a75, cortex-a76, cortex-a76ae, cortex-a77, cortex-a78, cortex-a78c, cortex-a710, cortex-a715, cortex-a720, cortex-r82, cortex-x1, cortex-x1c, cortex-x2, cortex-x3, cortex-x4, neoverse-e1, neoverse-n1, neoverse-n2, neoverse-512tvb, neoverse-v1, neoverse-v2, cyclone, apple-a7, apple-a8, apple-a9, apple-a10, apple-a11, apple-a12, apple-a13, apple-a14, apple-a15, apple-a16, apple-a17, apple-m1, apple-m2, apple-m3, apple-s4, apple-s5, exynos-m3, exynos-m4, exynos-m5, falkor, saphira, kryo, thunderx2t99, thunderx3t110, thunderx, thunderxt88, thunderxt81, thunderxt83, tsv110, a64fx, carmel, ampere1, ampere1a, cobalt-100, grace{{$}} // RUN: not %clang_cc1 -triple arm64--- -tune-cpu not-a-cpu -fsyntax-only %s 2>&1 | FileCheck %s --check-prefix TUNE_AARCH64 // TUNE_AARCH64: error: unknown target CPU 'not-a-cpu' -// TUNE_AARCH64-NEXT: note: valid target CPU values are: cortex-a34, cortex-a35, cortex-a53, cortex-a55, cortex-a510, cortex-a520, cortex-a57, cortex-a65, cortex-a65ae, cortex-a72, cortex-a73, cortex-a75, cortex-a76, cortex-a76ae, cortex-a77, cortex-a78, cortex-a78c, cortex-a710, cortex-a715, cortex-a720, cortex-r82, cortex-x1, cortex-x1c, cortex-x2, cortex-x3, cortex-x4, neoverse-e1, neoverse-n1, neoverse-n2, neoverse-512tvb, neoverse-v1, neoverse-v2, cyclone, apple-a7, apple-a8, apple-a9, apple-a10, apple-a11, apple-a12, apple-a13, apple-a14, apple-a15, apple-a16, apple-a17, apple-m1, apple-m2, apple-m3, apple-s4, apple-s5, exynos-m3, exynos-m4, exynos-m5, falkor, saphira, kryo, thunderx2t99, thunderx3t110, thunderx, thunderxt88, thunderxt81, thunderxt83, tsv110, a64fx, carmel, ampere1, ampere1a, grace{{$}} +// TUNE_AARCH64-NEXT: note: valid target CPU values are: cortex-a34, cortex-a35, cortex-a53, cortex-a55, cortex-a510, cortex-a520, cortex-a57, cortex-a65, cortex-a65ae,
[llvm-branch-commits] [llvm] [clang] PR for llvm/llvm-project#79614 (PR #79870)
https://github.com/llvmbot milestoned https://github.com/llvm/llvm-project/pull/79870 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] PR for llvm/llvm-project#79614 (PR #79870)
llvmbot wrote: @davemgreen What do you think about merging this PR to the release branch? https://github.com/llvm/llvm-project/pull/79870 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [clang] PR for llvm/llvm-project#79614 (PR #79870)
llvmbot wrote: @llvm/pr-subscribers-clang Author: None (llvmbot) Changes resolves llvm/llvm-project#79614 --- Full diff: https://github.com/llvm/llvm-project/pull/79870.diff 4 Files Affected: - (modified) clang/test/Driver/aarch64-mcpu.c (+3) - (modified) clang/test/Misc/target-invalid-cpu-note.c (+2-2) - (modified) llvm/include/llvm/TargetParser/AArch64TargetParser.h (+2-1) - (modified) llvm/unittests/TargetParser/TargetParserTest.cpp (+1-1) ``diff diff --git a/clang/test/Driver/aarch64-mcpu.c b/clang/test/Driver/aarch64-mcpu.c index 511482a420da268..3e07f3597f34081 100644 --- a/clang/test/Driver/aarch64-mcpu.c +++ b/clang/test/Driver/aarch64-mcpu.c @@ -72,6 +72,9 @@ // RUN: %clang --target=aarch64 -mcpu=cortex-r82 -### -c %s 2>&1 | FileCheck -check-prefix=CORTEXR82 %s // CORTEXR82: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "cortex-r82" +// RUN: %clang --target=aarch64 -mcpu=cobalt-100 -### -c %s 2>&1 | FileCheck -check-prefix=COBALT-100 %s +// COBALT-100: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "neoverse-n2" + // RUN: %clang --target=aarch64 -mcpu=grace -### -c %s 2>&1 | FileCheck -check-prefix=GRACE %s // GRACE: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "neoverse-v2" diff --git a/clang/test/Misc/target-invalid-cpu-note.c b/clang/test/Misc/target-invalid-cpu-note.c index 84aed5c9c36fe47..2f10bfb1fd82fe3 100644 --- a/clang/test/Misc/target-invalid-cpu-note.c +++ b/clang/test/Misc/target-invalid-cpu-note.c @@ -5,11 +5,11 @@ // RUN: not %clang_cc1 -triple arm64--- -target-cpu not-a-cpu -fsyntax-only %s 2>&1 | FileCheck %s --check-prefix AARCH64 // AARCH64: error: unknown target CPU 'not-a-cpu' -// AARCH64-NEXT: note: valid target CPU values are: cortex-a34, cortex-a35, cortex-a53, cortex-a55, cortex-a510, cortex-a520, cortex-a57, cortex-a65, cortex-a65ae, cortex-a72, cortex-a73, cortex-a75, cortex-a76, cortex-a76ae, cortex-a77, cortex-a78, cortex-a78c, cortex-a710, cortex-a715, cortex-a720, cortex-r82, cortex-x1, cortex-x1c, cortex-x2, cortex-x3, cortex-x4, neoverse-e1, neoverse-n1, neoverse-n2, neoverse-512tvb, neoverse-v1, neoverse-v2, cyclone, apple-a7, apple-a8, apple-a9, apple-a10, apple-a11, apple-a12, apple-a13, apple-a14, apple-a15, apple-a16, apple-a17, apple-m1, apple-m2, apple-m3, apple-s4, apple-s5, exynos-m3, exynos-m4, exynos-m5, falkor, saphira, kryo, thunderx2t99, thunderx3t110, thunderx, thunderxt88, thunderxt81, thunderxt83, tsv110, a64fx, carmel, ampere1, ampere1a, grace{{$}} +// AARCH64-NEXT: note: valid target CPU values are: cortex-a34, cortex-a35, cortex-a53, cortex-a55, cortex-a510, cortex-a520, cortex-a57, cortex-a65, cortex-a65ae, cortex-a72, cortex-a73, cortex-a75, cortex-a76, cortex-a76ae, cortex-a77, cortex-a78, cortex-a78c, cortex-a710, cortex-a715, cortex-a720, cortex-r82, cortex-x1, cortex-x1c, cortex-x2, cortex-x3, cortex-x4, neoverse-e1, neoverse-n1, neoverse-n2, neoverse-512tvb, neoverse-v1, neoverse-v2, cyclone, apple-a7, apple-a8, apple-a9, apple-a10, apple-a11, apple-a12, apple-a13, apple-a14, apple-a15, apple-a16, apple-a17, apple-m1, apple-m2, apple-m3, apple-s4, apple-s5, exynos-m3, exynos-m4, exynos-m5, falkor, saphira, kryo, thunderx2t99, thunderx3t110, thunderx, thunderxt88, thunderxt81, thunderxt83, tsv110, a64fx, carmel, ampere1, ampere1a, cobalt-100, grace{{$}} // RUN: not %clang_cc1 -triple arm64--- -tune-cpu not-a-cpu -fsyntax-only %s 2>&1 | FileCheck %s --check-prefix TUNE_AARCH64 // TUNE_AARCH64: error: unknown target CPU 'not-a-cpu' -// TUNE_AARCH64-NEXT: note: valid target CPU values are: cortex-a34, cortex-a35, cortex-a53, cortex-a55, cortex-a510, cortex-a520, cortex-a57, cortex-a65, cortex-a65ae, cortex-a72, cortex-a73, cortex-a75, cortex-a76, cortex-a76ae, cortex-a77, cortex-a78, cortex-a78c, cortex-a710, cortex-a715, cortex-a720, cortex-r82, cortex-x1, cortex-x1c, cortex-x2, cortex-x3, cortex-x4, neoverse-e1, neoverse-n1, neoverse-n2, neoverse-512tvb, neoverse-v1, neoverse-v2, cyclone, apple-a7, apple-a8, apple-a9, apple-a10, apple-a11, apple-a12, apple-a13, apple-a14, apple-a15, apple-a16, apple-a17, apple-m1, apple-m2, apple-m3, apple-s4, apple-s5, exynos-m3, exynos-m4, exynos-m5, falkor, saphira, kryo, thunderx2t99, thunderx3t110, thunderx, thunderxt88, thunderxt81, thunderxt83, tsv110, a64fx, carmel, ampere1, ampere1a, grace{{$}} +// TUNE_AARCH64-NEXT: note: valid target CPU values are: cortex-a34, cortex-a35, cortex-a53, cortex-a55, cortex-a510, cortex-a520, cortex-a57, cortex-a65, cortex-a65ae, cortex-a72, cortex-a73, cortex-a75, cortex-a76, cortex-a76ae, cortex-a77, cortex-a78, cortex-a78c, cortex-a710, cortex-a715, cortex-a720, cortex-r82, cortex-x1, cortex-x1c, cortex-x2, cortex-x3, cortex-x4, neoverse-e1, neoverse-n1, neoverse-n2, neoverse-512tvb, neoverse-v1, neoverse-v2, cyclone, apple-a7, apple-a8, apple-a9, apple-a10, apple-a11, apple-a12, apple-a13, apple-a14, apple-a15, apple-a16, apple-a17, apple-m1, apple-m2, a
[llvm-branch-commits] [llvm] [clang] PR for llvm/llvm-project#79614 (PR #79870)
llvmbot wrote: @llvm/pr-subscribers-clang-driver Author: None (llvmbot) Changes resolves llvm/llvm-project#79614 --- Full diff: https://github.com/llvm/llvm-project/pull/79870.diff 4 Files Affected: - (modified) clang/test/Driver/aarch64-mcpu.c (+3) - (modified) clang/test/Misc/target-invalid-cpu-note.c (+2-2) - (modified) llvm/include/llvm/TargetParser/AArch64TargetParser.h (+2-1) - (modified) llvm/unittests/TargetParser/TargetParserTest.cpp (+1-1) ``diff diff --git a/clang/test/Driver/aarch64-mcpu.c b/clang/test/Driver/aarch64-mcpu.c index 511482a420da268..3e07f3597f34081 100644 --- a/clang/test/Driver/aarch64-mcpu.c +++ b/clang/test/Driver/aarch64-mcpu.c @@ -72,6 +72,9 @@ // RUN: %clang --target=aarch64 -mcpu=cortex-r82 -### -c %s 2>&1 | FileCheck -check-prefix=CORTEXR82 %s // CORTEXR82: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "cortex-r82" +// RUN: %clang --target=aarch64 -mcpu=cobalt-100 -### -c %s 2>&1 | FileCheck -check-prefix=COBALT-100 %s +// COBALT-100: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "neoverse-n2" + // RUN: %clang --target=aarch64 -mcpu=grace -### -c %s 2>&1 | FileCheck -check-prefix=GRACE %s // GRACE: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "neoverse-v2" diff --git a/clang/test/Misc/target-invalid-cpu-note.c b/clang/test/Misc/target-invalid-cpu-note.c index 84aed5c9c36fe47..2f10bfb1fd82fe3 100644 --- a/clang/test/Misc/target-invalid-cpu-note.c +++ b/clang/test/Misc/target-invalid-cpu-note.c @@ -5,11 +5,11 @@ // RUN: not %clang_cc1 -triple arm64--- -target-cpu not-a-cpu -fsyntax-only %s 2>&1 | FileCheck %s --check-prefix AARCH64 // AARCH64: error: unknown target CPU 'not-a-cpu' -// AARCH64-NEXT: note: valid target CPU values are: cortex-a34, cortex-a35, cortex-a53, cortex-a55, cortex-a510, cortex-a520, cortex-a57, cortex-a65, cortex-a65ae, cortex-a72, cortex-a73, cortex-a75, cortex-a76, cortex-a76ae, cortex-a77, cortex-a78, cortex-a78c, cortex-a710, cortex-a715, cortex-a720, cortex-r82, cortex-x1, cortex-x1c, cortex-x2, cortex-x3, cortex-x4, neoverse-e1, neoverse-n1, neoverse-n2, neoverse-512tvb, neoverse-v1, neoverse-v2, cyclone, apple-a7, apple-a8, apple-a9, apple-a10, apple-a11, apple-a12, apple-a13, apple-a14, apple-a15, apple-a16, apple-a17, apple-m1, apple-m2, apple-m3, apple-s4, apple-s5, exynos-m3, exynos-m4, exynos-m5, falkor, saphira, kryo, thunderx2t99, thunderx3t110, thunderx, thunderxt88, thunderxt81, thunderxt83, tsv110, a64fx, carmel, ampere1, ampere1a, grace{{$}} +// AARCH64-NEXT: note: valid target CPU values are: cortex-a34, cortex-a35, cortex-a53, cortex-a55, cortex-a510, cortex-a520, cortex-a57, cortex-a65, cortex-a65ae, cortex-a72, cortex-a73, cortex-a75, cortex-a76, cortex-a76ae, cortex-a77, cortex-a78, cortex-a78c, cortex-a710, cortex-a715, cortex-a720, cortex-r82, cortex-x1, cortex-x1c, cortex-x2, cortex-x3, cortex-x4, neoverse-e1, neoverse-n1, neoverse-n2, neoverse-512tvb, neoverse-v1, neoverse-v2, cyclone, apple-a7, apple-a8, apple-a9, apple-a10, apple-a11, apple-a12, apple-a13, apple-a14, apple-a15, apple-a16, apple-a17, apple-m1, apple-m2, apple-m3, apple-s4, apple-s5, exynos-m3, exynos-m4, exynos-m5, falkor, saphira, kryo, thunderx2t99, thunderx3t110, thunderx, thunderxt88, thunderxt81, thunderxt83, tsv110, a64fx, carmel, ampere1, ampere1a, cobalt-100, grace{{$}} // RUN: not %clang_cc1 -triple arm64--- -tune-cpu not-a-cpu -fsyntax-only %s 2>&1 | FileCheck %s --check-prefix TUNE_AARCH64 // TUNE_AARCH64: error: unknown target CPU 'not-a-cpu' -// TUNE_AARCH64-NEXT: note: valid target CPU values are: cortex-a34, cortex-a35, cortex-a53, cortex-a55, cortex-a510, cortex-a520, cortex-a57, cortex-a65, cortex-a65ae, cortex-a72, cortex-a73, cortex-a75, cortex-a76, cortex-a76ae, cortex-a77, cortex-a78, cortex-a78c, cortex-a710, cortex-a715, cortex-a720, cortex-r82, cortex-x1, cortex-x1c, cortex-x2, cortex-x3, cortex-x4, neoverse-e1, neoverse-n1, neoverse-n2, neoverse-512tvb, neoverse-v1, neoverse-v2, cyclone, apple-a7, apple-a8, apple-a9, apple-a10, apple-a11, apple-a12, apple-a13, apple-a14, apple-a15, apple-a16, apple-a17, apple-m1, apple-m2, apple-m3, apple-s4, apple-s5, exynos-m3, exynos-m4, exynos-m5, falkor, saphira, kryo, thunderx2t99, thunderx3t110, thunderx, thunderxt88, thunderxt81, thunderxt83, tsv110, a64fx, carmel, ampere1, ampere1a, grace{{$}} +// TUNE_AARCH64-NEXT: note: valid target CPU values are: cortex-a34, cortex-a35, cortex-a53, cortex-a55, cortex-a510, cortex-a520, cortex-a57, cortex-a65, cortex-a65ae, cortex-a72, cortex-a73, cortex-a75, cortex-a76, cortex-a76ae, cortex-a77, cortex-a78, cortex-a78c, cortex-a710, cortex-a715, cortex-a720, cortex-r82, cortex-x1, cortex-x1c, cortex-x2, cortex-x3, cortex-x4, neoverse-e1, neoverse-n1, neoverse-n2, neoverse-512tvb, neoverse-v1, neoverse-v2, cyclone, apple-a7, apple-a8, apple-a9, apple-a10, apple-a11, apple-a12, apple-a13, apple-a14, apple-a15, apple-a16, apple-a17, apple-m1, appl
[llvm-branch-commits] [llvm] [clang] PR for llvm/llvm-project#79614 (PR #79870)
https://github.com/davemgreen approved this pull request. Sounds simple enough to me. LGTM https://github.com/llvm/llvm-project/pull/79870 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [workflows] Fix argument passing in abi-dump jobs (#79658) (PR #79836)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/79836 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] 2c32141 - [llvm] [cmake] Include httplib in LLVMConfig.cmake (#79305)
Author: Michał Górny Date: 2024-01-29T10:25:40-08:00 New Revision: 2c3214135ffa8e3f9ab61d12521637532126f368 URL: https://github.com/llvm/llvm-project/commit/2c3214135ffa8e3f9ab61d12521637532126f368 DIFF: https://github.com/llvm/llvm-project/commit/2c3214135ffa8e3f9ab61d12521637532126f368.diff LOG: [llvm] [cmake] Include httplib in LLVMConfig.cmake (#79305) Include LLVM_ENABLE_HTTPLIB along with httplib package finding in LLVMConfig.cmake, as this dependency is needed by LLVMDebuginfod that is now used by LLDB. Without it, building LLDB standalone fails with: ``` CMake Error at /usr/lib/llvm/19/lib64/cmake/llvm/LLVMExports.cmake:90 (set_target_properties): The link interface of target "LLVMDebuginfod" contains: httplib::httplib but the target was not found. Possible reasons include: * There is a typo in the target name. * A find_package call is missing for an IMPORTED target. * An ALIAS target is missing. Call Stack (most recent call first): /usr/lib/llvm/19/lib64/cmake/llvm/LLVMConfig.cmake:357 (include) cmake/modules/LLDBStandalone.cmake:9 (find_package) CMakeLists.txt:34 (include) ``` (cherry picked from commit 3c9f34c12450345c6eb524e47cf79664271e4260) Added: Modified: llvm/cmake/modules/LLVMConfig.cmake.in Removed: diff --git a/llvm/cmake/modules/LLVMConfig.cmake.in b/llvm/cmake/modules/LLVMConfig.cmake.in index 74e1c6bf52e2305..770a9caea322e6a 100644 --- a/llvm/cmake/modules/LLVMConfig.cmake.in +++ b/llvm/cmake/modules/LLVMConfig.cmake.in @@ -90,6 +90,11 @@ if(LLVM_ENABLE_CURL) find_package(CURL) endif() +set(LLVM_ENABLE_HTTPLIB @LLVM_ENABLE_HTTPLIB@) +if(LLVM_ENABLE_HTTPLIB) + find_package(httplib) +endif() + set(LLVM_WITH_Z3 @LLVM_WITH_Z3@) set(LLVM_ENABLE_DIA_SDK @LLVM_ENABLE_DIA_SDK@) ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79547 (PR #79548)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/79548 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79547 (PR #79548)
tstellar wrote: Merged: 2c3214135ffa8e3f9ab61d12521637532126f368 https://github.com/llvm/llvm-project/pull/79548 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] 3df71e5 - [mlir][LLVM] Use int32_t to indirectly construct GEPArg (#79562)
Author: Andrei Golubev Date: 2024-01-29T10:29:59-08:00 New Revision: 3df71e5a3f5d5fb9436c53c298e5426f729288e2 URL: https://github.com/llvm/llvm-project/commit/3df71e5a3f5d5fb9436c53c298e5426f729288e2 DIFF: https://github.com/llvm/llvm-project/commit/3df71e5a3f5d5fb9436c53c298e5426f729288e2.diff LOG: [mlir][LLVM] Use int32_t to indirectly construct GEPArg (#79562) GEPArg can only be constructed from int32_t and mlir::Value. Explicitly cast other types (e.g. unsigned, size_t) to int32_t to avoid narrowing conversion warnings on MSVC. Some recent examples of such are: ``` mlir\lib\Dialect\LLVMIR\Transforms\TypeConsistency.cpp: error C2398: Element '1': conversion from 'size_t' to 'T' requires a narrowing conversion with [ T=mlir::LLVM::GEPArg ] mlir\lib\Dialect\LLVMIR\Transforms\TypeConsistency.cpp: error C2398: Element '1': conversion from 'unsigned int' to 'T' requires a narrowing conversion with [ T=mlir::LLVM::GEPArg ] ``` Co-authored-by: Nikita Kudriavtsev (cherry picked from commit 89cd345667a5f8f4c37c621fd8abe8d84e85c050) Added: Modified: mlir/lib/Conversion/GPUCommon/GPUOpsLowering.cpp mlir/lib/Conversion/GPUCommon/GPUToLLVMConversion.cpp mlir/lib/Dialect/LLVMIR/Transforms/TypeConsistency.cpp Removed: diff --git a/mlir/lib/Conversion/GPUCommon/GPUOpsLowering.cpp b/mlir/lib/Conversion/GPUCommon/GPUOpsLowering.cpp index ae2bd8e5b5405d..73d418cb841327 100644 --- a/mlir/lib/Conversion/GPUCommon/GPUOpsLowering.cpp +++ b/mlir/lib/Conversion/GPUCommon/GPUOpsLowering.cpp @@ -529,7 +529,8 @@ LogicalResult GPUPrintfOpToVPrintfLowering::matchAndRewrite( /*alignment=*/0); for (auto [index, arg] : llvm::enumerate(args)) { Value ptr = rewriter.create( -loc, ptrType, structType, tempAlloc, ArrayRef{0, index}); +loc, ptrType, structType, tempAlloc, +ArrayRef{0, static_cast(index)}); rewriter.create(loc, arg, ptr); } std::array printfArgs = {stringStart, tempAlloc}; diff --git a/mlir/lib/Conversion/GPUCommon/GPUToLLVMConversion.cpp b/mlir/lib/Conversion/GPUCommon/GPUToLLVMConversion.cpp index f853d5c47b623c..78d4e806246872 100644 --- a/mlir/lib/Conversion/GPUCommon/GPUToLLVMConversion.cpp +++ b/mlir/lib/Conversion/GPUCommon/GPUToLLVMConversion.cpp @@ -1041,13 +1041,14 @@ Value ConvertLaunchFuncOpToGpuRuntimeCallPattern::generateParamsArray( auto arrayPtr = builder.create( loc, llvmPointerType, llvmPointerType, arraySize, /*alignment=*/0); for (const auto &en : llvm::enumerate(arguments)) { +const auto index = static_cast(en.index()); Value fieldPtr = builder.create(loc, llvmPointerType, structType, structPtr, -ArrayRef{0, en.index()}); +ArrayRef{0, index}); builder.create(loc, en.value(), fieldPtr); -auto elementPtr = builder.create( -loc, llvmPointerType, llvmPointerType, arrayPtr, -ArrayRef{en.index()}); +auto elementPtr = +builder.create(loc, llvmPointerType, llvmPointerType, +arrayPtr, ArrayRef{index}); builder.create(loc, fieldPtr, elementPtr); } return arrayPtr; diff --git a/mlir/lib/Dialect/LLVMIR/Transforms/TypeConsistency.cpp b/mlir/lib/Dialect/LLVMIR/Transforms/TypeConsistency.cpp index 72f9295749a66b..b25c831bc7172a 100644 --- a/mlir/lib/Dialect/LLVMIR/Transforms/TypeConsistency.cpp +++ b/mlir/lib/Dialect/LLVMIR/Transforms/TypeConsistency.cpp @@ -488,7 +488,8 @@ static void splitVectorStore(const DataLayout &dataLayout, Location loc, // Other patterns will turn this into a type-consistent GEP. auto gepOp = rewriter.create( loc, address.getType(), rewriter.getI8Type(), address, -ArrayRef{storeOffset + index * elementSize}); +ArrayRef{ +static_cast(storeOffset + index * elementSize)}); rewriter.create(loc, extractOp, gepOp); } @@ -524,9 +525,9 @@ static void splitIntegerStore(const DataLayout &dataLayout, Location loc, // We create an `i8` indexed GEP here as that is the easiest (offset is // already known). Other patterns turn this into a type-consistent GEP. -auto gepOp = -rewriter.create(loc, address.getType(), rewriter.getI8Type(), - address, ArrayRef{currentOffset}); +auto gepOp = rewriter.create( +loc, address.getType(), rewriter.getI8Type(), address, +ArrayRef{static_cast(currentOffset)}); rewriter.create(loc, valueToStore, gepOp); // No need to care about padding here since we already checked previously ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] PR for llvm/llvm-project#79600 (PR #79603)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/79603 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] PR for llvm/llvm-project#79600 (PR #79603)
tstellar wrote: Merged: 3df71e5a3f5d5fb9436c53c298e5426f729288e2 https://github.com/llvm/llvm-project/pull/79603 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] PR for llvm/llvm-project#79479 (PR #79907)
https://github.com/topperc created https://github.com/llvm/llvm-project/pull/79907 Resolves https://github.com/llvm/llvm-project/issues/79479. >From 8fb154776db1627da75e6d67cf468d5b55868e93 Mon Sep 17 00:00:00 2001 From: Craig Topper Date: Thu, 25 Jan 2024 09:14:52 -0800 Subject: [PATCH 1/2] [RISCV] Support __riscv_v_fixed_vlen for vbool types. (#76551) This adopts a similar behavior to AArch64 SVE, where bool vectors are represented as a vector of chars with 1/8 the number of elements. This ensures the vector always occupies a power of 2 number of bytes. A consequence of this is that vbool64_t, vbool32_t, and vool16_t can only be used with a vector length that guarantees at least 8 bits. --- clang/docs/ReleaseNotes.rst | 2 + clang/include/clang/AST/Type.h| 3 + clang/include/clang/Basic/AttrDocs.td | 5 +- clang/lib/AST/ASTContext.cpp | 20 +- clang/lib/AST/ItaniumMangle.cpp | 25 +- clang/lib/AST/JSONNodeDumper.cpp | 3 + clang/lib/AST/TextNodeDumper.cpp | 3 + clang/lib/AST/Type.cpp| 15 +- clang/lib/AST/TypePrinter.cpp | 2 + clang/lib/CodeGen/Targets/RISCV.cpp | 21 +- clang/lib/Sema/SemaExpr.cpp | 6 +- clang/lib/Sema/SemaType.cpp | 21 +- .../attr-riscv-rvv-vector-bits-bitcast.c | 100 ++ .../CodeGen/attr-riscv-rvv-vector-bits-call.c | 74 + .../CodeGen/attr-riscv-rvv-vector-bits-cast.c | 76 - .../attr-riscv-rvv-vector-bits-codegen.c | 172 +++ .../attr-riscv-rvv-vector-bits-globals.c | 107 +++ .../attr-riscv-rvv-vector-bits-types.c| 284 ++ .../riscv-mangle-rvv-fixed-vectors.cpp| 72 + clang/test/Sema/attr-riscv-rvv-vector-bits.c | 88 +- 20 files changed, 1065 insertions(+), 34 deletions(-) diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst index 060bc7669b72a5..45d1ab34d0f931 100644 --- a/clang/docs/ReleaseNotes.rst +++ b/clang/docs/ReleaseNotes.rst @@ -1227,6 +1227,8 @@ RISC-V Support - Default ABI with F but without D was changed to ilp32f for RV32 and to lp64f for RV64. +- ``__attribute__((rvv_vector_bits(N))) is now supported for RVV vbool*_t types. + CUDA/HIP Language Changes ^ diff --git a/clang/include/clang/AST/Type.h b/clang/include/clang/AST/Type.h index ea425791fc97f0..6384cf9420b82e 100644 --- a/clang/include/clang/AST/Type.h +++ b/clang/include/clang/AST/Type.h @@ -3495,6 +3495,9 @@ enum class VectorKind { /// is RISC-V RVV fixed-length data vector RVVFixedLengthData, + + /// is RISC-V RVV fixed-length mask vector + RVVFixedLengthMask, }; /// Represents a GCC generic vector type. This type is created using diff --git a/clang/include/clang/Basic/AttrDocs.td b/clang/include/clang/Basic/AttrDocs.td index 7e633f8e2635a9..e02a1201e2ad79 100644 --- a/clang/include/clang/Basic/AttrDocs.td +++ b/clang/include/clang/Basic/AttrDocs.td @@ -2424,7 +2424,10 @@ only be a power of 2 between 64 and 65536. For types where LMUL!=1, ``__riscv_v_fixed_vlen`` needs to be scaled by the LMUL of the type before passing to the attribute. -``vbool*_t`` types are not supported at this time. +For ``vbool*_t`` types, ``__riscv_v_fixed_vlen`` needs to be divided by the +number from the type name. For example, ``vbool8_t`` needs to use +``__riscv_v_fixed_vlen`` / 8. If the resulting value is not a multiple of 8, +the type is not supported for that value of ``__riscv_v_fixed_vlen``. }]; } diff --git a/clang/lib/AST/ASTContext.cpp b/clang/lib/AST/ASTContext.cpp index 5eb7aa3664569d..ab16ca10395fa8 100644 --- a/clang/lib/AST/ASTContext.cpp +++ b/clang/lib/AST/ASTContext.cpp @@ -1945,7 +1945,8 @@ TypeInfo ASTContext::getTypeInfoImpl(const Type *T) const { else if (VT->getVectorKind() == VectorKind::SveFixedLengthPredicate) // Adjust the alignment for fixed-length SVE predicates. Align = 16; -else if (VT->getVectorKind() == VectorKind::RVVFixedLengthData) +else if (VT->getVectorKind() == VectorKind::RVVFixedLengthData || + VT->getVectorKind() == VectorKind::RVVFixedLengthMask) // Adjust the alignment for fixed-length RVV vectors. Align = std::min(64, Width); break; @@ -9416,7 +9417,9 @@ bool ASTContext::areCompatibleVectorTypes(QualType FirstVec, Second->getVectorKind() != VectorKind::SveFixedLengthData && Second->getVectorKind() != VectorKind::SveFixedLengthPredicate && First->getVectorKind() != VectorKind::RVVFixedLengthData && - Second->getVectorKind() != VectorKind::RVVFixedLengthData) + Second->getVectorKind() != VectorKind::RVVFixedLengthData && + First->getVectorKind() != VectorKind::RVVFixedLengthMask && + Second->getVectorKind() != VectorKind::RVVFixedLengthMask) return true; return false; @@ -9522,8 +9525,11 @@ static uint64_t
[llvm-branch-commits] [clang] PR for llvm/llvm-project#79479 (PR #79907)
llvmbot wrote: @llvm/pr-subscribers-clang Author: Craig Topper (topperc) Changes Resolves https://github.com/llvm/llvm-project/issues/79479. --- Patch is 92.08 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/79907.diff 20 Files Affected: - (modified) clang/docs/ReleaseNotes.rst (+2) - (modified) clang/include/clang/AST/Type.h (+3) - (modified) clang/include/clang/Basic/AttrDocs.td (+4-1) - (modified) clang/lib/AST/ASTContext.cpp (+16-4) - (modified) clang/lib/AST/ItaniumMangle.cpp (+17-8) - (modified) clang/lib/AST/JSONNodeDumper.cpp (+3) - (modified) clang/lib/AST/TextNodeDumper.cpp (+3) - (modified) clang/lib/AST/Type.cpp (+14-1) - (modified) clang/lib/AST/TypePrinter.cpp (+2) - (modified) clang/lib/CodeGen/Targets/RISCV.cpp (+15-6) - (modified) clang/lib/Sema/SemaExpr.cpp (+4-2) - (modified) clang/lib/Sema/SemaType.cpp (+15-6) - (modified) clang/test/CodeGen/attr-riscv-rvv-vector-bits-bitcast.c (+100) - (modified) clang/test/CodeGen/attr-riscv-rvv-vector-bits-call.c (+74) - (modified) clang/test/CodeGen/attr-riscv-rvv-vector-bits-cast.c (+72-4) - (modified) clang/test/CodeGen/attr-riscv-rvv-vector-bits-codegen.c (+172) - (modified) clang/test/CodeGen/attr-riscv-rvv-vector-bits-globals.c (+107) - (modified) clang/test/CodeGen/attr-riscv-rvv-vector-bits-types.c (+284) - (modified) clang/test/CodeGenCXX/riscv-mangle-rvv-fixed-vectors.cpp (+72) - (modified) clang/test/Sema/attr-riscv-rvv-vector-bits.c (+86-2) ``diff diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst index 060bc7669b72a5..2f4fe8bf7556e7 100644 --- a/clang/docs/ReleaseNotes.rst +++ b/clang/docs/ReleaseNotes.rst @@ -1227,6 +1227,8 @@ RISC-V Support - Default ABI with F but without D was changed to ilp32f for RV32 and to lp64f for RV64. +- ``__attribute__((rvv_vector_bits(N)))`` is now supported for RVV vbool*_t types. + CUDA/HIP Language Changes ^ diff --git a/clang/include/clang/AST/Type.h b/clang/include/clang/AST/Type.h index ea425791fc97f0..6384cf9420b82e 100644 --- a/clang/include/clang/AST/Type.h +++ b/clang/include/clang/AST/Type.h @@ -3495,6 +3495,9 @@ enum class VectorKind { /// is RISC-V RVV fixed-length data vector RVVFixedLengthData, + + /// is RISC-V RVV fixed-length mask vector + RVVFixedLengthMask, }; /// Represents a GCC generic vector type. This type is created using diff --git a/clang/include/clang/Basic/AttrDocs.td b/clang/include/clang/Basic/AttrDocs.td index 7e633f8e2635a9..e02a1201e2ad79 100644 --- a/clang/include/clang/Basic/AttrDocs.td +++ b/clang/include/clang/Basic/AttrDocs.td @@ -2424,7 +2424,10 @@ only be a power of 2 between 64 and 65536. For types where LMUL!=1, ``__riscv_v_fixed_vlen`` needs to be scaled by the LMUL of the type before passing to the attribute. -``vbool*_t`` types are not supported at this time. +For ``vbool*_t`` types, ``__riscv_v_fixed_vlen`` needs to be divided by the +number from the type name. For example, ``vbool8_t`` needs to use +``__riscv_v_fixed_vlen`` / 8. If the resulting value is not a multiple of 8, +the type is not supported for that value of ``__riscv_v_fixed_vlen``. }]; } diff --git a/clang/lib/AST/ASTContext.cpp b/clang/lib/AST/ASTContext.cpp index 5eb7aa3664569d..ab16ca10395fa8 100644 --- a/clang/lib/AST/ASTContext.cpp +++ b/clang/lib/AST/ASTContext.cpp @@ -1945,7 +1945,8 @@ TypeInfo ASTContext::getTypeInfoImpl(const Type *T) const { else if (VT->getVectorKind() == VectorKind::SveFixedLengthPredicate) // Adjust the alignment for fixed-length SVE predicates. Align = 16; -else if (VT->getVectorKind() == VectorKind::RVVFixedLengthData) +else if (VT->getVectorKind() == VectorKind::RVVFixedLengthData || + VT->getVectorKind() == VectorKind::RVVFixedLengthMask) // Adjust the alignment for fixed-length RVV vectors. Align = std::min(64, Width); break; @@ -9416,7 +9417,9 @@ bool ASTContext::areCompatibleVectorTypes(QualType FirstVec, Second->getVectorKind() != VectorKind::SveFixedLengthData && Second->getVectorKind() != VectorKind::SveFixedLengthPredicate && First->getVectorKind() != VectorKind::RVVFixedLengthData && - Second->getVectorKind() != VectorKind::RVVFixedLengthData) + Second->getVectorKind() != VectorKind::RVVFixedLengthData && + First->getVectorKind() != VectorKind::RVVFixedLengthMask && + Second->getVectorKind() != VectorKind::RVVFixedLengthMask) return true; return false; @@ -9522,8 +9525,11 @@ static uint64_t getRVVTypeSize(ASTContext &Context, const BuiltinType *Ty) { ASTContext::BuiltinVectorTypeInfo Info = Context.getBuiltinVectorTypeInfo(Ty); - uint64_t EltSize = Context.getTypeSize(Info.ElementType); - uint64_t MinElts = Info.EC.getKnownMinValue(); + unsigned EltSize = Context.getTypeSize(Info.ElementType); + if (Info.ElementType == Context.BoolTy) +EltSize = 1; + + unsigned Min
[llvm-branch-commits] [clang] b73cd5e - Revert "[SemaCXX] Implement CWG2137 (list-initialization from objects of the same type) (#77768)"
Author: Alexander Kornienko Date: 2024-01-29T14:57:23-08:00 New Revision: b73cd5ec714740283841e0fc1f3ebebe65dd329a URL: https://github.com/llvm/llvm-project/commit/b73cd5ec714740283841e0fc1f3ebebe65dd329a DIFF: https://github.com/llvm/llvm-project/commit/b73cd5ec714740283841e0fc1f3ebebe65dd329a.diff LOG: Revert "[SemaCXX] Implement CWG2137 (list-initialization from objects of the same type) (#77768)" This reverts commit 924701311aa79180e86ad8ce43d253f27d25ec7d. Causes compilation errors on valid code, see https://github.com/llvm/llvm-project/pull/77768#issuecomment-1908062472. (cherry picked from commit 6e4930c67508a90bdfd756f6e45417b5253cd741) Added: Modified: clang/lib/Sema/SemaInit.cpp clang/lib/Sema/SemaOverload.cpp clang/test/CXX/drs/dr14xx.cpp clang/test/CXX/drs/dr21xx.cpp clang/test/CXX/drs/dr23xx.cpp clang/www/cxx_dr_status.html libcxx/test/std/utilities/utility/pairs/pairs.pair/ctor.pair_U_V_move.pass.cpp Removed: diff --git a/clang/lib/Sema/SemaInit.cpp b/clang/lib/Sema/SemaInit.cpp index 91e4cb7b68a24a..457fa377355a97 100644 --- a/clang/lib/Sema/SemaInit.cpp +++ b/clang/lib/Sema/SemaInit.cpp @@ -4200,7 +4200,7 @@ static OverloadingResult ResolveConstructorOverload( /// \param IsListInit Is this list-initialization? /// \param IsInitListCopy Is this non-list-initialization resulting from a /// list-initialization from {x} where x is the same -/// aggregate type as the entity? +/// type as the entity? static void TryConstructorInitialization(Sema &S, const InitializedEntity &Entity, const InitializationKind &Kind, @@ -4230,14 +4230,6 @@ static void TryConstructorInitialization(Sema &S, Entity.getKind() != InitializedEntity::EK_LambdaToBlockConversionBlockElement); - bool CopyElisionPossible = false; - auto ElideConstructor = [&] { -// Convert qualifications if necessary. -Sequence.AddQualificationConversionStep(DestType, VK_PRValue); -if (ILE) - Sequence.RewrapReferenceInitList(DestType, ILE); - }; - // C++17 [dcl.init]p17: // - If the initializer expression is a prvalue and the cv-unqualified // version of the source type is the same class as the class of the @@ -4250,17 +4242,11 @@ static void TryConstructorInitialization(Sema &S, if (S.getLangOpts().CPlusPlus17 && !RequireActualConstructor && UnwrappedArgs.size() == 1 && UnwrappedArgs[0]->isPRValue() && S.Context.hasSameUnqualifiedType(UnwrappedArgs[0]->getType(), DestType)) { -if (ILE && !DestType->isAggregateType()) { - // CWG2311: T{ prvalue_of_type_T } is not eligible for copy elision - // Make this an elision if this won't call an initializer-list - // constructor. (Always on an aggregate type or check constructors first.) - assert(!IsInitListCopy && - "IsInitListCopy only possible with aggregate types"); - CopyElisionPossible = true; -} else { - ElideConstructor(); - return; -} +// Convert qualifications if necessary. +Sequence.AddQualificationConversionStep(DestType, VK_PRValue); +if (ILE) + Sequence.RewrapReferenceInitList(DestType, ILE); +return; } const RecordType *DestRecordType = DestType->getAs(); @@ -4305,12 +4291,6 @@ static void TryConstructorInitialization(Sema &S, S, Kind.getLocation(), Args, CandidateSet, DestType, Ctors, Best, CopyInitialization, AllowExplicit, /*OnlyListConstructors=*/true, IsListInit, RequireActualConstructor); - -if (CopyElisionPossible && Result == OR_No_Viable_Function) { - // No initializer list candidate - ElideConstructor(); - return; -} } // C++11 [over.match.list]p1: @@ -4592,9 +4572,9 @@ static void TryListInitialization(Sema &S, return; } - // C++11 [dcl.init.list]p3, per DR1467 and DR2137: - // - If T is an aggregate class and the initializer list has a single element - // of type cv U, where U is T or a class derived from T, the object is + // C++11 [dcl.init.list]p3, per DR1467: + // - If T is a class type and the initializer list has a single element of + // type cv U, where U is T or a class derived from T, the object is // initialized from that element (by copy-initialization for // copy-list-initialization, or by direct-initialization for // direct-list-initialization). @@ -4605,7 +4585,7 @@ static void TryListInitialization(Sema &S, // - Otherwise, if T is an aggregate, [...] (continue below). if (S.getLangOpts().CPlusPlus11 && InitList->getNumInits() == 1 && !IsDesignatedInit) { -if (DestType->isRecordType() && DestType->isAggregateType()) { +if (DestType->isRecordType()) { QualType InitType = InitList->get
[llvm-branch-commits] [libcxx] [clang] PR for llvm/llvm-project#79762 (PR #79763)
tstellar wrote: Merged: b73cd5ec714740283841e0fc1f3ebebe65dd329a https://github.com/llvm/llvm-project/pull/79763 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [libcxx] PR for llvm/llvm-project#79762 (PR #79763)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/79763 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] 4c8cf4a - [AMDGPU] New llvm.amdgcn.wave.id intrinsic (#79325)
Author: Jay Foad Date: 2024-01-29T15:00:47-08:00 New Revision: 4c8cf4a1c29da834f1999a1c56c7e637c6886825 URL: https://github.com/llvm/llvm-project/commit/4c8cf4a1c29da834f1999a1c56c7e637c6886825 DIFF: https://github.com/llvm/llvm-project/commit/4c8cf4a1c29da834f1999a1c56c7e637c6886825.diff LOG: [AMDGPU] New llvm.amdgcn.wave.id intrinsic (#79325) This is only valid on targets with architected SGPRs. Added: llvm/test/CodeGen/AMDGPU/llvm.amdgcn.wave.id.ll Modified: llvm/include/llvm/IR/IntrinsicsAMDGPU.td llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h llvm/lib/Target/AMDGPU/SIISelLowering.cpp llvm/lib/Target/AMDGPU/SIISelLowering.h Removed: diff --git a/llvm/include/llvm/IR/IntrinsicsAMDGPU.td b/llvm/include/llvm/IR/IntrinsicsAMDGPU.td index 9eb1ac8e27bef..c5f43d17d1c14 100644 --- a/llvm/include/llvm/IR/IntrinsicsAMDGPU.td +++ b/llvm/include/llvm/IR/IntrinsicsAMDGPU.td @@ -2777,6 +2777,10 @@ class AMDGPULoadTr: def int_amdgcn_global_load_tr : AMDGPULoadTr; +// i32 @llvm.amdgcn.wave.id() +def int_amdgcn_wave_id : + DefaultAttrsIntrinsic<[llvm_i32_ty], [], [IntrNoMem, IntrSpeculatable]>; + //===--===// // Deep learning intrinsics. //===--===// diff --git a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp index 615685822f91e..e98ede88a7e2d 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp @@ -6883,6 +6883,23 @@ bool AMDGPULegalizerInfo::legalizeStackSave(MachineInstr &MI, return true; } +bool AMDGPULegalizerInfo::legalizeWaveID(MachineInstr &MI, + MachineIRBuilder &B) const { + // With architected SGPRs, waveIDinGroup is in TTMP8[29:25]. + if (!ST.hasArchitectedSGPRs()) +return false; + LLT S32 = LLT::scalar(32); + Register DstReg = MI.getOperand(0).getReg(); + Register TTMP8 = + getFunctionLiveInPhysReg(B.getMF(), B.getTII(), AMDGPU::TTMP8, + AMDGPU::SReg_32RegClass, B.getDebugLoc(), S32); + auto LSB = B.buildConstant(S32, 25); + auto Width = B.buildConstant(S32, 5); + B.buildUbfx(DstReg, TTMP8, LSB, Width); + MI.eraseFromParent(); + return true; +} + bool AMDGPULegalizerInfo::legalizeIntrinsic(LegalizerHelper &Helper, MachineInstr &MI) const { MachineIRBuilder &B = Helper.MIRBuilder; @@ -7005,6 +7022,8 @@ bool AMDGPULegalizerInfo::legalizeIntrinsic(LegalizerHelper &Helper, case Intrinsic::amdgcn_workgroup_id_z: return legalizePreloadedArgIntrin(MI, MRI, B, AMDGPUFunctionArgInfo::WORKGROUP_ID_Z); + case Intrinsic::amdgcn_wave_id: +return legalizeWaveID(MI, B); case Intrinsic::amdgcn_lds_kernel_id: return legalizePreloadedArgIntrin(MI, MRI, B, AMDGPUFunctionArgInfo::LDS_KERNEL_ID); diff --git a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h index 56aabd4f6ab71..ecbe42681c669 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h +++ b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h @@ -212,6 +212,7 @@ class AMDGPULegalizerInfo final : public LegalizerInfo { bool legalizeFPTruncRound(MachineInstr &MI, MachineIRBuilder &B) const; bool legalizeStackSave(MachineInstr &MI, MachineIRBuilder &B) const; + bool legalizeWaveID(MachineInstr &MI, MachineIRBuilder &B) const; bool legalizeImageIntrinsic( MachineInstr &MI, MachineIRBuilder &B, diff --git a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp index d60f511302613..c5ad9da88ec2b 100644 --- a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp +++ b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp @@ -7920,6 +7920,18 @@ SDValue SITargetLowering::lowerSBuffer(EVT VT, SDLoc DL, SDValue Rsrc, return Loads[0]; } +SDValue SITargetLowering::lowerWaveID(SelectionDAG &DAG, SDValue Op) const { + // With architected SGPRs, waveIDinGroup is in TTMP8[29:25]. + if (!Subtarget->hasArchitectedSGPRs()) +return {}; + SDLoc SL(Op); + MVT VT = MVT::i32; + SDValue TTMP8 = CreateLiveInRegister(DAG, &AMDGPU::SReg_32RegClass, + AMDGPU::TTMP8, VT, SL); + return DAG.getNode(AMDGPUISD::BFE_U32, SL, VT, TTMP8, + DAG.getConstant(25, SL, VT), DAG.getConstant(5, SL, VT)); +} + SDValue SITargetLowering::lowerWorkitemID(SelectionDAG &DAG, SDValue Op, unsigned Dim, const ArgDescriptor &Arg) const { @@ -8090,6 +8102,8 @@ SDValue SITargetLowering::LowerINTRINSIC_WO_CHAIN(SDValue Op, case
[llvm-branch-commits] [llvm] 824a3e5 - [AMDGPU] Do not bother adding reserved registers to liveins (#79436)
Author: Jay Foad Date: 2024-01-29T15:00:47-08:00 New Revision: 824a3e5dec3aabc91428f009c1f439a75f577469 URL: https://github.com/llvm/llvm-project/commit/824a3e5dec3aabc91428f009c1f439a75f577469 DIFF: https://github.com/llvm/llvm-project/commit/824a3e5dec3aabc91428f009c1f439a75f577469.diff LOG: [AMDGPU] Do not bother adding reserved registers to liveins (#79436) Tweak the implementation of llvm.amdgcn.wave.id to not add TTMP8 to the function liveins. Added: Modified: llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp llvm/lib/Target/AMDGPU/SIISelLowering.cpp Removed: diff --git a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp index e98ede88a7e2d..17ffb7ec988f0 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp @@ -6890,9 +6890,7 @@ bool AMDGPULegalizerInfo::legalizeWaveID(MachineInstr &MI, return false; LLT S32 = LLT::scalar(32); Register DstReg = MI.getOperand(0).getReg(); - Register TTMP8 = - getFunctionLiveInPhysReg(B.getMF(), B.getTII(), AMDGPU::TTMP8, - AMDGPU::SReg_32RegClass, B.getDebugLoc(), S32); + auto TTMP8 = B.buildCopy(S32, Register(AMDGPU::TTMP8)); auto LSB = B.buildConstant(S32, 25); auto Width = B.buildConstant(S32, 5); B.buildUbfx(DstReg, TTMP8, LSB, Width); diff --git a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp index c5ad9da88ec2b..d6bf0d8cb2efa 100644 --- a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp +++ b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp @@ -7926,8 +7926,7 @@ SDValue SITargetLowering::lowerWaveID(SelectionDAG &DAG, SDValue Op) const { return {}; SDLoc SL(Op); MVT VT = MVT::i32; - SDValue TTMP8 = CreateLiveInRegister(DAG, &AMDGPU::SReg_32RegClass, - AMDGPU::TTMP8, VT, SL); + SDValue TTMP8 = DAG.getCopyFromReg(DAG.getEntryNode(), SL, AMDGPU::TTMP8, VT); return DAG.getNode(AMDGPUISD::BFE_U32, SL, VT, TTMP8, DAG.getConstant(25, SL, VT), DAG.getConstant(5, SL, VT)); } ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Backport 45d2d7757feb386186f69af6ef57bde7b5adc2db to release/18.x (PR #79839)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/79839 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Backport 45d2d7757feb386186f69af6ef57bde7b5adc2db to release/18.x (PR #79839)
tstellar wrote: Merged: 824a3e5dec3aabc91428f009c1f439a75f577469 4c8cf4a1c29da834f1999a1c56c7e637c6886825 https://github.com/llvm/llvm-project/pull/79839 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BPF] expand cttz, ctlz for i32, i64 (PR #73668)
eddyz87 wrote: > @eddyz87 Could you please take a look? This has been stalled for a while :) Hello, I tried this with simple C test: ```c unsigned int test(unsigned int v) { return __builtin_ctz(v); //return __builtin_clz(v); } ``` The clz part compiles fine, but when ctz is used I still get an assertion, however a different one: ``` $ clang --target=bpf -S -O2 test-clz.c -o - ... LLVM ERROR: Cannot select: t15: i64 = ConstantPool<[32 x i8] c"\00\01\1C\02\1D\0E\18\03\1E\16\14\0F\19\11\04\08\1F\1B\0D\17\15\13\10\07\1A\0C\12\06\0B\05\0A\09"> 0 In function: test PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace. Stack dump: 0. Program arguments: llc -debug-only=isel --asm-show-inst -mtriple=bpf -mcpu=v3 -filetype=obj -o - test-clz.ll 1. Running pass 'Function Pass Manager' on module 'test-clz.ll'. 2. Running pass 'BPF DAG->DAG Pattern Instruction Selection' on function '@test' #0 0x55d4977b1db8 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/eddy/work/llvm-project/llvm/lib/Support/Unix/Signals.inc:723:13 #1 0x55d4977afeb0 llvm::sys::RunSignalHandlers() /home/eddy/work/llvm-project/llvm/lib/Support/Signals.cpp:106:18 #2 0x55d4977b2588 SignalHandler(int) /home/eddy/work/llvm-project/llvm/lib/Support/Unix/Signals.inc:413:1 #3 0x7fdbaa03f190 __restore_rt (/lib64/libc.so.6+0x3f190) #4 0x7fdbaa091dec __pthread_kill_implementation (/lib64/libc.so.6+0x91dec) #5 0x7fdbaa03f0c6 gsignal (/lib64/libc.so.6+0x3f0c6) #6 0x7fdbaa0268d7 abort (/lib64/libc.so.6+0x268d7) #7 0x55d497735145 llvm::report_fatal_error(llvm::Twine const&, bool) /home/eddy/work/llvm-project/llvm/lib/Support/ErrorHandling.cpp:125:5 #8 0x55d4975f4b6d llvm::SDNode::getValueType(unsigned int) const /home/eddy/work/llvm-project/llvm/include/llvm/CodeGen/SelectionDAGNodes.h:1007:5 #9 0x55d4975f4b6d llvm::SDValue::getValueType() const /home/eddy/work/llvm-project/llvm/include/llvm/CodeGen/SelectionDAGNodes.h:1162:16 #10 0x55d4975f4b6d llvm::SelectionDAGISel::CannotYetSelect(llvm::SDNode*) /home/eddy/work/llvm-project/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:4232:43 ``` Looking at debug info from llc it looks like cttz_zero_undef is expanded using some kind of a lookup table: ``` $ llc -debug-only=isel --asm-show-inst -mtriple=bpf -mcpu=v3 -filetype=asm -o - test-clz.ll ... Type-legalized selection DAG: %bb.0 'test:entry' SelectionDAG has 7 nodes: t0: ch,glue = EntryToken t2: i32,ch = CopyFromReg t0, Register:i32 %0 t3: i32 = cttz_zero_undef t2 t5: ch,glue = CopyToReg t0, Register:i32 $w0, t3 t6: ch = BPFISD::RET_GLUE t5, Register:i32 $w0, t5:1 Legalized selection DAG: %bb.0 'test:entry' SelectionDAG has 18 nodes: t0: ch,glue = EntryToken t2: i32,ch = CopyFromReg t0, Register:i32 %0 t8: i32 = sub Constant:i32<0>, t2 t9: i32 = and t2, t8 t11: i32 = mul t9, Constant:i32<125613361> t13: i32 = srl t11, Constant:i32<27> t14: i64 = sign_extend t13 t16: i64 = add ConstantPool:i64<[32 x i8] c"\00\01\1C\02\1D\0E\18\03\1E\16\14\0F\19\11\04\08\1F\1B\0D\17\15\13\10\07\1A\0C\12\06\0B\05\0A\09"> 0, t14 t18: i32,ch = load<(load (s8) from constant-pool), zext from i8> t0, t16, undef:i64 t5: ch,glue = CopyToReg t0, Register:i32 $w0, t18 t6: ch = BPFISD::RET_GLUE t5, Register:i32 $w0, t5:1 ``` If there is no way to convince lowering to use some other strategy, and you don't want to spend time on implementing translation for `ConstantPool`, I think it would be fine to leave `ctz` as-is, or just adjust error reporting, so that it clearly says that `ctz` is not supported w/o showing a stack-trace. Wdyt? https://github.com/llvm/llvm-project/pull/73668 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] bdaf16d - [LoopVectorize] Refine runtime memory check costs when there is an outer loop (#76034)
Author: David Sherwood Date: 2024-01-29T15:05:18-08:00 New Revision: bdaf16d59f4a64529371cbe056245f6cc035d7cf URL: https://github.com/llvm/llvm-project/commit/bdaf16d59f4a64529371cbe056245f6cc035d7cf DIFF: https://github.com/llvm/llvm-project/commit/bdaf16d59f4a64529371cbe056245f6cc035d7cf.diff LOG: [LoopVectorize] Refine runtime memory check costs when there is an outer loop (#76034) When we generate runtime memory checks for an inner loop it's possible that these checks are invariant in the outer loop and so will get hoisted out. In such cases, the effective cost of the checks should reduce to reflect the outer loop trip count. This fixes a 25% performance regression introduced by commit 49b0e6dcc296792b577ae8f0f674e61a0929b99d when building the SPEC2017 x264 benchmark with PGO, where we decided the inner loop trip count wasn't high enough to warrant the (incorrect) high cost of the runtime checks. Also, when runtime memory checks consist entirely of diff checks these are likely to be outer loop invariant. (cherry picked from commit 962fbafecf4730ba84a3b9fd7a662a5c30bb2c7c) Added: llvm/test/Transforms/LoopVectorize/AArch64/low_trip_memcheck_cost.ll Modified: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp Removed: diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp index 6ca93e15719fb..dd596c567cd48 100644 --- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp +++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp @@ -1957,6 +1957,8 @@ class GeneratedRTChecks { bool CostTooHigh = false; const bool AddBranchWeights; + Loop *OuterLoop = nullptr; + public: GeneratedRTChecks(ScalarEvolution &SE, DominatorTree *DT, LoopInfo *LI, TargetTransformInfo *TTI, const DataLayout &DL, @@ -2053,6 +2055,9 @@ class GeneratedRTChecks { DT->eraseNode(SCEVCheckBlock); LI->removeBlock(SCEVCheckBlock); } + +// Outer loop is used as part of the later cost calculations. +OuterLoop = L->getParentLoop(); } InstructionCost getCost() { @@ -2076,16 +2081,61 @@ class GeneratedRTChecks { LLVM_DEBUG(dbgs() << " " << C << " for " << I << "\n"); RTCheckCost += C; } -if (MemCheckBlock) +if (MemCheckBlock) { + InstructionCost MemCheckCost = 0; for (Instruction &I : *MemCheckBlock) { if (MemCheckBlock->getTerminator() == &I) continue; InstructionCost C = TTI->getInstructionCost(&I, TTI::TCK_RecipThroughput); LLVM_DEBUG(dbgs() << " " << C << " for " << I << "\n"); -RTCheckCost += C; +MemCheckCost += C; } + // If the runtime memory checks are being created inside an outer loop + // we should find out if these checks are outer loop invariant. If so, + // the checks will likely be hoisted out and so the effective cost will + // reduce according to the outer loop trip count. + if (OuterLoop) { +ScalarEvolution *SE = MemCheckExp.getSE(); +// TODO: If profitable, we could refine this further by analysing every +// individual memory check, since there could be a mixture of loop +// variant and invariant checks that mean the final condition is +// variant. +const SCEV *Cond = SE->getSCEV(MemRuntimeCheckCond); +if (SE->isLoopInvariant(Cond, OuterLoop)) { + // It seems reasonable to assume that we can reduce the effective + // cost of the checks even when we know nothing about the trip + // count. Assume that the outer loop executes at least twice. + unsigned BestTripCount = 2; + + // If exact trip count is known use that. + if (unsigned SmallTC = SE->getSmallConstantTripCount(OuterLoop)) +BestTripCount = SmallTC; + else if (LoopVectorizeWithBlockFrequency) { +// Else use profile data if available. +if (auto EstimatedTC = getLoopEstimatedTripCount(OuterLoop)) + BestTripCount = *EstimatedTC; + } + + InstructionCost NewMemCheckCost = MemCheckCost / BestTripCount; + + // Let's ensure the cost is always at least 1. + NewMemCheckCost = std::max(*NewMemCheckCost.getValue(), + (InstructionCost::CostType)1); + + LLVM_DEBUG(dbgs() + << "We expect runtime memory checks to be hoisted " + << "out of the outer loop. Cost reduced from " + << MemCheckCost << " to " << NewMemCheckCost << '\n'); + + MemCheckCost = NewMemCheckCost; +} + } + + RTCheckCost += MemCheckCost; +} + if (SCEVCheckBlock || MemCheckBlock) LLVM_DEBUG(dbgs() << "Total cost of runtime checks: " << RTCheckCost << "\n"); @@ -2144,8 +2194,8 @@ clas
[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79800 (PR #79813)
tstellar wrote: Merged: bdaf16d59f4a64529371cbe056245f6cc035d7cf https://github.com/llvm/llvm-project/pull/79813 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79800 (PR #79813)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/79813 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] 0680e84 - [mlir] Revert to old fold logic in IR::Dialect::add{Types, Attributes}() (#79582)
Author: Andrei Golubev Date: 2024-01-29T15:10:47-08:00 New Revision: 0680e84a3f2a366a860bd0491f490a2fba800313 URL: https://github.com/llvm/llvm-project/commit/0680e84a3f2a366a860bd0491f490a2fba800313 DIFF: https://github.com/llvm/llvm-project/commit/0680e84a3f2a366a860bd0491f490a2fba800313.diff LOG: [mlir] Revert to old fold logic in IR::Dialect::add{Types, Attributes}() (#79582) Fold expressions on Clang are limited to 256 elements. This causes compilation errors in cases when the amount of elements added exceeds this limit. Side-step the issue by restoring the original trick that would use the std::initializer_list. For the record, in our downstream Clang 16 gives: mlir/include/mlir/IR/Dialect.h:269:23: fatal error: instantiating fold expression with 688 arguments exceeded expression nesting limit of 256 (addType(), ...); Partially reverts 26d811b3ecd2fa1ca3d9b41e17fb42b8c7ad03d6. Co-authored-by: Nikita Kudriavtsev (cherry picked from commit e3a38a75ddc6ff00301ec19a0e2488d00f2cc297) Added: Modified: mlir/include/mlir/IR/Dialect.h Removed: diff --git a/mlir/include/mlir/IR/Dialect.h b/mlir/include/mlir/IR/Dialect.h index 45f29f37dd3b97..50f6f6de5c2897 100644 --- a/mlir/include/mlir/IR/Dialect.h +++ b/mlir/include/mlir/IR/Dialect.h @@ -281,7 +281,11 @@ class Dialect { /// Register a set of type classes with this dialect. template void addTypes() { -(addType(), ...); +// This initializer_list argument pack expansion is essentially equal to +// using a fold expression with a comma operator. Clang however, refuses +// to compile a fold expression with a depth of more than 256 by default. +// There seem to be no such limitations for initializer_list. +(void)std::initializer_list{0, (addType(), 0)...}; } /// Register a type instance with this dialect. @@ -292,7 +296,11 @@ class Dialect { /// Register a set of attribute classes with this dialect. template void addAttributes() { -(addAttribute(), ...); +// This initializer_list argument pack expansion is essentially equal to +// using a fold expression with a comma operator. Clang however, refuses +// to compile a fold expression with a depth of more than 256 by default. +// There seem to be no such limitations for initializer_list. +(void)std::initializer_list{0, (addAttribute(), 0)...}; } /// Register an attribute instance with this dialect. ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] PR for llvm/llvm-project#79797 (PR #79863)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/79863 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [mlir] PR for llvm/llvm-project#79797 (PR #79863)
tstellar wrote: Merged: 0680e84a3f2a366a860bd0491f490a2fba800313 https://github.com/llvm/llvm-project/pull/79863 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] bab01ae - Revert "[AArch64] merge index address with large offset into base address"
Author: David Green Date: 2024-01-29T15:17:53-08:00 New Revision: bab01aead7d7a34436bc8e1639b90227374f079e URL: https://github.com/llvm/llvm-project/commit/bab01aead7d7a34436bc8e1639b90227374f079e DIFF: https://github.com/llvm/llvm-project/commit/bab01aead7d7a34436bc8e1639b90227374f079e.diff LOG: Revert "[AArch64] merge index address with large offset into base address" This reverts commit 32878c2065c8005b3ea30c79e16dfd7eed55d645 due to #79756 and #76202. (cherry picked from commit 915c3d9e5a2d1314afe64cd6116a3b6c9809ec90) Added: Modified: llvm/lib/Target/AArch64/AArch64InstrInfo.cpp llvm/lib/Target/AArch64/AArch64InstrInfo.h llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp llvm/test/CodeGen/AArch64/arm64-addrmode.ll llvm/test/CodeGen/AArch64/large-offset-ldr-merge.mir Removed: diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp index 2e8d8c63d6bec..13e9d9725cc2e 100644 --- a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp +++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp @@ -4098,16 +4098,6 @@ AArch64InstrInfo::getLdStOffsetOp(const MachineInstr &MI) { return MI.getOperand(Idx); } -const MachineOperand & -AArch64InstrInfo::getLdStAmountOp(const MachineInstr &MI) { - switch (MI.getOpcode()) { - default: -llvm_unreachable("Unexpected opcode"); - case AArch64::LDRBBroX: -return MI.getOperand(4); - } -} - static const TargetRegisterClass *getRegClass(const MachineInstr &MI, Register Reg) { if (MI.getParent() == nullptr) diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.h b/llvm/lib/Target/AArch64/AArch64InstrInfo.h index db24a19fe5f8e..6526f6740747a 100644 --- a/llvm/lib/Target/AArch64/AArch64InstrInfo.h +++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.h @@ -111,9 +111,6 @@ class AArch64InstrInfo final : public AArch64GenInstrInfo { /// Returns the immediate offset operator of a load/store. static const MachineOperand &getLdStOffsetOp(const MachineInstr &MI); - /// Returns the shift amount operator of a load/store. - static const MachineOperand &getLdStAmountOp(const MachineInstr &MI); - /// Returns whether the instruction is FP or NEON. static bool isFpOrNEON(const MachineInstr &MI); diff --git a/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp b/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp index e90b8a8ca7ace..926a89466255c 100644 --- a/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp +++ b/llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp @@ -62,8 +62,6 @@ STATISTIC(NumUnscaledPairCreated, "Number of load/store from unscaled generated"); STATISTIC(NumZeroStoresPromoted, "Number of narrow zero stores promoted"); STATISTIC(NumLoadsFromStoresPromoted, "Number of loads from stores promoted"); -STATISTIC(NumConstOffsetFolded, - "Number of const offset of index address folded"); DEBUG_COUNTER(RegRenamingCounter, DEBUG_TYPE "-reg-renaming", "Controls which pairs are considered for renaming"); @@ -77,11 +75,6 @@ static cl::opt LdStLimit("aarch64-load-store-scan-limit", static cl::opt UpdateLimit("aarch64-update-scan-limit", cl::init(100), cl::Hidden); -// The LdStConstLimit limits how far we search for const offset instructions -// when we form index address load/store instructions. -static cl::opt LdStConstLimit("aarch64-load-store-const-scan-limit", -cl::init(10), cl::Hidden); - // Enable register renaming to find additional store pairing opportunities. static cl::opt EnableRenaming("aarch64-load-store-renaming", cl::init(true), cl::Hidden); @@ -178,13 +171,6 @@ struct AArch64LoadStoreOpt : public MachineFunctionPass { findMatchingUpdateInsnForward(MachineBasicBlock::iterator I, int UnscaledOffset, unsigned Limit); - // Scan the instruction list to find a register assigned with a const - // value that can be combined with the current instruction (a load or store) - // using base addressing with writeback. Scan forwards. - MachineBasicBlock::iterator - findMatchingConstOffsetBackward(MachineBasicBlock::iterator I, unsigned Limit, - unsigned &Offset); - // Scan the instruction list to find a base register update that can // be combined with the current instruction (a load or store) using // pre or post indexed addressing with writeback. Scan backwards. @@ -196,19 +182,11 @@ struct AArch64LoadStoreOpt : public MachineFunctionPass { bool isMatchingUpdateInsn(MachineInstr &MemMI, MachineInstr &MI, unsigned BaseReg, int Offset); - bool isMatchingMovConstInsn(MachineInstr &MemMI, MachineInstr &MI, - unsigned IndexR
[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79756 (PR #79814)
tstellar wrote: Merged: bab01aead7d7a34436bc8e1639b90227374f079e https://github.com/llvm/llvm-project/pull/79814 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79756 (PR #79814)
https://github.com/tstellar closed https://github.com/llvm/llvm-project/pull/79814 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Backport 5cf9f2cd9888feea23a624c1de3cc37ce8ce8112 to release/18.x (PR #79931)
https://github.com/lukel97 milestoned https://github.com/llvm/llvm-project/pull/79931 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Backport 5cf9f2cd9888feea23a624c1de3cc37ce8ce8112 to release/18.x (PR #79931)
https://github.com/lukel97 created https://github.com/llvm/llvm-project/pull/79931 This cherry picks a fix 5cf9f2cd9888feea23a624c1de3cc37ce8ce8112 for a miscompile (only with the -mrvv-vector-bits=zvl configuration or similar) introduced in bb8a8770e203ba027d141cd1200e93809ea66c8f, which is present in the 18.x release branch. It also includes a commit that adds a test d407e6ca61a422f25841674d8f0b5ea0dbec85f8 >From 5b3331f29489446d7d723a33310b7fec37153976 Mon Sep 17 00:00:00 2001 From: Luke Lau Date: Fri, 26 Jan 2024 20:16:21 +0700 Subject: [PATCH 1/2] [RISCV] Add test to showcase miscompile from #79072 --- .../rvv/fixed-vectors-shuffle-exact-vlen.ll| 18 -- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll index f53b51e05c572..c0b02f62444ef 100644 --- a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll +++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll @@ -138,8 +138,8 @@ define <4 x i64> @m2_splat_two_source(<4 x i64> %v1, <4 x i64> %v2) vscale_range ret <4 x i64> %res } -define <4 x i64> @m2_splat_into_identity_two_source(<4 x i64> %v1, <4 x i64> %v2) vscale_range(2,2) { -; CHECK-LABEL: m2_splat_into_identity_two_source: +define <4 x i64> @m2_splat_into_identity_two_source_v2_hi(<4 x i64> %v1, <4 x i64> %v2) vscale_range(2,2) { +; CHECK-LABEL: m2_splat_into_identity_two_source_v2_hi: ; CHECK: # %bb.0: ; CHECK-NEXT:vsetivli zero, 2, e64, m1, ta, ma ; CHECK-NEXT:vrgather.vi v10, v8, 0 @@ -149,6 +149,20 @@ define <4 x i64> @m2_splat_into_identity_two_source(<4 x i64> %v1, <4 x i64> %v2 ret <4 x i64> %res } +; FIXME: This is a miscompile, we're clobbering the lower reg group of %v2 +; (v10), and the vmv1r.v is moving from the wrong reg group (should be v10) +define <4 x i64> @m2_splat_into_slide_two_source_v2_lo(<4 x i64> %v1, <4 x i64> %v2) vscale_range(2,2) { +; CHECK-LABEL: m2_splat_into_slide_two_source_v2_lo: +; CHECK: # %bb.0: +; CHECK-NEXT:vsetivli zero, 2, e64, m1, ta, ma +; CHECK-NEXT:vrgather.vi v10, v8, 0 +; CHECK-NEXT:vmv1r.v v11, v8 +; CHECK-NEXT:vmv2r.v v8, v10 +; CHECK-NEXT:ret + %res = shufflevector <4 x i64> %v1, <4 x i64> %v2, <4 x i32> + ret <4 x i64> %res +} + define <4 x i64> @m2_splat_into_slide_two_source(<4 x i64> %v1, <4 x i64> %v2) vscale_range(2,2) { ; CHECK-LABEL: m2_splat_into_slide_two_source: ; CHECK: # %bb.0: >From 60341586c8bd46b1094663749ac6467058b7efe8 Mon Sep 17 00:00:00 2001 From: Luke Lau Date: Fri, 26 Jan 2024 20:18:08 +0700 Subject: [PATCH 2/2] [RISCV] Fix M1 shuffle on wrong SrcVec in lowerShuffleViaVRegSplitting This fixes a miscompile from #79072 where we were taking the wrong SrcVec to do the M1 shuffle. E.g. if the SrcVecIdx was 2 and we had 2 VRegsPerSrc, we ended up taking it from V1 instead of V2. --- llvm/lib/Target/RISCV/RISCVISelLowering.cpp | 2 +- .../CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll | 8 +++- 2 files changed, 4 insertions(+), 6 deletions(-) diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp index 47c6cd6e5487b..7895d74f06d12 100644 --- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp +++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp @@ -4718,7 +4718,7 @@ static SDValue lowerShuffleViaVRegSplitting(ShuffleVectorSDNode *SVN, if (SrcVecIdx == -1) continue; unsigned ExtractIdx = (SrcVecIdx % VRegsPerSrc) * NumOpElts; -SDValue SrcVec = (unsigned)SrcVecIdx > VRegsPerSrc ? V2 : V1; +SDValue SrcVec = (unsigned)SrcVecIdx >= VRegsPerSrc ? V2 : V1; SDValue SubVec = DAG.getNode(ISD::EXTRACT_SUBVECTOR, DL, M1VT, SrcVec, DAG.getVectorIdxConstant(ExtractIdx, DL)); SubVec = convertFromScalableVector(OneRegVT, SubVec, DAG, Subtarget); diff --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll index c0b02f62444ef..3f0bdb9d5e316 100644 --- a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll +++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll @@ -149,15 +149,13 @@ define <4 x i64> @m2_splat_into_identity_two_source_v2_hi(<4 x i64> %v1, <4 x i6 ret <4 x i64> %res } -; FIXME: This is a miscompile, we're clobbering the lower reg group of %v2 -; (v10), and the vmv1r.v is moving from the wrong reg group (should be v10) define <4 x i64> @m2_splat_into_slide_two_source_v2_lo(<4 x i64> %v1, <4 x i64> %v2) vscale_range(2,2) { ; CHECK-LABEL: m2_splat_into_slide_two_source_v2_lo: ; CHECK: # %bb.0: ; CHECK-NEXT:vsetivli zero, 2, e64, m1, ta, ma -; CHECK-NEXT:vrgather.vi v10, v8, 0 -; CHECK-NEXT:vmv1r.v v11, v8 -; CHECK-NEXT:vmv2r.v v8, v10 +; CHECK-NEXT:vrgather.vi v12, v8, 0 +; CHECK-NEXT:vm
[llvm-branch-commits] [llvm] [RISCV] Backport 5cf9f2cd9888feea23a624c1de3cc37ce8ce8112 to release/18.x (PR #79931)
llvmbot wrote: @llvm/pr-subscribers-backend-risc-v Author: Luke Lau (lukel97) Changes This cherry picks a fix 5cf9f2cd9888feea23a624c1de3cc37ce8ce8112 for a miscompile (only with the -mrvv-vector-bits=zvl configuration or similar) introduced in bb8a8770e203ba027d141cd1200e93809ea66c8f, which is present in the 18.x release branch. It also includes a commit that adds a test d407e6ca61a422f25841674d8f0b5ea0dbec85f8 --- Full diff: https://github.com/llvm/llvm-project/pull/79931.diff 2 Files Affected: - (modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+1-1) - (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll (+14-2) ``diff diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp index 47c6cd6e5487b..7895d74f06d12 100644 --- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp +++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp @@ -4718,7 +4718,7 @@ static SDValue lowerShuffleViaVRegSplitting(ShuffleVectorSDNode *SVN, if (SrcVecIdx == -1) continue; unsigned ExtractIdx = (SrcVecIdx % VRegsPerSrc) * NumOpElts; -SDValue SrcVec = (unsigned)SrcVecIdx > VRegsPerSrc ? V2 : V1; +SDValue SrcVec = (unsigned)SrcVecIdx >= VRegsPerSrc ? V2 : V1; SDValue SubVec = DAG.getNode(ISD::EXTRACT_SUBVECTOR, DL, M1VT, SrcVec, DAG.getVectorIdxConstant(ExtractIdx, DL)); SubVec = convertFromScalableVector(OneRegVT, SubVec, DAG, Subtarget); diff --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll index f53b51e05c572..3f0bdb9d5e316 100644 --- a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll +++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll @@ -138,8 +138,8 @@ define <4 x i64> @m2_splat_two_source(<4 x i64> %v1, <4 x i64> %v2) vscale_range ret <4 x i64> %res } -define <4 x i64> @m2_splat_into_identity_two_source(<4 x i64> %v1, <4 x i64> %v2) vscale_range(2,2) { -; CHECK-LABEL: m2_splat_into_identity_two_source: +define <4 x i64> @m2_splat_into_identity_two_source_v2_hi(<4 x i64> %v1, <4 x i64> %v2) vscale_range(2,2) { +; CHECK-LABEL: m2_splat_into_identity_two_source_v2_hi: ; CHECK: # %bb.0: ; CHECK-NEXT:vsetivli zero, 2, e64, m1, ta, ma ; CHECK-NEXT:vrgather.vi v10, v8, 0 @@ -149,6 +149,18 @@ define <4 x i64> @m2_splat_into_identity_two_source(<4 x i64> %v1, <4 x i64> %v2 ret <4 x i64> %res } +define <4 x i64> @m2_splat_into_slide_two_source_v2_lo(<4 x i64> %v1, <4 x i64> %v2) vscale_range(2,2) { +; CHECK-LABEL: m2_splat_into_slide_two_source_v2_lo: +; CHECK: # %bb.0: +; CHECK-NEXT:vsetivli zero, 2, e64, m1, ta, ma +; CHECK-NEXT:vrgather.vi v12, v8, 0 +; CHECK-NEXT:vmv1r.v v13, v10 +; CHECK-NEXT:vmv2r.v v8, v12 +; CHECK-NEXT:ret + %res = shufflevector <4 x i64> %v1, <4 x i64> %v2, <4 x i32> + ret <4 x i64> %res +} + define <4 x i64> @m2_splat_into_slide_two_source(<4 x i64> %v1, <4 x i64> %v2) vscale_range(2,2) { ; CHECK-LABEL: m2_splat_into_slide_two_source: ; CHECK: # %bb.0: `` https://github.com/llvm/llvm-project/pull/79931 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] PR for llvm/llvm-project#79838 (PR #79841)
MaskRay wrote: LGTM https://github.com/llvm/llvm-project/pull/79841 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] PR for llvm/llvm-project#79838 (PR #79841)
https://github.com/MaskRay approved this pull request. https://github.com/llvm/llvm-project/pull/79841 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits