[llvm-branch-commits] [BOLT] Support perf2bolt-N in the driver (PR #111072)
https://github.com/aaupov edited https://github.com/llvm/llvm-project/pull/111072 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [lld] Backport "[ELF] Make shouldAddProvideSym return values consistent when demoted to Undefined" (PR #112136)
https://github.com/MaskRay approved this pull request. https://github.com/llvm/llvm-project/pull/112136 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [lld] Backport "[ELF] Make shouldAddProvideSym return values consistent when demoted to Undefined" (PR #112136)
https://github.com/DianQK created https://github.com/llvm/llvm-project/pull/112136 Backport #111945. (cherry picked from commit 1c6688ae3449da9c8fee1e1c12c892223496fb4c) >From f8cab50362e2b4f4523818e844fca0c622339985 Mon Sep 17 00:00:00 2001 From: Fangrui Song Date: Fri, 11 Oct 2024 08:47:07 -0700 Subject: [PATCH] [ELF] Make shouldAddProvideSym return values consistent when demoted to Undefined Case: `PROVIDE(f1 = bar);` when both `f1` and `bar` are in separate sections that would be discarded by GC. Due to `demoteDefined`, `shouldAddProvideSym(f1)` may initially return false (when Defined) and then return true (been demoted to Undefined). ``` addScriptReferencedSymbolsToSymTable shouldAddProvideSym(f1): false // the RHS (bar) is not added to `referencedSymbols` and may be GCed declareSymbols shouldAddProvideSym(f1): false markLive demoteSymbolsAndComputeIsPreemptible // demoted f1 to Undefined processSymbolAssignments addSymbol shouldAddProvideSym(f1): true ``` The inconsistency can cause `cmd->expression()` in `addSymbol` to be evaluated, leading to `symbol not found: bar` errors (since `bar` in the RHS is not in `referencedSymbols` and is GCed) (#111478). Fix this by adding a `sym->isUsedInRegularObj` condition, making `shouldAddProvideSym(f1)` values consistent. In addition, we need a `sym->exportDynamic` condition to keep provide-shared.s working. Fixes: ebb326a51fec37b5a47e5702e8ea157cd4f835cd Pull Request: https://github.com/llvm/llvm-project/pull/111945 (cherry picked from commit 1c6688ae3449da9c8fee1e1c12c892223496fb4c) --- lld/ELF/LinkerScript.cpp| 9 +- lld/test/ELF/linkerscript/provide-defined.s | 36 + 2 files changed, 44 insertions(+), 1 deletion(-) create mode 100644 lld/test/ELF/linkerscript/provide-defined.s diff --git a/lld/ELF/LinkerScript.cpp b/lld/ELF/LinkerScript.cpp index 055fa21d44ca6e..d95c5573935ec4 100644 --- a/lld/ELF/LinkerScript.cpp +++ b/lld/ELF/LinkerScript.cpp @@ -1718,6 +1718,13 @@ void LinkerScript::addScriptReferencedSymbolsToSymTable() { } bool LinkerScript::shouldAddProvideSym(StringRef symName) { + // This function is called before and after garbage collection. To prevent + // undefined references from the RHS, the result of this function for a + // symbol must be the same for each call. We use isUsedInRegularObj to not + // change the return value of a demoted symbol. The exportDynamic condition, + // while not so accurate, allows PROVIDE to define a symbol referenced by a + // DSO. Symbol *sym = symtab.find(symName); - return sym && !sym->isDefined() && !sym->isCommon(); + return sym && !sym->isDefined() && !sym->isCommon() && + (sym->isUsedInRegularObj || sym->exportDynamic); } diff --git a/lld/test/ELF/linkerscript/provide-defined.s b/lld/test/ELF/linkerscript/provide-defined.s new file mode 100644 index 00..1d44bef3d4068d --- /dev/null +++ b/lld/test/ELF/linkerscript/provide-defined.s @@ -0,0 +1,36 @@ +# REQUIRES: x86 +## Test the GC behavior when the PROVIDE symbol is defined by a relocatable file. + +# RUN: rm -rf %t && split-file %s %t && cd %t +# RUN: llvm-mc -filetype=obj -triple=x86_64 a.s -o a.o +# RUN: llvm-mc -filetype=obj -triple=x86_64 b.s -o b.o +# RUN: ld.lld -T a.t --gc-sections a.o b.o -o a +# RUN: llvm-readelf -s a | FileCheck %s + +# CHECK: 1: {{.*}} 0 NOTYPE GLOBAL DEFAULT 1 _start +# CHECK-NEXT:2: {{.*}} 0 NOTYPE GLOBAL DEFAULT 2 f3 +# CHECK-NOT: {{.}} + +#--- a.s +.global _start, f1, f2, f3, bar +_start: + call f3 + +.section .text.f1,"ax"; f1: +.section .text.f2,"ax"; f2: # referenced by another relocatable file +.section .text.f3,"ax"; f3: # live +.section .text.bar,"ax"; bar: + +.comm comm,4,4 + +#--- b.s + call f2 + +#--- a.t +SECTIONS { + . = . + SIZEOF_HEADERS; + PROVIDE(f1 = bar+1); + PROVIDE(f2 = bar+2); + PROVIDE(f3 = bar+3); + PROVIDE(f4 = comm+4); +} ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [lld] Backport "[ELF] Make shouldAddProvideSym return values consistent when demoted to Undefined" (PR #112136)
llvmbot wrote: @llvm/pr-subscribers-lld Author: DianQK (DianQK) Changes Backport #111945. (cherry picked from commit 1c6688ae3449da9c8fee1e1c12c892223496fb4c) --- Full diff: https://github.com/llvm/llvm-project/pull/112136.diff 2 Files Affected: - (modified) lld/ELF/LinkerScript.cpp (+8-1) - (added) lld/test/ELF/linkerscript/provide-defined.s (+36) ``diff diff --git a/lld/ELF/LinkerScript.cpp b/lld/ELF/LinkerScript.cpp index 055fa21d44ca6e..d95c5573935ec4 100644 --- a/lld/ELF/LinkerScript.cpp +++ b/lld/ELF/LinkerScript.cpp @@ -1718,6 +1718,13 @@ void LinkerScript::addScriptReferencedSymbolsToSymTable() { } bool LinkerScript::shouldAddProvideSym(StringRef symName) { + // This function is called before and after garbage collection. To prevent + // undefined references from the RHS, the result of this function for a + // symbol must be the same for each call. We use isUsedInRegularObj to not + // change the return value of a demoted symbol. The exportDynamic condition, + // while not so accurate, allows PROVIDE to define a symbol referenced by a + // DSO. Symbol *sym = symtab.find(symName); - return sym && !sym->isDefined() && !sym->isCommon(); + return sym && !sym->isDefined() && !sym->isCommon() && + (sym->isUsedInRegularObj || sym->exportDynamic); } diff --git a/lld/test/ELF/linkerscript/provide-defined.s b/lld/test/ELF/linkerscript/provide-defined.s new file mode 100644 index 00..1d44bef3d4068d --- /dev/null +++ b/lld/test/ELF/linkerscript/provide-defined.s @@ -0,0 +1,36 @@ +# REQUIRES: x86 +## Test the GC behavior when the PROVIDE symbol is defined by a relocatable file. + +# RUN: rm -rf %t && split-file %s %t && cd %t +# RUN: llvm-mc -filetype=obj -triple=x86_64 a.s -o a.o +# RUN: llvm-mc -filetype=obj -triple=x86_64 b.s -o b.o +# RUN: ld.lld -T a.t --gc-sections a.o b.o -o a +# RUN: llvm-readelf -s a | FileCheck %s + +# CHECK: 1: {{.*}} 0 NOTYPE GLOBAL DEFAULT 1 _start +# CHECK-NEXT:2: {{.*}} 0 NOTYPE GLOBAL DEFAULT 2 f3 +# CHECK-NOT: {{.}} + +#--- a.s +.global _start, f1, f2, f3, bar +_start: + call f3 + +.section .text.f1,"ax"; f1: +.section .text.f2,"ax"; f2: # referenced by another relocatable file +.section .text.f3,"ax"; f3: # live +.section .text.bar,"ax"; bar: + +.comm comm,4,4 + +#--- b.s + call f2 + +#--- a.t +SECTIONS { + . = . + SIZEOF_HEADERS; + PROVIDE(f1 = bar+1); + PROVIDE(f2 = bar+2); + PROVIDE(f3 = bar+3); + PROVIDE(f4 = comm+4); +} `` https://github.com/llvm/llvm-project/pull/112136 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [lld] Backport "[ELF] Make shouldAddProvideSym return values consistent when demoted to Undefined" (PR #112136)
https://github.com/DianQK milestoned https://github.com/llvm/llvm-project/pull/112136 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [BOLT] Support perf2bolt-N in the driver (PR #111072)
https://github.com/maksfb approved this pull request. LGTM. Please add the description of the problem this PR fixes and link any related issue(s). https://github.com/llvm/llvm-project/pull/111072 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [NewPM][CodeGen] Port LiveRegMatrix to NPM (PR #109938)
https://github.com/optimisan updated https://github.com/llvm/llvm-project/pull/109938 >From d4cc049c53df27919103625417730595fc2183d7 Mon Sep 17 00:00:00 2001 From: Akshat Oke Date: Tue, 24 Sep 2024 09:07:04 + Subject: [PATCH 1/4] [NewPM][CodeGen] Port LiveRegMatrix to NPM --- llvm/include/llvm/CodeGen/LiveRegMatrix.h | 50 --- llvm/include/llvm/InitializePasses.h | 2 +- .../llvm/Passes/MachinePassRegistry.def | 4 +- llvm/lib/CodeGen/LiveRegMatrix.cpp| 38 ++ llvm/lib/CodeGen/RegAllocBasic.cpp| 8 +-- llvm/lib/CodeGen/RegAllocGreedy.cpp | 8 +-- llvm/lib/Passes/PassBuilder.cpp | 1 + llvm/lib/Target/AMDGPU/GCNNSAReassign.cpp | 6 +-- .../Target/AMDGPU/SIPreAllocateWWMRegs.cpp| 6 +-- 9 files changed, 88 insertions(+), 35 deletions(-) diff --git a/llvm/include/llvm/CodeGen/LiveRegMatrix.h b/llvm/include/llvm/CodeGen/LiveRegMatrix.h index 2b32308c7c075e..c024ca9c1dc38d 100644 --- a/llvm/include/llvm/CodeGen/LiveRegMatrix.h +++ b/llvm/include/llvm/CodeGen/LiveRegMatrix.h @@ -37,7 +37,9 @@ class MachineFunction; class TargetRegisterInfo; class VirtRegMap; -class LiveRegMatrix : public MachineFunctionPass { +class LiveRegMatrix { + friend class LiveRegMatrixWrapperPass; + friend class LiveRegMatrixAnalysis; const TargetRegisterInfo *TRI = nullptr; LiveIntervals *LIS = nullptr; VirtRegMap *VRM = nullptr; @@ -57,15 +59,21 @@ class LiveRegMatrix : public MachineFunctionPass { unsigned RegMaskVirtReg = 0; BitVector RegMaskUsable; - // MachineFunctionPass boilerplate. - void getAnalysisUsage(AnalysisUsage &) const override; - bool runOnMachineFunction(MachineFunction &) override; - void releaseMemory() override; + LiveRegMatrix() = default; + void releaseMemory(); public: - static char ID; - - LiveRegMatrix(); + LiveRegMatrix(LiveRegMatrix &&Other) + : TRI(Other.TRI), LIS(Other.LIS), VRM(Other.VRM), UserTag(Other.UserTag), +Matrix(std::move(Other.Matrix)), Queries(std::move(Other.Queries)), +RegMaskTag(Other.RegMaskTag), RegMaskVirtReg(Other.RegMaskVirtReg), +RegMaskUsable(std::move(Other.RegMaskUsable)) { +Other.TRI = nullptr; +Other.LIS = nullptr; +Other.VRM = nullptr; + } + + void init(MachineFunction &MF, LiveIntervals *LIS, VirtRegMap *VRM); //======// // High-level interface. @@ -159,6 +167,32 @@ class LiveRegMatrix : public MachineFunctionPass { Register getOneVReg(unsigned PhysReg) const; }; +class LiveRegMatrixWrapperPass : public MachineFunctionPass { + LiveRegMatrix LRM; + +public: + static char ID; + + LiveRegMatrixWrapperPass() : MachineFunctionPass(ID) {} + + LiveRegMatrix &getLRM() { return LRM; } + const LiveRegMatrix &getLRM() const { return LRM; } + + void getAnalysisUsage(AnalysisUsage &AU) const override; + bool runOnMachineFunction(MachineFunction &MF) override; + void releaseMemory() override; +}; + +class LiveRegMatrixAnalysis : public AnalysisInfoMixin { + friend AnalysisInfoMixin; + static AnalysisKey Key; + +public: + using Result = LiveRegMatrix; + + LiveRegMatrix run(MachineFunction &MF, MachineFunctionAnalysisManager &MFAM); +}; + } // end namespace llvm #endif // LLVM_CODEGEN_LIVEREGMATRIX_H diff --git a/llvm/include/llvm/InitializePasses.h b/llvm/include/llvm/InitializePasses.h index d89a5538b46975..3fee8c40a6607e 100644 --- a/llvm/include/llvm/InitializePasses.h +++ b/llvm/include/llvm/InitializePasses.h @@ -156,7 +156,7 @@ void initializeLiveDebugValuesPass(PassRegistry &); void initializeLiveDebugVariablesPass(PassRegistry &); void initializeLiveIntervalsWrapperPassPass(PassRegistry &); void initializeLiveRangeShrinkPass(PassRegistry &); -void initializeLiveRegMatrixPass(PassRegistry &); +void initializeLiveRegMatrixWrapperPassPass(PassRegistry &); void initializeLiveStacksPass(PassRegistry &); void initializeLiveVariablesWrapperPassPass(PassRegistry &); void initializeLoadStoreOptPass(PassRegistry &); diff --git a/llvm/include/llvm/Passes/MachinePassRegistry.def b/llvm/include/llvm/Passes/MachinePassRegistry.def index bdc56ca03f392a..4497c1fce0db69 100644 --- a/llvm/include/llvm/Passes/MachinePassRegistry.def +++ b/llvm/include/llvm/Passes/MachinePassRegistry.def @@ -97,6 +97,7 @@ LOOP_PASS("loop-term-fold", LoopTermFoldPass()) // preferably fix the scavenger to not depend on them). MACHINE_FUNCTION_ANALYSIS("live-intervals", LiveIntervalsAnalysis()) MACHINE_FUNCTION_ANALYSIS("live-vars", LiveVariablesAnalysis()) +MACHINE_FUNCTION_ANALYSIS("live-reg-matrix", LiveRegMatrixAnalysis()) MACHINE_FUNCTION_ANALYSIS("machine-block-freq", MachineBlockFrequencyAnalysis()) MACHINE_FUNCTION_ANALYSIS("machine-branch-prob", MachineBranchProbabilityAnalysis()) @@ -122,8 +123,7 @@ MACHINE_FUNCTION_ANALYSIS("virtregmap", VirtRegMapAnalysis()) // MachineRegionInf
[llvm-branch-commits] [llvm] [NewPM][AMDGPU] Port SIPreAllocateWWMRegs to NPM (PR #109939)
https://github.com/optimisan updated https://github.com/llvm/llvm-project/pull/109939 >From af1a1f15867edef93e69c43037a19ab69e8ec2e3 Mon Sep 17 00:00:00 2001 From: Akshat Oke Date: Tue, 24 Sep 2024 11:41:18 + Subject: [PATCH 1/2] [NewPM][AMDGPU] Port SIPreAllocateWWMRegs to NPM --- llvm/lib/Target/AMDGPU/AMDGPU.h | 6 +- llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def | 1 + .../lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 7 ++- .../Target/AMDGPU/SIPreAllocateWWMRegs.cpp| 60 --- llvm/lib/Target/AMDGPU/SIPreAllocateWWMRegs.h | 25 .../AMDGPU/si-pre-allocate-wwm-regs.mir | 20 +++ 6 files changed, 92 insertions(+), 27 deletions(-) create mode 100644 llvm/lib/Target/AMDGPU/SIPreAllocateWWMRegs.h diff --git a/llvm/lib/Target/AMDGPU/AMDGPU.h b/llvm/lib/Target/AMDGPU/AMDGPU.h index 342d55e828bca5..95d0ad0f9dc96a 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPU.h +++ b/llvm/lib/Target/AMDGPU/AMDGPU.h @@ -49,7 +49,7 @@ FunctionPass *createSIFixSGPRCopiesLegacyPass(); FunctionPass *createLowerWWMCopiesPass(); FunctionPass *createSIMemoryLegalizerPass(); FunctionPass *createSIInsertWaitcntsPass(); -FunctionPass *createSIPreAllocateWWMRegsPass(); +FunctionPass *createSIPreAllocateWWMRegsLegacyPass(); FunctionPass *createSIFormMemoryClausesPass(); FunctionPass *createSIPostRABundlerPass(); @@ -212,8 +212,8 @@ extern char &SILateBranchLoweringPassID; void initializeSIOptimizeExecMaskingPass(PassRegistry &); extern char &SIOptimizeExecMaskingID; -void initializeSIPreAllocateWWMRegsPass(PassRegistry &); -extern char &SIPreAllocateWWMRegsID; +void initializeSIPreAllocateWWMRegsLegacyPass(PassRegistry &); +extern char &SIPreAllocateWWMRegsLegacyID; void initializeAMDGPUImageIntrinsicOptimizerPass(PassRegistry &); extern char &AMDGPUImageIntrinsicOptimizerID; diff --git a/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def b/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def index 0ebf34c901c142..174a90f0aa419d 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def +++ b/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def @@ -102,5 +102,6 @@ MACHINE_FUNCTION_PASS("gcn-dpp-combine", GCNDPPCombinePass()) MACHINE_FUNCTION_PASS("si-load-store-opt", SILoadStoreOptimizerPass()) MACHINE_FUNCTION_PASS("si-lower-sgpr-spills", SILowerSGPRSpillsPass()) MACHINE_FUNCTION_PASS("si-peephole-sdwa", SIPeepholeSDWAPass()) +MACHINE_FUNCTION_PASS("si-pre-allocate-wwm-regs", SIPreAllocateWWMRegsPass()) MACHINE_FUNCTION_PASS("si-shrink-instructions", SIShrinkInstructionsPass()) #undef MACHINE_FUNCTION_PASS diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp index 23ee0c3e896eb3..f367b5fbea45af 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp @@ -41,6 +41,7 @@ #include "SIMachineFunctionInfo.h" #include "SIMachineScheduler.h" #include "SIPeepholeSDWA.h" +#include "SIPreAllocateWWMRegs.h" #include "SIShrinkInstructions.h" #include "TargetInfo/AMDGPUTargetInfo.h" #include "Utils/AMDGPUBaseInfo.h" @@ -508,7 +509,7 @@ extern "C" LLVM_EXTERNAL_VISIBILITY void LLVMInitializeAMDGPUTarget() { initializeSILateBranchLoweringPass(*PR); initializeSIMemoryLegalizerPass(*PR); initializeSIOptimizeExecMaskingPass(*PR); - initializeSIPreAllocateWWMRegsPass(*PR); + initializeSIPreAllocateWWMRegsLegacyPass(*PR); initializeSIFormMemoryClausesPass(*PR); initializeSIPostRABundlerPass(*PR); initializeGCNCreateVOPDPass(*PR); @@ -1506,7 +1507,7 @@ bool GCNPassConfig::addRegAssignAndRewriteFast() { addPass(&SILowerSGPRSpillsLegacyID); // To Allocate wwm registers used in whole quad mode operations (for shaders). - addPass(&SIPreAllocateWWMRegsID); + addPass(&SIPreAllocateWWMRegsLegacyID); // For allocating other wwm register operands. addPass(createWWMRegAllocPass(false)); @@ -1543,7 +1544,7 @@ bool GCNPassConfig::addRegAssignAndRewriteOptimized() { addPass(&SILowerSGPRSpillsLegacyID); // To Allocate wwm registers used in whole quad mode operations (for shaders). - addPass(&SIPreAllocateWWMRegsID); + addPass(&SIPreAllocateWWMRegsLegacyID); // For allocating other whole wave mode registers. addPass(createWWMRegAllocPass(true)); diff --git a/llvm/lib/Target/AMDGPU/SIPreAllocateWWMRegs.cpp b/llvm/lib/Target/AMDGPU/SIPreAllocateWWMRegs.cpp index 07303e2aa726c5..f9109c01c8085b 100644 --- a/llvm/lib/Target/AMDGPU/SIPreAllocateWWMRegs.cpp +++ b/llvm/lib/Target/AMDGPU/SIPreAllocateWWMRegs.cpp @@ -11,6 +11,7 @@ // //===--===// +#include "SIPreAllocateWWMRegs.h" #include "AMDGPU.h" #include "GCNSubtarget.h" #include "MCTargetDesc/AMDGPUMCTargetDesc.h" @@ -34,7 +35,7 @@ static cl::opt namespace { -class SIPreAllocateWWMRegs : public MachineFunctionPass { +class SIPreAllocateWWMRegs { private: const SIInstrInfo *TII; const SIRegisterInfo *TRI;
[llvm-branch-commits] [llvm] [AMDGPU] Add tests for SIPreAllocateWWMRegs (PR #109963)
https://github.com/optimisan updated https://github.com/llvm/llvm-project/pull/109963 >From 2cefaf6d479b6c7ae6bc8a2267f8e4fee274923c Mon Sep 17 00:00:00 2001 From: Akshat Oke Date: Wed, 25 Sep 2024 11:21:04 + Subject: [PATCH 1/2] [AMDGPU] Add tests for SIPreAllocateWWMRegs --- .../AMDGPU/si-pre-allocate-wwm-regs.mir | 26 +++ .../si-pre-allocate-wwm-sgpr-spills.mir | 21 +++ 2 files changed, 47 insertions(+) create mode 100644 llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwm-regs.mir create mode 100644 llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwm-sgpr-spills.mir diff --git a/llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwm-regs.mir b/llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwm-regs.mir new file mode 100644 index 00..f2db299f575f5e --- /dev/null +++ b/llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwm-regs.mir @@ -0,0 +1,26 @@ +# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5 +# RUN: llc -mtriple=amdgcn -verify-machineinstrs -run-pass=si-pre-allocate-wwm-regs -o - -mcpu=tahiti %s | FileCheck %s + +--- + +name: pre_allocate_wwm_regs_strict +tracksRegLiveness: true +body: | + bb.0: +liveins: $sgpr1 +; CHECK-LABEL: name: pre_allocate_wwm_regs_strict +; CHECK: liveins: $sgpr1 +; CHECK-NEXT: {{ $}} +; CHECK-NEXT: [[DEF:%[0-9]+]]:vgpr_32 = IMPLICIT_DEF +; CHECK-NEXT: renamable $sgpr4_sgpr5 = ENTER_STRICT_WWM -1, implicit-def $exec, implicit-def $scc, implicit $exec +; CHECK-NEXT: $vgpr0 = V_MOV_B32_e32 0, implicit $exec +; CHECK-NEXT: dead $vgpr0 = V_MOV_B32_dpp $vgpr0, [[DEF]], 323, 12, 15, 0, implicit $exec +; CHECK-NEXT: $exec = EXIT_STRICT_WWM killed renamable $sgpr4_sgpr5 +; CHECK-NEXT: dead [[COPY:%[0-9]+]]:vgpr_32 = COPY [[DEF]] +%0:vgpr_32 = IMPLICIT_DEF +renamable $sgpr4_sgpr5 = ENTER_STRICT_WWM -1, implicit-def $exec, implicit-def $scc, implicit $exec +%24:vgpr_32 = V_MOV_B32_e32 0, implicit $exec +%25:vgpr_32 = V_MOV_B32_dpp %24:vgpr_32(tied-def 0), %0:vgpr_32, 323, 12, 15, 0, implicit $exec +$exec = EXIT_STRICT_WWM killed renamable $sgpr4_sgpr5 +%2:vgpr_32 = COPY %0:vgpr_32 +... diff --git a/llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwm-sgpr-spills.mir b/llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwm-sgpr-spills.mir new file mode 100644 index 00..f0efe74878d831 --- /dev/null +++ b/llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwm-sgpr-spills.mir @@ -0,0 +1,21 @@ +# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5 +# RUN: llc -mtriple=amdgcn -verify-machineinstrs -amdgpu-prealloc-sgpr-spill-vgprs -run-pass=si-pre-allocate-wwm-regs -o - -mcpu=tahiti %s | FileCheck %s + +--- + +name: pre_allocate_wwm_spill_to_vgpr +tracksRegLiveness: true +body: | + bb.0: +liveins: $sgpr1 +; CHECK-LABEL: name: pre_allocate_wwm_spill_to_vgpr +; CHECK: liveins: $sgpr1 +; CHECK-NEXT: {{ $}} +; CHECK-NEXT: [[DEF:%[0-9]+]]:vgpr_32 = IMPLICIT_DEF +; CHECK-NEXT: dead $vgpr0 = SI_SPILL_S32_TO_VGPR $sgpr1, 0, [[DEF]] +; CHECK-NEXT: dead [[COPY:%[0-9]+]]:vgpr_32 = COPY [[DEF]] +%0:vgpr_32 = IMPLICIT_DEF +%23:vgpr_32 = SI_SPILL_S32_TO_VGPR $sgpr1, 0, %0:vgpr_32 +%2:vgpr_32 = COPY %0:vgpr_32 +... + >From 9bddae336227b80ba45be7d7f16ddc4f49fd0a15 Mon Sep 17 00:00:00 2001 From: Akshat Oke Date: Mon, 7 Oct 2024 09:13:04 + Subject: [PATCH 2/2] Keep tests in one file --- .../AMDGPU/si-pre-allocate-wwm-regs.mir | 24 --- .../si-pre-allocate-wwm-sgpr-spills.mir | 21 2 files changed, 21 insertions(+), 24 deletions(-) delete mode 100644 llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwm-sgpr-spills.mir diff --git a/llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwm-regs.mir b/llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwm-regs.mir index f2db299f575f5e..74a221084dce24 100644 --- a/llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwm-regs.mir +++ b/llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwm-regs.mir @@ -1,5 +1,6 @@ # NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5 # RUN: llc -mtriple=amdgcn -verify-machineinstrs -run-pass=si-pre-allocate-wwm-regs -o - -mcpu=tahiti %s | FileCheck %s +# RUN: llc -mtriple=amdgcn -verify-machineinstrs -amdgpu-prealloc-sgpr-spill-vgprs -run-pass=si-pre-allocate-wwm-regs -o - -mcpu=tahiti %s | FileCheck %s --check-prefix=CHECK2 --- @@ -19,8 +20,25 @@ body: | ; CHECK-NEXT: dead [[COPY:%[0-9]+]]:vgpr_32 = COPY [[DEF]] %0:vgpr_32 = IMPLICIT_DEF renamable $sgpr4_sgpr5 = ENTER_STRICT_WWM -1, implicit-def $exec, implicit-def $scc, implicit $exec -%24:vgpr_32 = V_MOV_B32_e32 0, implicit $exec -%25:vgpr_32 = V_MOV_B32_dpp %24:vgpr_32(tied-def 0), %0:vgpr_32, 323, 12, 15, 0, implicit $exec +%1:vgpr_32 = V_MOV_B32_e32 0, implicit $exec +%2:vgpr_32 = V_MOV_B32_dpp %1, %0, 323, 12, 15, 0, implicit $exec $exec = EXIT_STRICT_
[llvm-branch-commits] [llvm] [CodeGen] LiveIntervalUnions::Array Implement move constructor (PR #111357)
https://github.com/optimisan updated https://github.com/llvm/llvm-project/pull/111357 >From dbc51871aab3d4b5d7d64ef78f2df7833359b17f Mon Sep 17 00:00:00 2001 From: Akshat Oke Date: Mon, 7 Oct 2024 08:42:24 + Subject: [PATCH] [CodeGen] LiveIntervalUnions::Array Implement move constructor --- llvm/include/llvm/CodeGen/LiveIntervalUnion.h | 7 +++ 1 file changed, 7 insertions(+) diff --git a/llvm/include/llvm/CodeGen/LiveIntervalUnion.h b/llvm/include/llvm/CodeGen/LiveIntervalUnion.h index 81003455da4241..cc0f2a45bb182c 100644 --- a/llvm/include/llvm/CodeGen/LiveIntervalUnion.h +++ b/llvm/include/llvm/CodeGen/LiveIntervalUnion.h @@ -176,6 +176,13 @@ class LiveIntervalUnion { Array() = default; ~Array() { clear(); } +Array(Array &&Other) : Size(Other.Size), LIUs(Other.LIUs) { + Other.Size = 0; + Other.LIUs = nullptr; +} + +Array(const Array &) = delete; + // Initialize the array to have Size entries. // Reuse an existing allocation if the size matches. void init(LiveIntervalUnion::Allocator&, unsigned Size); ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Add tests for SIPreAllocateWWMRegs (PR #109963)
@@ -0,0 +1,44 @@ +# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5 +# RUN: llc -mtriple=amdgcn -verify-machineinstrs -run-pass=si-pre-allocate-wwm-regs -o - -mcpu=tahiti %s | FileCheck %s +# RUN: llc -mtriple=amdgcn -verify-machineinstrs -amdgpu-prealloc-sgpr-spill-vgprs -run-pass=si-pre-allocate-wwm-regs -o - -mcpu=tahiti %s | FileCheck %s --check-prefix=CHECK2 arsenm wrote: Move the -mcpu together with -mtriple https://github.com/llvm/llvm-project/pull/109963 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Add tests for SIPreAllocateWWMRegs (PR #109963)
@@ -0,0 +1,44 @@ +# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5 +# RUN: llc -mtriple=amdgcn -verify-machineinstrs -run-pass=si-pre-allocate-wwm-regs -o - -mcpu=tahiti %s | FileCheck %s +# RUN: llc -mtriple=amdgcn -verify-machineinstrs -amdgpu-prealloc-sgpr-spill-vgprs -run-pass=si-pre-allocate-wwm-regs -o - -mcpu=tahiti %s | FileCheck %s --check-prefix=CHECK2 + +--- arsenm wrote: Add fixme to check the MachineFunctionInfo reserved register information https://github.com/llvm/llvm-project/pull/109963 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Add tests for SIPreAllocateWWMRegs (PR #109963)
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/109963 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [NewPM][AMDGPU] Port SIPreAllocateWWMRegs to NPM (PR #109939)
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/109939 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Add tests for SIPreAllocateWWMRegs (PR #109963)
@@ -0,0 +1,44 @@ +# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5 +# RUN: llc -mtriple=amdgcn -verify-machineinstrs -run-pass=si-pre-allocate-wwm-regs -o - -mcpu=tahiti %s | FileCheck %s +# RUN: llc -mtriple=amdgcn -verify-machineinstrs -amdgpu-prealloc-sgpr-spill-vgprs -run-pass=si-pre-allocate-wwm-regs -o - -mcpu=tahiti %s | FileCheck %s --check-prefix=CHECK2 + +--- arsenm wrote: Yes https://github.com/llvm/llvm-project/pull/109963 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Add tests for SIPreAllocateWWMRegs (PR #109963)
@@ -0,0 +1,44 @@ +# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5 +# RUN: llc -mtriple=amdgcn -verify-machineinstrs -run-pass=si-pre-allocate-wwm-regs -o - -mcpu=tahiti %s | FileCheck %s +# RUN: llc -mtriple=amdgcn -verify-machineinstrs -amdgpu-prealloc-sgpr-spill-vgprs -run-pass=si-pre-allocate-wwm-regs -o - -mcpu=tahiti %s | FileCheck %s --check-prefix=CHECK2 + +--- optimisan wrote: Is it for WWM reserved registers? https://github.com/llvm/llvm-project/pull/109963 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Skip non-first termintors when forcing emit zero flag (PR #112116)
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/112116 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Skip non-first termintors when forcing emit zero flag (PR #112116)
@@ -1825,7 +1836,9 @@ bool SIInsertWaitcnts::generateWaitcntInstBefore(MachineInstr &MI, // Verify that the wait is actually needed. ScoreBrackets.simplifyWaitcnt(Wait); - if (ForceEmitZeroFlag) + // When forcing emit, we need to skip non-first terminators of a MBB because + // that would break the terminators of the MBB. + if (ForceEmitZeroFlag && !checkIfMBBNonFirstTerminator(MI)) arsenm wrote: You're scanning the terminators for every instruction. Can you adjust the outer iterator logic to skip the terminators in the first place? https://github.com/llvm/llvm-project/pull/112116 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Skip non-first termintors when forcing emit zero flag (PR #112116)
@@ -1600,6 +1600,17 @@ static bool callWaitsOnFunctionReturn(const MachineInstr &MI) { return true; } +/// \returns true if \p MI is not the first terminator of its associated MBB. +static bool checkIfMBBNonFirstTerminator(const MachineInstr &MI) { + const auto &MBB = MI.getParent(); + if (MBB->getFirstTerminator() == MI) +return false; + for (const auto &I : MBB->terminators()) +if (&I == &MI) + return true; arsenm wrote: This iterator logic is clumsy (you're effectively using getFirstTerminator twice) https://github.com/llvm/llvm-project/pull/112116 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Update correct dependency (PR #109937)
https://github.com/optimisan updated https://github.com/llvm/llvm-project/pull/109937 >From ca685074a7f8bfc75e40dd8172ce9e731e991f4d Mon Sep 17 00:00:00 2001 From: Akshat Oke Date: Tue, 24 Sep 2024 06:35:43 + Subject: [PATCH] Update correct dependency Replace unused analysis (VirtRegMap) dependency with the used one (SlotIndexes) --- llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp b/llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp index 4afefa3d9b245c..d8697aa2ffe1cd 100644 --- a/llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp +++ b/llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp @@ -95,8 +95,8 @@ char SILowerSGPRSpillsLegacy::ID = 0; INITIALIZE_PASS_BEGIN(SILowerSGPRSpillsLegacy, DEBUG_TYPE, "SI lower SGPR spill instructions", false, false) INITIALIZE_PASS_DEPENDENCY(LiveIntervalsWrapperPass) -INITIALIZE_PASS_DEPENDENCY(VirtRegMapWrapperLegacy) INITIALIZE_PASS_DEPENDENCY(MachineDominatorTreeWrapperPass) +INITIALIZE_PASS_DEPENDENCY(SlotIndexesWrapperPass) INITIALIZE_PASS_END(SILowerSGPRSpillsLegacy, DEBUG_TYPE, "SI lower SGPR spill instructions", false, false) ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Add tests for SIPreAllocateWWMRegs (PR #109963)
@@ -0,0 +1,44 @@ +# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5 +# RUN: llc -mtriple=amdgcn -verify-machineinstrs -run-pass=si-pre-allocate-wwm-regs -o - -mcpu=tahiti %s | FileCheck %s +# RUN: llc -mtriple=amdgcn -verify-machineinstrs -amdgpu-prealloc-sgpr-spill-vgprs -run-pass=si-pre-allocate-wwm-regs -o - -mcpu=tahiti %s | FileCheck %s --check-prefix=CHECK2 + +--- arsenm wrote: Then you need to write manual checks for it. Update_mir_test_checks doesn't currently support the function level properties https://github.com/llvm/llvm-project/pull/109963 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Add tests for SIPreAllocateWWMRegs (PR #109963)
@@ -0,0 +1,44 @@ +# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5 +# RUN: llc -mtriple=amdgcn -verify-machineinstrs -run-pass=si-pre-allocate-wwm-regs -o - -mcpu=tahiti %s | FileCheck %s +# RUN: llc -mtriple=amdgcn -verify-machineinstrs -amdgpu-prealloc-sgpr-spill-vgprs -run-pass=si-pre-allocate-wwm-regs -o - -mcpu=tahiti %s | FileCheck %s --check-prefix=CHECK2 + +--- optimisan wrote: But we have that already(?) ``` wwmReservedRegs: - '$vgpr0' ``` https://github.com/llvm/llvm-project/pull/109963 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits