[llvm-branch-commits] [BOLT] Support perf2bolt-N in the driver (PR #111072)

2024-10-13 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov edited 
https://github.com/llvm/llvm-project/pull/111072
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] Backport "[ELF] Make shouldAddProvideSym return values consistent when demoted to Undefined" (PR #112136)

2024-10-13 Thread Fangrui Song via llvm-branch-commits

https://github.com/MaskRay approved this pull request.


https://github.com/llvm/llvm-project/pull/112136
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] Backport "[ELF] Make shouldAddProvideSym return values consistent when demoted to Undefined" (PR #112136)

2024-10-13 Thread via llvm-branch-commits

https://github.com/DianQK created 
https://github.com/llvm/llvm-project/pull/112136

Backport #111945.

(cherry picked from commit 1c6688ae3449da9c8fee1e1c12c892223496fb4c)

>From f8cab50362e2b4f4523818e844fca0c622339985 Mon Sep 17 00:00:00 2001
From: Fangrui Song 
Date: Fri, 11 Oct 2024 08:47:07 -0700
Subject: [PATCH] [ELF] Make shouldAddProvideSym return values consistent when
 demoted to Undefined

Case: `PROVIDE(f1 = bar);` when both `f1` and `bar` are in separate
sections that would be discarded by GC.

Due to `demoteDefined`, `shouldAddProvideSym(f1)` may initially return
false (when Defined) and then return true (been demoted to Undefined).

```
addScriptReferencedSymbolsToSymTable
  shouldAddProvideSym(f1): false
  // the RHS (bar) is not added to `referencedSymbols` and may be GCed
declareSymbols
  shouldAddProvideSym(f1): false
markLive
demoteSymbolsAndComputeIsPreemptible
  // demoted f1 to Undefined
processSymbolAssignments
  addSymbol
shouldAddProvideSym(f1): true
```

The inconsistency can cause `cmd->expression()` in `addSymbol` to be
evaluated, leading to `symbol not found: bar` errors (since `bar` in the
RHS is not in `referencedSymbols` and is GCed) (#111478).

Fix this by adding a `sym->isUsedInRegularObj` condition, making
`shouldAddProvideSym(f1)` values consistent. In addition, we need a
`sym->exportDynamic` condition to keep provide-shared.s working.

Fixes: ebb326a51fec37b5a47e5702e8ea157cd4f835cd

Pull Request: https://github.com/llvm/llvm-project/pull/111945

(cherry picked from commit 1c6688ae3449da9c8fee1e1c12c892223496fb4c)
---
 lld/ELF/LinkerScript.cpp|  9 +-
 lld/test/ELF/linkerscript/provide-defined.s | 36 +
 2 files changed, 44 insertions(+), 1 deletion(-)
 create mode 100644 lld/test/ELF/linkerscript/provide-defined.s

diff --git a/lld/ELF/LinkerScript.cpp b/lld/ELF/LinkerScript.cpp
index 055fa21d44ca6e..d95c5573935ec4 100644
--- a/lld/ELF/LinkerScript.cpp
+++ b/lld/ELF/LinkerScript.cpp
@@ -1718,6 +1718,13 @@ void 
LinkerScript::addScriptReferencedSymbolsToSymTable() {
 }
 
 bool LinkerScript::shouldAddProvideSym(StringRef symName) {
+  // This function is called before and after garbage collection. To prevent
+  // undefined references from the RHS, the result of this function for a
+  // symbol must be the same for each call. We use isUsedInRegularObj to not
+  // change the return value of a demoted symbol. The exportDynamic condition,
+  // while not so accurate, allows PROVIDE to define a symbol referenced by a
+  // DSO.
   Symbol *sym = symtab.find(symName);
-  return sym && !sym->isDefined() && !sym->isCommon();
+  return sym && !sym->isDefined() && !sym->isCommon() &&
+ (sym->isUsedInRegularObj || sym->exportDynamic);
 }
diff --git a/lld/test/ELF/linkerscript/provide-defined.s 
b/lld/test/ELF/linkerscript/provide-defined.s
new file mode 100644
index 00..1d44bef3d4068d
--- /dev/null
+++ b/lld/test/ELF/linkerscript/provide-defined.s
@@ -0,0 +1,36 @@
+# REQUIRES: x86
+## Test the GC behavior when the PROVIDE symbol is defined by a relocatable 
file.
+
+# RUN: rm -rf %t && split-file %s %t && cd %t
+# RUN: llvm-mc -filetype=obj -triple=x86_64 a.s -o a.o
+# RUN: llvm-mc -filetype=obj -triple=x86_64 b.s -o b.o
+# RUN: ld.lld -T a.t --gc-sections a.o b.o -o a
+# RUN: llvm-readelf -s a | FileCheck %s
+
+# CHECK: 1: {{.*}}   0 NOTYPE  GLOBAL DEFAULT 1 _start
+# CHECK-NEXT:2: {{.*}}   0 NOTYPE  GLOBAL DEFAULT 2 f3
+# CHECK-NOT: {{.}}
+
+#--- a.s
+.global _start, f1, f2, f3, bar
+_start:
+  call f3
+
+.section .text.f1,"ax"; f1:
+.section .text.f2,"ax"; f2: # referenced by another relocatable file
+.section .text.f3,"ax"; f3: # live
+.section .text.bar,"ax"; bar:
+
+.comm comm,4,4
+
+#--- b.s
+  call f2
+
+#--- a.t
+SECTIONS {
+  . = . + SIZEOF_HEADERS;
+  PROVIDE(f1 = bar+1);
+  PROVIDE(f2 = bar+2);
+  PROVIDE(f3 = bar+3);
+  PROVIDE(f4 = comm+4);
+}

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] Backport "[ELF] Make shouldAddProvideSym return values consistent when demoted to Undefined" (PR #112136)

2024-10-13 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-lld

Author: DianQK (DianQK)


Changes

Backport #111945.

(cherry picked from commit 1c6688ae3449da9c8fee1e1c12c892223496fb4c)

---
Full diff: https://github.com/llvm/llvm-project/pull/112136.diff


2 Files Affected:

- (modified) lld/ELF/LinkerScript.cpp (+8-1) 
- (added) lld/test/ELF/linkerscript/provide-defined.s (+36) 


``diff
diff --git a/lld/ELF/LinkerScript.cpp b/lld/ELF/LinkerScript.cpp
index 055fa21d44ca6e..d95c5573935ec4 100644
--- a/lld/ELF/LinkerScript.cpp
+++ b/lld/ELF/LinkerScript.cpp
@@ -1718,6 +1718,13 @@ void 
LinkerScript::addScriptReferencedSymbolsToSymTable() {
 }
 
 bool LinkerScript::shouldAddProvideSym(StringRef symName) {
+  // This function is called before and after garbage collection. To prevent
+  // undefined references from the RHS, the result of this function for a
+  // symbol must be the same for each call. We use isUsedInRegularObj to not
+  // change the return value of a demoted symbol. The exportDynamic condition,
+  // while not so accurate, allows PROVIDE to define a symbol referenced by a
+  // DSO.
   Symbol *sym = symtab.find(symName);
-  return sym && !sym->isDefined() && !sym->isCommon();
+  return sym && !sym->isDefined() && !sym->isCommon() &&
+ (sym->isUsedInRegularObj || sym->exportDynamic);
 }
diff --git a/lld/test/ELF/linkerscript/provide-defined.s 
b/lld/test/ELF/linkerscript/provide-defined.s
new file mode 100644
index 00..1d44bef3d4068d
--- /dev/null
+++ b/lld/test/ELF/linkerscript/provide-defined.s
@@ -0,0 +1,36 @@
+# REQUIRES: x86
+## Test the GC behavior when the PROVIDE symbol is defined by a relocatable 
file.
+
+# RUN: rm -rf %t && split-file %s %t && cd %t
+# RUN: llvm-mc -filetype=obj -triple=x86_64 a.s -o a.o
+# RUN: llvm-mc -filetype=obj -triple=x86_64 b.s -o b.o
+# RUN: ld.lld -T a.t --gc-sections a.o b.o -o a
+# RUN: llvm-readelf -s a | FileCheck %s
+
+# CHECK: 1: {{.*}}   0 NOTYPE  GLOBAL DEFAULT 1 _start
+# CHECK-NEXT:2: {{.*}}   0 NOTYPE  GLOBAL DEFAULT 2 f3
+# CHECK-NOT: {{.}}
+
+#--- a.s
+.global _start, f1, f2, f3, bar
+_start:
+  call f3
+
+.section .text.f1,"ax"; f1:
+.section .text.f2,"ax"; f2: # referenced by another relocatable file
+.section .text.f3,"ax"; f3: # live
+.section .text.bar,"ax"; bar:
+
+.comm comm,4,4
+
+#--- b.s
+  call f2
+
+#--- a.t
+SECTIONS {
+  . = . + SIZEOF_HEADERS;
+  PROVIDE(f1 = bar+1);
+  PROVIDE(f2 = bar+2);
+  PROVIDE(f3 = bar+3);
+  PROVIDE(f4 = comm+4);
+}

``




https://github.com/llvm/llvm-project/pull/112136
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] Backport "[ELF] Make shouldAddProvideSym return values consistent when demoted to Undefined" (PR #112136)

2024-10-13 Thread via llvm-branch-commits

https://github.com/DianQK milestoned 
https://github.com/llvm/llvm-project/pull/112136
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [BOLT] Support perf2bolt-N in the driver (PR #111072)

2024-10-13 Thread Maksim Panchenko via llvm-branch-commits

https://github.com/maksfb approved this pull request.

LGTM. Please add the description of the problem this PR fixes and link any 
related issue(s).

https://github.com/llvm/llvm-project/pull/111072
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [NewPM][CodeGen] Port LiveRegMatrix to NPM (PR #109938)

2024-10-13 Thread Akshat Oke via llvm-branch-commits

https://github.com/optimisan updated 
https://github.com/llvm/llvm-project/pull/109938

>From d4cc049c53df27919103625417730595fc2183d7 Mon Sep 17 00:00:00 2001
From: Akshat Oke 
Date: Tue, 24 Sep 2024 09:07:04 +
Subject: [PATCH 1/4] [NewPM][CodeGen] Port LiveRegMatrix to NPM

---
 llvm/include/llvm/CodeGen/LiveRegMatrix.h | 50 ---
 llvm/include/llvm/InitializePasses.h  |  2 +-
 .../llvm/Passes/MachinePassRegistry.def   |  4 +-
 llvm/lib/CodeGen/LiveRegMatrix.cpp| 38 ++
 llvm/lib/CodeGen/RegAllocBasic.cpp|  8 +--
 llvm/lib/CodeGen/RegAllocGreedy.cpp   |  8 +--
 llvm/lib/Passes/PassBuilder.cpp   |  1 +
 llvm/lib/Target/AMDGPU/GCNNSAReassign.cpp |  6 +--
 .../Target/AMDGPU/SIPreAllocateWWMRegs.cpp|  6 +--
 9 files changed, 88 insertions(+), 35 deletions(-)

diff --git a/llvm/include/llvm/CodeGen/LiveRegMatrix.h 
b/llvm/include/llvm/CodeGen/LiveRegMatrix.h
index 2b32308c7c075e..c024ca9c1dc38d 100644
--- a/llvm/include/llvm/CodeGen/LiveRegMatrix.h
+++ b/llvm/include/llvm/CodeGen/LiveRegMatrix.h
@@ -37,7 +37,9 @@ class MachineFunction;
 class TargetRegisterInfo;
 class VirtRegMap;
 
-class LiveRegMatrix : public MachineFunctionPass {
+class LiveRegMatrix {
+  friend class LiveRegMatrixWrapperPass;
+  friend class LiveRegMatrixAnalysis;
   const TargetRegisterInfo *TRI = nullptr;
   LiveIntervals *LIS = nullptr;
   VirtRegMap *VRM = nullptr;
@@ -57,15 +59,21 @@ class LiveRegMatrix : public MachineFunctionPass {
   unsigned RegMaskVirtReg = 0;
   BitVector RegMaskUsable;
 
-  // MachineFunctionPass boilerplate.
-  void getAnalysisUsage(AnalysisUsage &) const override;
-  bool runOnMachineFunction(MachineFunction &) override;
-  void releaseMemory() override;
+  LiveRegMatrix() = default;
+  void releaseMemory();
 
 public:
-  static char ID;
-
-  LiveRegMatrix();
+  LiveRegMatrix(LiveRegMatrix &&Other)
+  : TRI(Other.TRI), LIS(Other.LIS), VRM(Other.VRM), UserTag(Other.UserTag),
+Matrix(std::move(Other.Matrix)), Queries(std::move(Other.Queries)),
+RegMaskTag(Other.RegMaskTag), RegMaskVirtReg(Other.RegMaskVirtReg),
+RegMaskUsable(std::move(Other.RegMaskUsable)) {
+Other.TRI = nullptr;
+Other.LIS = nullptr;
+Other.VRM = nullptr;
+  }
+
+  void init(MachineFunction &MF, LiveIntervals *LIS, VirtRegMap *VRM);
 
   
//======//
   // High-level interface.
@@ -159,6 +167,32 @@ class LiveRegMatrix : public MachineFunctionPass {
   Register getOneVReg(unsigned PhysReg) const;
 };
 
+class LiveRegMatrixWrapperPass : public MachineFunctionPass {
+  LiveRegMatrix LRM;
+
+public:
+  static char ID;
+
+  LiveRegMatrixWrapperPass() : MachineFunctionPass(ID) {}
+
+  LiveRegMatrix &getLRM() { return LRM; }
+  const LiveRegMatrix &getLRM() const { return LRM; }
+
+  void getAnalysisUsage(AnalysisUsage &AU) const override;
+  bool runOnMachineFunction(MachineFunction &MF) override;
+  void releaseMemory() override;
+};
+
+class LiveRegMatrixAnalysis : public AnalysisInfoMixin {
+  friend AnalysisInfoMixin;
+  static AnalysisKey Key;
+
+public:
+  using Result = LiveRegMatrix;
+
+  LiveRegMatrix run(MachineFunction &MF, MachineFunctionAnalysisManager &MFAM);
+};
+
 } // end namespace llvm
 
 #endif // LLVM_CODEGEN_LIVEREGMATRIX_H
diff --git a/llvm/include/llvm/InitializePasses.h 
b/llvm/include/llvm/InitializePasses.h
index d89a5538b46975..3fee8c40a6607e 100644
--- a/llvm/include/llvm/InitializePasses.h
+++ b/llvm/include/llvm/InitializePasses.h
@@ -156,7 +156,7 @@ void initializeLiveDebugValuesPass(PassRegistry &);
 void initializeLiveDebugVariablesPass(PassRegistry &);
 void initializeLiveIntervalsWrapperPassPass(PassRegistry &);
 void initializeLiveRangeShrinkPass(PassRegistry &);
-void initializeLiveRegMatrixPass(PassRegistry &);
+void initializeLiveRegMatrixWrapperPassPass(PassRegistry &);
 void initializeLiveStacksPass(PassRegistry &);
 void initializeLiveVariablesWrapperPassPass(PassRegistry &);
 void initializeLoadStoreOptPass(PassRegistry &);
diff --git a/llvm/include/llvm/Passes/MachinePassRegistry.def 
b/llvm/include/llvm/Passes/MachinePassRegistry.def
index bdc56ca03f392a..4497c1fce0db69 100644
--- a/llvm/include/llvm/Passes/MachinePassRegistry.def
+++ b/llvm/include/llvm/Passes/MachinePassRegistry.def
@@ -97,6 +97,7 @@ LOOP_PASS("loop-term-fold", LoopTermFoldPass())
 // preferably fix the scavenger to not depend on them).
 MACHINE_FUNCTION_ANALYSIS("live-intervals", LiveIntervalsAnalysis())
 MACHINE_FUNCTION_ANALYSIS("live-vars", LiveVariablesAnalysis())
+MACHINE_FUNCTION_ANALYSIS("live-reg-matrix", LiveRegMatrixAnalysis())
 MACHINE_FUNCTION_ANALYSIS("machine-block-freq", 
MachineBlockFrequencyAnalysis())
 MACHINE_FUNCTION_ANALYSIS("machine-branch-prob",
   MachineBranchProbabilityAnalysis())
@@ -122,8 +123,7 @@ MACHINE_FUNCTION_ANALYSIS("virtregmap", 
VirtRegMapAnalysis())
 // MachineRegionInf

[llvm-branch-commits] [llvm] [NewPM][AMDGPU] Port SIPreAllocateWWMRegs to NPM (PR #109939)

2024-10-13 Thread Akshat Oke via llvm-branch-commits

https://github.com/optimisan updated 
https://github.com/llvm/llvm-project/pull/109939

>From af1a1f15867edef93e69c43037a19ab69e8ec2e3 Mon Sep 17 00:00:00 2001
From: Akshat Oke 
Date: Tue, 24 Sep 2024 11:41:18 +
Subject: [PATCH 1/2] [NewPM][AMDGPU] Port SIPreAllocateWWMRegs to NPM

---
 llvm/lib/Target/AMDGPU/AMDGPU.h   |  6 +-
 llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def |  1 +
 .../lib/Target/AMDGPU/AMDGPUTargetMachine.cpp |  7 ++-
 .../Target/AMDGPU/SIPreAllocateWWMRegs.cpp| 60 ---
 llvm/lib/Target/AMDGPU/SIPreAllocateWWMRegs.h | 25 
 .../AMDGPU/si-pre-allocate-wwm-regs.mir   | 20 +++
 6 files changed, 92 insertions(+), 27 deletions(-)
 create mode 100644 llvm/lib/Target/AMDGPU/SIPreAllocateWWMRegs.h

diff --git a/llvm/lib/Target/AMDGPU/AMDGPU.h b/llvm/lib/Target/AMDGPU/AMDGPU.h
index 342d55e828bca5..95d0ad0f9dc96a 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPU.h
+++ b/llvm/lib/Target/AMDGPU/AMDGPU.h
@@ -49,7 +49,7 @@ FunctionPass *createSIFixSGPRCopiesLegacyPass();
 FunctionPass *createLowerWWMCopiesPass();
 FunctionPass *createSIMemoryLegalizerPass();
 FunctionPass *createSIInsertWaitcntsPass();
-FunctionPass *createSIPreAllocateWWMRegsPass();
+FunctionPass *createSIPreAllocateWWMRegsLegacyPass();
 FunctionPass *createSIFormMemoryClausesPass();
 
 FunctionPass *createSIPostRABundlerPass();
@@ -212,8 +212,8 @@ extern char &SILateBranchLoweringPassID;
 void initializeSIOptimizeExecMaskingPass(PassRegistry &);
 extern char &SIOptimizeExecMaskingID;
 
-void initializeSIPreAllocateWWMRegsPass(PassRegistry &);
-extern char &SIPreAllocateWWMRegsID;
+void initializeSIPreAllocateWWMRegsLegacyPass(PassRegistry &);
+extern char &SIPreAllocateWWMRegsLegacyID;
 
 void initializeAMDGPUImageIntrinsicOptimizerPass(PassRegistry &);
 extern char &AMDGPUImageIntrinsicOptimizerID;
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def 
b/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def
index 0ebf34c901c142..174a90f0aa419d 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def
+++ b/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def
@@ -102,5 +102,6 @@ MACHINE_FUNCTION_PASS("gcn-dpp-combine", 
GCNDPPCombinePass())
 MACHINE_FUNCTION_PASS("si-load-store-opt", SILoadStoreOptimizerPass())
 MACHINE_FUNCTION_PASS("si-lower-sgpr-spills", SILowerSGPRSpillsPass())
 MACHINE_FUNCTION_PASS("si-peephole-sdwa", SIPeepholeSDWAPass())
+MACHINE_FUNCTION_PASS("si-pre-allocate-wwm-regs", SIPreAllocateWWMRegsPass())
 MACHINE_FUNCTION_PASS("si-shrink-instructions", SIShrinkInstructionsPass())
 #undef MACHINE_FUNCTION_PASS
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
index 23ee0c3e896eb3..f367b5fbea45af 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -41,6 +41,7 @@
 #include "SIMachineFunctionInfo.h"
 #include "SIMachineScheduler.h"
 #include "SIPeepholeSDWA.h"
+#include "SIPreAllocateWWMRegs.h"
 #include "SIShrinkInstructions.h"
 #include "TargetInfo/AMDGPUTargetInfo.h"
 #include "Utils/AMDGPUBaseInfo.h"
@@ -508,7 +509,7 @@ extern "C" LLVM_EXTERNAL_VISIBILITY void 
LLVMInitializeAMDGPUTarget() {
   initializeSILateBranchLoweringPass(*PR);
   initializeSIMemoryLegalizerPass(*PR);
   initializeSIOptimizeExecMaskingPass(*PR);
-  initializeSIPreAllocateWWMRegsPass(*PR);
+  initializeSIPreAllocateWWMRegsLegacyPass(*PR);
   initializeSIFormMemoryClausesPass(*PR);
   initializeSIPostRABundlerPass(*PR);
   initializeGCNCreateVOPDPass(*PR);
@@ -1506,7 +1507,7 @@ bool GCNPassConfig::addRegAssignAndRewriteFast() {
   addPass(&SILowerSGPRSpillsLegacyID);
 
   // To Allocate wwm registers used in whole quad mode operations (for 
shaders).
-  addPass(&SIPreAllocateWWMRegsID);
+  addPass(&SIPreAllocateWWMRegsLegacyID);
 
   // For allocating other wwm register operands.
   addPass(createWWMRegAllocPass(false));
@@ -1543,7 +1544,7 @@ bool GCNPassConfig::addRegAssignAndRewriteOptimized() {
   addPass(&SILowerSGPRSpillsLegacyID);
 
   // To Allocate wwm registers used in whole quad mode operations (for 
shaders).
-  addPass(&SIPreAllocateWWMRegsID);
+  addPass(&SIPreAllocateWWMRegsLegacyID);
 
   // For allocating other whole wave mode registers.
   addPass(createWWMRegAllocPass(true));
diff --git a/llvm/lib/Target/AMDGPU/SIPreAllocateWWMRegs.cpp 
b/llvm/lib/Target/AMDGPU/SIPreAllocateWWMRegs.cpp
index 07303e2aa726c5..f9109c01c8085b 100644
--- a/llvm/lib/Target/AMDGPU/SIPreAllocateWWMRegs.cpp
+++ b/llvm/lib/Target/AMDGPU/SIPreAllocateWWMRegs.cpp
@@ -11,6 +11,7 @@
 //
 
//===--===//
 
+#include "SIPreAllocateWWMRegs.h"
 #include "AMDGPU.h"
 #include "GCNSubtarget.h"
 #include "MCTargetDesc/AMDGPUMCTargetDesc.h"
@@ -34,7 +35,7 @@ static cl::opt
 
 namespace {
 
-class SIPreAllocateWWMRegs : public MachineFunctionPass {
+class SIPreAllocateWWMRegs {
 private:
   const SIInstrInfo *TII;
   const SIRegisterInfo *TRI;

[llvm-branch-commits] [llvm] [AMDGPU] Add tests for SIPreAllocateWWMRegs (PR #109963)

2024-10-13 Thread Akshat Oke via llvm-branch-commits

https://github.com/optimisan updated 
https://github.com/llvm/llvm-project/pull/109963

>From 2cefaf6d479b6c7ae6bc8a2267f8e4fee274923c Mon Sep 17 00:00:00 2001
From: Akshat Oke 
Date: Wed, 25 Sep 2024 11:21:04 +
Subject: [PATCH 1/2] [AMDGPU] Add tests for SIPreAllocateWWMRegs

---
 .../AMDGPU/si-pre-allocate-wwm-regs.mir   | 26 +++
 .../si-pre-allocate-wwm-sgpr-spills.mir   | 21 +++
 2 files changed, 47 insertions(+)
 create mode 100644 llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwm-regs.mir
 create mode 100644 llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwm-sgpr-spills.mir

diff --git a/llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwm-regs.mir 
b/llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwm-regs.mir
new file mode 100644
index 00..f2db299f575f5e
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwm-regs.mir
@@ -0,0 +1,26 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py 
UTC_ARGS: --version 5
+# RUN: llc -mtriple=amdgcn -verify-machineinstrs 
-run-pass=si-pre-allocate-wwm-regs -o - -mcpu=tahiti %s  | FileCheck %s
+
+---
+
+name: pre_allocate_wwm_regs_strict
+tracksRegLiveness: true
+body: |
+  bb.0:
+liveins: $sgpr1
+; CHECK-LABEL: name: pre_allocate_wwm_regs_strict
+; CHECK: liveins: $sgpr1
+; CHECK-NEXT: {{  $}}
+; CHECK-NEXT: [[DEF:%[0-9]+]]:vgpr_32 = IMPLICIT_DEF
+; CHECK-NEXT: renamable $sgpr4_sgpr5 = ENTER_STRICT_WWM -1, implicit-def 
$exec, implicit-def $scc, implicit $exec
+; CHECK-NEXT: $vgpr0 = V_MOV_B32_e32 0, implicit $exec
+; CHECK-NEXT: dead $vgpr0 = V_MOV_B32_dpp $vgpr0, [[DEF]], 323, 12, 15, 0, 
implicit $exec
+; CHECK-NEXT: $exec = EXIT_STRICT_WWM killed renamable $sgpr4_sgpr5
+; CHECK-NEXT: dead [[COPY:%[0-9]+]]:vgpr_32 = COPY [[DEF]]
+%0:vgpr_32 = IMPLICIT_DEF
+renamable $sgpr4_sgpr5 = ENTER_STRICT_WWM -1, implicit-def $exec, 
implicit-def $scc, implicit $exec
+%24:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
+%25:vgpr_32 = V_MOV_B32_dpp %24:vgpr_32(tied-def 0), %0:vgpr_32, 323, 12, 
15, 0, implicit $exec
+$exec = EXIT_STRICT_WWM killed renamable $sgpr4_sgpr5
+%2:vgpr_32 = COPY %0:vgpr_32
+...
diff --git a/llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwm-sgpr-spills.mir 
b/llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwm-sgpr-spills.mir
new file mode 100644
index 00..f0efe74878d831
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwm-sgpr-spills.mir
@@ -0,0 +1,21 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py 
UTC_ARGS: --version 5
+# RUN: llc -mtriple=amdgcn -verify-machineinstrs 
-amdgpu-prealloc-sgpr-spill-vgprs -run-pass=si-pre-allocate-wwm-regs -o - 
-mcpu=tahiti %s | FileCheck %s
+
+---
+
+name: pre_allocate_wwm_spill_to_vgpr
+tracksRegLiveness: true
+body: |
+  bb.0:
+liveins: $sgpr1
+; CHECK-LABEL: name: pre_allocate_wwm_spill_to_vgpr
+; CHECK: liveins: $sgpr1
+; CHECK-NEXT: {{  $}}
+; CHECK-NEXT: [[DEF:%[0-9]+]]:vgpr_32 = IMPLICIT_DEF
+; CHECK-NEXT: dead $vgpr0 = SI_SPILL_S32_TO_VGPR $sgpr1, 0, [[DEF]]
+; CHECK-NEXT: dead [[COPY:%[0-9]+]]:vgpr_32 = COPY [[DEF]]
+%0:vgpr_32 = IMPLICIT_DEF
+%23:vgpr_32 = SI_SPILL_S32_TO_VGPR $sgpr1, 0, %0:vgpr_32
+%2:vgpr_32 = COPY %0:vgpr_32
+...
+

>From 9bddae336227b80ba45be7d7f16ddc4f49fd0a15 Mon Sep 17 00:00:00 2001
From: Akshat Oke 
Date: Mon, 7 Oct 2024 09:13:04 +
Subject: [PATCH 2/2] Keep tests in one file

---
 .../AMDGPU/si-pre-allocate-wwm-regs.mir   | 24 ---
 .../si-pre-allocate-wwm-sgpr-spills.mir   | 21 
 2 files changed, 21 insertions(+), 24 deletions(-)
 delete mode 100644 llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwm-sgpr-spills.mir

diff --git a/llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwm-regs.mir 
b/llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwm-regs.mir
index f2db299f575f5e..74a221084dce24 100644
--- a/llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwm-regs.mir
+++ b/llvm/test/CodeGen/AMDGPU/si-pre-allocate-wwm-regs.mir
@@ -1,5 +1,6 @@
 # NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py 
UTC_ARGS: --version 5
 # RUN: llc -mtriple=amdgcn -verify-machineinstrs 
-run-pass=si-pre-allocate-wwm-regs -o - -mcpu=tahiti %s  | FileCheck %s
+# RUN: llc -mtriple=amdgcn -verify-machineinstrs 
-amdgpu-prealloc-sgpr-spill-vgprs -run-pass=si-pre-allocate-wwm-regs -o - 
-mcpu=tahiti %s | FileCheck %s --check-prefix=CHECK2
 
 ---
 
@@ -19,8 +20,25 @@ body: |
 ; CHECK-NEXT: dead [[COPY:%[0-9]+]]:vgpr_32 = COPY [[DEF]]
 %0:vgpr_32 = IMPLICIT_DEF
 renamable $sgpr4_sgpr5 = ENTER_STRICT_WWM -1, implicit-def $exec, 
implicit-def $scc, implicit $exec
-%24:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
-%25:vgpr_32 = V_MOV_B32_dpp %24:vgpr_32(tied-def 0), %0:vgpr_32, 323, 12, 
15, 0, implicit $exec
+%1:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
+%2:vgpr_32 = V_MOV_B32_dpp %1, %0, 323, 12, 15, 0, implicit $exec
 $exec = EXIT_STRICT_

[llvm-branch-commits] [llvm] [CodeGen] LiveIntervalUnions::Array Implement move constructor (PR #111357)

2024-10-13 Thread Akshat Oke via llvm-branch-commits

https://github.com/optimisan updated 
https://github.com/llvm/llvm-project/pull/111357

>From dbc51871aab3d4b5d7d64ef78f2df7833359b17f Mon Sep 17 00:00:00 2001
From: Akshat Oke 
Date: Mon, 7 Oct 2024 08:42:24 +
Subject: [PATCH] [CodeGen] LiveIntervalUnions::Array  Implement move
 constructor

---
 llvm/include/llvm/CodeGen/LiveIntervalUnion.h | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/llvm/include/llvm/CodeGen/LiveIntervalUnion.h 
b/llvm/include/llvm/CodeGen/LiveIntervalUnion.h
index 81003455da4241..cc0f2a45bb182c 100644
--- a/llvm/include/llvm/CodeGen/LiveIntervalUnion.h
+++ b/llvm/include/llvm/CodeGen/LiveIntervalUnion.h
@@ -176,6 +176,13 @@ class LiveIntervalUnion {
 Array() = default;
 ~Array() { clear(); }
 
+Array(Array &&Other) : Size(Other.Size), LIUs(Other.LIUs) {
+  Other.Size = 0;
+  Other.LIUs = nullptr;
+}
+
+Array(const Array &) = delete;
+
 // Initialize the array to have Size entries.
 // Reuse an existing allocation if the size matches.
 void init(LiveIntervalUnion::Allocator&, unsigned Size);

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU] Add tests for SIPreAllocateWWMRegs (PR #109963)

2024-10-13 Thread Matt Arsenault via llvm-branch-commits


@@ -0,0 +1,44 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py 
UTC_ARGS: --version 5
+# RUN: llc -mtriple=amdgcn -verify-machineinstrs 
-run-pass=si-pre-allocate-wwm-regs -o - -mcpu=tahiti %s  | FileCheck %s
+# RUN: llc -mtriple=amdgcn -verify-machineinstrs 
-amdgpu-prealloc-sgpr-spill-vgprs -run-pass=si-pre-allocate-wwm-regs -o - 
-mcpu=tahiti %s | FileCheck %s --check-prefix=CHECK2

arsenm wrote:

Move the -mcpu together with -mtriple 

https://github.com/llvm/llvm-project/pull/109963
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU] Add tests for SIPreAllocateWWMRegs (PR #109963)

2024-10-13 Thread Matt Arsenault via llvm-branch-commits


@@ -0,0 +1,44 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py 
UTC_ARGS: --version 5
+# RUN: llc -mtriple=amdgcn -verify-machineinstrs 
-run-pass=si-pre-allocate-wwm-regs -o - -mcpu=tahiti %s  | FileCheck %s
+# RUN: llc -mtriple=amdgcn -verify-machineinstrs 
-amdgpu-prealloc-sgpr-spill-vgprs -run-pass=si-pre-allocate-wwm-regs -o - 
-mcpu=tahiti %s | FileCheck %s --check-prefix=CHECK2
+
+---

arsenm wrote:

Add fixme to check the MachineFunctionInfo reserved register information 

https://github.com/llvm/llvm-project/pull/109963
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU] Add tests for SIPreAllocateWWMRegs (PR #109963)

2024-10-13 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm approved this pull request.


https://github.com/llvm/llvm-project/pull/109963
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [NewPM][AMDGPU] Port SIPreAllocateWWMRegs to NPM (PR #109939)

2024-10-13 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm approved this pull request.


https://github.com/llvm/llvm-project/pull/109939
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU] Add tests for SIPreAllocateWWMRegs (PR #109963)

2024-10-13 Thread Matt Arsenault via llvm-branch-commits


@@ -0,0 +1,44 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py 
UTC_ARGS: --version 5
+# RUN: llc -mtriple=amdgcn -verify-machineinstrs 
-run-pass=si-pre-allocate-wwm-regs -o - -mcpu=tahiti %s  | FileCheck %s
+# RUN: llc -mtriple=amdgcn -verify-machineinstrs 
-amdgpu-prealloc-sgpr-spill-vgprs -run-pass=si-pre-allocate-wwm-regs -o - 
-mcpu=tahiti %s | FileCheck %s --check-prefix=CHECK2
+
+---

arsenm wrote:

Yes

https://github.com/llvm/llvm-project/pull/109963
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU] Add tests for SIPreAllocateWWMRegs (PR #109963)

2024-10-13 Thread Akshat Oke via llvm-branch-commits


@@ -0,0 +1,44 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py 
UTC_ARGS: --version 5
+# RUN: llc -mtriple=amdgcn -verify-machineinstrs 
-run-pass=si-pre-allocate-wwm-regs -o - -mcpu=tahiti %s  | FileCheck %s
+# RUN: llc -mtriple=amdgcn -verify-machineinstrs 
-amdgpu-prealloc-sgpr-spill-vgprs -run-pass=si-pre-allocate-wwm-regs -o - 
-mcpu=tahiti %s | FileCheck %s --check-prefix=CHECK2
+
+---

optimisan wrote:

Is it for WWM reserved registers? 

https://github.com/llvm/llvm-project/pull/109963
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU] Skip non-first termintors when forcing emit zero flag (PR #112116)

2024-10-13 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm edited 
https://github.com/llvm/llvm-project/pull/112116
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU] Skip non-first termintors when forcing emit zero flag (PR #112116)

2024-10-13 Thread Matt Arsenault via llvm-branch-commits


@@ -1825,7 +1836,9 @@ bool 
SIInsertWaitcnts::generateWaitcntInstBefore(MachineInstr &MI,
   // Verify that the wait is actually needed.
   ScoreBrackets.simplifyWaitcnt(Wait);
 
-  if (ForceEmitZeroFlag)
+  // When forcing emit, we need to skip non-first terminators of a MBB because
+  // that would break the terminators of the MBB.
+  if (ForceEmitZeroFlag && !checkIfMBBNonFirstTerminator(MI))

arsenm wrote:

You're scanning the terminators for every instruction. Can you adjust the outer 
iterator logic to skip the terminators in the first place? 

https://github.com/llvm/llvm-project/pull/112116
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU] Skip non-first termintors when forcing emit zero flag (PR #112116)

2024-10-13 Thread Matt Arsenault via llvm-branch-commits


@@ -1600,6 +1600,17 @@ static bool callWaitsOnFunctionReturn(const MachineInstr 
&MI) {
   return true;
 }
 
+/// \returns true if \p MI is not the first terminator of its associated MBB.
+static bool checkIfMBBNonFirstTerminator(const MachineInstr &MI) {
+  const auto &MBB = MI.getParent();
+  if (MBB->getFirstTerminator() == MI)
+return false;
+  for (const auto &I : MBB->terminators())
+if (&I == &MI)
+  return true;

arsenm wrote:

This iterator logic is clumsy (you're effectively using getFirstTerminator 
twice) 

https://github.com/llvm/llvm-project/pull/112116
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] Update correct dependency (PR #109937)

2024-10-13 Thread Akshat Oke via llvm-branch-commits

https://github.com/optimisan updated 
https://github.com/llvm/llvm-project/pull/109937

>From ca685074a7f8bfc75e40dd8172ce9e731e991f4d Mon Sep 17 00:00:00 2001
From: Akshat Oke 
Date: Tue, 24 Sep 2024 06:35:43 +
Subject: [PATCH] Update correct dependency

Replace unused analysis (VirtRegMap) dependency with the used one (SlotIndexes)
---
 llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp 
b/llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp
index 4afefa3d9b245c..d8697aa2ffe1cd 100644
--- a/llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp
+++ b/llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp
@@ -95,8 +95,8 @@ char SILowerSGPRSpillsLegacy::ID = 0;
 INITIALIZE_PASS_BEGIN(SILowerSGPRSpillsLegacy, DEBUG_TYPE,
   "SI lower SGPR spill instructions", false, false)
 INITIALIZE_PASS_DEPENDENCY(LiveIntervalsWrapperPass)
-INITIALIZE_PASS_DEPENDENCY(VirtRegMapWrapperLegacy)
 INITIALIZE_PASS_DEPENDENCY(MachineDominatorTreeWrapperPass)
+INITIALIZE_PASS_DEPENDENCY(SlotIndexesWrapperPass)
 INITIALIZE_PASS_END(SILowerSGPRSpillsLegacy, DEBUG_TYPE,
 "SI lower SGPR spill instructions", false, false)
 

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU] Add tests for SIPreAllocateWWMRegs (PR #109963)

2024-10-13 Thread Matt Arsenault via llvm-branch-commits


@@ -0,0 +1,44 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py 
UTC_ARGS: --version 5
+# RUN: llc -mtriple=amdgcn -verify-machineinstrs 
-run-pass=si-pre-allocate-wwm-regs -o - -mcpu=tahiti %s  | FileCheck %s
+# RUN: llc -mtriple=amdgcn -verify-machineinstrs 
-amdgpu-prealloc-sgpr-spill-vgprs -run-pass=si-pre-allocate-wwm-regs -o - 
-mcpu=tahiti %s | FileCheck %s --check-prefix=CHECK2
+
+---

arsenm wrote:

Then you need to write manual checks for it. Update_mir_test_checks doesn't 
currently support the function level properties 

https://github.com/llvm/llvm-project/pull/109963
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU] Add tests for SIPreAllocateWWMRegs (PR #109963)

2024-10-13 Thread Akshat Oke via llvm-branch-commits


@@ -0,0 +1,44 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py 
UTC_ARGS: --version 5
+# RUN: llc -mtriple=amdgcn -verify-machineinstrs 
-run-pass=si-pre-allocate-wwm-regs -o - -mcpu=tahiti %s  | FileCheck %s
+# RUN: llc -mtriple=amdgcn -verify-machineinstrs 
-amdgpu-prealloc-sgpr-spill-vgprs -run-pass=si-pre-allocate-wwm-regs -o - 
-mcpu=tahiti %s | FileCheck %s --check-prefix=CHECK2
+
+---

optimisan wrote:

But we have that already(?)
```
wwmReservedRegs:
  - '$vgpr0'
```

https://github.com/llvm/llvm-project/pull/109963
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits