from:"Christudasan Devadasan via llvm\-branch\-commits"

[llvm-branch-commits] [clang] [llvm] AMDGPU: Remove ds atomic fadd intrinsics (PR #95396)

2024-06-13 Thread Christudasan Devadasan via llvm-branch-commits



@@ -2331,40 +2337,74 @@ static Value *upgradeARMIntrinsicCall(StringRef Name, 
CallBase *CI, Function *F,
   llvm_unreachable("Unknown function for ARM CallBase upgrade.");
 }
 
+// These are expected to have have the arguments:

cdevadas wrote:

```suggestion
// These are expected to have the arguments:
```

https://github.com/llvm/llvm-project/pull/95396
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Merge constrained sloads (PR #96162)

2024-06-20 Thread Christudasan Devadasan via llvm-branch-commits


cdevadas wrote:

> [!WARNING]
> This pull request is not mergeable via GitHub because a downstack PR is 
> open. Once all requirements are satisfied, merge this PR as a stack  href="https://app.graphite.dev/github/pr/llvm/llvm-project/96162?utm_source=stack-comment-downstack-mergeability-warning";
>  >on Graphite.
> https://graphite.dev/docs/merge-pull-requests";>Learn more

* **#96163** https://app.graphite.dev/github/pr/llvm/llvm-project/96163?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* **#96162** https://app.graphite.dev/github/pr/llvm/llvm-project/96162?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/> 👈
* **#96161** https://app.graphite.dev/github/pr/llvm/llvm-project/96161?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* `main`

This stack of pull requests is managed by Graphite. https://stacking.dev/?utm_source=stack-comment";>Learn more about 
stacking.


 Join @cdevadas and the rest of your teammates on https://graphite.dev?utm-source=stack-comment";>https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="11px" height="11px"/> Graphite
  

https://github.com/llvm/llvm-project/pull/96162
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU] Codegen support for constrained multi-dword sloads (PR #96163)

2024-06-20 Thread Christudasan Devadasan via llvm-branch-commits


cdevadas wrote:

> [!WARNING]
> This pull request is not mergeable via GitHub because a downstack PR is 
> open. Once all requirements are satisfied, merge this PR as a stack  href="https://app.graphite.dev/github/pr/llvm/llvm-project/96163?utm_source=stack-comment-downstack-mergeability-warning";
>  >on Graphite.
> https://graphite.dev/docs/merge-pull-requests";>Learn more

* **#96163** https://app.graphite.dev/github/pr/llvm/llvm-project/96163?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/> 👈
* **#96162** https://app.graphite.dev/github/pr/llvm/llvm-project/96162?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* **#96161** https://app.graphite.dev/github/pr/llvm/llvm-project/96161?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* `main`

This stack of pull requests is managed by Graphite. https://stacking.dev/?utm_source=stack-comment";>Learn more about 
stacking.


 Join @cdevadas and the rest of your teammates on https://graphite.dev?utm-source=stack-comment";>https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="11px" height="11px"/> Graphite
  

https://github.com/llvm/llvm-project/pull/96163
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU] Codegen support for constrained multi-dword sloads (PR #96163)

2024-06-20 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas ready_for_review 
https://github.com/llvm/llvm-project/pull/96163
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Merge constrained sloads (PR #96162)

2024-06-20 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas ready_for_review 
https://github.com/llvm/llvm-project/pull/96162
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Merge constrained sloads (PR #96162)

2024-06-20 Thread Christudasan Devadasan via llvm-branch-commits

cdevadas wrote:

> This looks like it is affecting codegen even when xnack is disabled? That 
> should not happen.

It shouldn't. I put the xnack replay subtarget check before using *_ec 
equivalents. See the code here: 
https://github.com/llvm/llvm-project/commit/65eb44327cf32a83dbbf13eb70f9d8c03f3efaef#diff-35f4d1b6c4c17815f6989f86abbac2e606ca760f9d93f501ff503449048bf760R1735

https://github.com/llvm/llvm-project/pull/96162
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU] Codegen support for constrained multi-dword sloads (PR #96163)

2024-06-21 Thread Christudasan Devadasan via llvm-branch-commits



@@ -867,13 +867,104 @@ def SMRDBufferImm   : ComplexPattern;
 def SMRDBufferImm32 : ComplexPattern;
 def SMRDBufferSgprImm : ComplexPattern;
 
+class SMRDAlignedLoadPat : PatFrag <(ops node:$ptr), (Op 
node:$ptr), [{
+  // Returns true if it is a naturally aligned multi-dword load.
+  LoadSDNode *Ld = cast(N);
+  unsigned Size = Ld->getMemoryVT().getStoreSize();
+  return (Size <= 4) || (Ld->getAlign().value() >= PowerOf2Ceil(Size));

cdevadas wrote:

For the DWORDX3 case. Even though the size is 12 Bytes, its natural alignment 
is 16 Bytes. 

https://github.com/llvm/llvm-project/pull/96163
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU] Codegen support for constrained multi-dword sloads (PR #96163)

2024-06-21 Thread Christudasan Devadasan via llvm-branch-commits



@@ -886,26 +977,17 @@ multiclass SMRD_Pattern  {
   def : GCNPat <
 (smrd_load (SMRDSgpr i64:$sbase, i32:$soffset)),
 (vt (!cast(Instr#"_SGPR") $sbase, $soffset, 0))> {
-let OtherPredicates = [isNotGFX9Plus];
-  }
-  def : GCNPat <
-(smrd_load (SMRDSgpr i64:$sbase, i32:$soffset)),
-(vt (!cast(Instr#"_SGPR_IMM") $sbase, $soffset, 0, 0))> {
-let OtherPredicates = [isGFX9Plus];
+let OtherPredicates = [isGFX6GFX7];
   }
 
-  // 4. SGPR+IMM offset
+  // 4. No offset
   def : GCNPat <
-(smrd_load (SMRDSgprImm i64:$sbase, i32:$soffset, i32:$offset)),
-(vt (!cast(Instr#"_SGPR_IMM") $sbase, $soffset, $offset, 0))> {
-let OtherPredicates = [isGFX9Plus];
+(vt (smrd_load (i64 SReg_64:$sbase))),
+(vt (!cast(Instr#"_IMM") i64:$sbase, 0, 0))> {
+let OtherPredicates = [isGFX6GFX7];
   }
 
-  // 5. No offset
-  def : GCNPat <
-(vt (smrd_load (i64 SReg_64:$sbase))),
-(vt (!cast(Instr#"_IMM") i64:$sbase, 0, 0))
-  >;
+  defm : SMRD_Align_Pattern;

cdevadas wrote:

I was using the predicate for gfx8+ which has the xnack replay support enabled. 
I should instead check if the xnack is on. Will change it.

https://github.com/llvm/llvm-project/pull/96163
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU] Codegen support for constrained multi-dword sloads (PR #96163)

2024-06-22 Thread Christudasan Devadasan via llvm-branch-commits



@@ -867,13 +867,104 @@ def SMRDBufferImm   : ComplexPattern;
 def SMRDBufferImm32 : ComplexPattern;
 def SMRDBufferSgprImm : ComplexPattern;
 
+class SMRDAlignedLoadPat : PatFrag <(ops node:$ptr), (Op 
node:$ptr), [{
+  // Returns true if it is a naturally aligned multi-dword load.
+  LoadSDNode *Ld = cast(N);
+  unsigned Size = Ld->getMemoryVT().getStoreSize();
+  return (Size <= 4) || (Ld->getAlign().value() >= PowerOf2Ceil(Size));

cdevadas wrote:

Isn't it 12 >= 16 or 16 >=16?
The PowerOf2Ceil intends to catch the first case where the specified Align is 
smaller than its natural alignment.

https://github.com/llvm/llvm-project/pull/96163
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Merge constrained sloads (PR #96162)

2024-06-23 Thread Christudasan Devadasan via llvm-branch-commits



@@ -1701,17 +1732,33 @@ unsigned SILoadStoreOptimizer::getNewOpcode(const 
CombineInfo &CI,
   return AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM;
 }
   case S_LOAD_IMM:
-switch (Width) {
-default:
-  return 0;
-case 2:
-  return AMDGPU::S_LOAD_DWORDX2_IMM;
-case 3:
-  return AMDGPU::S_LOAD_DWORDX3_IMM;
-case 4:
-  return AMDGPU::S_LOAD_DWORDX4_IMM;
-case 8:
-  return AMDGPU::S_LOAD_DWORDX8_IMM;
+// For targets that support XNACK replay, use the constrained load opcode.
+if (STI && STI->hasXnackReplay()) {
+  switch (Width) {

cdevadas wrote:

I guess currently the merged load is always under-aligned. While combining the 
MMOs, currently the alignment is picked from the first MMO and that'd 
definitely be smaller than the natural align requirement for the new load. 
That's one reason I conservatively want to emit _ec equivalent when XNACK is 
enabled.

https://github.com/llvm/llvm-project/pull/96162
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Merge constrained sloads (PR #96162)

2024-06-24 Thread Christudasan Devadasan via llvm-branch-commits



@@ -1701,17 +1732,33 @@ unsigned SILoadStoreOptimizer::getNewOpcode(const 
CombineInfo &CI,
   return AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM;
 }
   case S_LOAD_IMM:
-switch (Width) {
-default:
-  return 0;
-case 2:
-  return AMDGPU::S_LOAD_DWORDX2_IMM;
-case 3:
-  return AMDGPU::S_LOAD_DWORDX3_IMM;
-case 4:
-  return AMDGPU::S_LOAD_DWORDX4_IMM;
-case 8:
-  return AMDGPU::S_LOAD_DWORDX8_IMM;
+// For targets that support XNACK replay, use the constrained load opcode.
+if (STI && STI->hasXnackReplay()) {
+  switch (Width) {

cdevadas wrote:

> > currently the alignment is picked from the first MMO and that'd definitely 
> > be smaller than the natural align requirement for the new load
> 
> You don't know that - the alignment in the first MMO will be whatever 
> alignment the compiler could deduce, which could be large, e.g. if the 
> pointer used for the first load was known to have a large alignment.

Are you suggesting to check the alignment in the first MMO and see if it is 
still the preferred alignment for the merge-load? 
Use the _ec if the alignment is found to be smaller than the expected value.

https://github.com/llvm/llvm-project/pull/96162
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU] Enable atomic optimizer for 64 bit divergent values (PR #96934)

2024-06-27 Thread Christudasan Devadasan via llvm-branch-commits



@@ -178,6 +178,20 @@ bool AMDGPUAtomicOptimizerImpl::run(Function &F) {
   return Changed;
 }
 
+static bool shouldOptimizeForType(Type *Ty) {
+  switch (Ty->getTypeID()) {
+  case Type::FloatTyID:
+  case Type::DoubleTyID:
+return true;
+  case Type::IntegerTyID: {
+if (Ty->getIntegerBitWidth() == 32 || Ty->getIntegerBitWidth() == 64)

cdevadas wrote:

Get Ty->getIntegerBitWidth() just once outside?

https://github.com/llvm/llvm-project/pull/96934
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Merge constrained sloads (PR #96162)

2024-07-03 Thread Christudasan Devadasan via llvm-branch-commits



@@ -183,10 +183,10 @@ define <2 x half> @local_atomic_fadd_v2f16_rtn(ptr 
addrspace(3) %ptr, <2 x half>
 define amdgpu_kernel void @local_atomic_fadd_v2bf16_noret(ptr addrspace(3) 
%ptr, <2 x i16> %data) {
 ; GFX940-LABEL: local_atomic_fadd_v2bf16_noret:
 ; GFX940:   ; %bb.0:
-; GFX940-NEXT:s_load_dwordx2 s[0:1], s[0:1], 0x24
+; GFX940-NEXT:s_load_dwordx2 s[2:3], s[0:1], 0x24

cdevadas wrote:

Earlier I wrongly used the dword size (Width) in the the alignment check here 
as Jay pointed out. Now, I fixed it to use Byte size while comparing it with 
the existing alignment of the first load.
https://github.com/llvm/llvm-project/pull/96162/commits/e7e6cbc4abd476a038fd7836e5078565e73d1fe9#diff-35f4d1b6c4c17815f6989f86abbac2e606ca760f9d93f501ff503449048bf760R1730

https://github.com/llvm/llvm-project/pull/96162
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Merge constrained sloads (PR #96162)

2024-07-03 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas edited 
https://github.com/llvm/llvm-project/pull/96162
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Merge constrained sloads (PR #96162)

2024-07-03 Thread Christudasan Devadasan via llvm-branch-commits



@@ -183,10 +183,10 @@ define <2 x half> @local_atomic_fadd_v2f16_rtn(ptr 
addrspace(3) %ptr, <2 x half>
 define amdgpu_kernel void @local_atomic_fadd_v2bf16_noret(ptr addrspace(3) 
%ptr, <2 x i16> %data) {
 ; GFX940-LABEL: local_atomic_fadd_v2bf16_noret:
 ; GFX940:   ; %bb.0:
-; GFX940-NEXT:s_load_dwordx2 s[0:1], s[0:1], 0x24
+; GFX940-NEXT:s_load_dwordx2 s[2:3], s[0:1], 0x24

cdevadas wrote:

Unfortunately, that's not happening. The IR load-store-vectorizer doesn't 
combine the two loads.
I still see the two loads after the IR vectorizer and they become two loads in 
the selected code. Can this happen because the alignment for the two loads 
differ and the IR vectorizer safely ignores them?

*** IR Dump before Selection ***
define amdgpu_kernel void @local_atomic_fadd_v2bf16_noret(ptr addrspace(3) 
%ptr, <2 x i16> %data) #0 {
  %local_atomic_fadd_v2bf16_noret.kernarg.segment = call nonnull align 16 
dereferenceable(44) ptr addrspace(4) @llvm.amdgcn.kernarg.segment.ptr()
  %ptr.kernarg.offset = getelementptr inbounds i8, ptr addrspace(4) 
%local_atomic_fadd_v2bf16_noret.kernarg.segment, i64 36, !amdgpu.uniform !0
  **%ptr.load = load ptr addrspace(3), ptr addrspace(4) %ptr.kernarg.offset**, 
align 4, !invariant.load !0
  %data.kernarg.offset = getelementptr inbounds i8, ptr addrspace(4) 
%local_atomic_fadd_v2bf16_noret.kernarg.segment, i64 40, !amdgpu.uniform !0
  **%data.load = load <2 x i16>, ptr addrspace(4) %data.kernarg.offset**, align 
8, !invariant.load !0
  %ret = call <2 x i16> @llvm.amdgcn.ds.fadd.v2bf16(ptr addrspace(3) %ptr.load, 
<2 x i16> %data.load)
  ret void
}
# *** IR Dump After selection ***:
# Machine code for function local_atomic_fadd_v2bf16_noret: IsSSA, 
TracksLiveness
Function Live Ins: $sgpr0_sgpr1 in %1

bb.0 (%ir-block.0):
  liveins: $sgpr0_sgpr1
  %1:sgpr_64(p4) = COPY $sgpr0_sgpr1
  %3:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM %1:sgpr_64(p4), 36, 0 :: 
(dereferenceable invariant load (s32) from %ir.ptr.kernarg.offset, addrspace 4)
  %4:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM %1:sgpr_64(p4), 40, 0 :: 
(dereferenceable invariant load (s32) from %ir.data.kernarg.offset, align 8, 
addrspace 4)
  %5:vgpr_32 = COPY %3:sreg_32_xm0_xexec
  %6:vgpr_32 = COPY %4:sreg_32_xm0_xexec
  DS_PK_ADD_BF16 killed %5:vgpr_32, killed %6:vgpr_32, 0, 0, implicit $m0, 
implicit $exec
  S_ENDPGM 0


https://github.com/llvm/llvm-project/pull/96162
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Merge constrained sloads (PR #96162)

2024-07-03 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas edited 
https://github.com/llvm/llvm-project/pull/96162
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Merge constrained sloads (PR #96162)

2024-07-03 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas edited 
https://github.com/llvm/llvm-project/pull/96162
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Merge constrained sloads (PR #96162)

2024-07-04 Thread Christudasan Devadasan via llvm-branch-commits



@@ -183,10 +183,10 @@ define <2 x half> @local_atomic_fadd_v2f16_rtn(ptr 
addrspace(3) %ptr, <2 x half>
 define amdgpu_kernel void @local_atomic_fadd_v2bf16_noret(ptr addrspace(3) 
%ptr, <2 x i16> %data) {
 ; GFX940-LABEL: local_atomic_fadd_v2bf16_noret:
 ; GFX940:   ; %bb.0:
-; GFX940-NEXT:s_load_dwordx2 s[0:1], s[0:1], 0x24
+; GFX940-NEXT:s_load_dwordx2 s[2:3], s[0:1], 0x24

cdevadas wrote:

Opened https://github.com/llvm/llvm-project/issues/97715.

https://github.com/llvm/llvm-project/pull/96162
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Merge constrained sloads (PR #96162)

2024-07-10 Thread Christudasan Devadasan via llvm-branch-commits


cdevadas wrote:

Ping

https://github.com/llvm/llvm-project/pull/96162
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU] Codegen support for constrained multi-dword sloads (PR #96163)

2024-07-10 Thread Christudasan Devadasan via llvm-branch-commits


cdevadas wrote:

Ping

https://github.com/llvm/llvm-project/pull/96163
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Merge constrained sloads (PR #96162)

2024-07-10 Thread Christudasan Devadasan via llvm-branch-commits



@@ -658,17 +658,17 @@ define amdgpu_kernel void 
@image_bvh_intersect_ray_nsa_reassign(ptr %p_node_ptr,
 ;
 ; GFX1013-LABEL: image_bvh_intersect_ray_nsa_reassign:
 ; GFX1013:   ; %bb.0:
-; GFX1013-NEXT:s_load_dwordx8 s[0:7], s[0:1], 0x24
+; GFX1013-NEXT:s_load_dwordx8 s[4:11], s[0:1], 0x24

cdevadas wrote:

> I guess this code changes because xnack is enabled by default for GFX10.1?
Yes.
> Is there anything we could do to add known alignment info here, to avoid the 
> code pessimization?
I'm not sure what can be done for it.



https://github.com/llvm/llvm-project/pull/96162
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Merge constrained sloads (PR #96162)

2024-07-10 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas edited 
https://github.com/llvm/llvm-project/pull/96162
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Merge constrained sloads (PR #96162)

2024-07-17 Thread Christudasan Devadasan via llvm-branch-commits

cdevadas wrote:

> I still think it is terribly surprising all of the test diff shows up in this 
> commit, and not the selection case

Because the selection support is done in the next PR of the review stack, 
https://github.com/llvm/llvm-project/pull/96162. This patch takes care of 
choosing the right opcode while merging the loads.

https://github.com/llvm/llvm-project/pull/96162
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU] Codegen support for constrained multi-dword sloads (PR #96163)

2024-07-22 Thread Christudasan Devadasan via llvm-branch-commits


cdevadas wrote:

The latest patch optimizes the PatFrag and the patterns written further by 
using OtherPredicates. The lit test changes in the latest patch are a missed 
optimization I incorrectly introduced earlier in this PR for GFX7. It is now 
fixed and matches the default behavior with the current compiler.

https://github.com/llvm/llvm-project/pull/96163
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Merge constrained sloads (PR #96162)

2024-07-23 Thread Christudasan Devadasan via llvm-branch-commits


cdevadas wrote:

### Merge activity

* **Jul 23, 4:02 AM EDT**: @cdevadas started a stack merge that includes this 
pull request via 
[Graphite](https://app.graphite.dev/github/pr/llvm/llvm-project/96162).


https://github.com/llvm/llvm-project/pull/96162
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU] Codegen support for constrained multi-dword sloads (PR #96163)

2024-07-23 Thread Christudasan Devadasan via llvm-branch-commits


cdevadas wrote:

### Merge activity

* **Jul 23, 4:02 AM EDT**: @cdevadas started a stack merge that includes this 
pull request via 
[Graphite](https://app.graphite.dev/github/pr/llvm/llvm-project/96163).


https://github.com/llvm/llvm-project/pull/96163
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Include constrained buffer load variants (PR #101619)

2024-08-01 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas created 
https://github.com/llvm/llvm-project/pull/101619

Use the constrained buffer load opcodes while combining under-aligned
load for XNACK enabled subtargets.

>From ad8a8dfea913c92fb94079aab0a4a5905b30384d Mon Sep 17 00:00:00 2001
From: Christudasan Devadasan 
Date: Tue, 30 Jul 2024 14:46:36 +0530
Subject: [PATCH] [AMDGPU][SILoadStoreOptimizer] Include constrained buffer
 load variants

Use the constrained buffer load opcodes while combining under-aligned
load for XNACK enabled subtargets.
---
 .../Target/AMDGPU/SILoadStoreOptimizer.cpp|  75 ++-
 .../AMDGPU/llvm.amdgcn.s.buffer.load.ll   |  56 +-
 .../CodeGen/AMDGPU/merge-sbuffer-load.mir | 564 --
 3 files changed, 613 insertions(+), 82 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp 
b/llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp
index ae537b194f50c..7553c370f694f 100644
--- a/llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp
+++ b/llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp
@@ -352,6 +352,8 @@ static unsigned getOpcodeWidth(const MachineInstr &MI, 
const SIInstrInfo &TII) {
 return 1;
   case AMDGPU::S_BUFFER_LOAD_DWORDX2_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX2_SGPR_IMM:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX2_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX2_SGPR_IMM_ec:
   case AMDGPU::S_LOAD_DWORDX2_IMM:
   case AMDGPU::S_LOAD_DWORDX2_IMM_ec:
   case AMDGPU::GLOBAL_LOAD_DWORDX2:
@@ -363,6 +365,8 @@ static unsigned getOpcodeWidth(const MachineInstr &MI, 
const SIInstrInfo &TII) {
 return 2;
   case AMDGPU::S_BUFFER_LOAD_DWORDX3_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX3_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM_ec:
   case AMDGPU::S_LOAD_DWORDX3_IMM:
   case AMDGPU::S_LOAD_DWORDX3_IMM_ec:
   case AMDGPU::GLOBAL_LOAD_DWORDX3:
@@ -374,6 +378,8 @@ static unsigned getOpcodeWidth(const MachineInstr &MI, 
const SIInstrInfo &TII) {
 return 3;
   case AMDGPU::S_BUFFER_LOAD_DWORDX4_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX4_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM_ec:
   case AMDGPU::S_LOAD_DWORDX4_IMM:
   case AMDGPU::S_LOAD_DWORDX4_IMM_ec:
   case AMDGPU::GLOBAL_LOAD_DWORDX4:
@@ -385,6 +391,8 @@ static unsigned getOpcodeWidth(const MachineInstr &MI, 
const SIInstrInfo &TII) {
 return 4;
   case AMDGPU::S_BUFFER_LOAD_DWORDX8_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX8_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM_ec:
   case AMDGPU::S_LOAD_DWORDX8_IMM:
   case AMDGPU::S_LOAD_DWORDX8_IMM_ec:
 return 8;
@@ -499,12 +507,20 @@ static InstClassEnum getInstClass(unsigned Opc, const 
SIInstrInfo &TII) {
   case AMDGPU::S_BUFFER_LOAD_DWORDX3_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX4_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX8_IMM:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX2_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX3_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX4_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX8_IMM_ec:
 return S_BUFFER_LOAD_IMM;
   case AMDGPU::S_BUFFER_LOAD_DWORD_SGPR_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX2_SGPR_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX2_SGPR_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM_ec:
 return S_BUFFER_LOAD_SGPR_IMM;
   case AMDGPU::S_LOAD_DWORD_IMM:
   case AMDGPU::S_LOAD_DWORDX2_IMM:
@@ -587,12 +603,20 @@ static unsigned getInstSubclass(unsigned Opc, const 
SIInstrInfo &TII) {
   case AMDGPU::S_BUFFER_LOAD_DWORDX3_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX4_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX8_IMM:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX2_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX3_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX4_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX8_IMM_ec:
 return AMDGPU::S_BUFFER_LOAD_DWORD_IMM;
   case AMDGPU::S_BUFFER_LOAD_DWORD_SGPR_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX2_SGPR_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX2_SGPR_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM_ec:
 return AMDGPU::S_BUFFER_LOAD_DWORD_SGPR_IMM;
   case AMDGPU::S_LOAD_DWORD_IMM:
   case AMDGPU::S_LOAD_DWORDX2_IMM:
@@ -703,6 +727,10 @@ static AddressRegs getRegs(unsigned Opc, const SIInstrInfo 
&TII) {
   case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX2_SGPR_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_

[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Include constrained buffer load variants (PR #101619)

2024-08-01 Thread Christudasan Devadasan via llvm-branch-commits


cdevadas wrote:

> [!WARNING]
> This pull request is not mergeable via GitHub because a downstack PR is 
> open. Once all requirements are satisfied, merge this PR as a stack  href="https://app.graphite.dev/github/pr/llvm/llvm-project/101619?utm_source=stack-comment-downstack-mergeability-warning";
>  >on Graphite.
> https://graphite.dev/docs/merge-pull-requests";>Learn more

* **#101619** https://app.graphite.dev/github/pr/llvm/llvm-project/101619?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/> 👈
* **#101618** https://app.graphite.dev/github/pr/llvm/llvm-project/101618?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* `main`

This stack of pull requests is managed by Graphite. https://stacking.dev/?utm_source=stack-comment";>Learn more about 
stacking.


 Join @cdevadas and the rest of your teammates on https://graphite.dev?utm-source=stack-comment";>https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="11px" height="11px"/> Graphite
  

https://github.com/llvm/llvm-project/pull/101619
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Include constrained buffer load variants (PR #101619)

2024-08-01 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas ready_for_review 
https://github.com/llvm/llvm-project/pull/101619
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Include constrained buffer load variants (PR #101619)

2024-08-02 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas updated 
https://github.com/llvm/llvm-project/pull/101619

>From ad8a8dfea913c92fb94079aab0a4a5905b30384d Mon Sep 17 00:00:00 2001
From: Christudasan Devadasan 
Date: Tue, 30 Jul 2024 14:46:36 +0530
Subject: [PATCH 1/3] [AMDGPU][SILoadStoreOptimizer] Include constrained buffer
 load variants

Use the constrained buffer load opcodes while combining under-aligned
load for XNACK enabled subtargets.
---
 .../Target/AMDGPU/SILoadStoreOptimizer.cpp|  75 ++-
 .../AMDGPU/llvm.amdgcn.s.buffer.load.ll   |  56 +-
 .../CodeGen/AMDGPU/merge-sbuffer-load.mir | 564 --
 3 files changed, 613 insertions(+), 82 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp 
b/llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp
index ae537b194f50c..7553c370f694f 100644
--- a/llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp
+++ b/llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp
@@ -352,6 +352,8 @@ static unsigned getOpcodeWidth(const MachineInstr &MI, 
const SIInstrInfo &TII) {
 return 1;
   case AMDGPU::S_BUFFER_LOAD_DWORDX2_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX2_SGPR_IMM:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX2_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX2_SGPR_IMM_ec:
   case AMDGPU::S_LOAD_DWORDX2_IMM:
   case AMDGPU::S_LOAD_DWORDX2_IMM_ec:
   case AMDGPU::GLOBAL_LOAD_DWORDX2:
@@ -363,6 +365,8 @@ static unsigned getOpcodeWidth(const MachineInstr &MI, 
const SIInstrInfo &TII) {
 return 2;
   case AMDGPU::S_BUFFER_LOAD_DWORDX3_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX3_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM_ec:
   case AMDGPU::S_LOAD_DWORDX3_IMM:
   case AMDGPU::S_LOAD_DWORDX3_IMM_ec:
   case AMDGPU::GLOBAL_LOAD_DWORDX3:
@@ -374,6 +378,8 @@ static unsigned getOpcodeWidth(const MachineInstr &MI, 
const SIInstrInfo &TII) {
 return 3;
   case AMDGPU::S_BUFFER_LOAD_DWORDX4_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX4_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM_ec:
   case AMDGPU::S_LOAD_DWORDX4_IMM:
   case AMDGPU::S_LOAD_DWORDX4_IMM_ec:
   case AMDGPU::GLOBAL_LOAD_DWORDX4:
@@ -385,6 +391,8 @@ static unsigned getOpcodeWidth(const MachineInstr &MI, 
const SIInstrInfo &TII) {
 return 4;
   case AMDGPU::S_BUFFER_LOAD_DWORDX8_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX8_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM_ec:
   case AMDGPU::S_LOAD_DWORDX8_IMM:
   case AMDGPU::S_LOAD_DWORDX8_IMM_ec:
 return 8;
@@ -499,12 +507,20 @@ static InstClassEnum getInstClass(unsigned Opc, const 
SIInstrInfo &TII) {
   case AMDGPU::S_BUFFER_LOAD_DWORDX3_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX4_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX8_IMM:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX2_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX3_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX4_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX8_IMM_ec:
 return S_BUFFER_LOAD_IMM;
   case AMDGPU::S_BUFFER_LOAD_DWORD_SGPR_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX2_SGPR_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX2_SGPR_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM_ec:
 return S_BUFFER_LOAD_SGPR_IMM;
   case AMDGPU::S_LOAD_DWORD_IMM:
   case AMDGPU::S_LOAD_DWORDX2_IMM:
@@ -587,12 +603,20 @@ static unsigned getInstSubclass(unsigned Opc, const 
SIInstrInfo &TII) {
   case AMDGPU::S_BUFFER_LOAD_DWORDX3_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX4_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX8_IMM:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX2_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX3_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX4_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX8_IMM_ec:
 return AMDGPU::S_BUFFER_LOAD_DWORD_IMM;
   case AMDGPU::S_BUFFER_LOAD_DWORD_SGPR_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX2_SGPR_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX2_SGPR_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM_ec:
 return AMDGPU::S_BUFFER_LOAD_DWORD_SGPR_IMM;
   case AMDGPU::S_LOAD_DWORD_IMM:
   case AMDGPU::S_LOAD_DWORDX2_IMM:
@@ -703,6 +727,10 @@ static AddressRegs getRegs(unsigned Opc, const SIInstrInfo 
&TII) {
   case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX2_SGPR_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_

[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Include constrained buffer load variants (PR #101619)

2024-08-02 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas updated 
https://github.com/llvm/llvm-project/pull/101619

>From ad8a8dfea913c92fb94079aab0a4a5905b30384d Mon Sep 17 00:00:00 2001
From: Christudasan Devadasan 
Date: Tue, 30 Jul 2024 14:46:36 +0530
Subject: [PATCH 1/4] [AMDGPU][SILoadStoreOptimizer] Include constrained buffer
 load variants

Use the constrained buffer load opcodes while combining under-aligned
load for XNACK enabled subtargets.
---
 .../Target/AMDGPU/SILoadStoreOptimizer.cpp|  75 ++-
 .../AMDGPU/llvm.amdgcn.s.buffer.load.ll   |  56 +-
 .../CodeGen/AMDGPU/merge-sbuffer-load.mir | 564 --
 3 files changed, 613 insertions(+), 82 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp 
b/llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp
index ae537b194f50c..7553c370f694f 100644
--- a/llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp
+++ b/llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp
@@ -352,6 +352,8 @@ static unsigned getOpcodeWidth(const MachineInstr &MI, 
const SIInstrInfo &TII) {
 return 1;
   case AMDGPU::S_BUFFER_LOAD_DWORDX2_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX2_SGPR_IMM:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX2_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX2_SGPR_IMM_ec:
   case AMDGPU::S_LOAD_DWORDX2_IMM:
   case AMDGPU::S_LOAD_DWORDX2_IMM_ec:
   case AMDGPU::GLOBAL_LOAD_DWORDX2:
@@ -363,6 +365,8 @@ static unsigned getOpcodeWidth(const MachineInstr &MI, 
const SIInstrInfo &TII) {
 return 2;
   case AMDGPU::S_BUFFER_LOAD_DWORDX3_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX3_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM_ec:
   case AMDGPU::S_LOAD_DWORDX3_IMM:
   case AMDGPU::S_LOAD_DWORDX3_IMM_ec:
   case AMDGPU::GLOBAL_LOAD_DWORDX3:
@@ -374,6 +378,8 @@ static unsigned getOpcodeWidth(const MachineInstr &MI, 
const SIInstrInfo &TII) {
 return 3;
   case AMDGPU::S_BUFFER_LOAD_DWORDX4_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX4_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM_ec:
   case AMDGPU::S_LOAD_DWORDX4_IMM:
   case AMDGPU::S_LOAD_DWORDX4_IMM_ec:
   case AMDGPU::GLOBAL_LOAD_DWORDX4:
@@ -385,6 +391,8 @@ static unsigned getOpcodeWidth(const MachineInstr &MI, 
const SIInstrInfo &TII) {
 return 4;
   case AMDGPU::S_BUFFER_LOAD_DWORDX8_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX8_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM_ec:
   case AMDGPU::S_LOAD_DWORDX8_IMM:
   case AMDGPU::S_LOAD_DWORDX8_IMM_ec:
 return 8;
@@ -499,12 +507,20 @@ static InstClassEnum getInstClass(unsigned Opc, const 
SIInstrInfo &TII) {
   case AMDGPU::S_BUFFER_LOAD_DWORDX3_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX4_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX8_IMM:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX2_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX3_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX4_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX8_IMM_ec:
 return S_BUFFER_LOAD_IMM;
   case AMDGPU::S_BUFFER_LOAD_DWORD_SGPR_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX2_SGPR_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX2_SGPR_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM_ec:
 return S_BUFFER_LOAD_SGPR_IMM;
   case AMDGPU::S_LOAD_DWORD_IMM:
   case AMDGPU::S_LOAD_DWORDX2_IMM:
@@ -587,12 +603,20 @@ static unsigned getInstSubclass(unsigned Opc, const 
SIInstrInfo &TII) {
   case AMDGPU::S_BUFFER_LOAD_DWORDX3_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX4_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX8_IMM:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX2_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX3_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX4_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX8_IMM_ec:
 return AMDGPU::S_BUFFER_LOAD_DWORD_IMM;
   case AMDGPU::S_BUFFER_LOAD_DWORD_SGPR_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX2_SGPR_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX2_SGPR_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM_ec:
 return AMDGPU::S_BUFFER_LOAD_DWORD_SGPR_IMM;
   case AMDGPU::S_LOAD_DWORD_IMM:
   case AMDGPU::S_LOAD_DWORDX2_IMM:
@@ -703,6 +727,10 @@ static AddressRegs getRegs(unsigned Opc, const SIInstrInfo 
&TII) {
   case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM:
   case AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX2_SGPR_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM_ec:
+  case AMDGPU::S_BUFFER_LOAD_

[llvm-branch-commits] [llvm] AMDGPU: Handle folding frame indexes into s_add_i32 (PR #101694)

2024-08-05 Thread Christudasan Devadasan via llvm-branch-commits



@@ -0,0 +1,930 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
+# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx700 -verify-machineinstrs 
-run-pass=prologepilog %s -o - | FileCheck -check-prefix=MUBUFW64 %s
+# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx803 -verify-machineinstrs 
-run-pass=prologepilog %s -o - | FileCheck -check-prefix=MUBUFW64 %s
+# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx900 -verify-machineinstrs 
-run-pass=prologepilog %s -o - | FileCheck -check-prefix=MUBUFW64 %s
+# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx90a -verify-machineinstrs 
-run-pass=prologepilog %s -o - | FileCheck -check-prefix=MUBUFW64 %s
+# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1010 -verify-machineinstrs 
-run-pass=prologepilog %s -o - | FileCheck -check-prefix=MUBUFW32 %s
+
+# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx940 -verify-machineinstrs 
-run-pass=prologepilog %s -o - | FileCheck -check-prefix=FLATSCRW64 %s
+# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1100 -verify-machineinstrs 
-run-pass=prologepilog %s -o - | FileCheck -check-prefix=FLATSCRW32 %s
+# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1200 -verify-machineinstrs 
-run-pass=prologepilog %s -o - | FileCheck -check-prefix=FLATSCRW32 %s
+
+---
+name: s_add_i32__inline_imm__fi_offset0
+tracksRegLiveness: true
+stack:
+  - { id: 0, size: 32, alignment: 16 }
+machineFunctionInfo:
+  scratchRSrcReg:  '$sgpr0_sgpr1_sgpr2_sgpr3'
+  frameOffsetReg:  '$sgpr33'
+  stackPtrOffsetReg: '$sgpr32'
+body: |
+  bb.0:
+; MUBUFW64-LABEL: name: s_add_i32__inline_imm__fi_offset0
+; MUBUFW64: renamable $sgpr7 = S_LSHR_B32 $sgpr32, 6, implicit-def dead 
$scc
+; MUBUFW64-NEXT: renamable $sgpr7 = S_ADD_I32 $sgpr7, 12, implicit-def 
dead $scc
+; MUBUFW64-NEXT: SI_RETURN implicit $sgpr7
+;
+; MUBUFW32-LABEL: name: s_add_i32__inline_imm__fi_offset0
+; MUBUFW32: renamable $sgpr7 = S_LSHR_B32 $sgpr32, 5, implicit-def dead 
$scc
+; MUBUFW32-NEXT: renamable $sgpr7 = S_ADD_I32 $sgpr7, 12, implicit-def 
dead $scc
+; MUBUFW32-NEXT: SI_RETURN implicit $sgpr7
+;
+; FLATSCRW64-LABEL: name: s_add_i32__inline_imm__fi_offset0
+; FLATSCRW64: renamable $sgpr7 = S_ADD_I32 $sgpr32, 12, implicit-def dead 
$scc
+; FLATSCRW64-NEXT: SI_RETURN implicit $sgpr7
+;
+; FLATSCRW32-LABEL: name: s_add_i32__inline_imm__fi_offset0
+; FLATSCRW32: renamable $sgpr7 = S_ADD_I32 $sgpr32, 12, implicit-def dead 
$scc
+; FLATSCRW32-NEXT: SI_RETURN implicit $sgpr7
+renamable $sgpr7 = S_ADD_I32 12, %stack.0, implicit-def dead $scc
+SI_RETURN implicit $sgpr7
+
+...
+
+---
+name: s_add_i32__fi_offset0__inline_imm
+tracksRegLiveness: true
+stack:
+  - { id: 0, size: 32, alignment: 16 }
+machineFunctionInfo:
+  scratchRSrcReg:  '$sgpr0_sgpr1_sgpr2_sgpr3'
+  frameOffsetReg:  '$sgpr33'
+  stackPtrOffsetReg: '$sgpr32'
+body: |
+  bb.0:
+; MUBUFW64-LABEL: name: s_add_i32__fi_offset0__inline_imm
+; MUBUFW64: renamable $sgpr7 = S_LSHR_B32 $sgpr32, 6, implicit-def dead 
$scc
+; MUBUFW64-NEXT: renamable $sgpr7 = S_ADD_I32 12, $sgpr7, implicit-def 
dead $scc
+; MUBUFW64-NEXT: SI_RETURN implicit $sgpr7
+;
+; MUBUFW32-LABEL: name: s_add_i32__fi_offset0__inline_imm
+; MUBUFW32: renamable $sgpr7 = S_LSHR_B32 $sgpr32, 5, implicit-def dead 
$scc
+; MUBUFW32-NEXT: renamable $sgpr7 = S_ADD_I32 12, $sgpr7, implicit-def 
dead $scc
+; MUBUFW32-NEXT: SI_RETURN implicit $sgpr7
+;
+; FLATSCRW64-LABEL: name: s_add_i32__fi_offset0__inline_imm
+; FLATSCRW64: renamable $sgpr7 = S_ADD_I32 12, $sgpr32, implicit-def dead 
$scc
+; FLATSCRW64-NEXT: SI_RETURN implicit $sgpr7
+;
+; FLATSCRW32-LABEL: name: s_add_i32__fi_offset0__inline_imm
+; FLATSCRW32: renamable $sgpr7 = S_ADD_I32 12, $sgpr32, implicit-def dead 
$scc
+; FLATSCRW32-NEXT: SI_RETURN implicit $sgpr7
+renamable $sgpr7 = S_ADD_I32 %stack.0, 12, implicit-def dead $scc
+SI_RETURN implicit $sgpr7
+
+...
+
+---
+name: s_add_i32__inline_imm___fi_offset_inline_imm
+tracksRegLiveness: true
+stack:
+  - { id: 0, size: 16, alignment: 16 }
+  - { id: 1, size: 24, alignment: 4 }
+machineFunctionInfo:
+  scratchRSrcReg:  '$sgpr0_sgpr1_sgpr2_sgpr3'
+  frameOffsetReg:  '$sgpr33'
+  stackPtrOffsetReg: '$sgpr32'
+body: |
+  bb.0:
+; MUBUFW64-LABEL: name: s_add_i32__inline_imm___fi_offset_inline_imm
+; MUBUFW64: renamable $sgpr7 = S_LSHR_B32 $sgpr32, 6, implicit-def dead 
$scc
+; MUBUFW64-NEXT: renamable $sgpr7 = S_ADD_I32 $sgpr7, 28, implicit-def $scc
+; MUBUFW64-NEXT: SI_RETURN implicit $sgpr7
+;
+; MUBUFW32-LABEL: name: s_add_i32__inline_imm___fi_offset_inline_imm
+; MUBUFW32: renamable $sgpr7 = S_LSHR_B32 $sgpr32, 5, implicit-def dead 
$scc
+; MUBUFW32-NEXT: renamable $sgpr7 = S_ADD_I32 $sgpr7, 28, implicit-def $scc
+; MUBUFW32-NEXT: SI_RETURN implicit $sgpr7
+;
+; FLATSCRW64-LABEL: name: s_add_i32__inline_imm___

[llvm-branch-commits] [llvm] AMDGPU: Handle folding frame indexes into s_add_i32 (PR #101694)

2024-08-05 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas approved this pull request.


https://github.com/llvm/llvm-project/pull/101694
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Include constrained buffer load variants (PR #101619)

2024-08-05 Thread Christudasan Devadasan via llvm-branch-commits


cdevadas wrote:

### Merge activity

* **Aug 6, 1:43 AM EDT**: @cdevadas started a stack merge that includes this 
pull request via 
[Graphite](https://app.graphite.dev/github/pr/llvm/llvm-project/101619).


https://github.com/llvm/llvm-project/pull/101619
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU][R600] Move R600TargetMachine into R600CodeGenPassBuilder(NFC). (PR #103721)

2024-08-14 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas created 
https://github.com/llvm/llvm-project/pull/103721

None

>From cc30faa32e7828d74826421cdae50464ede38e0b Mon Sep 17 00:00:00 2001
From: Christudasan Devadasan 
Date: Wed, 14 Aug 2024 14:18:59 +0530
Subject: [PATCH] [AMDGPU][R600] Move R600TargetMachine into
 R600CodeGenPassBuilder(NFC).

---
 .../AMDGPU/AMDGPUCodeGenPassBuilder.cpp   |   2 +-
 llvm/lib/Target/AMDGPU/CMakeLists.txt |   1 -
 .../Target/AMDGPU/R600CodeGenPassBuilder.cpp  | 149 -
 .../Target/AMDGPU/R600CodeGenPassBuilder.h|  38 -
 llvm/lib/Target/AMDGPU/R600ISelLowering.cpp   |   3 +-
 llvm/lib/Target/AMDGPU/R600TargetMachine.cpp  | 154 --
 llvm/lib/Target/AMDGPU/R600TargetMachine.h|  60 ---
 7 files changed, 187 insertions(+), 220 deletions(-)
 delete mode 100644 llvm/lib/Target/AMDGPU/R600TargetMachine.cpp
 delete mode 100644 llvm/lib/Target/AMDGPU/R600TargetMachine.h

diff --git a/llvm/lib/Target/AMDGPU/AMDGPUCodeGenPassBuilder.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPUCodeGenPassBuilder.cpp
index 0d7233432fc2b5..8fbd672c2ad62d 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUCodeGenPassBuilder.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUCodeGenPassBuilder.cpp
@@ -32,8 +32,8 @@
 #include "GCNSchedStrategy.h"
 #include "GCNVOPDUtils.h"
 #include "R600.h"
+#include "R600CodeGenPassBuilder.h"
 #include "R600MachineFunctionInfo.h"
-#include "R600TargetMachine.h"
 #include "SIFixSGPRCopies.h"
 #include "SIMachineFunctionInfo.h"
 #include "SIMachineScheduler.h"
diff --git a/llvm/lib/Target/AMDGPU/CMakeLists.txt 
b/llvm/lib/Target/AMDGPU/CMakeLists.txt
index e21b2cf62f0c8f..9cb84614fefd36 100644
--- a/llvm/lib/Target/AMDGPU/CMakeLists.txt
+++ b/llvm/lib/Target/AMDGPU/CMakeLists.txt
@@ -137,7 +137,6 @@ add_llvm_target(AMDGPUCodeGen
   R600Packetizer.cpp
   R600RegisterInfo.cpp
   R600Subtarget.cpp
-  R600TargetMachine.cpp
   R600TargetTransformInfo.cpp
   SIAnnotateControlFlow.cpp
   SIFixSGPRCopies.cpp
diff --git a/llvm/lib/Target/AMDGPU/R600CodeGenPassBuilder.cpp 
b/llvm/lib/Target/AMDGPU/R600CodeGenPassBuilder.cpp
index a57b3aa0adb158..1b182e17add9c0 100644
--- a/llvm/lib/Target/AMDGPU/R600CodeGenPassBuilder.cpp
+++ b/llvm/lib/Target/AMDGPU/R600CodeGenPassBuilder.cpp
@@ -5,12 +5,159 @@
 // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
 //
 
//===--===//
+//
+/// \file
+/// This file contains both AMDGPU-R600 target machine and the CodeGen pass
+/// builder. The target machine contains all of the hardware specific
+/// information needed to emit code for R600 GPUs and the CodeGen pass builder
+/// handles the same for new pass manager infrastructure.
+//
+//===--===//
 
 #include "R600CodeGenPassBuilder.h"
-#include "R600TargetMachine.h"
+#include "R600.h"
+#include "R600MachineScheduler.h"
+#include "R600TargetTransformInfo.h"
+#include "llvm/Transforms/Scalar.h"
+#include 
 
 using namespace llvm;
 
+static cl::opt
+EnableR600StructurizeCFG("r600-ir-structurize",
+ cl::desc("Use StructurizeCFG IR pass"),
+ cl::init(true));
+
+static cl::opt EnableR600IfConvert("r600-if-convert",
+ cl::desc("Use if conversion pass"),
+ cl::ReallyHidden, cl::init(true));
+
+static cl::opt EnableAMDGPUFunctionCallsOpt(
+"amdgpu-function-calls", cl::desc("Enable AMDGPU function call support"),
+cl::location(AMDGPUTargetMachine::EnableFunctionCalls), cl::init(true),
+cl::Hidden);
+
+static ScheduleDAGInstrs *createR600MachineScheduler(MachineSchedContext *C) {
+  return new ScheduleDAGMILive(C, std::make_unique());
+}
+
+static MachineSchedRegistry R600SchedRegistry("r600",
+  "Run R600's custom scheduler",
+  createR600MachineScheduler);
+
+//===--===//
+// R600 Target Machine (R600 -> Cayman) - Legacy Pass Manager interface.
+//===--===//
+
+R600TargetMachine::R600TargetMachine(const Target &T, const Triple &TT,
+ StringRef CPU, StringRef FS,
+ const TargetOptions &Options,
+ std::optional RM,
+ std::optional CM,
+ CodeGenOptLevel OL, bool JIT)
+: AMDGPUTargetMachine(T, TT, CPU, FS, Options, RM, CM, OL) {
+  setRequiresStructuredCFG(true);
+
+  // Override the default since calls aren't supported for r600.
+  if (EnableFunctionCalls &&
+  EnableAMDGPUFunctionCallsOpt.getNumOccurrences() == 0)
+EnableFunctionCalls = false;
+}
+
+const TargetSubtargetInfo *
+R600TargetMachine::getSubtarget

[llvm-branch-commits] [llvm] [AMDGPU][R600] Move R600TargetMachine into R600CodeGenPassBuilder(NFC). (PR #103721)

2024-08-14 Thread Christudasan Devadasan via llvm-branch-commits


cdevadas wrote:

> [!WARNING]
> This pull request is not mergeable via GitHub because a downstack PR is 
> open. Once all requirements are satisfied, merge this PR as a stack  href="https://app.graphite.dev/github/pr/llvm/llvm-project/103721?utm_source=stack-comment-downstack-mergeability-warning";
>  >on Graphite.
> https://graphite.dev/docs/merge-pull-requests";>Learn more

* **#103721** https://app.graphite.dev/github/pr/llvm/llvm-project/103721?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/> 👈
* **#103720** https://app.graphite.dev/github/pr/llvm/llvm-project/103720?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* `main`

This stack of pull requests is managed by Graphite. https://stacking.dev/?utm_source=stack-comment";>Learn more about 
stacking.


 Join @cdevadas and the rest of your teammates on https://graphite.dev?utm-source=stack-comment";>https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="11px" height="11px"/> Graphite
  

https://github.com/llvm/llvm-project/pull/103721
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU][R600] Move R600TargetMachine into R600CodeGenPassBuilder(NFC). (PR #103721)

2024-08-14 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas ready_for_review 
https://github.com/llvm/llvm-project/pull/103721
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU][R600] Move R600TargetMachine into R600CodeGenPassBuilder(NFC). (PR #103721)

2024-08-14 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas updated 
https://github.com/llvm/llvm-project/pull/103721

>From a10910597e6ee30e87dd09a4f77fcfa1729873f0 Mon Sep 17 00:00:00 2001
From: Christudasan Devadasan 
Date: Wed, 14 Aug 2024 14:18:59 +0530
Subject: [PATCH 1/2] [AMDGPU][R600] Move R600TargetMachine into
 R600CodeGenPassBuilder(NFC).

---
 .../lib/Target/AMDGPU/AMDGPUTargetMachine.cpp |   2 +-
 llvm/lib/Target/AMDGPU/CMakeLists.txt |   1 -
 .../Target/AMDGPU/R600CodeGenPassBuilder.cpp  | 149 -
 .../Target/AMDGPU/R600CodeGenPassBuilder.h|  38 -
 llvm/lib/Target/AMDGPU/R600ISelLowering.cpp   |   3 +-
 llvm/lib/Target/AMDGPU/R600TargetMachine.cpp  | 154 --
 6 files changed, 187 insertions(+), 160 deletions(-)
 delete mode 100644 llvm/lib/Target/AMDGPU/R600TargetMachine.cpp

diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
index bcedc3623d3ed7..13024f45151899 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -32,8 +32,8 @@
 #include "GCNSchedStrategy.h"
 #include "GCNVOPDUtils.h"
 #include "R600.h"
+#include "R600CodeGenPassBuilder.h"
 #include "R600MachineFunctionInfo.h"
-#include "R600TargetMachine.h"
 #include "SIFixSGPRCopies.h"
 #include "SIMachineFunctionInfo.h"
 #include "SIMachineScheduler.h"
diff --git a/llvm/lib/Target/AMDGPU/CMakeLists.txt 
b/llvm/lib/Target/AMDGPU/CMakeLists.txt
index f493076f5bb8a3..16186f1f1bbed0 100644
--- a/llvm/lib/Target/AMDGPU/CMakeLists.txt
+++ b/llvm/lib/Target/AMDGPU/CMakeLists.txt
@@ -137,7 +137,6 @@ add_llvm_target(AMDGPUCodeGen
   R600Packetizer.cpp
   R600RegisterInfo.cpp
   R600Subtarget.cpp
-  R600TargetMachine.cpp
   R600TargetTransformInfo.cpp
   SIAnnotateControlFlow.cpp
   SIFixSGPRCopies.cpp
diff --git a/llvm/lib/Target/AMDGPU/R600CodeGenPassBuilder.cpp 
b/llvm/lib/Target/AMDGPU/R600CodeGenPassBuilder.cpp
index a57b3aa0adb158..1b182e17add9c0 100644
--- a/llvm/lib/Target/AMDGPU/R600CodeGenPassBuilder.cpp
+++ b/llvm/lib/Target/AMDGPU/R600CodeGenPassBuilder.cpp
@@ -5,12 +5,159 @@
 // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
 //
 
//===--===//
+//
+/// \file
+/// This file contains both AMDGPU-R600 target machine and the CodeGen pass
+/// builder. The target machine contains all of the hardware specific
+/// information needed to emit code for R600 GPUs and the CodeGen pass builder
+/// handles the same for new pass manager infrastructure.
+//
+//===--===//
 
 #include "R600CodeGenPassBuilder.h"
-#include "R600TargetMachine.h"
+#include "R600.h"
+#include "R600MachineScheduler.h"
+#include "R600TargetTransformInfo.h"
+#include "llvm/Transforms/Scalar.h"
+#include 
 
 using namespace llvm;
 
+static cl::opt
+EnableR600StructurizeCFG("r600-ir-structurize",
+ cl::desc("Use StructurizeCFG IR pass"),
+ cl::init(true));
+
+static cl::opt EnableR600IfConvert("r600-if-convert",
+ cl::desc("Use if conversion pass"),
+ cl::ReallyHidden, cl::init(true));
+
+static cl::opt EnableAMDGPUFunctionCallsOpt(
+"amdgpu-function-calls", cl::desc("Enable AMDGPU function call support"),
+cl::location(AMDGPUTargetMachine::EnableFunctionCalls), cl::init(true),
+cl::Hidden);
+
+static ScheduleDAGInstrs *createR600MachineScheduler(MachineSchedContext *C) {
+  return new ScheduleDAGMILive(C, std::make_unique());
+}
+
+static MachineSchedRegistry R600SchedRegistry("r600",
+  "Run R600's custom scheduler",
+  createR600MachineScheduler);
+
+//===--===//
+// R600 Target Machine (R600 -> Cayman) - Legacy Pass Manager interface.
+//===--===//
+
+R600TargetMachine::R600TargetMachine(const Target &T, const Triple &TT,
+ StringRef CPU, StringRef FS,
+ const TargetOptions &Options,
+ std::optional RM,
+ std::optional CM,
+ CodeGenOptLevel OL, bool JIT)
+: AMDGPUTargetMachine(T, TT, CPU, FS, Options, RM, CM, OL) {
+  setRequiresStructuredCFG(true);
+
+  // Override the default since calls aren't supported for r600.
+  if (EnableFunctionCalls &&
+  EnableAMDGPUFunctionCallsOpt.getNumOccurrences() == 0)
+EnableFunctionCalls = false;
+}
+
+const TargetSubtargetInfo *
+R600TargetMachine::getSubtargetImpl(const Function &F) const {
+  StringRef GPU = getGPUName(F);
+  StringRef FS = getFeatureString(F);
+
+  SmallString<128> SubtargetKey(GPU);

[llvm-branch-commits] [llvm] [AMDGPU][R600] Move R600CodeGenPassBuilder into R600TargetMachine(NFC). (PR #103721)

2024-08-14 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas edited 
https://github.com/llvm/llvm-project/pull/103721
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU][R600] Move createMachineFunctionInfo into R600 TM. (PR #104038)

2024-08-14 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas created 
https://github.com/llvm/llvm-project/pull/104038

This definition shouldn't be in AMDGPU TM.

>From 6ab7119cd8b4e63dad2625b0aa9416e2898a3bbc Mon Sep 17 00:00:00 2001
From: Christudasan Devadasan 
Date: Wed, 14 Aug 2024 20:50:30 +0530
Subject: [PATCH] [AMDGPU][R600] Move createMachineFunctionInfo into R600 TM.

This definition shouldn't be in AMDGPU TM.
---
 llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 8 
 llvm/lib/Target/AMDGPU/R600TargetMachine.cpp   | 8 
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
index bcedc3623d3ed..f5da459a43c59 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -32,7 +32,6 @@
 #include "GCNSchedStrategy.h"
 #include "GCNVOPDUtils.h"
 #include "R600.h"
-#include "R600MachineFunctionInfo.h"
 #include "R600TargetMachine.h"
 #include "SIFixSGPRCopies.h"
 #include "SIMachineFunctionInfo.h"
@@ -1193,13 +1192,6 @@ 
AMDGPUPassConfig::createMachineScheduler(MachineSchedContext *C) const {
   return DAG;
 }
 
-MachineFunctionInfo *R600TargetMachine::createMachineFunctionInfo(
-BumpPtrAllocator &Allocator, const Function &F,
-const TargetSubtargetInfo *STI) const {
-  return R600MachineFunctionInfo::create(
-  Allocator, F, static_cast(STI));
-}
-
 
//===--===//
 // GCN Legacy Pass Setup
 
//===--===//
diff --git a/llvm/lib/Target/AMDGPU/R600TargetMachine.cpp 
b/llvm/lib/Target/AMDGPU/R600TargetMachine.cpp
index 06ae5f7289360..a1a60b8bdfa9e 100644
--- a/llvm/lib/Target/AMDGPU/R600TargetMachine.cpp
+++ b/llvm/lib/Target/AMDGPU/R600TargetMachine.cpp
@@ -16,6 +16,7 @@
 
 #include "R600TargetMachine.h"
 #include "R600.h"
+#include "R600MachineFunctionInfo.h"
 #include "R600MachineScheduler.h"
 #include "R600TargetTransformInfo.h"
 #include "llvm/Transforms/Scalar.h"
@@ -154,6 +155,13 @@ Error R600TargetMachine::buildCodeGenPipeline(
   return CGPB.buildPipeline(MPM, Out, DwoOut, FileType);
 }
 
+MachineFunctionInfo *R600TargetMachine::createMachineFunctionInfo(
+BumpPtrAllocator &Allocator, const Function &F,
+const TargetSubtargetInfo *STI) const {
+  return R600MachineFunctionInfo::create(
+  Allocator, F, static_cast(STI));
+}
+
 
//===--===//
 // R600 CodeGen Pass Builder interface.
 
//===--===//

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU][R600] Move createMachineFunctionInfo into R600 TM. (PR #104038)

2024-08-14 Thread Christudasan Devadasan via llvm-branch-commits


cdevadas wrote:

> [!WARNING]
> This pull request is not mergeable via GitHub because a downstack PR is 
> open. Once all requirements are satisfied, merge this PR as a stack  href="https://app.graphite.dev/github/pr/llvm/llvm-project/104038?utm_source=stack-comment-downstack-mergeability-warning";
>  >on Graphite.
> https://graphite.dev/docs/merge-pull-requests";>Learn more

* **#104038** https://app.graphite.dev/github/pr/llvm/llvm-project/104038?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/> 👈
* **#103721** https://app.graphite.dev/github/pr/llvm/llvm-project/103721?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* **#103720** https://app.graphite.dev/github/pr/llvm/llvm-project/103720?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* `main`

This stack of pull requests is managed by Graphite. https://stacking.dev/?utm_source=stack-comment";>Learn more about 
stacking.


 Join @cdevadas and the rest of your teammates on https://graphite.dev?utm-source=stack-comment";>https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="11px" height="11px"/> Graphite
  

https://github.com/llvm/llvm-project/pull/104038
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU][R600] Move createMachineFunctionInfo into R600 TM. (PR #104038)

2024-08-14 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas ready_for_review 
https://github.com/llvm/llvm-project/pull/104038
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU][R600] Move R600CodeGenPassBuilder into R600TargetMachine(NFC). (PR #103721)

2024-08-19 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas updated 
https://github.com/llvm/llvm-project/pull/103721

>From f2095f23eaa5c3876bf7f8d5706881e404c5aa1b Mon Sep 17 00:00:00 2001
From: Christudasan Devadasan 
Date: Wed, 14 Aug 2024 14:18:59 +0530
Subject: [PATCH 1/3] [AMDGPU][R600] Move R600TargetMachine into
 R600CodeGenPassBuilder(NFC).

---
 llvm/lib/Target/AMDGPU/CMakeLists.txt |   1 -
 .../Target/AMDGPU/R600CodeGenPassBuilder.cpp  | 149 -
 .../Target/AMDGPU/R600CodeGenPassBuilder.h|  38 -
 llvm/lib/Target/AMDGPU/R600ISelLowering.cpp   |   3 +-
 llvm/lib/Target/AMDGPU/R600TargetMachine.cpp  | 154 --
 5 files changed, 186 insertions(+), 159 deletions(-)
 delete mode 100644 llvm/lib/Target/AMDGPU/R600TargetMachine.cpp

diff --git a/llvm/lib/Target/AMDGPU/CMakeLists.txt 
b/llvm/lib/Target/AMDGPU/CMakeLists.txt
index f493076f5bb8a3..16186f1f1bbed0 100644
--- a/llvm/lib/Target/AMDGPU/CMakeLists.txt
+++ b/llvm/lib/Target/AMDGPU/CMakeLists.txt
@@ -137,7 +137,6 @@ add_llvm_target(AMDGPUCodeGen
   R600Packetizer.cpp
   R600RegisterInfo.cpp
   R600Subtarget.cpp
-  R600TargetMachine.cpp
   R600TargetTransformInfo.cpp
   SIAnnotateControlFlow.cpp
   SIFixSGPRCopies.cpp
diff --git a/llvm/lib/Target/AMDGPU/R600CodeGenPassBuilder.cpp 
b/llvm/lib/Target/AMDGPU/R600CodeGenPassBuilder.cpp
index a57b3aa0adb158..1b182e17add9c0 100644
--- a/llvm/lib/Target/AMDGPU/R600CodeGenPassBuilder.cpp
+++ b/llvm/lib/Target/AMDGPU/R600CodeGenPassBuilder.cpp
@@ -5,12 +5,159 @@
 // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
 //
 
//===--===//
+//
+/// \file
+/// This file contains both AMDGPU-R600 target machine and the CodeGen pass
+/// builder. The target machine contains all of the hardware specific
+/// information needed to emit code for R600 GPUs and the CodeGen pass builder
+/// handles the same for new pass manager infrastructure.
+//
+//===--===//
 
 #include "R600CodeGenPassBuilder.h"
-#include "R600TargetMachine.h"
+#include "R600.h"
+#include "R600MachineScheduler.h"
+#include "R600TargetTransformInfo.h"
+#include "llvm/Transforms/Scalar.h"
+#include 
 
 using namespace llvm;
 
+static cl::opt
+EnableR600StructurizeCFG("r600-ir-structurize",
+ cl::desc("Use StructurizeCFG IR pass"),
+ cl::init(true));
+
+static cl::opt EnableR600IfConvert("r600-if-convert",
+ cl::desc("Use if conversion pass"),
+ cl::ReallyHidden, cl::init(true));
+
+static cl::opt EnableAMDGPUFunctionCallsOpt(
+"amdgpu-function-calls", cl::desc("Enable AMDGPU function call support"),
+cl::location(AMDGPUTargetMachine::EnableFunctionCalls), cl::init(true),
+cl::Hidden);
+
+static ScheduleDAGInstrs *createR600MachineScheduler(MachineSchedContext *C) {
+  return new ScheduleDAGMILive(C, std::make_unique());
+}
+
+static MachineSchedRegistry R600SchedRegistry("r600",
+  "Run R600's custom scheduler",
+  createR600MachineScheduler);
+
+//===--===//
+// R600 Target Machine (R600 -> Cayman) - Legacy Pass Manager interface.
+//===--===//
+
+R600TargetMachine::R600TargetMachine(const Target &T, const Triple &TT,
+ StringRef CPU, StringRef FS,
+ const TargetOptions &Options,
+ std::optional RM,
+ std::optional CM,
+ CodeGenOptLevel OL, bool JIT)
+: AMDGPUTargetMachine(T, TT, CPU, FS, Options, RM, CM, OL) {
+  setRequiresStructuredCFG(true);
+
+  // Override the default since calls aren't supported for r600.
+  if (EnableFunctionCalls &&
+  EnableAMDGPUFunctionCallsOpt.getNumOccurrences() == 0)
+EnableFunctionCalls = false;
+}
+
+const TargetSubtargetInfo *
+R600TargetMachine::getSubtargetImpl(const Function &F) const {
+  StringRef GPU = getGPUName(F);
+  StringRef FS = getFeatureString(F);
+
+  SmallString<128> SubtargetKey(GPU);
+  SubtargetKey.append(FS);
+
+  auto &I = SubtargetMap[SubtargetKey];
+  if (!I) {
+// This needs to be done before we create a new subtarget since any
+// creation will depend on the TM and the code generation flags on the
+// function that reside in TargetOptions.
+resetTargetOptions(F);
+I = std::make_unique(TargetTriple, GPU, FS, *this);
+  }
+
+  return I.get();
+}
+
+TargetTransformInfo
+R600TargetMachine::getTargetTransformInfo(const Function &F) const {
+  return TargetTransformInfo(R600TTIImpl(this, F));
+}
+
+namespace {
+class R600PassConfig final : public AMDGPUPassConfig {
+pu

[llvm-branch-commits] [llvm] [AMDGPU][R600] Move R600CodeGenPassBuilder into R600TargetMachine(NFC). (PR #103721)

2024-08-19 Thread Christudasan Devadasan via llvm-branch-commits


cdevadas wrote:

### Merge activity

* **Aug 19, 10:55 AM EDT**: @cdevadas started a stack merge that includes this 
pull request via 
[Graphite](https://app.graphite.dev/github/pr/llvm/llvm-project/103721).


https://github.com/llvm/llvm-project/pull/103721
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port MachineCSE pass to new pass manager. (PR #106605)

2024-08-29 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas created 
https://github.com/llvm/llvm-project/pull/106605

None

>From 607099de09be2fed6d9277c8439ade69e0820d92 Mon Sep 17 00:00:00 2001
From: Christudasan Devadasan 
Date: Thu, 29 Aug 2024 22:21:22 +0530
Subject: [PATCH] [CodeGen][NewPM] Port MachineCSE pass to new pass manager.

---
 llvm/include/llvm/CodeGen/MachineCSE.h|  29 ++
 llvm/include/llvm/CodeGen/Passes.h|   2 +-
 llvm/include/llvm/InitializePasses.h  |   2 +-
 llvm/include/llvm/Passes/CodeGenPassBuilder.h |   1 +
 .../llvm/Passes/MachinePassRegistry.def   |   2 +-
 llvm/lib/CodeGen/CodeGen.cpp  |   2 +-
 llvm/lib/CodeGen/MachineCSE.cpp   | 283 ++
 llvm/lib/CodeGen/TargetPassConfig.cpp |   6 +-
 llvm/lib/Passes/PassBuilder.cpp   |   1 +
 .../lib/Target/AMDGPU/AMDGPUTargetMachine.cpp |   2 +-
 llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp  |   2 +-
 .../GlobalISel/machine-cse-mid-pipeline.mir   |   1 +
 .../AArch64/sve-pfalse-machine-cse.mir|   1 +
 .../no-cse-nonlocal-convergent-instrs.mir |   1 +
 .../copyprop_regsequence_with_undef.mir   |   1 +
 llvm/test/CodeGen/AMDGPU/machine-cse-ssa.mir  |   8 +-
 .../CodeGen/PowerPC/machine-cse-rm-pre.mir|   1 +
 .../CodeGen/Thumb/machine-cse-deadreg.mir |   1 +
 .../CodeGen/Thumb/machine-cse-physreg.mir |   1 +
 llvm/test/CodeGen/X86/cse-two-preds.mir   |   1 +
 llvm/test/DebugInfo/MIR/X86/machine-cse.mir   |   1 +
 21 files changed, 215 insertions(+), 134 deletions(-)
 create mode 100644 llvm/include/llvm/CodeGen/MachineCSE.h

diff --git a/llvm/include/llvm/CodeGen/MachineCSE.h 
b/llvm/include/llvm/CodeGen/MachineCSE.h
new file mode 100644
index 00..7440068e3e6f46
--- /dev/null
+++ b/llvm/include/llvm/CodeGen/MachineCSE.h
@@ -0,0 +1,29 @@
+//===- llvm/CodeGen/MachineCSE.h -*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#ifndef LLVM_CODEGEN_MACHINECSE_H
+#define LLVM_CODEGEN_MACHINECSE_H
+
+#include "llvm/CodeGen/MachinePassManager.h"
+
+namespace llvm {
+
+class MachineCSEPass : public PassInfoMixin {
+public:
+  PreservedAnalyses run(MachineFunction &MF,
+MachineFunctionAnalysisManager &MFAM);
+
+  MachineFunctionProperties getRequiredProperties() {
+return MachineFunctionProperties().set(
+MachineFunctionProperties::Property::IsSSA);
+  }
+};
+
+} // namespace llvm
+
+#endif // LLVM_CODEGEN_MACHINECSE_H
diff --git a/llvm/include/llvm/CodeGen/Passes.h 
b/llvm/include/llvm/CodeGen/Passes.h
index dbdd110b0600e5..ddb2012cd2bffc 100644
--- a/llvm/include/llvm/CodeGen/Passes.h
+++ b/llvm/include/llvm/CodeGen/Passes.h
@@ -330,7 +330,7 @@ namespace llvm {
   extern char &GCMachineCodeAnalysisID;
 
   /// MachineCSE - This pass performs global CSE on machine instructions.
-  extern char &MachineCSEID;
+  extern char &MachineCSELegacyID;
 
   /// MIRCanonicalizer - This pass canonicalizes MIR by renaming vregs
   /// according to the semantics of the instruction as well as hoists
diff --git a/llvm/include/llvm/InitializePasses.h 
b/llvm/include/llvm/InitializePasses.h
index 47a1ca15fc0d1f..6605c6fde92510 100644
--- a/llvm/include/llvm/InitializePasses.h
+++ b/llvm/include/llvm/InitializePasses.h
@@ -188,7 +188,7 @@ void initializeMachineBlockPlacementPass(PassRegistry &);
 void initializeMachineBlockPlacementStatsPass(PassRegistry &);
 void initializeMachineBranchProbabilityInfoWrapperPassPass(PassRegistry &);
 void initializeMachineCFGPrinterPass(PassRegistry &);
-void initializeMachineCSEPass(PassRegistry &);
+void initializeMachineCSELegacyPass(PassRegistry &);
 void initializeMachineCombinerPass(PassRegistry &);
 void initializeMachineCopyPropagationPass(PassRegistry &);
 void initializeMachineCycleInfoPrinterPassPass(PassRegistry &);
diff --git a/llvm/include/llvm/Passes/CodeGenPassBuilder.h 
b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
index eb15beb835b535..6c34747b9da406 100644
--- a/llvm/include/llvm/Passes/CodeGenPassBuilder.h
+++ b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
@@ -42,6 +42,7 @@
 #include "llvm/CodeGen/LocalStackSlotAllocation.h"
 #include "llvm/CodeGen/LowerEmuTLS.h"
 #include "llvm/CodeGen/MIRPrinter.h"
+#include "llvm/CodeGen/MachineCSE.h"
 #include "llvm/CodeGen/MachineFunctionAnalysis.h"
 #include "llvm/CodeGen/MachineModuleInfo.h"
 #include "llvm/CodeGen/MachinePassManager.h"
diff --git a/llvm/include/llvm/Passes/MachinePassRegistry.def 
b/llvm/include/llvm/Passes/MachinePassRegistry.def
index b710b1c46f643f..cb781532e266e6 100644
--- a/llvm/include/llvm/Passes/MachinePassRegistry.def
+++ b/llvm/include/llvm/Passes/MachinePassRegistry.def
@@ -131,6 +131,7 @@ MACHINE_FUNCTION_ANALYSIS("slot-indexes"

[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port MachineCSE pass to new pass manager. (PR #106605)

2024-08-29 Thread Christudasan Devadasan via llvm-branch-commits


cdevadas wrote:

> [!WARNING]
> This pull request is not mergeable via GitHub because a downstack PR is 
> open. Once all requirements are satisfied, merge this PR as a stack  href="https://app.graphite.dev/github/pr/llvm/llvm-project/106605?utm_source=stack-comment-downstack-mergeability-warning";
>  >on Graphite.
> https://graphite.dev/docs/merge-pull-requests";>Learn more

* **#106605** https://app.graphite.dev/github/pr/llvm/llvm-project/106605?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/> 👈
* **#106604** https://app.graphite.dev/github/pr/llvm/llvm-project/106604?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* `main`

This stack of pull requests is managed by Graphite. https://stacking.dev/?utm_source=stack-comment";>Learn more about 
stacking.


 Join @cdevadas and the rest of your teammates on https://graphite.dev?utm-source=stack-comment";>https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="11px" height="11px"/> Graphite
  

https://github.com/llvm/llvm-project/pull/106605
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port MachineCSE pass to new pass manager. (PR #106605)

2024-08-29 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas ready_for_review 
https://github.com/llvm/llvm-project/pull/106605
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port MachineCSE pass to new pass manager. (PR #106605)

2024-08-29 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas updated 
https://github.com/llvm/llvm-project/pull/106605

>From 607099de09be2fed6d9277c8439ade69e0820d92 Mon Sep 17 00:00:00 2001
From: Christudasan Devadasan 
Date: Thu, 29 Aug 2024 22:21:22 +0530
Subject: [PATCH 1/2] [CodeGen][NewPM] Port MachineCSE pass to new pass
 manager.

---
 llvm/include/llvm/CodeGen/MachineCSE.h|  29 ++
 llvm/include/llvm/CodeGen/Passes.h|   2 +-
 llvm/include/llvm/InitializePasses.h  |   2 +-
 llvm/include/llvm/Passes/CodeGenPassBuilder.h |   1 +
 .../llvm/Passes/MachinePassRegistry.def   |   2 +-
 llvm/lib/CodeGen/CodeGen.cpp  |   2 +-
 llvm/lib/CodeGen/MachineCSE.cpp   | 283 ++
 llvm/lib/CodeGen/TargetPassConfig.cpp |   6 +-
 llvm/lib/Passes/PassBuilder.cpp   |   1 +
 .../lib/Target/AMDGPU/AMDGPUTargetMachine.cpp |   2 +-
 llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp  |   2 +-
 .../GlobalISel/machine-cse-mid-pipeline.mir   |   1 +
 .../AArch64/sve-pfalse-machine-cse.mir|   1 +
 .../no-cse-nonlocal-convergent-instrs.mir |   1 +
 .../copyprop_regsequence_with_undef.mir   |   1 +
 llvm/test/CodeGen/AMDGPU/machine-cse-ssa.mir  |   8 +-
 .../CodeGen/PowerPC/machine-cse-rm-pre.mir|   1 +
 .../CodeGen/Thumb/machine-cse-deadreg.mir |   1 +
 .../CodeGen/Thumb/machine-cse-physreg.mir |   1 +
 llvm/test/CodeGen/X86/cse-two-preds.mir   |   1 +
 llvm/test/DebugInfo/MIR/X86/machine-cse.mir   |   1 +
 21 files changed, 215 insertions(+), 134 deletions(-)
 create mode 100644 llvm/include/llvm/CodeGen/MachineCSE.h

diff --git a/llvm/include/llvm/CodeGen/MachineCSE.h 
b/llvm/include/llvm/CodeGen/MachineCSE.h
new file mode 100644
index 00..7440068e3e6f46
--- /dev/null
+++ b/llvm/include/llvm/CodeGen/MachineCSE.h
@@ -0,0 +1,29 @@
+//===- llvm/CodeGen/MachineCSE.h -*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#ifndef LLVM_CODEGEN_MACHINECSE_H
+#define LLVM_CODEGEN_MACHINECSE_H
+
+#include "llvm/CodeGen/MachinePassManager.h"
+
+namespace llvm {
+
+class MachineCSEPass : public PassInfoMixin {
+public:
+  PreservedAnalyses run(MachineFunction &MF,
+MachineFunctionAnalysisManager &MFAM);
+
+  MachineFunctionProperties getRequiredProperties() {
+return MachineFunctionProperties().set(
+MachineFunctionProperties::Property::IsSSA);
+  }
+};
+
+} // namespace llvm
+
+#endif // LLVM_CODEGEN_MACHINECSE_H
diff --git a/llvm/include/llvm/CodeGen/Passes.h 
b/llvm/include/llvm/CodeGen/Passes.h
index dbdd110b0600e5..ddb2012cd2bffc 100644
--- a/llvm/include/llvm/CodeGen/Passes.h
+++ b/llvm/include/llvm/CodeGen/Passes.h
@@ -330,7 +330,7 @@ namespace llvm {
   extern char &GCMachineCodeAnalysisID;
 
   /// MachineCSE - This pass performs global CSE on machine instructions.
-  extern char &MachineCSEID;
+  extern char &MachineCSELegacyID;
 
   /// MIRCanonicalizer - This pass canonicalizes MIR by renaming vregs
   /// according to the semantics of the instruction as well as hoists
diff --git a/llvm/include/llvm/InitializePasses.h 
b/llvm/include/llvm/InitializePasses.h
index 47a1ca15fc0d1f..6605c6fde92510 100644
--- a/llvm/include/llvm/InitializePasses.h
+++ b/llvm/include/llvm/InitializePasses.h
@@ -188,7 +188,7 @@ void initializeMachineBlockPlacementPass(PassRegistry &);
 void initializeMachineBlockPlacementStatsPass(PassRegistry &);
 void initializeMachineBranchProbabilityInfoWrapperPassPass(PassRegistry &);
 void initializeMachineCFGPrinterPass(PassRegistry &);
-void initializeMachineCSEPass(PassRegistry &);
+void initializeMachineCSELegacyPass(PassRegistry &);
 void initializeMachineCombinerPass(PassRegistry &);
 void initializeMachineCopyPropagationPass(PassRegistry &);
 void initializeMachineCycleInfoPrinterPassPass(PassRegistry &);
diff --git a/llvm/include/llvm/Passes/CodeGenPassBuilder.h 
b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
index eb15beb835b535..6c34747b9da406 100644
--- a/llvm/include/llvm/Passes/CodeGenPassBuilder.h
+++ b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
@@ -42,6 +42,7 @@
 #include "llvm/CodeGen/LocalStackSlotAllocation.h"
 #include "llvm/CodeGen/LowerEmuTLS.h"
 #include "llvm/CodeGen/MIRPrinter.h"
+#include "llvm/CodeGen/MachineCSE.h"
 #include "llvm/CodeGen/MachineFunctionAnalysis.h"
 #include "llvm/CodeGen/MachineModuleInfo.h"
 #include "llvm/CodeGen/MachinePassManager.h"
diff --git a/llvm/include/llvm/Passes/MachinePassRegistry.def 
b/llvm/include/llvm/Passes/MachinePassRegistry.def
index b710b1c46f643f..cb781532e266e6 100644
--- a/llvm/include/llvm/Passes/MachinePassRegistry.def
+++ b/llvm/include/llvm/Passes/MachinePassRegistry.def
@@ -131,6 +131,7 @@ MACHINE_FUNCTION_ANALYSIS("slot-indexes",

[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port MachineCSE pass to new pass manager. (PR #106605)

2024-08-29 Thread Christudasan Devadasan via llvm-branch-commits



@@ -932,18 +935,55 @@ bool 
MachineCSE::isProfitableToHoistInto(MachineBasicBlock *CandidateBB,
  MBFI->getBlockFreq(MBB) + MBFI->getBlockFreq(MBB1);
 }
 
-bool MachineCSE::runOnMachineFunction(MachineFunction &MF) {
-  if (skipFunction(MF.getFunction()))
-return false;
+void MachineCSEImpl::releaseMemory() {
+  ScopeMap.clear();
+  PREMap.clear();
+  Exps.clear();
+}
 
+bool MachineCSEImpl::run(MachineFunction &MF) {
   TII = MF.getSubtarget().getInstrInfo();
   TRI = MF.getSubtarget().getRegisterInfo();
   MRI = &MF.getRegInfo();
-  DT = &getAnalysis().getDomTree();
-  MBFI = &getAnalysis().getMBFI();
   LookAheadLimit = TII->getMachineCSELookAheadLimit();
   bool ChangedPRE, ChangedCSE;
   ChangedPRE = PerformSimplePRE(DT);
   ChangedCSE = PerformCSE(DT->getRootNode());
+  releaseMemory();
   return ChangedPRE || ChangedCSE;
 }
+
+PreservedAnalyses MachineCSEPass::run(MachineFunction &MF,
+  MachineFunctionAnalysisManager &MFAM) {
+  MFPropsModifier _(*this, MF);
+
+  if (MF.getFunction().hasOptNone())
+return PreservedAnalyses::all();
+
+  MachineDominatorTree &MDT = MFAM.getResult(MF);
+  MachineBlockFrequencyInfo &MBFI =
+  MFAM.getResult(MF);
+  MachineCSEImpl Impl(&MDT, &MBFI);
+  bool Changed = Impl.run(MF);
+  if (!Changed)
+return PreservedAnalyses::all();
+
+  auto PA = getMachineFunctionPassPreservedAnalyses();
+  PA.preserve();
+  PA.preserve();

cdevadas wrote:

Not sure about that. I have seen similar instances (DT being preserved along 
with the preserveSet) in other backend passes ported to NPM.

https://github.com/llvm/llvm-project/pull/106605
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port MachineCSE pass to new pass manager. (PR #106605)

2024-08-29 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas updated 
https://github.com/llvm/llvm-project/pull/106605

>From 607099de09be2fed6d9277c8439ade69e0820d92 Mon Sep 17 00:00:00 2001
From: Christudasan Devadasan 
Date: Thu, 29 Aug 2024 22:21:22 +0530
Subject: [PATCH 1/3] [CodeGen][NewPM] Port MachineCSE pass to new pass
 manager.

---
 llvm/include/llvm/CodeGen/MachineCSE.h|  29 ++
 llvm/include/llvm/CodeGen/Passes.h|   2 +-
 llvm/include/llvm/InitializePasses.h  |   2 +-
 llvm/include/llvm/Passes/CodeGenPassBuilder.h |   1 +
 .../llvm/Passes/MachinePassRegistry.def   |   2 +-
 llvm/lib/CodeGen/CodeGen.cpp  |   2 +-
 llvm/lib/CodeGen/MachineCSE.cpp   | 283 ++
 llvm/lib/CodeGen/TargetPassConfig.cpp |   6 +-
 llvm/lib/Passes/PassBuilder.cpp   |   1 +
 .../lib/Target/AMDGPU/AMDGPUTargetMachine.cpp |   2 +-
 llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp  |   2 +-
 .../GlobalISel/machine-cse-mid-pipeline.mir   |   1 +
 .../AArch64/sve-pfalse-machine-cse.mir|   1 +
 .../no-cse-nonlocal-convergent-instrs.mir |   1 +
 .../copyprop_regsequence_with_undef.mir   |   1 +
 llvm/test/CodeGen/AMDGPU/machine-cse-ssa.mir  |   8 +-
 .../CodeGen/PowerPC/machine-cse-rm-pre.mir|   1 +
 .../CodeGen/Thumb/machine-cse-deadreg.mir |   1 +
 .../CodeGen/Thumb/machine-cse-physreg.mir |   1 +
 llvm/test/CodeGen/X86/cse-two-preds.mir   |   1 +
 llvm/test/DebugInfo/MIR/X86/machine-cse.mir   |   1 +
 21 files changed, 215 insertions(+), 134 deletions(-)
 create mode 100644 llvm/include/llvm/CodeGen/MachineCSE.h

diff --git a/llvm/include/llvm/CodeGen/MachineCSE.h 
b/llvm/include/llvm/CodeGen/MachineCSE.h
new file mode 100644
index 00..7440068e3e6f46
--- /dev/null
+++ b/llvm/include/llvm/CodeGen/MachineCSE.h
@@ -0,0 +1,29 @@
+//===- llvm/CodeGen/MachineCSE.h -*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#ifndef LLVM_CODEGEN_MACHINECSE_H
+#define LLVM_CODEGEN_MACHINECSE_H
+
+#include "llvm/CodeGen/MachinePassManager.h"
+
+namespace llvm {
+
+class MachineCSEPass : public PassInfoMixin {
+public:
+  PreservedAnalyses run(MachineFunction &MF,
+MachineFunctionAnalysisManager &MFAM);
+
+  MachineFunctionProperties getRequiredProperties() {
+return MachineFunctionProperties().set(
+MachineFunctionProperties::Property::IsSSA);
+  }
+};
+
+} // namespace llvm
+
+#endif // LLVM_CODEGEN_MACHINECSE_H
diff --git a/llvm/include/llvm/CodeGen/Passes.h 
b/llvm/include/llvm/CodeGen/Passes.h
index dbdd110b0600e5..ddb2012cd2bffc 100644
--- a/llvm/include/llvm/CodeGen/Passes.h
+++ b/llvm/include/llvm/CodeGen/Passes.h
@@ -330,7 +330,7 @@ namespace llvm {
   extern char &GCMachineCodeAnalysisID;
 
   /// MachineCSE - This pass performs global CSE on machine instructions.
-  extern char &MachineCSEID;
+  extern char &MachineCSELegacyID;
 
   /// MIRCanonicalizer - This pass canonicalizes MIR by renaming vregs
   /// according to the semantics of the instruction as well as hoists
diff --git a/llvm/include/llvm/InitializePasses.h 
b/llvm/include/llvm/InitializePasses.h
index 47a1ca15fc0d1f..6605c6fde92510 100644
--- a/llvm/include/llvm/InitializePasses.h
+++ b/llvm/include/llvm/InitializePasses.h
@@ -188,7 +188,7 @@ void initializeMachineBlockPlacementPass(PassRegistry &);
 void initializeMachineBlockPlacementStatsPass(PassRegistry &);
 void initializeMachineBranchProbabilityInfoWrapperPassPass(PassRegistry &);
 void initializeMachineCFGPrinterPass(PassRegistry &);
-void initializeMachineCSEPass(PassRegistry &);
+void initializeMachineCSELegacyPass(PassRegistry &);
 void initializeMachineCombinerPass(PassRegistry &);
 void initializeMachineCopyPropagationPass(PassRegistry &);
 void initializeMachineCycleInfoPrinterPassPass(PassRegistry &);
diff --git a/llvm/include/llvm/Passes/CodeGenPassBuilder.h 
b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
index eb15beb835b535..6c34747b9da406 100644
--- a/llvm/include/llvm/Passes/CodeGenPassBuilder.h
+++ b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
@@ -42,6 +42,7 @@
 #include "llvm/CodeGen/LocalStackSlotAllocation.h"
 #include "llvm/CodeGen/LowerEmuTLS.h"
 #include "llvm/CodeGen/MIRPrinter.h"
+#include "llvm/CodeGen/MachineCSE.h"
 #include "llvm/CodeGen/MachineFunctionAnalysis.h"
 #include "llvm/CodeGen/MachineModuleInfo.h"
 #include "llvm/CodeGen/MachinePassManager.h"
diff --git a/llvm/include/llvm/Passes/MachinePassRegistry.def 
b/llvm/include/llvm/Passes/MachinePassRegistry.def
index b710b1c46f643f..cb781532e266e6 100644
--- a/llvm/include/llvm/Passes/MachinePassRegistry.def
+++ b/llvm/include/llvm/Passes/MachinePassRegistry.def
@@ -131,6 +131,7 @@ MACHINE_FUNCTION_ANALYSIS("slot-indexes",

[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port MachineCSE pass to new pass manager. (PR #106605)

2024-08-29 Thread Christudasan Devadasan via llvm-branch-commits



@@ -932,18 +935,55 @@ bool 
MachineCSE::isProfitableToHoistInto(MachineBasicBlock *CandidateBB,
  MBFI->getBlockFreq(MBB) + MBFI->getBlockFreq(MBB1);
 }
 
-bool MachineCSE::runOnMachineFunction(MachineFunction &MF) {
-  if (skipFunction(MF.getFunction()))
-return false;
+void MachineCSEImpl::releaseMemory() {
+  ScopeMap.clear();
+  PREMap.clear();
+  Exps.clear();
+}
 
+bool MachineCSEImpl::run(MachineFunction &MF) {
   TII = MF.getSubtarget().getInstrInfo();
   TRI = MF.getSubtarget().getRegisterInfo();
   MRI = &MF.getRegInfo();
-  DT = &getAnalysis().getDomTree();
-  MBFI = &getAnalysis().getMBFI();
   LookAheadLimit = TII->getMachineCSELookAheadLimit();
   bool ChangedPRE, ChangedCSE;
   ChangedPRE = PerformSimplePRE(DT);
   ChangedCSE = PerformCSE(DT->getRootNode());
+  releaseMemory();
   return ChangedPRE || ChangedCSE;
 }
+
+PreservedAnalyses MachineCSEPass::run(MachineFunction &MF,
+  MachineFunctionAnalysisManager &MFAM) {
+  MFPropsModifier _(*this, MF);
+
+  if (MF.getFunction().hasOptNone())
+return PreservedAnalyses::all();
+
+  MachineDominatorTree &MDT = MFAM.getResult(MF);
+  MachineBlockFrequencyInfo &MBFI =
+  MFAM.getResult(MF);
+  MachineCSEImpl Impl(&MDT, &MBFI);
+  bool Changed = Impl.run(MF);
+  if (!Changed)
+return PreservedAnalyses::all();
+
+  auto PA = getMachineFunctionPassPreservedAnalyses();
+  PA.preserve();
+  PA.preserve();

cdevadas wrote:

I guess, we won't be able to figure it out until we have the whole pipeline 
running in NPM.

https://github.com/llvm/llvm-project/pull/106605
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port MachineCSE pass to new pass manager. (PR #106605)

2024-09-03 Thread Christudasan Devadasan via llvm-branch-commits


cdevadas wrote:

Ping

https://github.com/llvm/llvm-project/pull/106605
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port MachineCSE pass to new pass manager. (PR #106605)

2024-09-04 Thread Christudasan Devadasan via llvm-branch-commits


cdevadas wrote:

### Merge activity

* **Sep 4, 8:51 AM EDT**: @cdevadas started a stack merge that includes this 
pull request via 
[Graphite](https://app.graphite.dev/github/pr/llvm/llvm-project/106605).


https://github.com/llvm/llvm-project/pull/106605
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port machine trace metrics analysis to new pass manager. (PR #108507)

2024-09-12 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas created 
https://github.com/llvm/llvm-project/pull/108507

None

>From b56e90fa59c0f5db905b94aa74c771e3f72cd81d Mon Sep 17 00:00:00 2001
From: Christudasan Devadasan 
Date: Thu, 12 Sep 2024 23:38:09 +0530
Subject: [PATCH] [CodeGen][NewPM] Port machine trace metrics analysis to new
 pass manager.

---
 .../llvm/CodeGen/MachineTraceMetrics.h| 58 ++---
 llvm/include/llvm/InitializePasses.h  |  2 +-
 .../llvm/Passes/MachinePassRegistry.def   |  4 +-
 llvm/lib/CodeGen/EarlyIfConversion.cpp|  8 +--
 llvm/lib/CodeGen/MachineCombiner.cpp  |  8 +--
 llvm/lib/CodeGen/MachineTraceMetrics.cpp  | 62 +++
 llvm/lib/Passes/PassBuilder.cpp   |  1 +
 .../AArch64/AArch64ConditionalCompares.cpp|  8 +--
 .../AArch64/AArch64StorePairSuppress.cpp  |  6 +-
 9 files changed, 119 insertions(+), 38 deletions(-)

diff --git a/llvm/include/llvm/CodeGen/MachineTraceMetrics.h 
b/llvm/include/llvm/CodeGen/MachineTraceMetrics.h
index a5e78d47724d82..ca5a1197911d0a 100644
--- a/llvm/include/llvm/CodeGen/MachineTraceMetrics.h
+++ b/llvm/include/llvm/CodeGen/MachineTraceMetrics.h
@@ -46,12 +46,13 @@
 #ifndef LLVM_CODEGEN_MACHINETRACEMETRICS_H
 #define LLVM_CODEGEN_MACHINETRACEMETRICS_H
 
-#include "llvm/ADT/SparseSet.h"
 #include "llvm/ADT/ArrayRef.h"
 #include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/SmallVector.h"
+#include "llvm/ADT/SparseSet.h"
 #include "llvm/CodeGen/MachineBasicBlock.h"
 #include "llvm/CodeGen/MachineFunctionPass.h"
+#include "llvm/CodeGen/MachinePassManager.h"
 #include "llvm/CodeGen/TargetSchedule.h"
 
 namespace llvm {
@@ -93,7 +94,7 @@ enum class MachineTraceStrategy {
   TS_NumStrategies
 };
 
-class MachineTraceMetrics : public MachineFunctionPass {
+class MachineTraceMetrics {
   const MachineFunction *MF = nullptr;
   const TargetInstrInfo *TII = nullptr;
   const TargetRegisterInfo *TRI = nullptr;
@@ -102,19 +103,22 @@ class MachineTraceMetrics : public MachineFunctionPass {
   TargetSchedModel SchedModel;
 
 public:
+  friend class MachineTraceMetricsWrapperPass;
   friend class Ensemble;
   friend class Trace;
 
   class Ensemble;
 
-  static char ID;
+  // For legacy pass.
+  MachineTraceMetrics() {
+std::fill(std::begin(Ensembles), std::end(Ensembles), nullptr);
+  }
 
-  MachineTraceMetrics();
+  explicit MachineTraceMetrics(MachineFunction &MF, const MachineLoopInfo &LI);
+  ~MachineTraceMetrics();
 
-  void getAnalysisUsage(AnalysisUsage&) const override;
-  bool runOnMachineFunction(MachineFunction&) override;
-  void releaseMemory() override;
-  void verifyAnalysis() const override;
+  void init(MachineFunction &Func, const MachineLoopInfo &LI);
+  void clear();
 
   /// Per-basic block information that doesn't depend on the trace through the
   /// block.
@@ -400,6 +404,12 @@ class MachineTraceMetrics : public MachineFunctionPass {
   /// Call Ensemble::getTrace() again to update any trace handles.
   void invalidate(const MachineBasicBlock *MBB);
 
+  /// Handle invalidation explicitly.
+  bool invalidate(MachineFunction &, const PreservedAnalyses &PA,
+  MachineFunctionAnalysisManager::Invalidator &);
+
+  void verifyAnalysis() const;
+
 private:
   // One entry per basic block, indexed by block number.
   SmallVector BlockInfo;
@@ -435,6 +445,38 @@ inline raw_ostream &operator<<(raw_ostream &OS,
   return OS;
 }
 
+class MachineTraceMetricsAnalysis
+: public AnalysisInfoMixin {
+  friend AnalysisInfoMixin;
+  static AnalysisKey Key;
+
+public:
+  using Result = MachineTraceMetrics;
+  Result run(MachineFunction &MF, MachineFunctionAnalysisManager &MFAM);
+};
+
+/// Verifier pass for \c MachineTraceMetrics.
+struct MachineTraceMetricsVerifierPass
+: PassInfoMixin {
+  PreservedAnalyses run(MachineFunction &MF,
+MachineFunctionAnalysisManager &MFAM);
+  static bool isRequired() { return true; }
+};
+
+class MachineTraceMetricsWrapperPass : public MachineFunctionPass {
+public:
+  static char ID;
+  MachineTraceMetrics MTM;
+
+  MachineTraceMetricsWrapperPass();
+
+  void getAnalysisUsage(AnalysisUsage &) const override;
+  bool runOnMachineFunction(MachineFunction &) override;
+  void releaseMemory() override { MTM.clear(); }
+  void verifyAnalysis() const override { MTM.verifyAnalysis(); }
+  MachineTraceMetrics &getMTM() { return MTM; }
+};
+
 } // end namespace llvm
 
 #endif // LLVM_CODEGEN_MACHINETRACEMETRICS_H
diff --git a/llvm/include/llvm/InitializePasses.h 
b/llvm/include/llvm/InitializePasses.h
index 4352099d6dbb99..3fa6fabaeccd64 100644
--- a/llvm/include/llvm/InitializePasses.h
+++ b/llvm/include/llvm/InitializePasses.h
@@ -209,7 +209,7 @@ void initializeMachineRegionInfoPassPass(PassRegistry &);
 void initializeMachineSanitizerBinaryMetadataPass(PassRegistry &);
 void initializeMachineSchedulerPass(PassRegistry &);
 void initializeMachineSinkingPass(PassRegistry &);
-void initializeMachineTraceMetricsPass(PassRegistry &

[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port EarlyIfConversion pass to NPM. (PR #108508)

2024-09-12 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas created 
https://github.com/llvm/llvm-project/pull/108508

None

>From 8c819329488c087fce339d4fd65761bc986ed80e Mon Sep 17 00:00:00 2001
From: Christudasan Devadasan 
Date: Fri, 13 Sep 2024 12:22:03 +0530
Subject: [PATCH] [CodeGen][NewPM] Port EarlyIfConversion pass to NPM.

---
 llvm/include/llvm/CodeGen/EarlyIfConversion.h | 24 ++
 llvm/include/llvm/CodeGen/Passes.h|  2 +-
 llvm/include/llvm/InitializePasses.h  |  2 +-
 llvm/include/llvm/Passes/CodeGenPassBuilder.h |  1 +
 .../llvm/Passes/MachinePassRegistry.def   |  2 +-
 llvm/lib/CodeGen/CodeGen.cpp  |  2 +-
 llvm/lib/CodeGen/EarlyIfConversion.cpp| 79 ++-
 llvm/lib/CodeGen/TargetPassConfig.cpp |  4 +-
 llvm/lib/Passes/PassBuilder.cpp   |  1 +
 .../Target/AArch64/AArch64TargetMachine.cpp   |  2 +-
 .../lib/Target/AMDGPU/AMDGPUTargetMachine.cpp |  2 +-
 llvm/lib/Target/PowerPC/PPCTargetMachine.cpp  |  2 +-
 .../Target/SystemZ/SystemZTargetMachine.cpp   |  2 +-
 llvm/lib/Target/X86/X86TargetMachine.cpp  |  2 +-
 .../early-ifcvt-likely-predictable.mir|  1 +
 .../AArch64/early-ifcvt-regclass-mismatch.mir |  1 +
 .../AArch64/early-ifcvt-same-value.mir|  1 +
 .../CodeGen/PowerPC/early-ifcvt-no-isel.mir   |  2 +
 18 files changed, 102 insertions(+), 30 deletions(-)
 create mode 100644 llvm/include/llvm/CodeGen/EarlyIfConversion.h

diff --git a/llvm/include/llvm/CodeGen/EarlyIfConversion.h 
b/llvm/include/llvm/CodeGen/EarlyIfConversion.h
new file mode 100644
index 00..78bf12ade02c3d
--- /dev/null
+++ b/llvm/include/llvm/CodeGen/EarlyIfConversion.h
@@ -0,0 +1,24 @@
+//===- llvm/CodeGen/EarlyIfConversion.h -*- C++ 
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#ifndef LLVM_CODEGEN_EARLYIFCONVERSION_H
+#define LLVM_CODEGEN_EARLYIFCONVERSION_H
+
+#include "llvm/CodeGen/MachinePassManager.h"
+
+namespace llvm {
+
+class EarlyIfConverterPass : public PassInfoMixin {
+public:
+  PreservedAnalyses run(MachineFunction &MF,
+MachineFunctionAnalysisManager &MFAM);
+};
+
+} // namespace llvm
+
+#endif // LLVM_CODEGEN_EARLYIFCONVERSION_H
diff --git a/llvm/include/llvm/CodeGen/Passes.h 
b/llvm/include/llvm/CodeGen/Passes.h
index ddb2012cd2bffc..5d042d8daa5630 100644
--- a/llvm/include/llvm/CodeGen/Passes.h
+++ b/llvm/include/llvm/CodeGen/Passes.h
@@ -273,7 +273,7 @@ namespace llvm {
 
   /// EarlyIfConverter - This pass performs if-conversion on SSA form by
   /// inserting cmov instructions.
-  extern char &EarlyIfConverterID;
+  extern char &EarlyIfConverterLegacyID;
 
   /// EarlyIfPredicator - This pass performs if-conversion on SSA form by
   /// predicating if/else block and insert select at the join point.
diff --git a/llvm/include/llvm/InitializePasses.h 
b/llvm/include/llvm/InitializePasses.h
index 3fa6fabaeccd64..9d98e80a4fd365 100644
--- a/llvm/include/llvm/InitializePasses.h
+++ b/llvm/include/llvm/InitializePasses.h
@@ -98,7 +98,7 @@ void initializeDominatorTreeWrapperPassPass(PassRegistry &);
 void initializeDwarfEHPrepareLegacyPassPass(PassRegistry &);
 void initializeEarlyCSELegacyPassPass(PassRegistry &);
 void initializeEarlyCSEMemSSALegacyPassPass(PassRegistry &);
-void initializeEarlyIfConverterPass(PassRegistry &);
+void initializeEarlyIfConverterLegacyPass(PassRegistry &);
 void initializeEarlyIfPredicatorPass(PassRegistry &);
 void initializeEarlyMachineLICMPass(PassRegistry &);
 void initializeEarlyTailDuplicatePass(PassRegistry &);
diff --git a/llvm/include/llvm/Passes/CodeGenPassBuilder.h 
b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
index a99fed86d168d1..5f005707fe3cc0 100644
--- a/llvm/include/llvm/Passes/CodeGenPassBuilder.h
+++ b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
@@ -27,6 +27,7 @@
 #include "llvm/CodeGen/CodeGenPrepare.h"
 #include "llvm/CodeGen/DeadMachineInstructionElim.h"
 #include "llvm/CodeGen/DwarfEHPrepare.h"
+#include "llvm/CodeGen/EarlyIfConversion.h"
 #include "llvm/CodeGen/ExpandLargeDivRem.h"
 #include "llvm/CodeGen/ExpandLargeFpConvert.h"
 #include "llvm/CodeGen/ExpandMemCmp.h"
diff --git a/llvm/include/llvm/Passes/MachinePassRegistry.def 
b/llvm/include/llvm/Passes/MachinePassRegistry.def
index e92d6dd97c655a..949936e55a0f06 100644
--- a/llvm/include/llvm/Passes/MachinePassRegistry.def
+++ b/llvm/include/llvm/Passes/MachinePassRegistry.def
@@ -128,6 +128,7 @@ MACHINE_FUNCTION_ANALYSIS("slot-indexes", 
SlotIndexesAnalysis())
 #ifndef MACHINE_FUNCTION_PASS
 #define MACHINE_FUNCTION_PASS(NAME, CREATE_PASS)
 #endif
+MACHINE_FUNCTION_PASS("early-ifcvt", EarlyIfConverterPass())
 MACHINE_FUNCTION_PASS("dead-mi-elimination", DeadMachineInstructionElimPass())
 MACHINE_FUNCTION

[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port EarlyIfConversion pass to NPM. (PR #108508)

2024-09-12 Thread Christudasan Devadasan via llvm-branch-commits


cdevadas wrote:

> [!WARNING]
> This pull request is not mergeable via GitHub because a downstack PR is 
> open. Once all requirements are satisfied, merge this PR as a stack  href="https://app.graphite.dev/github/pr/llvm/llvm-project/108508?utm_source=stack-comment-downstack-mergeability-warning";
>  >on Graphite.
> https://graphite.dev/docs/merge-pull-requests";>Learn more

* **#108508** https://app.graphite.dev/github/pr/llvm/llvm-project/108508?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/> 👈
* **#108507** https://app.graphite.dev/github/pr/llvm/llvm-project/108507?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* **#108506** https://app.graphite.dev/github/pr/llvm/llvm-project/108506?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* `main`

This stack of pull requests is managed by Graphite. https://stacking.dev/?utm_source=stack-comment";>Learn more about 
stacking.


 Join @cdevadas and the rest of your teammates on https://graphite.dev?utm-source=stack-comment";>https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="11px" height="11px"/> Graphite
  

https://github.com/llvm/llvm-project/pull/108508
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port machine trace metrics analysis to new pass manager. (PR #108507)

2024-09-12 Thread Christudasan Devadasan via llvm-branch-commits


cdevadas wrote:

> [!WARNING]
> This pull request is not mergeable via GitHub because a downstack PR is 
> open. Once all requirements are satisfied, merge this PR as a stack  href="https://app.graphite.dev/github/pr/llvm/llvm-project/108507?utm_source=stack-comment-downstack-mergeability-warning";
>  >on Graphite.
> https://graphite.dev/docs/merge-pull-requests";>Learn more

* **#108508** https://app.graphite.dev/github/pr/llvm/llvm-project/108508?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* **#108507** https://app.graphite.dev/github/pr/llvm/llvm-project/108507?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/> 👈
* **#108506** https://app.graphite.dev/github/pr/llvm/llvm-project/108506?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* `main`

This stack of pull requests is managed by Graphite. https://stacking.dev/?utm_source=stack-comment";>Learn more about 
stacking.


 Join @cdevadas and the rest of your teammates on https://graphite.dev?utm-source=stack-comment";>https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="11px" height="11px"/> Graphite
  

https://github.com/llvm/llvm-project/pull/108507
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port machine trace metrics analysis to new pass manager. (PR #108507)

2024-09-12 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas ready_for_review 
https://github.com/llvm/llvm-project/pull/108507
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port EarlyIfConversion pass to NPM. (PR #108508)

2024-09-12 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas ready_for_review 
https://github.com/llvm/llvm-project/pull/108508
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port machine trace metrics analysis to new pass manager. (PR #108507)

2024-09-13 Thread Christudasan Devadasan via llvm-branch-commits



@@ -39,41 +39,68 @@ using namespace llvm;
 
 #define DEBUG_TYPE "machine-trace-metrics"
 
-char MachineTraceMetrics::ID = 0;
+AnalysisKey MachineTraceMetricsAnalysis::Key;
 
-char &llvm::MachineTraceMetricsID = MachineTraceMetrics::ID;
+MachineTraceMetricsAnalysis::Result
+MachineTraceMetricsAnalysis::run(MachineFunction &MF,
+ MachineFunctionAnalysisManager &MFAM) {
+  return Result(MF, MFAM.getResult(MF));
+}
+
+PreservedAnalyses
+MachineTraceMetricsVerifierPass::run(MachineFunction &MF,
+ MachineFunctionAnalysisManager &MFAM) {
+  MFAM.getResult(MF).verifyAnalysis();

cdevadas wrote:

In the NPM infrastructure verifyAnalysis is done separately. There are many 
such instances in the opt pipeline. This is the first attempt to port CodeGen 
analysis with verify. 

https://github.com/llvm/llvm-project/pull/108507
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port EarlyIfConversion pass to NPM. (PR #108508)

2024-09-13 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas updated 
https://github.com/llvm/llvm-project/pull/108508

>From 8c819329488c087fce339d4fd65761bc986ed80e Mon Sep 17 00:00:00 2001
From: Christudasan Devadasan 
Date: Fri, 13 Sep 2024 12:22:03 +0530
Subject: [PATCH 1/2] [CodeGen][NewPM] Port EarlyIfConversion pass to NPM.

---
 llvm/include/llvm/CodeGen/EarlyIfConversion.h | 24 ++
 llvm/include/llvm/CodeGen/Passes.h|  2 +-
 llvm/include/llvm/InitializePasses.h  |  2 +-
 llvm/include/llvm/Passes/CodeGenPassBuilder.h |  1 +
 .../llvm/Passes/MachinePassRegistry.def   |  2 +-
 llvm/lib/CodeGen/CodeGen.cpp  |  2 +-
 llvm/lib/CodeGen/EarlyIfConversion.cpp| 79 ++-
 llvm/lib/CodeGen/TargetPassConfig.cpp |  4 +-
 llvm/lib/Passes/PassBuilder.cpp   |  1 +
 .../Target/AArch64/AArch64TargetMachine.cpp   |  2 +-
 .../lib/Target/AMDGPU/AMDGPUTargetMachine.cpp |  2 +-
 llvm/lib/Target/PowerPC/PPCTargetMachine.cpp  |  2 +-
 .../Target/SystemZ/SystemZTargetMachine.cpp   |  2 +-
 llvm/lib/Target/X86/X86TargetMachine.cpp  |  2 +-
 .../early-ifcvt-likely-predictable.mir|  1 +
 .../AArch64/early-ifcvt-regclass-mismatch.mir |  1 +
 .../AArch64/early-ifcvt-same-value.mir|  1 +
 .../CodeGen/PowerPC/early-ifcvt-no-isel.mir   |  2 +
 18 files changed, 102 insertions(+), 30 deletions(-)
 create mode 100644 llvm/include/llvm/CodeGen/EarlyIfConversion.h

diff --git a/llvm/include/llvm/CodeGen/EarlyIfConversion.h 
b/llvm/include/llvm/CodeGen/EarlyIfConversion.h
new file mode 100644
index 00..78bf12ade02c3d
--- /dev/null
+++ b/llvm/include/llvm/CodeGen/EarlyIfConversion.h
@@ -0,0 +1,24 @@
+//===- llvm/CodeGen/EarlyIfConversion.h -*- C++ 
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#ifndef LLVM_CODEGEN_EARLYIFCONVERSION_H
+#define LLVM_CODEGEN_EARLYIFCONVERSION_H
+
+#include "llvm/CodeGen/MachinePassManager.h"
+
+namespace llvm {
+
+class EarlyIfConverterPass : public PassInfoMixin {
+public:
+  PreservedAnalyses run(MachineFunction &MF,
+MachineFunctionAnalysisManager &MFAM);
+};
+
+} // namespace llvm
+
+#endif // LLVM_CODEGEN_EARLYIFCONVERSION_H
diff --git a/llvm/include/llvm/CodeGen/Passes.h 
b/llvm/include/llvm/CodeGen/Passes.h
index ddb2012cd2bffc..5d042d8daa5630 100644
--- a/llvm/include/llvm/CodeGen/Passes.h
+++ b/llvm/include/llvm/CodeGen/Passes.h
@@ -273,7 +273,7 @@ namespace llvm {
 
   /// EarlyIfConverter - This pass performs if-conversion on SSA form by
   /// inserting cmov instructions.
-  extern char &EarlyIfConverterID;
+  extern char &EarlyIfConverterLegacyID;
 
   /// EarlyIfPredicator - This pass performs if-conversion on SSA form by
   /// predicating if/else block and insert select at the join point.
diff --git a/llvm/include/llvm/InitializePasses.h 
b/llvm/include/llvm/InitializePasses.h
index 3fa6fabaeccd64..9d98e80a4fd365 100644
--- a/llvm/include/llvm/InitializePasses.h
+++ b/llvm/include/llvm/InitializePasses.h
@@ -98,7 +98,7 @@ void initializeDominatorTreeWrapperPassPass(PassRegistry &);
 void initializeDwarfEHPrepareLegacyPassPass(PassRegistry &);
 void initializeEarlyCSELegacyPassPass(PassRegistry &);
 void initializeEarlyCSEMemSSALegacyPassPass(PassRegistry &);
-void initializeEarlyIfConverterPass(PassRegistry &);
+void initializeEarlyIfConverterLegacyPass(PassRegistry &);
 void initializeEarlyIfPredicatorPass(PassRegistry &);
 void initializeEarlyMachineLICMPass(PassRegistry &);
 void initializeEarlyTailDuplicatePass(PassRegistry &);
diff --git a/llvm/include/llvm/Passes/CodeGenPassBuilder.h 
b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
index a99fed86d168d1..5f005707fe3cc0 100644
--- a/llvm/include/llvm/Passes/CodeGenPassBuilder.h
+++ b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
@@ -27,6 +27,7 @@
 #include "llvm/CodeGen/CodeGenPrepare.h"
 #include "llvm/CodeGen/DeadMachineInstructionElim.h"
 #include "llvm/CodeGen/DwarfEHPrepare.h"
+#include "llvm/CodeGen/EarlyIfConversion.h"
 #include "llvm/CodeGen/ExpandLargeDivRem.h"
 #include "llvm/CodeGen/ExpandLargeFpConvert.h"
 #include "llvm/CodeGen/ExpandMemCmp.h"
diff --git a/llvm/include/llvm/Passes/MachinePassRegistry.def 
b/llvm/include/llvm/Passes/MachinePassRegistry.def
index e92d6dd97c655a..949936e55a0f06 100644
--- a/llvm/include/llvm/Passes/MachinePassRegistry.def
+++ b/llvm/include/llvm/Passes/MachinePassRegistry.def
@@ -128,6 +128,7 @@ MACHINE_FUNCTION_ANALYSIS("slot-indexes", 
SlotIndexesAnalysis())
 #ifndef MACHINE_FUNCTION_PASS
 #define MACHINE_FUNCTION_PASS(NAME, CREATE_PASS)
 #endif
+MACHINE_FUNCTION_PASS("early-ifcvt", EarlyIfConverterPass())
 MACHINE_FUNCTION_PASS("dead-mi-elimination", DeadMachineInstructionElimPass())
 MACHINE_FUNCTION_P

[llvm-branch-commits] [llvm] [AMDGPU][NewPM] Fill out addILPOpts. (PR #108514)

2024-09-13 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas created 
https://github.com/llvm/llvm-project/pull/108514

None

>From 2941048b558ae43ff0c96a1cc301976435c95a7f Mon Sep 17 00:00:00 2001
From: Christudasan Devadasan 
Date: Fri, 13 Sep 2024 13:53:01 +0530
Subject: [PATCH] [AMDGPU][NewPM] Fill out addILPOpts.

---
 llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 7 +++
 llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h   | 1 +
 2 files changed, 8 insertions(+)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
index 8c45e6b5e589c2..6cc9ff6a981fee 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -1898,6 +1898,13 @@ void AMDGPUCodeGenPassBuilder::addPreISel(AddIRPass 
&addPass) const {
   addPass(RequireAnalysisPass());
 }
 
+void AMDGPUCodeGenPassBuilder::addILPOpts(AddMachinePass &addPass) const {
+  if (EnableEarlyIfConversion)
+addPass(EarlyIfConverterPass());
+
+  Base::addILPOpts(addPass);
+}
+
 void AMDGPUCodeGenPassBuilder::addAsmPrinter(AddMachinePass &addPass,
  CreateMCStreamer) const {
   // TODO: Add AsmPrinter.
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h 
b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h
index 5b7257ddb36f1e..96b414f294ee70 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h
@@ -172,6 +172,7 @@ class AMDGPUCodeGenPassBuilder
   void addIRPasses(AddIRPass &) const;
   void addCodeGenPrepare(AddIRPass &) const;
   void addPreISel(AddIRPass &addPass) const;
+  void addILPOpts(AddMachinePass &) const;
   void addAsmPrinter(AddMachinePass &, CreateMCStreamer) const;
   Error addInstSelector(AddMachinePass &) const;
 

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU][NewPM] Fill out addILPOpts. (PR #108514)

2024-09-13 Thread Christudasan Devadasan via llvm-branch-commits


cdevadas wrote:

> [!WARNING]
> This pull request is not mergeable via GitHub because a downstack PR is 
> open. Once all requirements are satisfied, merge this PR as a stack  href="https://app.graphite.dev/github/pr/llvm/llvm-project/108514?utm_source=stack-comment-downstack-mergeability-warning";
>  >on Graphite.
> https://graphite.dev/docs/merge-pull-requests";>Learn more

* **#108514** https://app.graphite.dev/github/pr/llvm/llvm-project/108514?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/> 👈
* **#108508** https://app.graphite.dev/github/pr/llvm/llvm-project/108508?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* **#108507** https://app.graphite.dev/github/pr/llvm/llvm-project/108507?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* **#108506** https://app.graphite.dev/github/pr/llvm/llvm-project/108506?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* `main`

This stack of pull requests is managed by Graphite. https://stacking.dev/?utm_source=stack-comment";>Learn more about 
stacking.


 Join @cdevadas and the rest of your teammates on https://graphite.dev?utm-source=stack-comment";>https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="11px" height="11px"/> Graphite
  

https://github.com/llvm/llvm-project/pull/108514
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU][NewPM] Fill out addILPOpts. (PR #108514)

2024-09-13 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas ready_for_review 
https://github.com/llvm/llvm-project/pull/108514
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] c971bcd - [AMDGPU] Test clean up (NFC)

2021-01-22 Thread Christudasan Devadasan via llvm-branch-commits


Author: Christudasan Devadasan
Date: 2021-01-22T13:38:52+05:30
New Revision: c971bcd2102b905e6469463fb8309ab3f7b2b8f2

URL: 
https://github.com/llvm/llvm-project/commit/c971bcd2102b905e6469463fb8309ab3f7b2b8f2
DIFF: 
https://github.com/llvm/llvm-project/commit/c971bcd2102b905e6469463fb8309ab3f7b2b8f2.diff

LOG: [AMDGPU] Test clean up (NFC)

Added: 


Modified: 
llvm/test/CodeGen/AMDGPU/local-stack-alloc-block-sp-reference.ll

Removed: 




diff  --git a/llvm/test/CodeGen/AMDGPU/local-stack-alloc-block-sp-reference.ll 
b/llvm/test/CodeGen/AMDGPU/local-stack-alloc-block-sp-reference.ll
index 4fb2625a5221..5e4b5f70de0b 100644
--- a/llvm/test/CodeGen/AMDGPU/local-stack-alloc-block-sp-reference.ll
+++ b/llvm/test/CodeGen/AMDGPU/local-stack-alloc-block-sp-reference.ll
@@ -16,7 +16,7 @@
 ; so eliminateFrameIndex would not adjust the access to use the
 ; correct FP offset.
 
-define amdgpu_kernel void @local_stack_offset_uses_sp(i64 addrspace(1)* %out, 
i8 addrspace(1)* %in) {
+define amdgpu_kernel void @local_stack_offset_uses_sp(i64 addrspace(1)* %out) {
 ; MUBUF-LABEL: local_stack_offset_uses_sp:
 ; MUBUF:   ; %bb.0: ; %entry
 ; MUBUF-NEXT:s_load_dwordx2 s[4:5], s[4:5], 0x0
@@ -106,7 +106,7 @@ entry:
   ret void
 }
 
-define void @func_local_stack_offset_uses_sp(i64 addrspace(1)* %out, i8 
addrspace(1)* %in) {
+define void @func_local_stack_offset_uses_sp(i64 addrspace(1)* %out) {
 ; MUBUF-LABEL: func_local_stack_offset_uses_sp:
 ; MUBUF:   ; %bb.0: ; %entry
 ; MUBUF-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] ff8a1ca - [AMDGPU] Fix the inconsistency in soffset for MUBUF stack accesses.

2021-01-22 Thread Christudasan Devadasan via llvm-branch-commits


Author: Christudasan Devadasan
Date: 2021-01-22T14:20:59+05:30
New Revision: ff8a1cae181438b97937848060da1efb67117ea4

URL: 
https://github.com/llvm/llvm-project/commit/ff8a1cae181438b97937848060da1efb67117ea4
DIFF: 
https://github.com/llvm/llvm-project/commit/ff8a1cae181438b97937848060da1efb67117ea4.diff

LOG: [AMDGPU] Fix the inconsistency in soffset for MUBUF stack accesses.

During instruction selection, there is an inconsistency in choosing
the initial soffset value. With certain early passes, this value is
getting modified and that brought additional fixup during
eliminateFrameIndex to work for all cases. This whole transformation
looks trivial and can be handled better.

This patch clearly defines the initial value for soffset and keeps it
unchanged before eliminateFrameIndex. The initial value must be zero
for MUBUF with a frame index. The non-frame index MUBUF forms that
use a raw offset from SP will have the stack register for soffset.
During frame elimination, the soffset remains zero for entry functions
with zero dynamic allocas and no callsites, or else is updated to the
appropriate frame/stack register.

Also, did some code clean up and made all asserts around soffset
stricter to match.

Reviewed By: scott.linder

Differential Revision: https://reviews.llvm.org/D95071

Added: 


Modified: 
llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-private.mir
llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-store-private.mir
llvm/test/CodeGen/AMDGPU/amdpal-callable.ll
llvm/test/CodeGen/AMDGPU/fold-fi-mubuf.mir
llvm/test/CodeGen/AMDGPU/stack-pointer-offset-relative-frameindex.ll

Removed: 




diff  --git a/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
index 3c66745c0e70..340f4ac6f57a 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
@@ -1523,7 +1523,9 @@ std::pair 
AMDGPUDAGToDAGISel::foldFrameIndex(SDValue N) const
   FI ? CurDAG->getTargetFrameIndex(FI->getIndex(), FI->getValueType(0)) : 
N;
 
   // We rebase the base address into an absolute stack address and hence
-  // use constant 0 for soffset.
+  // use constant 0 for soffset. This value must be retained until
+  // frame elimination and eliminateFrameIndex will choose the appropriate
+  // frame register if need be.
   return std::make_pair(TFI, CurDAG->getTargetConstant(0, DL, MVT::i32));
 }
 

diff  --git a/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
index 7255a061b26b..bd577a6fb8c5 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
@@ -3669,13 +3669,9 @@ 
AMDGPUInstructionSelector::selectMUBUFScratchOffen(MachineOperand &Root) const {
MIB.addReg(HighBits);
  },
  [=](MachineInstrBuilder &MIB) { // soffset
-   const MachineMemOperand *MMO = *MI->memoperands_begin();
-   const MachinePointerInfo &PtrInfo = MMO->getPointerInfo();
-
-   if (isStackPtrRelative(PtrInfo))
- MIB.addReg(Info->getStackPtrOffsetReg());
-   else
- MIB.addImm(0);
+   // Use constant zero for soffset and rely on eliminateFrameIndex
+   // to choose the appropriate frame register if need be.
+   MIB.addImm(0);
  },
  [=](MachineInstrBuilder &MIB) { // offset
MIB.addImm(Offset & 4095);
@@ -3722,15 +3718,9 @@ 
AMDGPUInstructionSelector::selectMUBUFScratchOffen(MachineOperand &Root) const {
MIB.addReg(VAddr);
},
[=](MachineInstrBuilder &MIB) { // soffset
- // If we don't know this private access is a local stack object, 
it
- // needs to be relative to the entry point's scratch wave offset.
- // TODO: Should split large offsets that don't fit like above.
- // TODO: Don't use scratch wave offset just because the offset
- // didn't fit.
- if (!Info->isEntryFunction() && FI.hasValue())
-   MIB.addReg(Info->getStackPtrOffsetReg());
- else
-   MIB.addImm(0);
+ // Use constant zero for soffset and rely on eliminateFrameIndex
+ // to choose the appropriate frame register if need be.
+ MIB.addImm(0);
},
[=](MachineInstrBuilder &MIB) { // offset
  MIB.addImm(Offset);

diff  --git a/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp 
b/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
index d22bdb791535..d5fa9afded27 100644
--- a/llvm/lib/T

[llvm-branch-commits] [llvm] d68458b - [GlobalISel] Base implementation for sret demotion.

2021-01-05 Thread Christudasan Devadasan via llvm-branch-commits


Author: Christudasan Devadasan
Date: 2021-01-06T10:30:50+05:30
New Revision: d68458bd56d9d55b05fca5447891aa8752d70509

URL: 
https://github.com/llvm/llvm-project/commit/d68458bd56d9d55b05fca5447891aa8752d70509
DIFF: 
https://github.com/llvm/llvm-project/commit/d68458bd56d9d55b05fca5447891aa8752d70509.diff

LOG: [GlobalISel] Base implementation for sret demotion.

If the return values can't be lowered to registers
SelectionDAG performs the sret demotion. This patch
contains the basic implementation for the same in
the GlobalISel pipeline.

Furthermore, targets should bring relevant changes
during lowerFormalArguments, lowerReturn and
lowerCall to make use of this feature.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D92953

Added: 


Modified: 
llvm/include/llvm/CodeGen/GlobalISel/CallLowering.h
llvm/lib/CodeGen/GlobalISel/CallLowering.cpp
llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
llvm/lib/Target/AArch64/GISel/AArch64CallLowering.cpp
llvm/lib/Target/AArch64/GISel/AArch64CallLowering.h
llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp
llvm/lib/Target/AMDGPU/AMDGPUCallLowering.h
llvm/lib/Target/ARM/ARMCallLowering.cpp
llvm/lib/Target/ARM/ARMCallLowering.h
llvm/lib/Target/Mips/MipsCallLowering.cpp
llvm/lib/Target/Mips/MipsCallLowering.h
llvm/lib/Target/PowerPC/GISel/PPCCallLowering.cpp
llvm/lib/Target/PowerPC/GISel/PPCCallLowering.h
llvm/lib/Target/RISCV/RISCVCallLowering.cpp
llvm/lib/Target/RISCV/RISCVCallLowering.h
llvm/lib/Target/X86/X86CallLowering.cpp
llvm/lib/Target/X86/X86CallLowering.h
llvm/tools/llvm-exegesis/lib/Assembler.cpp

Removed: 




diff  --git a/llvm/include/llvm/CodeGen/GlobalISel/CallLowering.h 
b/llvm/include/llvm/CodeGen/GlobalISel/CallLowering.h
index dbd7e00c429a..dff73d185114 100644
--- a/llvm/include/llvm/CodeGen/GlobalISel/CallLowering.h
+++ b/llvm/include/llvm/CodeGen/GlobalISel/CallLowering.h
@@ -19,6 +19,7 @@
 #include "llvm/CodeGen/CallingConvLower.h"
 #include "llvm/CodeGen/MachineOperand.h"
 #include "llvm/CodeGen/TargetCallingConv.h"
+#include "llvm/IR/Attributes.h"
 #include "llvm/IR/CallingConv.h"
 #include "llvm/IR/Type.h"
 #include "llvm/Support/ErrorHandling.h"
@@ -31,6 +32,7 @@ namespace llvm {
 class CallBase;
 class DataLayout;
 class Function;
+class FunctionLoweringInfo;
 class MachineIRBuilder;
 struct MachinePointerInfo;
 class MachineRegisterInfo;
@@ -42,21 +44,30 @@ class CallLowering {
 
   virtual void anchor();
 public:
-  struct ArgInfo {
+  struct BaseArgInfo {
+Type *Ty;
+SmallVector Flags;
+bool IsFixed;
+
+BaseArgInfo(Type *Ty,
+ArrayRef Flags = ArrayRef(),
+bool IsFixed = true)
+: Ty(Ty), Flags(Flags.begin(), Flags.end()), IsFixed(IsFixed) {}
+
+BaseArgInfo() : Ty(nullptr), IsFixed(false) {}
+  };
+
+  struct ArgInfo : public BaseArgInfo {
 SmallVector Regs;
 // If the argument had to be split into multiple parts according to the
 // target calling convention, then this contains the original vregs
 // if the argument was an incoming arg.
 SmallVector OrigRegs;
-Type *Ty;
-SmallVector Flags;
-bool IsFixed;
 
 ArgInfo(ArrayRef Regs, Type *Ty,
 ArrayRef Flags = ArrayRef(),
 bool IsFixed = true)
-: Regs(Regs.begin(), Regs.end()), Ty(Ty),
-  Flags(Flags.begin(), Flags.end()), IsFixed(IsFixed) {
+: BaseArgInfo(Ty, Flags, IsFixed), Regs(Regs.begin(), Regs.end()) {
   if (!Regs.empty() && Flags.empty())
 this->Flags.push_back(ISD::ArgFlagsTy());
   // FIXME: We should have just one way of saying "no register".
@@ -65,7 +76,7 @@ class CallLowering {
  "only void types should have no register");
 }
 
-ArgInfo() : Ty(nullptr), IsFixed(false) {}
+ArgInfo() : BaseArgInfo() {}
   };
 
   struct CallLoweringInfo {
@@ -101,6 +112,15 @@ class CallLowering {
 
 /// True if the call is to a vararg function.
 bool IsVarArg = false;
+
+/// True if the function's return value can be lowered to registers.
+bool CanLowerReturn = true;
+
+/// VReg to hold the hidden sret parameter.
+Register DemoteRegister;
+
+/// The stack index for sret demotion.
+int DemoteStackIndex;
   };
 
   /// Argument handling is mostly uniform between the four places that
@@ -292,20 +312,73 @@ class CallLowering {
 return false;
   }
 
+  /// Load the returned value from the stack into virtual registers in \p 
VRegs.
+  /// It uses the frame index \p FI and the start offset from \p DemoteReg.
+  /// The loaded data size will be determined from \p RetTy.
+  void insertSRetLoads(MachineIRBuilder &MIRBuilder, Type *RetTy,
+   ArrayRef VRegs, Register DemoteReg,
+   int FI) const;
+
+  /// Store the return value given by \p VRegs into stack starting at the 
offset
+

[llvm-branch-commits] [llvm] ae25a39 - AMDGPU/GlobalISel: Enable sret demotion

2021-01-07 Thread Christudasan Devadasan via llvm-branch-commits


Author: Christudasan Devadasan
Date: 2021-01-08T10:56:35+05:30
New Revision: ae25a397e9de833ffbd5d8e3b480086404625cb7

URL: 
https://github.com/llvm/llvm-project/commit/ae25a397e9de833ffbd5d8e3b480086404625cb7
DIFF: 
https://github.com/llvm/llvm-project/commit/ae25a397e9de833ffbd5d8e3b480086404625cb7.diff

LOG: AMDGPU/GlobalISel: Enable sret demotion

Added: 


Modified: 
llvm/include/llvm/CodeGen/GlobalISel/CallLowering.h
llvm/lib/CodeGen/GlobalISel/CallLowering.cpp
llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp
llvm/lib/Target/AMDGPU/AMDGPUCallLowering.h
llvm/test/CodeGen/AMDGPU/GlobalISel/function-returns.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-call-return-values.ll

Removed: 




diff  --git a/llvm/include/llvm/CodeGen/GlobalISel/CallLowering.h 
b/llvm/include/llvm/CodeGen/GlobalISel/CallLowering.h
index dff73d185114..57ff3900ef25 100644
--- a/llvm/include/llvm/CodeGen/GlobalISel/CallLowering.h
+++ b/llvm/include/llvm/CodeGen/GlobalISel/CallLowering.h
@@ -358,8 +358,8 @@ class CallLowering {
   /// described by \p Outs can fit into the return registers. If false
   /// is returned, an sret-demotion is performed.
   virtual bool canLowerReturn(MachineFunction &MF, CallingConv::ID CallConv,
-  SmallVectorImpl &Outs, bool 
IsVarArg,
-  LLVMContext &Context) const {
+  SmallVectorImpl &Outs,
+  bool IsVarArg) const {
 return true;
   }
 

diff  --git a/llvm/lib/CodeGen/GlobalISel/CallLowering.cpp 
b/llvm/lib/CodeGen/GlobalISel/CallLowering.cpp
index e1591a4bf19b..a6d4ea76 100644
--- a/llvm/lib/CodeGen/GlobalISel/CallLowering.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/CallLowering.cpp
@@ -95,8 +95,7 @@ bool CallLowering::lowerCall(MachineIRBuilder &MIRBuilder, 
const CallBase &CB,
 
   SmallVector SplitArgs;
   getReturnInfo(CallConv, RetTy, CB.getAttributes(), SplitArgs, DL);
-  Info.CanLowerReturn =
-  canLowerReturn(MF, CallConv, SplitArgs, IsVarArg, RetTy->getContext());
+  Info.CanLowerReturn = canLowerReturn(MF, CallConv, SplitArgs, IsVarArg);
 
   if (!Info.CanLowerReturn) {
 // Callee requires sret demotion.
@@ -592,8 +591,7 @@ bool 
CallLowering::checkReturnTypeForCallConv(MachineFunction &MF) const {
   SmallVector SplitArgs;
   getReturnInfo(CallConv, ReturnType, F.getAttributes(), SplitArgs,
 MF.getDataLayout());
-  return canLowerReturn(MF, CallConv, SplitArgs, F.isVarArg(),
-ReturnType->getContext());
+  return canLowerReturn(MF, CallConv, SplitArgs, F.isVarArg());
 }
 
 bool CallLowering::analyzeArgInfo(CCState &CCState,

diff  --git a/llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp
index 3b6e263ee6d8..b86052e3a14e 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp
@@ -20,6 +20,7 @@
 #include "SIMachineFunctionInfo.h"
 #include "SIRegisterInfo.h"
 #include "llvm/CodeGen/Analysis.h"
+#include "llvm/CodeGen/FunctionLoweringInfo.h"
 #include "llvm/CodeGen/GlobalISel/MachineIRBuilder.h"
 #include "llvm/IR/IntrinsicsAMDGPU.h"
 
@@ -420,6 +421,22 @@ static void unpackRegsToOrigType(MachineIRBuilder &B,
   B.buildUnmerge(UnmergeResults, UnmergeSrc);
 }
 
+bool AMDGPUCallLowering::canLowerReturn(MachineFunction &MF,
+CallingConv::ID CallConv,
+SmallVectorImpl &Outs,
+bool IsVarArg) const {
+  // For shaders. Vector types should be explicitly handled by CC.
+  if (AMDGPU::isEntryFunctionCC(CallConv))
+return true;
+
+  SmallVector ArgLocs;
+  const SITargetLowering &TLI = *getTLI();
+  CCState CCInfo(CallConv, IsVarArg, MF, ArgLocs,
+ MF.getFunction().getContext());
+
+  return checkReturn(CCInfo, Outs, TLI.CCAssignFnForReturn(CallConv, 
IsVarArg));
+}
+
 /// Lower the return value for the already existing \p Ret. This assumes that
 /// \p B's insertion point is correct.
 bool AMDGPUCallLowering::lowerReturnVal(MachineIRBuilder &B,
@@ -533,7 +550,9 @@ bool AMDGPUCallLowering::lowerReturn(MachineIRBuilder &B, 
const Value *Val,
 Ret.addUse(ReturnAddrVReg);
   }
 
-  if (!lowerReturnVal(B, Val, VRegs, Ret))
+  if (!FLI.CanLowerReturn)
+insertSRetStores(B, Val->getType(), VRegs, FLI.DemoteRegister);
+  else if (!lowerReturnVal(B, Val, VRegs, Ret))
 return false;
 
   if (ReturnOpc == AMDGPU::S_SETPC_B64_return) {
@@ -872,6 +891,11 @@ bool AMDGPUCallLowering::lowerFormalArguments(
   unsigned Idx = 0;
   unsigned PSInputNum = 0;
 
+  // Insert the hidden sret parameter if the return value won't fit in the
+  // return registers.
+  if (!FLI.CanLowerReturn)
+insertSRetIncomingArgument(F, SplitArgs, FLI.DemoteRegister, MRI, DL);
+
   for (auto &Arg : F.args()) {

[llvm-branch-commits] [llvm] [MIR] Add missing noteNewVirtualRegister callbacks (PR #111634)

2024-10-08 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas commented:

The changes you made in llvm/lib/CodeGen/MIRParser/MIRParser.cpp are not 
related to this PR.
Create a separate NFC patch for it.

https://github.com/llvm/llvm-project/pull/111634
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU][SILowerSGPRSpills] Updated the correct pass dependency (PR #109937)

2024-10-14 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas edited 
https://github.com/llvm/llvm-project/pull/109937
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU][SILowerSGPRSpills] Updated the correct pass dependency. (PR #109937)

2024-10-14 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas edited 
https://github.com/llvm/llvm-project/pull/109937
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port machine trace metrics analysis to new pass manager. (PR #108507)

2024-10-15 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas updated 
https://github.com/llvm/llvm-project/pull/108507

>From e0e4e978c06a2c78b31382274527201e03082e00 Mon Sep 17 00:00:00 2001
From: Christudasan Devadasan 
Date: Thu, 12 Sep 2024 23:38:09 +0530
Subject: [PATCH 1/3] [CodeGen][NewPM] Port machine trace metrics analysis to
 new pass manager.

---
 .../llvm/CodeGen/MachineTraceMetrics.h| 58 ++---
 llvm/include/llvm/InitializePasses.h  |  2 +-
 .../llvm/Passes/MachinePassRegistry.def   |  4 +-
 llvm/lib/CodeGen/EarlyIfConversion.cpp|  8 +--
 llvm/lib/CodeGen/MachineCombiner.cpp  |  8 +--
 llvm/lib/CodeGen/MachineTraceMetrics.cpp  | 62 +++
 llvm/lib/Passes/PassBuilder.cpp   |  1 +
 .../AArch64/AArch64ConditionalCompares.cpp|  8 +--
 .../AArch64/AArch64StorePairSuppress.cpp  |  6 +-
 9 files changed, 119 insertions(+), 38 deletions(-)

diff --git a/llvm/include/llvm/CodeGen/MachineTraceMetrics.h 
b/llvm/include/llvm/CodeGen/MachineTraceMetrics.h
index c7d97597d551cd..36718f80a1b6dd 100644
--- a/llvm/include/llvm/CodeGen/MachineTraceMetrics.h
+++ b/llvm/include/llvm/CodeGen/MachineTraceMetrics.h
@@ -46,12 +46,13 @@
 #ifndef LLVM_CODEGEN_MACHINETRACEMETRICS_H
 #define LLVM_CODEGEN_MACHINETRACEMETRICS_H
 
-#include "llvm/ADT/SparseSet.h"
 #include "llvm/ADT/ArrayRef.h"
 #include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/SmallVector.h"
+#include "llvm/ADT/SparseSet.h"
 #include "llvm/CodeGen/MachineBasicBlock.h"
 #include "llvm/CodeGen/MachineFunctionPass.h"
+#include "llvm/CodeGen/MachinePassManager.h"
 #include "llvm/CodeGen/TargetSchedule.h"
 
 namespace llvm {
@@ -93,7 +94,7 @@ enum class MachineTraceStrategy {
   TS_NumStrategies
 };
 
-class MachineTraceMetrics : public MachineFunctionPass {
+class MachineTraceMetrics {
   const MachineFunction *MF = nullptr;
   const TargetInstrInfo *TII = nullptr;
   const TargetRegisterInfo *TRI = nullptr;
@@ -102,19 +103,22 @@ class MachineTraceMetrics : public MachineFunctionPass {
   TargetSchedModel SchedModel;
 
 public:
+  friend class MachineTraceMetricsWrapperPass;
   friend class Ensemble;
   friend class Trace;
 
   class Ensemble;
 
-  static char ID;
+  // For legacy pass.
+  MachineTraceMetrics() {
+std::fill(std::begin(Ensembles), std::end(Ensembles), nullptr);
+  }
 
-  MachineTraceMetrics();
+  explicit MachineTraceMetrics(MachineFunction &MF, const MachineLoopInfo &LI);
+  ~MachineTraceMetrics();
 
-  void getAnalysisUsage(AnalysisUsage&) const override;
-  bool runOnMachineFunction(MachineFunction&) override;
-  void releaseMemory() override;
-  void verifyAnalysis() const override;
+  void init(MachineFunction &Func, const MachineLoopInfo &LI);
+  void clear();
 
   /// Per-basic block information that doesn't depend on the trace through the
   /// block.
@@ -400,6 +404,12 @@ class MachineTraceMetrics : public MachineFunctionPass {
   /// Call Ensemble::getTrace() again to update any trace handles.
   void invalidate(const MachineBasicBlock *MBB);
 
+  /// Handle invalidation explicitly.
+  bool invalidate(MachineFunction &, const PreservedAnalyses &PA,
+  MachineFunctionAnalysisManager::Invalidator &);
+
+  void verifyAnalysis() const;
+
 private:
   // One entry per basic block, indexed by block number.
   SmallVector BlockInfo;
@@ -435,6 +445,38 @@ inline raw_ostream &operator<<(raw_ostream &OS,
   return OS;
 }
 
+class MachineTraceMetricsAnalysis
+: public AnalysisInfoMixin {
+  friend AnalysisInfoMixin;
+  static AnalysisKey Key;
+
+public:
+  using Result = MachineTraceMetrics;
+  Result run(MachineFunction &MF, MachineFunctionAnalysisManager &MFAM);
+};
+
+/// Verifier pass for \c MachineTraceMetrics.
+struct MachineTraceMetricsVerifierPass
+: PassInfoMixin {
+  PreservedAnalyses run(MachineFunction &MF,
+MachineFunctionAnalysisManager &MFAM);
+  static bool isRequired() { return true; }
+};
+
+class MachineTraceMetricsWrapperPass : public MachineFunctionPass {
+public:
+  static char ID;
+  MachineTraceMetrics MTM;
+
+  MachineTraceMetricsWrapperPass();
+
+  void getAnalysisUsage(AnalysisUsage &) const override;
+  bool runOnMachineFunction(MachineFunction &) override;
+  void releaseMemory() override { MTM.clear(); }
+  void verifyAnalysis() const override { MTM.verifyAnalysis(); }
+  MachineTraceMetrics &getMTM() { return MTM; }
+};
+
 } // end namespace llvm
 
 #endif // LLVM_CODEGEN_MACHINETRACEMETRICS_H
diff --git a/llvm/include/llvm/InitializePasses.h 
b/llvm/include/llvm/InitializePasses.h
index 6a75dc0285cc61..5ed0ad98a2a72d 100644
--- a/llvm/include/llvm/InitializePasses.h
+++ b/llvm/include/llvm/InitializePasses.h
@@ -209,7 +209,7 @@ void initializeMachineRegionInfoPassPass(PassRegistry &);
 void initializeMachineSanitizerBinaryMetadataPass(PassRegistry &);
 void initializeMachineSchedulerPass(PassRegistry &);
 void initializeMachineSinkingPass(PassRegistry &);
-void initializeMachineTraceMetricsPass(PassRegistry &);

[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port machine trace metrics analysis to new pass manager. (PR #108507)

2024-10-15 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas updated 
https://github.com/llvm/llvm-project/pull/108507

>From e0e4e978c06a2c78b31382274527201e03082e00 Mon Sep 17 00:00:00 2001
From: Christudasan Devadasan 
Date: Thu, 12 Sep 2024 23:38:09 +0530
Subject: [PATCH 1/3] [CodeGen][NewPM] Port machine trace metrics analysis to
 new pass manager.

---
 .../llvm/CodeGen/MachineTraceMetrics.h| 58 ++---
 llvm/include/llvm/InitializePasses.h  |  2 +-
 .../llvm/Passes/MachinePassRegistry.def   |  4 +-
 llvm/lib/CodeGen/EarlyIfConversion.cpp|  8 +--
 llvm/lib/CodeGen/MachineCombiner.cpp  |  8 +--
 llvm/lib/CodeGen/MachineTraceMetrics.cpp  | 62 +++
 llvm/lib/Passes/PassBuilder.cpp   |  1 +
 .../AArch64/AArch64ConditionalCompares.cpp|  8 +--
 .../AArch64/AArch64StorePairSuppress.cpp  |  6 +-
 9 files changed, 119 insertions(+), 38 deletions(-)

diff --git a/llvm/include/llvm/CodeGen/MachineTraceMetrics.h 
b/llvm/include/llvm/CodeGen/MachineTraceMetrics.h
index c7d97597d551cd..36718f80a1b6dd 100644
--- a/llvm/include/llvm/CodeGen/MachineTraceMetrics.h
+++ b/llvm/include/llvm/CodeGen/MachineTraceMetrics.h
@@ -46,12 +46,13 @@
 #ifndef LLVM_CODEGEN_MACHINETRACEMETRICS_H
 #define LLVM_CODEGEN_MACHINETRACEMETRICS_H
 
-#include "llvm/ADT/SparseSet.h"
 #include "llvm/ADT/ArrayRef.h"
 #include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/SmallVector.h"
+#include "llvm/ADT/SparseSet.h"
 #include "llvm/CodeGen/MachineBasicBlock.h"
 #include "llvm/CodeGen/MachineFunctionPass.h"
+#include "llvm/CodeGen/MachinePassManager.h"
 #include "llvm/CodeGen/TargetSchedule.h"
 
 namespace llvm {
@@ -93,7 +94,7 @@ enum class MachineTraceStrategy {
   TS_NumStrategies
 };
 
-class MachineTraceMetrics : public MachineFunctionPass {
+class MachineTraceMetrics {
   const MachineFunction *MF = nullptr;
   const TargetInstrInfo *TII = nullptr;
   const TargetRegisterInfo *TRI = nullptr;
@@ -102,19 +103,22 @@ class MachineTraceMetrics : public MachineFunctionPass {
   TargetSchedModel SchedModel;
 
 public:
+  friend class MachineTraceMetricsWrapperPass;
   friend class Ensemble;
   friend class Trace;
 
   class Ensemble;
 
-  static char ID;
+  // For legacy pass.
+  MachineTraceMetrics() {
+std::fill(std::begin(Ensembles), std::end(Ensembles), nullptr);
+  }
 
-  MachineTraceMetrics();
+  explicit MachineTraceMetrics(MachineFunction &MF, const MachineLoopInfo &LI);
+  ~MachineTraceMetrics();
 
-  void getAnalysisUsage(AnalysisUsage&) const override;
-  bool runOnMachineFunction(MachineFunction&) override;
-  void releaseMemory() override;
-  void verifyAnalysis() const override;
+  void init(MachineFunction &Func, const MachineLoopInfo &LI);
+  void clear();
 
   /// Per-basic block information that doesn't depend on the trace through the
   /// block.
@@ -400,6 +404,12 @@ class MachineTraceMetrics : public MachineFunctionPass {
   /// Call Ensemble::getTrace() again to update any trace handles.
   void invalidate(const MachineBasicBlock *MBB);
 
+  /// Handle invalidation explicitly.
+  bool invalidate(MachineFunction &, const PreservedAnalyses &PA,
+  MachineFunctionAnalysisManager::Invalidator &);
+
+  void verifyAnalysis() const;
+
 private:
   // One entry per basic block, indexed by block number.
   SmallVector BlockInfo;
@@ -435,6 +445,38 @@ inline raw_ostream &operator<<(raw_ostream &OS,
   return OS;
 }
 
+class MachineTraceMetricsAnalysis
+: public AnalysisInfoMixin {
+  friend AnalysisInfoMixin;
+  static AnalysisKey Key;
+
+public:
+  using Result = MachineTraceMetrics;
+  Result run(MachineFunction &MF, MachineFunctionAnalysisManager &MFAM);
+};
+
+/// Verifier pass for \c MachineTraceMetrics.
+struct MachineTraceMetricsVerifierPass
+: PassInfoMixin {
+  PreservedAnalyses run(MachineFunction &MF,
+MachineFunctionAnalysisManager &MFAM);
+  static bool isRequired() { return true; }
+};
+
+class MachineTraceMetricsWrapperPass : public MachineFunctionPass {
+public:
+  static char ID;
+  MachineTraceMetrics MTM;
+
+  MachineTraceMetricsWrapperPass();
+
+  void getAnalysisUsage(AnalysisUsage &) const override;
+  bool runOnMachineFunction(MachineFunction &) override;
+  void releaseMemory() override { MTM.clear(); }
+  void verifyAnalysis() const override { MTM.verifyAnalysis(); }
+  MachineTraceMetrics &getMTM() { return MTM; }
+};
+
 } // end namespace llvm
 
 #endif // LLVM_CODEGEN_MACHINETRACEMETRICS_H
diff --git a/llvm/include/llvm/InitializePasses.h 
b/llvm/include/llvm/InitializePasses.h
index 6a75dc0285cc61..5ed0ad98a2a72d 100644
--- a/llvm/include/llvm/InitializePasses.h
+++ b/llvm/include/llvm/InitializePasses.h
@@ -209,7 +209,7 @@ void initializeMachineRegionInfoPassPass(PassRegistry &);
 void initializeMachineSanitizerBinaryMetadataPass(PassRegistry &);
 void initializeMachineSchedulerPass(PassRegistry &);
 void initializeMachineSinkingPass(PassRegistry &);
-void initializeMachineTraceMetricsPass(PassRegistry &);

[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port EarlyIfConversion pass to NPM. (PR #108508)

2024-10-15 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas updated 
https://github.com/llvm/llvm-project/pull/108508

>From 9f27fd5fbaa0c9a9075d074d1915ea0cc65e3b07 Mon Sep 17 00:00:00 2001
From: Christudasan Devadasan 
Date: Fri, 13 Sep 2024 12:22:03 +0530
Subject: [PATCH 1/2] [CodeGen][NewPM] Port EarlyIfConversion pass to NPM.

---
 llvm/include/llvm/CodeGen/EarlyIfConversion.h | 24 ++
 llvm/include/llvm/CodeGen/Passes.h|  2 +-
 llvm/include/llvm/InitializePasses.h  |  2 +-
 llvm/include/llvm/Passes/CodeGenPassBuilder.h |  1 +
 .../llvm/Passes/MachinePassRegistry.def   |  2 +-
 llvm/lib/CodeGen/CodeGen.cpp  |  2 +-
 llvm/lib/CodeGen/EarlyIfConversion.cpp| 79 ++-
 llvm/lib/CodeGen/TargetPassConfig.cpp |  4 +-
 llvm/lib/Passes/PassBuilder.cpp   |  1 +
 .../Target/AArch64/AArch64TargetMachine.cpp   |  2 +-
 .../lib/Target/AMDGPU/AMDGPUTargetMachine.cpp |  2 +-
 llvm/lib/Target/PowerPC/PPCTargetMachine.cpp  |  2 +-
 .../Target/SystemZ/SystemZTargetMachine.cpp   |  2 +-
 llvm/lib/Target/X86/X86TargetMachine.cpp  |  2 +-
 .../early-ifcvt-likely-predictable.mir|  1 +
 .../AArch64/early-ifcvt-regclass-mismatch.mir |  1 +
 .../AArch64/early-ifcvt-same-value.mir|  1 +
 .../CodeGen/PowerPC/early-ifcvt-no-isel.mir   |  2 +
 18 files changed, 102 insertions(+), 30 deletions(-)
 create mode 100644 llvm/include/llvm/CodeGen/EarlyIfConversion.h

diff --git a/llvm/include/llvm/CodeGen/EarlyIfConversion.h 
b/llvm/include/llvm/CodeGen/EarlyIfConversion.h
new file mode 100644
index 00..78bf12ade02c3d
--- /dev/null
+++ b/llvm/include/llvm/CodeGen/EarlyIfConversion.h
@@ -0,0 +1,24 @@
+//===- llvm/CodeGen/EarlyIfConversion.h -*- C++ 
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#ifndef LLVM_CODEGEN_EARLYIFCONVERSION_H
+#define LLVM_CODEGEN_EARLYIFCONVERSION_H
+
+#include "llvm/CodeGen/MachinePassManager.h"
+
+namespace llvm {
+
+class EarlyIfConverterPass : public PassInfoMixin {
+public:
+  PreservedAnalyses run(MachineFunction &MF,
+MachineFunctionAnalysisManager &MFAM);
+};
+
+} // namespace llvm
+
+#endif // LLVM_CODEGEN_EARLYIFCONVERSION_H
diff --git a/llvm/include/llvm/CodeGen/Passes.h 
b/llvm/include/llvm/CodeGen/Passes.h
index 99421bdf769ffa..bbbf99626098a6 100644
--- a/llvm/include/llvm/CodeGen/Passes.h
+++ b/llvm/include/llvm/CodeGen/Passes.h
@@ -273,7 +273,7 @@ namespace llvm {
 
   /// EarlyIfConverter - This pass performs if-conversion on SSA form by
   /// inserting cmov instructions.
-  extern char &EarlyIfConverterID;
+  extern char &EarlyIfConverterLegacyID;
 
   /// EarlyIfPredicator - This pass performs if-conversion on SSA form by
   /// predicating if/else block and insert select at the join point.
diff --git a/llvm/include/llvm/InitializePasses.h 
b/llvm/include/llvm/InitializePasses.h
index 5ed0ad98a2a72d..1374880b6a716b 100644
--- a/llvm/include/llvm/InitializePasses.h
+++ b/llvm/include/llvm/InitializePasses.h
@@ -98,7 +98,7 @@ void initializeDominatorTreeWrapperPassPass(PassRegistry &);
 void initializeDwarfEHPrepareLegacyPassPass(PassRegistry &);
 void initializeEarlyCSELegacyPassPass(PassRegistry &);
 void initializeEarlyCSEMemSSALegacyPassPass(PassRegistry &);
-void initializeEarlyIfConverterPass(PassRegistry &);
+void initializeEarlyIfConverterLegacyPass(PassRegistry &);
 void initializeEarlyIfPredicatorPass(PassRegistry &);
 void initializeEarlyMachineLICMPass(PassRegistry &);
 void initializeEarlyTailDuplicatePass(PassRegistry &);
diff --git a/llvm/include/llvm/Passes/CodeGenPassBuilder.h 
b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
index 0d45df08cb0ca7..9ef6e39dbb1cdd 100644
--- a/llvm/include/llvm/Passes/CodeGenPassBuilder.h
+++ b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
@@ -27,6 +27,7 @@
 #include "llvm/CodeGen/CodeGenPrepare.h"
 #include "llvm/CodeGen/DeadMachineInstructionElim.h"
 #include "llvm/CodeGen/DwarfEHPrepare.h"
+#include "llvm/CodeGen/EarlyIfConversion.h"
 #include "llvm/CodeGen/ExpandLargeDivRem.h"
 #include "llvm/CodeGen/ExpandLargeFpConvert.h"
 #include "llvm/CodeGen/ExpandMemCmp.h"
diff --git a/llvm/include/llvm/Passes/MachinePassRegistry.def 
b/llvm/include/llvm/Passes/MachinePassRegistry.def
index 1d7084354455c7..15c6e2de6488c4 100644
--- a/llvm/include/llvm/Passes/MachinePassRegistry.def
+++ b/llvm/include/llvm/Passes/MachinePassRegistry.def
@@ -128,6 +128,7 @@ MACHINE_FUNCTION_ANALYSIS("slot-indexes", 
SlotIndexesAnalysis())
 #ifndef MACHINE_FUNCTION_PASS
 #define MACHINE_FUNCTION_PASS(NAME, CREATE_PASS)
 #endif
+MACHINE_FUNCTION_PASS("early-ifcvt", EarlyIfConverterPass())
 MACHINE_FUNCTION_PASS("dead-mi-elimination", DeadMachineInstructionElimPass())
 MACHINE_FUNCTION_P

[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port EarlyIfConversion pass to NPM. (PR #108508)

2024-10-15 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas updated 
https://github.com/llvm/llvm-project/pull/108508

>From 9f27fd5fbaa0c9a9075d074d1915ea0cc65e3b07 Mon Sep 17 00:00:00 2001
From: Christudasan Devadasan 
Date: Fri, 13 Sep 2024 12:22:03 +0530
Subject: [PATCH 1/2] [CodeGen][NewPM] Port EarlyIfConversion pass to NPM.

---
 llvm/include/llvm/CodeGen/EarlyIfConversion.h | 24 ++
 llvm/include/llvm/CodeGen/Passes.h|  2 +-
 llvm/include/llvm/InitializePasses.h  |  2 +-
 llvm/include/llvm/Passes/CodeGenPassBuilder.h |  1 +
 .../llvm/Passes/MachinePassRegistry.def   |  2 +-
 llvm/lib/CodeGen/CodeGen.cpp  |  2 +-
 llvm/lib/CodeGen/EarlyIfConversion.cpp| 79 ++-
 llvm/lib/CodeGen/TargetPassConfig.cpp |  4 +-
 llvm/lib/Passes/PassBuilder.cpp   |  1 +
 .../Target/AArch64/AArch64TargetMachine.cpp   |  2 +-
 .../lib/Target/AMDGPU/AMDGPUTargetMachine.cpp |  2 +-
 llvm/lib/Target/PowerPC/PPCTargetMachine.cpp  |  2 +-
 .../Target/SystemZ/SystemZTargetMachine.cpp   |  2 +-
 llvm/lib/Target/X86/X86TargetMachine.cpp  |  2 +-
 .../early-ifcvt-likely-predictable.mir|  1 +
 .../AArch64/early-ifcvt-regclass-mismatch.mir |  1 +
 .../AArch64/early-ifcvt-same-value.mir|  1 +
 .../CodeGen/PowerPC/early-ifcvt-no-isel.mir   |  2 +
 18 files changed, 102 insertions(+), 30 deletions(-)
 create mode 100644 llvm/include/llvm/CodeGen/EarlyIfConversion.h

diff --git a/llvm/include/llvm/CodeGen/EarlyIfConversion.h 
b/llvm/include/llvm/CodeGen/EarlyIfConversion.h
new file mode 100644
index 00..78bf12ade02c3d
--- /dev/null
+++ b/llvm/include/llvm/CodeGen/EarlyIfConversion.h
@@ -0,0 +1,24 @@
+//===- llvm/CodeGen/EarlyIfConversion.h -*- C++ 
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#ifndef LLVM_CODEGEN_EARLYIFCONVERSION_H
+#define LLVM_CODEGEN_EARLYIFCONVERSION_H
+
+#include "llvm/CodeGen/MachinePassManager.h"
+
+namespace llvm {
+
+class EarlyIfConverterPass : public PassInfoMixin {
+public:
+  PreservedAnalyses run(MachineFunction &MF,
+MachineFunctionAnalysisManager &MFAM);
+};
+
+} // namespace llvm
+
+#endif // LLVM_CODEGEN_EARLYIFCONVERSION_H
diff --git a/llvm/include/llvm/CodeGen/Passes.h 
b/llvm/include/llvm/CodeGen/Passes.h
index 99421bdf769ffa..bbbf99626098a6 100644
--- a/llvm/include/llvm/CodeGen/Passes.h
+++ b/llvm/include/llvm/CodeGen/Passes.h
@@ -273,7 +273,7 @@ namespace llvm {
 
   /// EarlyIfConverter - This pass performs if-conversion on SSA form by
   /// inserting cmov instructions.
-  extern char &EarlyIfConverterID;
+  extern char &EarlyIfConverterLegacyID;
 
   /// EarlyIfPredicator - This pass performs if-conversion on SSA form by
   /// predicating if/else block and insert select at the join point.
diff --git a/llvm/include/llvm/InitializePasses.h 
b/llvm/include/llvm/InitializePasses.h
index 5ed0ad98a2a72d..1374880b6a716b 100644
--- a/llvm/include/llvm/InitializePasses.h
+++ b/llvm/include/llvm/InitializePasses.h
@@ -98,7 +98,7 @@ void initializeDominatorTreeWrapperPassPass(PassRegistry &);
 void initializeDwarfEHPrepareLegacyPassPass(PassRegistry &);
 void initializeEarlyCSELegacyPassPass(PassRegistry &);
 void initializeEarlyCSEMemSSALegacyPassPass(PassRegistry &);
-void initializeEarlyIfConverterPass(PassRegistry &);
+void initializeEarlyIfConverterLegacyPass(PassRegistry &);
 void initializeEarlyIfPredicatorPass(PassRegistry &);
 void initializeEarlyMachineLICMPass(PassRegistry &);
 void initializeEarlyTailDuplicatePass(PassRegistry &);
diff --git a/llvm/include/llvm/Passes/CodeGenPassBuilder.h 
b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
index 0d45df08cb0ca7..9ef6e39dbb1cdd 100644
--- a/llvm/include/llvm/Passes/CodeGenPassBuilder.h
+++ b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
@@ -27,6 +27,7 @@
 #include "llvm/CodeGen/CodeGenPrepare.h"
 #include "llvm/CodeGen/DeadMachineInstructionElim.h"
 #include "llvm/CodeGen/DwarfEHPrepare.h"
+#include "llvm/CodeGen/EarlyIfConversion.h"
 #include "llvm/CodeGen/ExpandLargeDivRem.h"
 #include "llvm/CodeGen/ExpandLargeFpConvert.h"
 #include "llvm/CodeGen/ExpandMemCmp.h"
diff --git a/llvm/include/llvm/Passes/MachinePassRegistry.def 
b/llvm/include/llvm/Passes/MachinePassRegistry.def
index 1d7084354455c7..15c6e2de6488c4 100644
--- a/llvm/include/llvm/Passes/MachinePassRegistry.def
+++ b/llvm/include/llvm/Passes/MachinePassRegistry.def
@@ -128,6 +128,7 @@ MACHINE_FUNCTION_ANALYSIS("slot-indexes", 
SlotIndexesAnalysis())
 #ifndef MACHINE_FUNCTION_PASS
 #define MACHINE_FUNCTION_PASS(NAME, CREATE_PASS)
 #endif
+MACHINE_FUNCTION_PASS("early-ifcvt", EarlyIfConverterPass())
 MACHINE_FUNCTION_PASS("dead-mi-elimination", DeadMachineInstructionElimPass())
 MACHINE_FUNCTION_P

[llvm-branch-commits] [llvm] [AMDGPU][NewPM] Fill out addILPOpts. (PR #108514)

2024-10-15 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas updated 
https://github.com/llvm/llvm-project/pull/108514

>From 10462c6c2e6b087575d1f8f0c94c38ddebb013a9 Mon Sep 17 00:00:00 2001
From: Christudasan Devadasan 
Date: Fri, 13 Sep 2024 13:53:01 +0530
Subject: [PATCH] [AMDGPU][NewPM] Fill out addILPOpts.

---
 llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 7 +++
 llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h   | 1 +
 2 files changed, 8 insertions(+)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
index 8a3c1a92a63a2d..affb8a2654c1ab 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -1994,6 +1994,13 @@ void AMDGPUCodeGenPassBuilder::addPreISel(AddIRPass 
&addPass) const {
   addPass(RequireAnalysisPass());
 }
 
+void AMDGPUCodeGenPassBuilder::addILPOpts(AddMachinePass &addPass) const {
+  if (EnableEarlyIfConversion)
+addPass(EarlyIfConverterPass());
+
+  Base::addILPOpts(addPass);
+}
+
 void AMDGPUCodeGenPassBuilder::addAsmPrinter(AddMachinePass &addPass,
  CreateMCStreamer) const {
   // TODO: Add AsmPrinter.
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h 
b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h
index af8476bc21ec61..d8a5111e5898d7 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h
@@ -172,6 +172,7 @@ class AMDGPUCodeGenPassBuilder
   void addIRPasses(AddIRPass &) const;
   void addCodeGenPrepare(AddIRPass &) const;
   void addPreISel(AddIRPass &addPass) const;
+  void addILPOpts(AddMachinePass &) const;
   void addAsmPrinter(AddMachinePass &, CreateMCStreamer) const;
   Error addInstSelector(AddMachinePass &) const;
   void addMachineSSAOptimization(AddMachinePass &) const;

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port machine trace metrics analysis to new pass manager. (PR #108507)

2024-10-15 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas updated 
https://github.com/llvm/llvm-project/pull/108507

>From e0e4e978c06a2c78b31382274527201e03082e00 Mon Sep 17 00:00:00 2001
From: Christudasan Devadasan 
Date: Thu, 12 Sep 2024 23:38:09 +0530
Subject: [PATCH 1/2] [CodeGen][NewPM] Port machine trace metrics analysis to
 new pass manager.

---
 .../llvm/CodeGen/MachineTraceMetrics.h| 58 ++---
 llvm/include/llvm/InitializePasses.h  |  2 +-
 .../llvm/Passes/MachinePassRegistry.def   |  4 +-
 llvm/lib/CodeGen/EarlyIfConversion.cpp|  8 +--
 llvm/lib/CodeGen/MachineCombiner.cpp  |  8 +--
 llvm/lib/CodeGen/MachineTraceMetrics.cpp  | 62 +++
 llvm/lib/Passes/PassBuilder.cpp   |  1 +
 .../AArch64/AArch64ConditionalCompares.cpp|  8 +--
 .../AArch64/AArch64StorePairSuppress.cpp  |  6 +-
 9 files changed, 119 insertions(+), 38 deletions(-)

diff --git a/llvm/include/llvm/CodeGen/MachineTraceMetrics.h 
b/llvm/include/llvm/CodeGen/MachineTraceMetrics.h
index c7d97597d551cd..36718f80a1b6dd 100644
--- a/llvm/include/llvm/CodeGen/MachineTraceMetrics.h
+++ b/llvm/include/llvm/CodeGen/MachineTraceMetrics.h
@@ -46,12 +46,13 @@
 #ifndef LLVM_CODEGEN_MACHINETRACEMETRICS_H
 #define LLVM_CODEGEN_MACHINETRACEMETRICS_H
 
-#include "llvm/ADT/SparseSet.h"
 #include "llvm/ADT/ArrayRef.h"
 #include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/SmallVector.h"
+#include "llvm/ADT/SparseSet.h"
 #include "llvm/CodeGen/MachineBasicBlock.h"
 #include "llvm/CodeGen/MachineFunctionPass.h"
+#include "llvm/CodeGen/MachinePassManager.h"
 #include "llvm/CodeGen/TargetSchedule.h"
 
 namespace llvm {
@@ -93,7 +94,7 @@ enum class MachineTraceStrategy {
   TS_NumStrategies
 };
 
-class MachineTraceMetrics : public MachineFunctionPass {
+class MachineTraceMetrics {
   const MachineFunction *MF = nullptr;
   const TargetInstrInfo *TII = nullptr;
   const TargetRegisterInfo *TRI = nullptr;
@@ -102,19 +103,22 @@ class MachineTraceMetrics : public MachineFunctionPass {
   TargetSchedModel SchedModel;
 
 public:
+  friend class MachineTraceMetricsWrapperPass;
   friend class Ensemble;
   friend class Trace;
 
   class Ensemble;
 
-  static char ID;
+  // For legacy pass.
+  MachineTraceMetrics() {
+std::fill(std::begin(Ensembles), std::end(Ensembles), nullptr);
+  }
 
-  MachineTraceMetrics();
+  explicit MachineTraceMetrics(MachineFunction &MF, const MachineLoopInfo &LI);
+  ~MachineTraceMetrics();
 
-  void getAnalysisUsage(AnalysisUsage&) const override;
-  bool runOnMachineFunction(MachineFunction&) override;
-  void releaseMemory() override;
-  void verifyAnalysis() const override;
+  void init(MachineFunction &Func, const MachineLoopInfo &LI);
+  void clear();
 
   /// Per-basic block information that doesn't depend on the trace through the
   /// block.
@@ -400,6 +404,12 @@ class MachineTraceMetrics : public MachineFunctionPass {
   /// Call Ensemble::getTrace() again to update any trace handles.
   void invalidate(const MachineBasicBlock *MBB);
 
+  /// Handle invalidation explicitly.
+  bool invalidate(MachineFunction &, const PreservedAnalyses &PA,
+  MachineFunctionAnalysisManager::Invalidator &);
+
+  void verifyAnalysis() const;
+
 private:
   // One entry per basic block, indexed by block number.
   SmallVector BlockInfo;
@@ -435,6 +445,38 @@ inline raw_ostream &operator<<(raw_ostream &OS,
   return OS;
 }
 
+class MachineTraceMetricsAnalysis
+: public AnalysisInfoMixin {
+  friend AnalysisInfoMixin;
+  static AnalysisKey Key;
+
+public:
+  using Result = MachineTraceMetrics;
+  Result run(MachineFunction &MF, MachineFunctionAnalysisManager &MFAM);
+};
+
+/// Verifier pass for \c MachineTraceMetrics.
+struct MachineTraceMetricsVerifierPass
+: PassInfoMixin {
+  PreservedAnalyses run(MachineFunction &MF,
+MachineFunctionAnalysisManager &MFAM);
+  static bool isRequired() { return true; }
+};
+
+class MachineTraceMetricsWrapperPass : public MachineFunctionPass {
+public:
+  static char ID;
+  MachineTraceMetrics MTM;
+
+  MachineTraceMetricsWrapperPass();
+
+  void getAnalysisUsage(AnalysisUsage &) const override;
+  bool runOnMachineFunction(MachineFunction &) override;
+  void releaseMemory() override { MTM.clear(); }
+  void verifyAnalysis() const override { MTM.verifyAnalysis(); }
+  MachineTraceMetrics &getMTM() { return MTM; }
+};
+
 } // end namespace llvm
 
 #endif // LLVM_CODEGEN_MACHINETRACEMETRICS_H
diff --git a/llvm/include/llvm/InitializePasses.h 
b/llvm/include/llvm/InitializePasses.h
index 6a75dc0285cc61..5ed0ad98a2a72d 100644
--- a/llvm/include/llvm/InitializePasses.h
+++ b/llvm/include/llvm/InitializePasses.h
@@ -209,7 +209,7 @@ void initializeMachineRegionInfoPassPass(PassRegistry &);
 void initializeMachineSanitizerBinaryMetadataPass(PassRegistry &);
 void initializeMachineSchedulerPass(PassRegistry &);
 void initializeMachineSinkingPass(PassRegistry &);
-void initializeMachineTraceMetricsPass(PassRegistry &);

[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port EarlyIfConversion pass to NPM. (PR #108508)

2024-10-15 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas updated 
https://github.com/llvm/llvm-project/pull/108508

>From dc7a71f17436ab0ce96546cef7639ef2cffd173f Mon Sep 17 00:00:00 2001
From: Christudasan Devadasan 
Date: Fri, 13 Sep 2024 12:22:03 +0530
Subject: [PATCH 1/2] [CodeGen][NewPM] Port EarlyIfConversion pass to NPM.

---
 llvm/include/llvm/CodeGen/EarlyIfConversion.h | 24 ++
 llvm/include/llvm/CodeGen/Passes.h|  2 +-
 llvm/include/llvm/InitializePasses.h  |  2 +-
 llvm/include/llvm/Passes/CodeGenPassBuilder.h |  1 +
 .../llvm/Passes/MachinePassRegistry.def   |  2 +-
 llvm/lib/CodeGen/CodeGen.cpp  |  2 +-
 llvm/lib/CodeGen/EarlyIfConversion.cpp| 79 ++-
 llvm/lib/CodeGen/TargetPassConfig.cpp |  4 +-
 llvm/lib/Passes/PassBuilder.cpp   |  1 +
 .../Target/AArch64/AArch64TargetMachine.cpp   |  2 +-
 .../lib/Target/AMDGPU/AMDGPUTargetMachine.cpp |  2 +-
 llvm/lib/Target/PowerPC/PPCTargetMachine.cpp  |  2 +-
 .../Target/SystemZ/SystemZTargetMachine.cpp   |  2 +-
 llvm/lib/Target/X86/X86TargetMachine.cpp  |  2 +-
 .../early-ifcvt-likely-predictable.mir|  1 +
 .../AArch64/early-ifcvt-regclass-mismatch.mir |  1 +
 .../AArch64/early-ifcvt-same-value.mir|  1 +
 .../CodeGen/PowerPC/early-ifcvt-no-isel.mir   |  2 +
 18 files changed, 102 insertions(+), 30 deletions(-)
 create mode 100644 llvm/include/llvm/CodeGen/EarlyIfConversion.h

diff --git a/llvm/include/llvm/CodeGen/EarlyIfConversion.h 
b/llvm/include/llvm/CodeGen/EarlyIfConversion.h
new file mode 100644
index 00..78bf12ade02c3d
--- /dev/null
+++ b/llvm/include/llvm/CodeGen/EarlyIfConversion.h
@@ -0,0 +1,24 @@
+//===- llvm/CodeGen/EarlyIfConversion.h -*- C++ 
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#ifndef LLVM_CODEGEN_EARLYIFCONVERSION_H
+#define LLVM_CODEGEN_EARLYIFCONVERSION_H
+
+#include "llvm/CodeGen/MachinePassManager.h"
+
+namespace llvm {
+
+class EarlyIfConverterPass : public PassInfoMixin {
+public:
+  PreservedAnalyses run(MachineFunction &MF,
+MachineFunctionAnalysisManager &MFAM);
+};
+
+} // namespace llvm
+
+#endif // LLVM_CODEGEN_EARLYIFCONVERSION_H
diff --git a/llvm/include/llvm/CodeGen/Passes.h 
b/llvm/include/llvm/CodeGen/Passes.h
index 99421bdf769ffa..bbbf99626098a6 100644
--- a/llvm/include/llvm/CodeGen/Passes.h
+++ b/llvm/include/llvm/CodeGen/Passes.h
@@ -273,7 +273,7 @@ namespace llvm {
 
   /// EarlyIfConverter - This pass performs if-conversion on SSA form by
   /// inserting cmov instructions.
-  extern char &EarlyIfConverterID;
+  extern char &EarlyIfConverterLegacyID;
 
   /// EarlyIfPredicator - This pass performs if-conversion on SSA form by
   /// predicating if/else block and insert select at the join point.
diff --git a/llvm/include/llvm/InitializePasses.h 
b/llvm/include/llvm/InitializePasses.h
index 5ed0ad98a2a72d..1374880b6a716b 100644
--- a/llvm/include/llvm/InitializePasses.h
+++ b/llvm/include/llvm/InitializePasses.h
@@ -98,7 +98,7 @@ void initializeDominatorTreeWrapperPassPass(PassRegistry &);
 void initializeDwarfEHPrepareLegacyPassPass(PassRegistry &);
 void initializeEarlyCSELegacyPassPass(PassRegistry &);
 void initializeEarlyCSEMemSSALegacyPassPass(PassRegistry &);
-void initializeEarlyIfConverterPass(PassRegistry &);
+void initializeEarlyIfConverterLegacyPass(PassRegistry &);
 void initializeEarlyIfPredicatorPass(PassRegistry &);
 void initializeEarlyMachineLICMPass(PassRegistry &);
 void initializeEarlyTailDuplicatePass(PassRegistry &);
diff --git a/llvm/include/llvm/Passes/CodeGenPassBuilder.h 
b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
index 0d45df08cb0ca7..9ef6e39dbb1cdd 100644
--- a/llvm/include/llvm/Passes/CodeGenPassBuilder.h
+++ b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
@@ -27,6 +27,7 @@
 #include "llvm/CodeGen/CodeGenPrepare.h"
 #include "llvm/CodeGen/DeadMachineInstructionElim.h"
 #include "llvm/CodeGen/DwarfEHPrepare.h"
+#include "llvm/CodeGen/EarlyIfConversion.h"
 #include "llvm/CodeGen/ExpandLargeDivRem.h"
 #include "llvm/CodeGen/ExpandLargeFpConvert.h"
 #include "llvm/CodeGen/ExpandMemCmp.h"
diff --git a/llvm/include/llvm/Passes/MachinePassRegistry.def 
b/llvm/include/llvm/Passes/MachinePassRegistry.def
index 1d7084354455c7..15c6e2de6488c4 100644
--- a/llvm/include/llvm/Passes/MachinePassRegistry.def
+++ b/llvm/include/llvm/Passes/MachinePassRegistry.def
@@ -128,6 +128,7 @@ MACHINE_FUNCTION_ANALYSIS("slot-indexes", 
SlotIndexesAnalysis())
 #ifndef MACHINE_FUNCTION_PASS
 #define MACHINE_FUNCTION_PASS(NAME, CREATE_PASS)
 #endif
+MACHINE_FUNCTION_PASS("early-ifcvt", EarlyIfConverterPass())
 MACHINE_FUNCTION_PASS("dead-mi-elimination", DeadMachineInstructionElimPass())
 MACHINE_FUNCTION_P

[llvm-branch-commits] [llvm] [AMDGPU][NewPM] Fill out addILPOpts. (PR #108514)

2024-10-15 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas updated 
https://github.com/llvm/llvm-project/pull/108514

>From 0be0bb79f9e6bae184905f753bc5bd5c73eb1e56 Mon Sep 17 00:00:00 2001
From: Christudasan Devadasan 
Date: Fri, 13 Sep 2024 13:53:01 +0530
Subject: [PATCH] [AMDGPU][NewPM] Fill out addILPOpts.

---
 llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 7 +++
 llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h   | 1 +
 2 files changed, 8 insertions(+)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
index 8a3c1a92a63a2d..affb8a2654c1ab 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -1994,6 +1994,13 @@ void AMDGPUCodeGenPassBuilder::addPreISel(AddIRPass 
&addPass) const {
   addPass(RequireAnalysisPass());
 }
 
+void AMDGPUCodeGenPassBuilder::addILPOpts(AddMachinePass &addPass) const {
+  if (EnableEarlyIfConversion)
+addPass(EarlyIfConverterPass());
+
+  Base::addILPOpts(addPass);
+}
+
 void AMDGPUCodeGenPassBuilder::addAsmPrinter(AddMachinePass &addPass,
  CreateMCStreamer) const {
   // TODO: Add AsmPrinter.
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h 
b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h
index af8476bc21ec61..d8a5111e5898d7 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h
@@ -172,6 +172,7 @@ class AMDGPUCodeGenPassBuilder
   void addIRPasses(AddIRPass &) const;
   void addCodeGenPrepare(AddIRPass &) const;
   void addPreISel(AddIRPass &addPass) const;
+  void addILPOpts(AddMachinePass &) const;
   void addAsmPrinter(AddMachinePass &, CreateMCStreamer) const;
   Error addInstSelector(AddMachinePass &) const;
   void addMachineSSAOptimization(AddMachinePass &) const;

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port EarlyIfConversion pass to NPM. (PR #108508)

2024-10-15 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas updated 
https://github.com/llvm/llvm-project/pull/108508

>From f692bb7b707610ebd1ad5f61943802aaef723f67 Mon Sep 17 00:00:00 2001
From: Christudasan Devadasan 
Date: Fri, 13 Sep 2024 12:22:03 +0530
Subject: [PATCH 1/2] [CodeGen][NewPM] Port EarlyIfConversion pass to NPM.

---
 llvm/include/llvm/CodeGen/EarlyIfConversion.h | 24 ++
 llvm/include/llvm/CodeGen/Passes.h|  2 +-
 llvm/include/llvm/InitializePasses.h  |  2 +-
 llvm/include/llvm/Passes/CodeGenPassBuilder.h |  1 +
 .../llvm/Passes/MachinePassRegistry.def   |  2 +-
 llvm/lib/CodeGen/CodeGen.cpp  |  2 +-
 llvm/lib/CodeGen/EarlyIfConversion.cpp| 79 ++-
 llvm/lib/CodeGen/TargetPassConfig.cpp |  4 +-
 llvm/lib/Passes/PassBuilder.cpp   |  1 +
 .../Target/AArch64/AArch64TargetMachine.cpp   |  2 +-
 .../lib/Target/AMDGPU/AMDGPUTargetMachine.cpp |  2 +-
 llvm/lib/Target/PowerPC/PPCTargetMachine.cpp  |  2 +-
 .../Target/SystemZ/SystemZTargetMachine.cpp   |  2 +-
 llvm/lib/Target/X86/X86TargetMachine.cpp  |  2 +-
 .../early-ifcvt-likely-predictable.mir|  1 +
 .../AArch64/early-ifcvt-regclass-mismatch.mir |  1 +
 .../AArch64/early-ifcvt-same-value.mir|  1 +
 .../CodeGen/PowerPC/early-ifcvt-no-isel.mir   |  2 +
 18 files changed, 102 insertions(+), 30 deletions(-)
 create mode 100644 llvm/include/llvm/CodeGen/EarlyIfConversion.h

diff --git a/llvm/include/llvm/CodeGen/EarlyIfConversion.h 
b/llvm/include/llvm/CodeGen/EarlyIfConversion.h
new file mode 100644
index 00..78bf12ade02c3d
--- /dev/null
+++ b/llvm/include/llvm/CodeGen/EarlyIfConversion.h
@@ -0,0 +1,24 @@
+//===- llvm/CodeGen/EarlyIfConversion.h -*- C++ 
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#ifndef LLVM_CODEGEN_EARLYIFCONVERSION_H
+#define LLVM_CODEGEN_EARLYIFCONVERSION_H
+
+#include "llvm/CodeGen/MachinePassManager.h"
+
+namespace llvm {
+
+class EarlyIfConverterPass : public PassInfoMixin {
+public:
+  PreservedAnalyses run(MachineFunction &MF,
+MachineFunctionAnalysisManager &MFAM);
+};
+
+} // namespace llvm
+
+#endif // LLVM_CODEGEN_EARLYIFCONVERSION_H
diff --git a/llvm/include/llvm/CodeGen/Passes.h 
b/llvm/include/llvm/CodeGen/Passes.h
index 99421bdf769ffa..bbbf99626098a6 100644
--- a/llvm/include/llvm/CodeGen/Passes.h
+++ b/llvm/include/llvm/CodeGen/Passes.h
@@ -273,7 +273,7 @@ namespace llvm {
 
   /// EarlyIfConverter - This pass performs if-conversion on SSA form by
   /// inserting cmov instructions.
-  extern char &EarlyIfConverterID;
+  extern char &EarlyIfConverterLegacyID;
 
   /// EarlyIfPredicator - This pass performs if-conversion on SSA form by
   /// predicating if/else block and insert select at the join point.
diff --git a/llvm/include/llvm/InitializePasses.h 
b/llvm/include/llvm/InitializePasses.h
index 5ed0ad98a2a72d..1374880b6a716b 100644
--- a/llvm/include/llvm/InitializePasses.h
+++ b/llvm/include/llvm/InitializePasses.h
@@ -98,7 +98,7 @@ void initializeDominatorTreeWrapperPassPass(PassRegistry &);
 void initializeDwarfEHPrepareLegacyPassPass(PassRegistry &);
 void initializeEarlyCSELegacyPassPass(PassRegistry &);
 void initializeEarlyCSEMemSSALegacyPassPass(PassRegistry &);
-void initializeEarlyIfConverterPass(PassRegistry &);
+void initializeEarlyIfConverterLegacyPass(PassRegistry &);
 void initializeEarlyIfPredicatorPass(PassRegistry &);
 void initializeEarlyMachineLICMPass(PassRegistry &);
 void initializeEarlyTailDuplicatePass(PassRegistry &);
diff --git a/llvm/include/llvm/Passes/CodeGenPassBuilder.h 
b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
index 0d45df08cb0ca7..9ef6e39dbb1cdd 100644
--- a/llvm/include/llvm/Passes/CodeGenPassBuilder.h
+++ b/llvm/include/llvm/Passes/CodeGenPassBuilder.h
@@ -27,6 +27,7 @@
 #include "llvm/CodeGen/CodeGenPrepare.h"
 #include "llvm/CodeGen/DeadMachineInstructionElim.h"
 #include "llvm/CodeGen/DwarfEHPrepare.h"
+#include "llvm/CodeGen/EarlyIfConversion.h"
 #include "llvm/CodeGen/ExpandLargeDivRem.h"
 #include "llvm/CodeGen/ExpandLargeFpConvert.h"
 #include "llvm/CodeGen/ExpandMemCmp.h"
diff --git a/llvm/include/llvm/Passes/MachinePassRegistry.def 
b/llvm/include/llvm/Passes/MachinePassRegistry.def
index 1d7084354455c7..15c6e2de6488c4 100644
--- a/llvm/include/llvm/Passes/MachinePassRegistry.def
+++ b/llvm/include/llvm/Passes/MachinePassRegistry.def
@@ -128,6 +128,7 @@ MACHINE_FUNCTION_ANALYSIS("slot-indexes", 
SlotIndexesAnalysis())
 #ifndef MACHINE_FUNCTION_PASS
 #define MACHINE_FUNCTION_PASS(NAME, CREATE_PASS)
 #endif
+MACHINE_FUNCTION_PASS("early-ifcvt", EarlyIfConverterPass())
 MACHINE_FUNCTION_PASS("dead-mi-elimination", DeadMachineInstructionElimPass())
 MACHINE_FUNCTION_P

[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port machine trace metrics analysis to new pass manager. (PR #108507)

2024-10-15 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas updated 
https://github.com/llvm/llvm-project/pull/108507

>From eabec10a9a7daf537006ff5d580b798a285e33e3 Mon Sep 17 00:00:00 2001
From: Christudasan Devadasan 
Date: Thu, 12 Sep 2024 23:38:09 +0530
Subject: [PATCH 1/4] [CodeGen][NewPM] Port machine trace metrics analysis to
 new pass manager.

---
 .../llvm/CodeGen/MachineTraceMetrics.h| 58 ++---
 llvm/include/llvm/InitializePasses.h  |  2 +-
 .../llvm/Passes/MachinePassRegistry.def   |  4 +-
 llvm/lib/CodeGen/EarlyIfConversion.cpp|  8 +--
 llvm/lib/CodeGen/MachineCombiner.cpp  |  8 +--
 llvm/lib/CodeGen/MachineTraceMetrics.cpp  | 62 +++
 llvm/lib/Passes/PassBuilder.cpp   |  1 +
 .../AArch64/AArch64ConditionalCompares.cpp|  8 +--
 .../AArch64/AArch64StorePairSuppress.cpp  |  6 +-
 9 files changed, 119 insertions(+), 38 deletions(-)

diff --git a/llvm/include/llvm/CodeGen/MachineTraceMetrics.h 
b/llvm/include/llvm/CodeGen/MachineTraceMetrics.h
index c7d97597d551cd..36718f80a1b6dd 100644
--- a/llvm/include/llvm/CodeGen/MachineTraceMetrics.h
+++ b/llvm/include/llvm/CodeGen/MachineTraceMetrics.h
@@ -46,12 +46,13 @@
 #ifndef LLVM_CODEGEN_MACHINETRACEMETRICS_H
 #define LLVM_CODEGEN_MACHINETRACEMETRICS_H
 
-#include "llvm/ADT/SparseSet.h"
 #include "llvm/ADT/ArrayRef.h"
 #include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/SmallVector.h"
+#include "llvm/ADT/SparseSet.h"
 #include "llvm/CodeGen/MachineBasicBlock.h"
 #include "llvm/CodeGen/MachineFunctionPass.h"
+#include "llvm/CodeGen/MachinePassManager.h"
 #include "llvm/CodeGen/TargetSchedule.h"
 
 namespace llvm {
@@ -93,7 +94,7 @@ enum class MachineTraceStrategy {
   TS_NumStrategies
 };
 
-class MachineTraceMetrics : public MachineFunctionPass {
+class MachineTraceMetrics {
   const MachineFunction *MF = nullptr;
   const TargetInstrInfo *TII = nullptr;
   const TargetRegisterInfo *TRI = nullptr;
@@ -102,19 +103,22 @@ class MachineTraceMetrics : public MachineFunctionPass {
   TargetSchedModel SchedModel;
 
 public:
+  friend class MachineTraceMetricsWrapperPass;
   friend class Ensemble;
   friend class Trace;
 
   class Ensemble;
 
-  static char ID;
+  // For legacy pass.
+  MachineTraceMetrics() {
+std::fill(std::begin(Ensembles), std::end(Ensembles), nullptr);
+  }
 
-  MachineTraceMetrics();
+  explicit MachineTraceMetrics(MachineFunction &MF, const MachineLoopInfo &LI);
+  ~MachineTraceMetrics();
 
-  void getAnalysisUsage(AnalysisUsage&) const override;
-  bool runOnMachineFunction(MachineFunction&) override;
-  void releaseMemory() override;
-  void verifyAnalysis() const override;
+  void init(MachineFunction &Func, const MachineLoopInfo &LI);
+  void clear();
 
   /// Per-basic block information that doesn't depend on the trace through the
   /// block.
@@ -400,6 +404,12 @@ class MachineTraceMetrics : public MachineFunctionPass {
   /// Call Ensemble::getTrace() again to update any trace handles.
   void invalidate(const MachineBasicBlock *MBB);
 
+  /// Handle invalidation explicitly.
+  bool invalidate(MachineFunction &, const PreservedAnalyses &PA,
+  MachineFunctionAnalysisManager::Invalidator &);
+
+  void verifyAnalysis() const;
+
 private:
   // One entry per basic block, indexed by block number.
   SmallVector BlockInfo;
@@ -435,6 +445,38 @@ inline raw_ostream &operator<<(raw_ostream &OS,
   return OS;
 }
 
+class MachineTraceMetricsAnalysis
+: public AnalysisInfoMixin {
+  friend AnalysisInfoMixin;
+  static AnalysisKey Key;
+
+public:
+  using Result = MachineTraceMetrics;
+  Result run(MachineFunction &MF, MachineFunctionAnalysisManager &MFAM);
+};
+
+/// Verifier pass for \c MachineTraceMetrics.
+struct MachineTraceMetricsVerifierPass
+: PassInfoMixin {
+  PreservedAnalyses run(MachineFunction &MF,
+MachineFunctionAnalysisManager &MFAM);
+  static bool isRequired() { return true; }
+};
+
+class MachineTraceMetricsWrapperPass : public MachineFunctionPass {
+public:
+  static char ID;
+  MachineTraceMetrics MTM;
+
+  MachineTraceMetricsWrapperPass();
+
+  void getAnalysisUsage(AnalysisUsage &) const override;
+  bool runOnMachineFunction(MachineFunction &) override;
+  void releaseMemory() override { MTM.clear(); }
+  void verifyAnalysis() const override { MTM.verifyAnalysis(); }
+  MachineTraceMetrics &getMTM() { return MTM; }
+};
+
 } // end namespace llvm
 
 #endif // LLVM_CODEGEN_MACHINETRACEMETRICS_H
diff --git a/llvm/include/llvm/InitializePasses.h 
b/llvm/include/llvm/InitializePasses.h
index 6a75dc0285cc61..5ed0ad98a2a72d 100644
--- a/llvm/include/llvm/InitializePasses.h
+++ b/llvm/include/llvm/InitializePasses.h
@@ -209,7 +209,7 @@ void initializeMachineRegionInfoPassPass(PassRegistry &);
 void initializeMachineSanitizerBinaryMetadataPass(PassRegistry &);
 void initializeMachineSchedulerPass(PassRegistry &);
 void initializeMachineSinkingPass(PassRegistry &);
-void initializeMachineTraceMetricsPass(PassRegistry &);

[llvm-branch-commits] [llvm] [AMDGPU][NewPM] Fill out addILPOpts. (PR #108514)

2024-10-15 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas updated 
https://github.com/llvm/llvm-project/pull/108514

>From 037b15166e575b59e115795e5b4fd8f4065b4483 Mon Sep 17 00:00:00 2001
From: Christudasan Devadasan 
Date: Fri, 13 Sep 2024 13:53:01 +0530
Subject: [PATCH] [AMDGPU][NewPM] Fill out addILPOpts.

---
 llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 7 +++
 llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h   | 1 +
 2 files changed, 8 insertions(+)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
index 8a3c1a92a63a2d..affb8a2654c1ab 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -1994,6 +1994,13 @@ void AMDGPUCodeGenPassBuilder::addPreISel(AddIRPass 
&addPass) const {
   addPass(RequireAnalysisPass());
 }
 
+void AMDGPUCodeGenPassBuilder::addILPOpts(AddMachinePass &addPass) const {
+  if (EnableEarlyIfConversion)
+addPass(EarlyIfConverterPass());
+
+  Base::addILPOpts(addPass);
+}
+
 void AMDGPUCodeGenPassBuilder::addAsmPrinter(AddMachinePass &addPass,
  CreateMCStreamer) const {
   // TODO: Add AsmPrinter.
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h 
b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h
index af8476bc21ec61..d8a5111e5898d7 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h
@@ -172,6 +172,7 @@ class AMDGPUCodeGenPassBuilder
   void addIRPasses(AddIRPass &) const;
   void addCodeGenPrepare(AddIRPass &) const;
   void addPreISel(AddIRPass &addPass) const;
+  void addILPOpts(AddMachinePass &) const;
   void addAsmPrinter(AddMachinePass &, CreateMCStreamer) const;
   Error addInstSelector(AddMachinePass &) const;
   void addMachineSSAOptimization(AddMachinePass &) const;

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU] Serialize WWM_REG vreg flag (PR #110229)

2024-10-08 Thread Christudasan Devadasan via llvm-branch-commits



@@ -684,8 +684,8 @@ class SIMachineFunctionInfo final : public 
AMDGPUMachineFunction,
 
   void setFlag(Register Reg, uint8_t Flag) {
 assert(Reg.isVirtual());
-if (VRegFlags.inBounds(Reg))
-  VRegFlags[Reg] |= Flag;
+VRegFlags.grow(Reg);

cdevadas wrote:

This change doesn't make sense to me. What will happen to the regular flow when 
it reaches from MRI createVirtualRegister? Isn't duplicating the size?
https://github.com/llvm/llvm-project/blob/main/llvm/lib/CodeGen/MachineRegisterInfo.cpp#L166

https://github.com/llvm/llvm-project/pull/110229
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU] Serialize WWM_REG vreg flag (PR #110229)

2024-10-09 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas approved this pull request.


https://github.com/llvm/llvm-project/pull/110229
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port machine trace metrics analysis to new pass manager. (PR #108507)

2024-10-16 Thread Christudasan Devadasan via llvm-branch-commits


cdevadas wrote:

### Merge activity

* **Oct 16, 3:38 AM EDT**: A user started a stack merge that includes this pull 
request via 
[Graphite](https://app.graphite.dev/github/pr/llvm/llvm-project/108507).


https://github.com/llvm/llvm-project/pull/108507
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU][NewPM] Fill out addILPOpts. (PR #108514)

2024-10-16 Thread Christudasan Devadasan via llvm-branch-commits


cdevadas wrote:

### Merge activity

* **Oct 16, 3:38 AM EDT**: A user started a stack merge that includes this pull 
request via 
[Graphite](https://app.graphite.dev/github/pr/llvm/llvm-project/108514).


https://github.com/llvm/llvm-project/pull/108514
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port EarlyIfConversion pass to NPM. (PR #108508)

2024-10-16 Thread Christudasan Devadasan via llvm-branch-commits


cdevadas wrote:

### Merge activity

* **Oct 16, 3:38 AM EDT**: A user started a stack merge that includes this pull 
request via 
[Graphite](https://app.graphite.dev/github/pr/llvm/llvm-project/108508).


https://github.com/llvm/llvm-project/pull/108508
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port machine trace metrics analysis to new pass manager. (PR #108507)

2024-09-23 Thread Christudasan Devadasan via llvm-branch-commits



@@ -102,19 +103,22 @@ class MachineTraceMetrics : public MachineFunctionPass {
   TargetSchedModel SchedModel;
 
 public:
+  friend class MachineTraceMetricsWrapperPass;
   friend class Ensemble;
   friend class Trace;
 
   class Ensemble;
 
-  static char ID;
+  // For legacy pass.
+  MachineTraceMetrics() {
+std::fill(std::begin(Ensembles), std::end(Ensembles), nullptr);
+  }

cdevadas wrote:

It isn't possible to move out the Ensembles pointer initialization from the 
constructors. Certain tests crashed while the destructor invokes clear() that 
tries to delete the Ensemble pointers (some garbage value). The default 
constructor for these tests doesn't appropriately clear the object's members. 
The tests that crashed don't contain any function definitions, but only some 
global declarations. So the run() instances won't be invoked for clearing these 
pointers. Also, they should point to the dynamically allocated addresses 
(otherwise the initial null addresses) before the destructor is invoked.


https://github.com/llvm/llvm-project/pull/108507
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU] Serialize WWM_REG vreg flag (PR #110229)

2024-09-27 Thread Christudasan Devadasan via llvm-branch-commits



@@ -684,8 +684,8 @@ class SIMachineFunctionInfo final : public 
AMDGPUMachineFunction,
 
   void setFlag(Register Reg, uint8_t Flag) {
 assert(Reg.isVirtual());
-if (VRegFlags.inBounds(Reg))
-  VRegFlags[Reg] |= Flag;
+VRegFlags.grow(Reg);

cdevadas wrote:

I guess, this change is unnecessary. Keep the inbounds check back.
The following change in the other patch would make sure to grow VRegFlags for 
each virtual register it encountered.
https://github.com/llvm/llvm-project/pull/110228/files#diff-c72079b2a595aca3300d5e3c15d227f81937f2745f7c5494fcf1fe9ba37d8828R1789

https://github.com/llvm/llvm-project/pull/110229
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU] Serialize WWM_REG vreg flag (PR #110229)

2024-09-27 Thread Christudasan Devadasan via llvm-branch-commits



@@ -0,0 +1,16 @@
+# RUN: llc -mtriple=amdgcn -run-pass=none -o - %s | FileCheck %s

cdevadas wrote:

Mostly, the tests related to serialize options go in 
llvm/test/CodeGen/MIR/AMDGPU/machine-function-info-no-ir.mir.
Additionally, a negative test is required for this serialized option and that 
should be a separate test like 
llvm/test/CodeGen/MIR/AMDGPU/sgpr-for-exec-copy-invalid-reg.mir

https://github.com/llvm/llvm-project/pull/110229
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port MachineCycleInfo to NPM (PR #114745)

2024-11-07 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas edited 
https://github.com/llvm/llvm-project/pull/114745
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port MachineCycleInfo to NPM (PR #114745)

2024-11-07 Thread Christudasan Devadasan via llvm-branch-commits



@@ -151,6 +152,7 @@ MACHINE_FUNCTION_PASS("print",
 MACHINE_FUNCTION_PASS("print",
   MachineDominatorTreePrinterPass(dbgs()))
 MACHINE_FUNCTION_PASS("print", MachineLoopPrinterPass(dbgs()))
+MACHINE_FUNCTION_PASS("print", 
MachineCycleInfoPrinterPass(dbgs()))

cdevadas wrote:

Alphabetical order.

https://github.com/llvm/llvm-project/pull/114745
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port MachineCycleInfo to NPM (PR #114745)

2024-11-07 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas commented:

I guess this PR is for porting MachineCycleAnalysis. The run interface for 
legacy MachineCycle analysis is missing. The porting for the Print interface 
seems complete.
Also, no test/runline to validate the MachineCycle analysis in the NPM path. I 
see only the runline for the print interface.

https://github.com/llvm/llvm-project/pull/114745
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU] Add liverange split instructions into BB Prolog (PR #117544)

2024-11-25 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas created 
https://github.com/llvm/llvm-project/pull/117544

The COPY inserted for liverange split during sgpr-regalloc
pipeline currently breaks the BB prolog during the subsequent
vgpr-regalloc phase while spilling and/or splitting the vector
liveranges. This patch fixes it by correctly including the
the LR split instructions during sgpr-regalloc and wwm-regalloc
pipelines into the BB prolog.

>From 2e2b216d58e525e7dd65245ae2ee8b86750564c8 Mon Sep 17 00:00:00 2001
From: Christudasan Devadasan 
Date: Tue, 19 Nov 2024 12:14:05 +0530
Subject: [PATCH] [AMDGPU] Add liverange split instructions into BB Prolog

The COPY inserted for liverange split during sgpr-regalloc
pipeline currently breaks the BB prolog during the subsequent
vgpr-regalloc phase while spilling and/or splitting the vector
liveranges. This patch fixes it by correctly including the
the LR split instructions during sgpr-regalloc and wwm-regalloc
pipelines into the BB prolog.
---
 llvm/lib/Target/AMDGPU/SIInstrInfo.cpp|  34 -
 llvm/lib/Target/AMDGPU/SIInstrInfo.h  |   2 +
 .../identical-subrange-spill-infloop.ll   | 116 
 .../ran-out-of-sgprs-allocation-failure.mir   | 128 +-
 4 files changed, 148 insertions(+), 132 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp 
b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
index 4a94d690297949..204a575e2f64c1 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
@@ -8956,6 +8956,30 @@ unsigned SIInstrInfo::getLiveRangeSplitOpcode(Register 
SrcReg,
   return AMDGPU::COPY;
 }
 
+bool SIInstrInfo::canAddToBBProlog(const MachineInstr &MI) const {
+  uint16_t Opcode = MI.getOpcode();
+  // Check if it is SGPR spill or wwm-register spill Opcode.
+  if (isSGPRSpill(Opcode) || isWWMRegSpillOpcode(Opcode))
+return true;
+
+  const MachineFunction *MF = MI.getMF();
+  const MachineRegisterInfo &MRI = MF->getRegInfo();
+  const SIMachineFunctionInfo *MFI = MF->getInfo();
+
+  // See if this is Liverange split instruction inserted for SGPR or
+  // wwm-register. The implicit def inserted for wwm-registers should also be
+  // included as they can appear at the bb begin.
+  bool IsLRSplitInst = MI.getFlag(MachineInstr::LRSplit);
+  if (!IsLRSplitInst && Opcode != AMDGPU::IMPLICIT_DEF)
+return false;
+
+  Register Reg = MI.getOperand(0).getReg();
+  if (RI.isSGPRClass(RI.getRegClassForReg(MRI, Reg)))
+return IsLRSplitInst;
+
+  return MFI->isWWMReg(Reg);
+}
+
 bool SIInstrInfo::isBasicBlockPrologue(const MachineInstr &MI,
Register Reg) const {
   // We need to handle instructions which may be inserted during register
@@ -8964,20 +8988,16 @@ bool SIInstrInfo::isBasicBlockPrologue(const 
MachineInstr &MI,
   // needed by the prolog. However, the insertions for scalar registers can
   // always be placed at the BB top as they are independent of the exec mask
   // value.
-  const MachineFunction *MF = MI.getParent()->getParent();
   bool IsNullOrVectorRegister = true;
   if (Reg) {
+const MachineFunction *MF = MI.getMF();
 const MachineRegisterInfo &MRI = MF->getRegInfo();
 IsNullOrVectorRegister = !RI.isSGPRClass(RI.getRegClassForReg(MRI, Reg));
   }
 
-  uint16_t Opcode = MI.getOpcode();
-  const SIMachineFunctionInfo *MFI = MF->getInfo();
   return IsNullOrVectorRegister &&
- (isSGPRSpill(Opcode) || isWWMRegSpillOpcode(Opcode) ||
-  (Opcode == AMDGPU::IMPLICIT_DEF &&
-   MFI->isWWMReg(MI.getOperand(0).getReg())) ||
-  (!MI.isTerminator() && Opcode != AMDGPU::COPY &&
+ (canAddToBBProlog(MI) ||
+  (!MI.isTerminator() && MI.getOpcode() != AMDGPU::COPY &&
MI.modifiesRegister(AMDGPU::EXEC, &RI)));
 }
 
diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.h 
b/llvm/lib/Target/AMDGPU/SIInstrInfo.h
index e55418326a4bd0..ea1d16784645e1 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.h
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.h
@@ -1348,6 +1348,8 @@ class SIInstrInfo final : public AMDGPUGenInstrInfo {
   bool isBasicBlockPrologue(const MachineInstr &MI,
 Register Reg = Register()) const override;
 
+  bool canAddToBBProlog(const MachineInstr &MI) const;
+
   MachineInstr *createPHIDestinationCopy(MachineBasicBlock &MBB,
  MachineBasicBlock::iterator InsPt,
  const DebugLoc &DL, Register Src,
diff --git a/llvm/test/CodeGen/AMDGPU/identical-subrange-spill-infloop.ll 
b/llvm/test/CodeGen/AMDGPU/identical-subrange-spill-infloop.ll
index 5dff660912e402..d7c38f26957677 100644
--- a/llvm/test/CodeGen/AMDGPU/identical-subrange-spill-infloop.ll
+++ b/llvm/test/CodeGen/AMDGPU/identical-subrange-spill-infloop.ll
@@ -176,39 +176,39 @@ define void @main(i1 %arg) #0 {
 ; CHECK-NEXT:v_readlane_b32 s17, v7, 37
 ; CHECK-NEXT:v_readlane_b32 s18, v7, 38
 ; CHECK-NEXT:v_readlane_b

[llvm-branch-commits] [llvm] [AMDGPU] Add liverange split instructions into BB Prolog (PR #117544)

2024-11-25 Thread Christudasan Devadasan via llvm-branch-commits


cdevadas wrote:

> [!WARNING]
> This pull request is not mergeable via GitHub because a downstack PR is 
> open. Once all requirements are satisfied, merge this PR as a stack  href="https://app.graphite.dev/github/pr/llvm/llvm-project/117544?utm_source=stack-comment-downstack-mergeability-warning";
>  >on Graphite.
> https://graphite.dev/docs/merge-pull-requests";>Learn more

* **#117544** https://app.graphite.dev/github/pr/llvm/llvm-project/117544?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/117544?utm_source=stack-comment-view-in-graphite";
 target="_blank">(View in Graphite)
* **#117543** https://app.graphite.dev/github/pr/llvm/llvm-project/117543?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* `main`



This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn 
more about https://stacking.dev/?utm_source=stack-comment";>stacking.


https://github.com/llvm/llvm-project/pull/117544
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU] Add liverange split instructions into BB Prolog (PR #117544)

2024-11-25 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas ready_for_review 
https://github.com/llvm/llvm-project/pull/117544
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [NewPM] Introduce MFAnalysisGetter for a common analysis getter (PR #116166)

2024-11-14 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas edited 
https://github.com/llvm/llvm-project/pull/116166
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [NewPM] Introduce MFAnalysisGetter for a common analysis getter (PR #116166)

2024-11-14 Thread Christudasan Devadasan via llvm-branch-commits


https://github.com/cdevadas commented:

I liked the general direction. Do you have at least one analysis pass ported 
using this Getter?

https://github.com/llvm/llvm-project/pull/116166
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [NewPM] Introduce MFAnalysisGetter for a common analysis getter (PR #116166)

2024-11-14 Thread Christudasan Devadasan via llvm-branch-commits



@@ -236,6 +238,82 @@ using MachineFunctionPassManager = 
PassManager;
 /// preserve.
 PreservedAnalyses getMachineFunctionPassPreservedAnalyses();
 
+/// For migrating to new pass manager

cdevadas wrote:

Full stop at the end of the comment.

https://github.com/llvm/llvm-project/pull/116166
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [NewPM] Introduce MFAnalysisGetter for a common analysis getter (PR #116166)

2024-11-14 Thread Christudasan Devadasan via llvm-branch-commits



@@ -236,6 +238,82 @@ using MachineFunctionPassManager = 
PassManager;
 /// preserve.
 PreservedAnalyses getMachineFunctionPassPreservedAnalyses();
 
+/// For migrating to new pass manager
+/// Provides a common interface to fetch analyses instead of doing it twice in
+/// the *LegacyPass::runOnMachineFunction and NPM Pass::run.
+///
+/// NPM analyses must have the LegacyWrapper type to indicate which legacy
+/// analysis to run. Legacy wrapper analyses must have `getResult()` method.
+/// This can be added on a needs-to basis.
+///
+/// Outer analyses passes(Module or Function) can also be requested through
+/// `getAnalysis` or `getCachedAnalysis`.
+class MFAnalysisGetter {
+private:
+  Pass *LegacyPass = nullptr;
+  MachineFunctionAnalysisManager *MFAM = nullptr;
+
+  template 
+  using type_of_run =
+  typename function_traits::template arg_t<0>;
+
+  template 
+  static constexpr bool IsFunctionAnalysis =
+  std::is_same_v>;
+
+  template 
+  static constexpr bool IsModuleAnalysis =
+  std::is_same_v>;
+
+public:
+  MFAnalysisGetter(Pass *LegacyPass) : LegacyPass(LegacyPass) {}
+  MFAnalysisGetter(MachineFunctionAnalysisManager *MFAM) : MFAM(MFAM) {}
+
+  /// Outer analyses requested from NPM will be cached results and can be null
+  template 
+  typename AnalysisT::Result *getAnalysis(MachineFunction &MF) {
+if (MFAM) {
+  // need a proxy to get the result for outer analyses

cdevadas wrote:

Capitalize the first character.

https://github.com/llvm/llvm-project/pull/116166
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

1 2 3 >

1 - 100 of 222 matches

Mail list logo