[llvm-branch-commits] [clang] [llvm] AMDGPU: Remove ds atomic fadd intrinsics (PR #95396)
@@ -2331,40 +2337,74 @@ static Value *upgradeARMIntrinsicCall(StringRef Name, CallBase *CI, Function *F, llvm_unreachable("Unknown function for ARM CallBase upgrade."); } +// These are expected to have have the arguments: cdevadas wrote: ```suggestion // These are expected to have the arguments: ``` https://github.com/llvm/llvm-project/pull/95396 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Merge constrained sloads (PR #96162)
cdevadas wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.dev/github/pr/llvm/llvm-project/96162?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#96163** https://app.graphite.dev/github/pr/llvm/llvm-project/96163?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#96162** https://app.graphite.dev/github/pr/llvm/llvm-project/96162?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 * **#96161** https://app.graphite.dev/github/pr/llvm/llvm-project/96161?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * `main` This stack of pull requests is managed by Graphite. https://stacking.dev/?utm_source=stack-comment";>Learn more about stacking. Join @cdevadas and the rest of your teammates on https://graphite.dev?utm-source=stack-comment";>https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="11px" height="11px"/> Graphite https://github.com/llvm/llvm-project/pull/96162 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Codegen support for constrained multi-dword sloads (PR #96163)
cdevadas wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.dev/github/pr/llvm/llvm-project/96163?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#96163** https://app.graphite.dev/github/pr/llvm/llvm-project/96163?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 * **#96162** https://app.graphite.dev/github/pr/llvm/llvm-project/96162?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#96161** https://app.graphite.dev/github/pr/llvm/llvm-project/96161?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * `main` This stack of pull requests is managed by Graphite. https://stacking.dev/?utm_source=stack-comment";>Learn more about stacking. Join @cdevadas and the rest of your teammates on https://graphite.dev?utm-source=stack-comment";>https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="11px" height="11px"/> Graphite https://github.com/llvm/llvm-project/pull/96163 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Codegen support for constrained multi-dword sloads (PR #96163)
https://github.com/cdevadas ready_for_review https://github.com/llvm/llvm-project/pull/96163 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Merge constrained sloads (PR #96162)
https://github.com/cdevadas ready_for_review https://github.com/llvm/llvm-project/pull/96162 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Merge constrained sloads (PR #96162)
cdevadas wrote: > This looks like it is affecting codegen even when xnack is disabled? That > should not happen. It shouldn't. I put the xnack replay subtarget check before using *_ec equivalents. See the code here: https://github.com/llvm/llvm-project/commit/65eb44327cf32a83dbbf13eb70f9d8c03f3efaef#diff-35f4d1b6c4c17815f6989f86abbac2e606ca760f9d93f501ff503449048bf760R1735 https://github.com/llvm/llvm-project/pull/96162 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Codegen support for constrained multi-dword sloads (PR #96163)
@@ -867,13 +867,104 @@ def SMRDBufferImm : ComplexPattern; def SMRDBufferImm32 : ComplexPattern; def SMRDBufferSgprImm : ComplexPattern; +class SMRDAlignedLoadPat : PatFrag <(ops node:$ptr), (Op node:$ptr), [{ + // Returns true if it is a naturally aligned multi-dword load. + LoadSDNode *Ld = cast(N); + unsigned Size = Ld->getMemoryVT().getStoreSize(); + return (Size <= 4) || (Ld->getAlign().value() >= PowerOf2Ceil(Size)); cdevadas wrote: For the DWORDX3 case. Even though the size is 12 Bytes, its natural alignment is 16 Bytes. https://github.com/llvm/llvm-project/pull/96163 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Codegen support for constrained multi-dword sloads (PR #96163)
@@ -886,26 +977,17 @@ multiclass SMRD_Pattern { def : GCNPat < (smrd_load (SMRDSgpr i64:$sbase, i32:$soffset)), (vt (!cast(Instr#"_SGPR") $sbase, $soffset, 0))> { -let OtherPredicates = [isNotGFX9Plus]; - } - def : GCNPat < -(smrd_load (SMRDSgpr i64:$sbase, i32:$soffset)), -(vt (!cast(Instr#"_SGPR_IMM") $sbase, $soffset, 0, 0))> { -let OtherPredicates = [isGFX9Plus]; +let OtherPredicates = [isGFX6GFX7]; } - // 4. SGPR+IMM offset + // 4. No offset def : GCNPat < -(smrd_load (SMRDSgprImm i64:$sbase, i32:$soffset, i32:$offset)), -(vt (!cast(Instr#"_SGPR_IMM") $sbase, $soffset, $offset, 0))> { -let OtherPredicates = [isGFX9Plus]; +(vt (smrd_load (i64 SReg_64:$sbase))), +(vt (!cast(Instr#"_IMM") i64:$sbase, 0, 0))> { +let OtherPredicates = [isGFX6GFX7]; } - // 5. No offset - def : GCNPat < -(vt (smrd_load (i64 SReg_64:$sbase))), -(vt (!cast(Instr#"_IMM") i64:$sbase, 0, 0)) - >; + defm : SMRD_Align_Pattern; cdevadas wrote: I was using the predicate for gfx8+ which has the xnack replay support enabled. I should instead check if the xnack is on. Will change it. https://github.com/llvm/llvm-project/pull/96163 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Codegen support for constrained multi-dword sloads (PR #96163)
@@ -867,13 +867,104 @@ def SMRDBufferImm : ComplexPattern; def SMRDBufferImm32 : ComplexPattern; def SMRDBufferSgprImm : ComplexPattern; +class SMRDAlignedLoadPat : PatFrag <(ops node:$ptr), (Op node:$ptr), [{ + // Returns true if it is a naturally aligned multi-dword load. + LoadSDNode *Ld = cast(N); + unsigned Size = Ld->getMemoryVT().getStoreSize(); + return (Size <= 4) || (Ld->getAlign().value() >= PowerOf2Ceil(Size)); cdevadas wrote: Isn't it 12 >= 16 or 16 >=16? The PowerOf2Ceil intends to catch the first case where the specified Align is smaller than its natural alignment. https://github.com/llvm/llvm-project/pull/96163 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Merge constrained sloads (PR #96162)
@@ -1701,17 +1732,33 @@ unsigned SILoadStoreOptimizer::getNewOpcode(const CombineInfo &CI, return AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM; } case S_LOAD_IMM: -switch (Width) { -default: - return 0; -case 2: - return AMDGPU::S_LOAD_DWORDX2_IMM; -case 3: - return AMDGPU::S_LOAD_DWORDX3_IMM; -case 4: - return AMDGPU::S_LOAD_DWORDX4_IMM; -case 8: - return AMDGPU::S_LOAD_DWORDX8_IMM; +// For targets that support XNACK replay, use the constrained load opcode. +if (STI && STI->hasXnackReplay()) { + switch (Width) { cdevadas wrote: I guess currently the merged load is always under-aligned. While combining the MMOs, currently the alignment is picked from the first MMO and that'd definitely be smaller than the natural align requirement for the new load. That's one reason I conservatively want to emit _ec equivalent when XNACK is enabled. https://github.com/llvm/llvm-project/pull/96162 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Merge constrained sloads (PR #96162)
@@ -1701,17 +1732,33 @@ unsigned SILoadStoreOptimizer::getNewOpcode(const CombineInfo &CI, return AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM; } case S_LOAD_IMM: -switch (Width) { -default: - return 0; -case 2: - return AMDGPU::S_LOAD_DWORDX2_IMM; -case 3: - return AMDGPU::S_LOAD_DWORDX3_IMM; -case 4: - return AMDGPU::S_LOAD_DWORDX4_IMM; -case 8: - return AMDGPU::S_LOAD_DWORDX8_IMM; +// For targets that support XNACK replay, use the constrained load opcode. +if (STI && STI->hasXnackReplay()) { + switch (Width) { cdevadas wrote: > > currently the alignment is picked from the first MMO and that'd definitely > > be smaller than the natural align requirement for the new load > > You don't know that - the alignment in the first MMO will be whatever > alignment the compiler could deduce, which could be large, e.g. if the > pointer used for the first load was known to have a large alignment. Are you suggesting to check the alignment in the first MMO and see if it is still the preferred alignment for the merge-load? Use the _ec if the alignment is found to be smaller than the expected value. https://github.com/llvm/llvm-project/pull/96162 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Enable atomic optimizer for 64 bit divergent values (PR #96934)
@@ -178,6 +178,20 @@ bool AMDGPUAtomicOptimizerImpl::run(Function &F) { return Changed; } +static bool shouldOptimizeForType(Type *Ty) { + switch (Ty->getTypeID()) { + case Type::FloatTyID: + case Type::DoubleTyID: +return true; + case Type::IntegerTyID: { +if (Ty->getIntegerBitWidth() == 32 || Ty->getIntegerBitWidth() == 64) cdevadas wrote: Get Ty->getIntegerBitWidth() just once outside? https://github.com/llvm/llvm-project/pull/96934 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Merge constrained sloads (PR #96162)
@@ -183,10 +183,10 @@ define <2 x half> @local_atomic_fadd_v2f16_rtn(ptr addrspace(3) %ptr, <2 x half> define amdgpu_kernel void @local_atomic_fadd_v2bf16_noret(ptr addrspace(3) %ptr, <2 x i16> %data) { ; GFX940-LABEL: local_atomic_fadd_v2bf16_noret: ; GFX940: ; %bb.0: -; GFX940-NEXT:s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX940-NEXT:s_load_dwordx2 s[2:3], s[0:1], 0x24 cdevadas wrote: Earlier I wrongly used the dword size (Width) in the the alignment check here as Jay pointed out. Now, I fixed it to use Byte size while comparing it with the existing alignment of the first load. https://github.com/llvm/llvm-project/pull/96162/commits/e7e6cbc4abd476a038fd7836e5078565e73d1fe9#diff-35f4d1b6c4c17815f6989f86abbac2e606ca760f9d93f501ff503449048bf760R1730 https://github.com/llvm/llvm-project/pull/96162 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Merge constrained sloads (PR #96162)
https://github.com/cdevadas edited https://github.com/llvm/llvm-project/pull/96162 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Merge constrained sloads (PR #96162)
@@ -183,10 +183,10 @@ define <2 x half> @local_atomic_fadd_v2f16_rtn(ptr addrspace(3) %ptr, <2 x half> define amdgpu_kernel void @local_atomic_fadd_v2bf16_noret(ptr addrspace(3) %ptr, <2 x i16> %data) { ; GFX940-LABEL: local_atomic_fadd_v2bf16_noret: ; GFX940: ; %bb.0: -; GFX940-NEXT:s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX940-NEXT:s_load_dwordx2 s[2:3], s[0:1], 0x24 cdevadas wrote: Unfortunately, that's not happening. The IR load-store-vectorizer doesn't combine the two loads. I still see the two loads after the IR vectorizer and they become two loads in the selected code. Can this happen because the alignment for the two loads differ and the IR vectorizer safely ignores them? *** IR Dump before Selection *** define amdgpu_kernel void @local_atomic_fadd_v2bf16_noret(ptr addrspace(3) %ptr, <2 x i16> %data) #0 { %local_atomic_fadd_v2bf16_noret.kernarg.segment = call nonnull align 16 dereferenceable(44) ptr addrspace(4) @llvm.amdgcn.kernarg.segment.ptr() %ptr.kernarg.offset = getelementptr inbounds i8, ptr addrspace(4) %local_atomic_fadd_v2bf16_noret.kernarg.segment, i64 36, !amdgpu.uniform !0 **%ptr.load = load ptr addrspace(3), ptr addrspace(4) %ptr.kernarg.offset**, align 4, !invariant.load !0 %data.kernarg.offset = getelementptr inbounds i8, ptr addrspace(4) %local_atomic_fadd_v2bf16_noret.kernarg.segment, i64 40, !amdgpu.uniform !0 **%data.load = load <2 x i16>, ptr addrspace(4) %data.kernarg.offset**, align 8, !invariant.load !0 %ret = call <2 x i16> @llvm.amdgcn.ds.fadd.v2bf16(ptr addrspace(3) %ptr.load, <2 x i16> %data.load) ret void } # *** IR Dump After selection ***: # Machine code for function local_atomic_fadd_v2bf16_noret: IsSSA, TracksLiveness Function Live Ins: $sgpr0_sgpr1 in %1 bb.0 (%ir-block.0): liveins: $sgpr0_sgpr1 %1:sgpr_64(p4) = COPY $sgpr0_sgpr1 %3:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM %1:sgpr_64(p4), 36, 0 :: (dereferenceable invariant load (s32) from %ir.ptr.kernarg.offset, addrspace 4) %4:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM %1:sgpr_64(p4), 40, 0 :: (dereferenceable invariant load (s32) from %ir.data.kernarg.offset, align 8, addrspace 4) %5:vgpr_32 = COPY %3:sreg_32_xm0_xexec %6:vgpr_32 = COPY %4:sreg_32_xm0_xexec DS_PK_ADD_BF16 killed %5:vgpr_32, killed %6:vgpr_32, 0, 0, implicit $m0, implicit $exec S_ENDPGM 0 https://github.com/llvm/llvm-project/pull/96162 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Merge constrained sloads (PR #96162)
https://github.com/cdevadas edited https://github.com/llvm/llvm-project/pull/96162 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Merge constrained sloads (PR #96162)
https://github.com/cdevadas edited https://github.com/llvm/llvm-project/pull/96162 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Merge constrained sloads (PR #96162)
@@ -183,10 +183,10 @@ define <2 x half> @local_atomic_fadd_v2f16_rtn(ptr addrspace(3) %ptr, <2 x half> define amdgpu_kernel void @local_atomic_fadd_v2bf16_noret(ptr addrspace(3) %ptr, <2 x i16> %data) { ; GFX940-LABEL: local_atomic_fadd_v2bf16_noret: ; GFX940: ; %bb.0: -; GFX940-NEXT:s_load_dwordx2 s[0:1], s[0:1], 0x24 +; GFX940-NEXT:s_load_dwordx2 s[2:3], s[0:1], 0x24 cdevadas wrote: Opened https://github.com/llvm/llvm-project/issues/97715. https://github.com/llvm/llvm-project/pull/96162 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Merge constrained sloads (PR #96162)
cdevadas wrote: Ping https://github.com/llvm/llvm-project/pull/96162 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Codegen support for constrained multi-dword sloads (PR #96163)
cdevadas wrote: Ping https://github.com/llvm/llvm-project/pull/96163 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Merge constrained sloads (PR #96162)
@@ -658,17 +658,17 @@ define amdgpu_kernel void @image_bvh_intersect_ray_nsa_reassign(ptr %p_node_ptr, ; ; GFX1013-LABEL: image_bvh_intersect_ray_nsa_reassign: ; GFX1013: ; %bb.0: -; GFX1013-NEXT:s_load_dwordx8 s[0:7], s[0:1], 0x24 +; GFX1013-NEXT:s_load_dwordx8 s[4:11], s[0:1], 0x24 cdevadas wrote: > I guess this code changes because xnack is enabled by default for GFX10.1? Yes. > Is there anything we could do to add known alignment info here, to avoid the > code pessimization? I'm not sure what can be done for it. https://github.com/llvm/llvm-project/pull/96162 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Merge constrained sloads (PR #96162)
https://github.com/cdevadas edited https://github.com/llvm/llvm-project/pull/96162 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Merge constrained sloads (PR #96162)
cdevadas wrote: > I still think it is terribly surprising all of the test diff shows up in this > commit, and not the selection case Because the selection support is done in the next PR of the review stack, https://github.com/llvm/llvm-project/pull/96162. This patch takes care of choosing the right opcode while merging the loads. https://github.com/llvm/llvm-project/pull/96162 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Codegen support for constrained multi-dword sloads (PR #96163)
cdevadas wrote: The latest patch optimizes the PatFrag and the patterns written further by using OtherPredicates. The lit test changes in the latest patch are a missed optimization I incorrectly introduced earlier in this PR for GFX7. It is now fixed and matches the default behavior with the current compiler. https://github.com/llvm/llvm-project/pull/96163 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Merge constrained sloads (PR #96162)
cdevadas wrote: ### Merge activity * **Jul 23, 4:02 AM EDT**: @cdevadas started a stack merge that includes this pull request via [Graphite](https://app.graphite.dev/github/pr/llvm/llvm-project/96162). https://github.com/llvm/llvm-project/pull/96162 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Codegen support for constrained multi-dword sloads (PR #96163)
cdevadas wrote: ### Merge activity * **Jul 23, 4:02 AM EDT**: @cdevadas started a stack merge that includes this pull request via [Graphite](https://app.graphite.dev/github/pr/llvm/llvm-project/96163). https://github.com/llvm/llvm-project/pull/96163 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Include constrained buffer load variants (PR #101619)
https://github.com/cdevadas created https://github.com/llvm/llvm-project/pull/101619 Use the constrained buffer load opcodes while combining under-aligned load for XNACK enabled subtargets. >From ad8a8dfea913c92fb94079aab0a4a5905b30384d Mon Sep 17 00:00:00 2001 From: Christudasan Devadasan Date: Tue, 30 Jul 2024 14:46:36 +0530 Subject: [PATCH] [AMDGPU][SILoadStoreOptimizer] Include constrained buffer load variants Use the constrained buffer load opcodes while combining under-aligned load for XNACK enabled subtargets. --- .../Target/AMDGPU/SILoadStoreOptimizer.cpp| 75 ++- .../AMDGPU/llvm.amdgcn.s.buffer.load.ll | 56 +- .../CodeGen/AMDGPU/merge-sbuffer-load.mir | 564 -- 3 files changed, 613 insertions(+), 82 deletions(-) diff --git a/llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp b/llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp index ae537b194f50c..7553c370f694f 100644 --- a/llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp +++ b/llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp @@ -352,6 +352,8 @@ static unsigned getOpcodeWidth(const MachineInstr &MI, const SIInstrInfo &TII) { return 1; case AMDGPU::S_BUFFER_LOAD_DWORDX2_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX2_SGPR_IMM: + case AMDGPU::S_BUFFER_LOAD_DWORDX2_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX2_SGPR_IMM_ec: case AMDGPU::S_LOAD_DWORDX2_IMM: case AMDGPU::S_LOAD_DWORDX2_IMM_ec: case AMDGPU::GLOBAL_LOAD_DWORDX2: @@ -363,6 +365,8 @@ static unsigned getOpcodeWidth(const MachineInstr &MI, const SIInstrInfo &TII) { return 2; case AMDGPU::S_BUFFER_LOAD_DWORDX3_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM: + case AMDGPU::S_BUFFER_LOAD_DWORDX3_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM_ec: case AMDGPU::S_LOAD_DWORDX3_IMM: case AMDGPU::S_LOAD_DWORDX3_IMM_ec: case AMDGPU::GLOBAL_LOAD_DWORDX3: @@ -374,6 +378,8 @@ static unsigned getOpcodeWidth(const MachineInstr &MI, const SIInstrInfo &TII) { return 3; case AMDGPU::S_BUFFER_LOAD_DWORDX4_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM: + case AMDGPU::S_BUFFER_LOAD_DWORDX4_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM_ec: case AMDGPU::S_LOAD_DWORDX4_IMM: case AMDGPU::S_LOAD_DWORDX4_IMM_ec: case AMDGPU::GLOBAL_LOAD_DWORDX4: @@ -385,6 +391,8 @@ static unsigned getOpcodeWidth(const MachineInstr &MI, const SIInstrInfo &TII) { return 4; case AMDGPU::S_BUFFER_LOAD_DWORDX8_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM: + case AMDGPU::S_BUFFER_LOAD_DWORDX8_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM_ec: case AMDGPU::S_LOAD_DWORDX8_IMM: case AMDGPU::S_LOAD_DWORDX8_IMM_ec: return 8; @@ -499,12 +507,20 @@ static InstClassEnum getInstClass(unsigned Opc, const SIInstrInfo &TII) { case AMDGPU::S_BUFFER_LOAD_DWORDX3_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX4_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX8_IMM: + case AMDGPU::S_BUFFER_LOAD_DWORDX2_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX3_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX4_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX8_IMM_ec: return S_BUFFER_LOAD_IMM; case AMDGPU::S_BUFFER_LOAD_DWORD_SGPR_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX2_SGPR_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM: + case AMDGPU::S_BUFFER_LOAD_DWORDX2_SGPR_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM_ec: return S_BUFFER_LOAD_SGPR_IMM; case AMDGPU::S_LOAD_DWORD_IMM: case AMDGPU::S_LOAD_DWORDX2_IMM: @@ -587,12 +603,20 @@ static unsigned getInstSubclass(unsigned Opc, const SIInstrInfo &TII) { case AMDGPU::S_BUFFER_LOAD_DWORDX3_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX4_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX8_IMM: + case AMDGPU::S_BUFFER_LOAD_DWORDX2_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX3_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX4_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX8_IMM_ec: return AMDGPU::S_BUFFER_LOAD_DWORD_IMM; case AMDGPU::S_BUFFER_LOAD_DWORD_SGPR_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX2_SGPR_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM: + case AMDGPU::S_BUFFER_LOAD_DWORDX2_SGPR_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM_ec: return AMDGPU::S_BUFFER_LOAD_DWORD_SGPR_IMM; case AMDGPU::S_LOAD_DWORD_IMM: case AMDGPU::S_LOAD_DWORDX2_IMM: @@ -703,6 +727,10 @@ static AddressRegs getRegs(unsigned Opc, const SIInstrInfo &TII) { case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM: + case AMDGPU::S_BUFFER_LOAD_DWORDX2_SGPR_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_
[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Include constrained buffer load variants (PR #101619)
cdevadas wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.dev/github/pr/llvm/llvm-project/101619?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#101619** https://app.graphite.dev/github/pr/llvm/llvm-project/101619?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 * **#101618** https://app.graphite.dev/github/pr/llvm/llvm-project/101618?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * `main` This stack of pull requests is managed by Graphite. https://stacking.dev/?utm_source=stack-comment";>Learn more about stacking. Join @cdevadas and the rest of your teammates on https://graphite.dev?utm-source=stack-comment";>https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="11px" height="11px"/> Graphite https://github.com/llvm/llvm-project/pull/101619 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Include constrained buffer load variants (PR #101619)
https://github.com/cdevadas ready_for_review https://github.com/llvm/llvm-project/pull/101619 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Include constrained buffer load variants (PR #101619)
https://github.com/cdevadas updated https://github.com/llvm/llvm-project/pull/101619 >From ad8a8dfea913c92fb94079aab0a4a5905b30384d Mon Sep 17 00:00:00 2001 From: Christudasan Devadasan Date: Tue, 30 Jul 2024 14:46:36 +0530 Subject: [PATCH 1/3] [AMDGPU][SILoadStoreOptimizer] Include constrained buffer load variants Use the constrained buffer load opcodes while combining under-aligned load for XNACK enabled subtargets. --- .../Target/AMDGPU/SILoadStoreOptimizer.cpp| 75 ++- .../AMDGPU/llvm.amdgcn.s.buffer.load.ll | 56 +- .../CodeGen/AMDGPU/merge-sbuffer-load.mir | 564 -- 3 files changed, 613 insertions(+), 82 deletions(-) diff --git a/llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp b/llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp index ae537b194f50c..7553c370f694f 100644 --- a/llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp +++ b/llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp @@ -352,6 +352,8 @@ static unsigned getOpcodeWidth(const MachineInstr &MI, const SIInstrInfo &TII) { return 1; case AMDGPU::S_BUFFER_LOAD_DWORDX2_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX2_SGPR_IMM: + case AMDGPU::S_BUFFER_LOAD_DWORDX2_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX2_SGPR_IMM_ec: case AMDGPU::S_LOAD_DWORDX2_IMM: case AMDGPU::S_LOAD_DWORDX2_IMM_ec: case AMDGPU::GLOBAL_LOAD_DWORDX2: @@ -363,6 +365,8 @@ static unsigned getOpcodeWidth(const MachineInstr &MI, const SIInstrInfo &TII) { return 2; case AMDGPU::S_BUFFER_LOAD_DWORDX3_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM: + case AMDGPU::S_BUFFER_LOAD_DWORDX3_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM_ec: case AMDGPU::S_LOAD_DWORDX3_IMM: case AMDGPU::S_LOAD_DWORDX3_IMM_ec: case AMDGPU::GLOBAL_LOAD_DWORDX3: @@ -374,6 +378,8 @@ static unsigned getOpcodeWidth(const MachineInstr &MI, const SIInstrInfo &TII) { return 3; case AMDGPU::S_BUFFER_LOAD_DWORDX4_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM: + case AMDGPU::S_BUFFER_LOAD_DWORDX4_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM_ec: case AMDGPU::S_LOAD_DWORDX4_IMM: case AMDGPU::S_LOAD_DWORDX4_IMM_ec: case AMDGPU::GLOBAL_LOAD_DWORDX4: @@ -385,6 +391,8 @@ static unsigned getOpcodeWidth(const MachineInstr &MI, const SIInstrInfo &TII) { return 4; case AMDGPU::S_BUFFER_LOAD_DWORDX8_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM: + case AMDGPU::S_BUFFER_LOAD_DWORDX8_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM_ec: case AMDGPU::S_LOAD_DWORDX8_IMM: case AMDGPU::S_LOAD_DWORDX8_IMM_ec: return 8; @@ -499,12 +507,20 @@ static InstClassEnum getInstClass(unsigned Opc, const SIInstrInfo &TII) { case AMDGPU::S_BUFFER_LOAD_DWORDX3_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX4_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX8_IMM: + case AMDGPU::S_BUFFER_LOAD_DWORDX2_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX3_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX4_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX8_IMM_ec: return S_BUFFER_LOAD_IMM; case AMDGPU::S_BUFFER_LOAD_DWORD_SGPR_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX2_SGPR_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM: + case AMDGPU::S_BUFFER_LOAD_DWORDX2_SGPR_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM_ec: return S_BUFFER_LOAD_SGPR_IMM; case AMDGPU::S_LOAD_DWORD_IMM: case AMDGPU::S_LOAD_DWORDX2_IMM: @@ -587,12 +603,20 @@ static unsigned getInstSubclass(unsigned Opc, const SIInstrInfo &TII) { case AMDGPU::S_BUFFER_LOAD_DWORDX3_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX4_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX8_IMM: + case AMDGPU::S_BUFFER_LOAD_DWORDX2_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX3_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX4_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX8_IMM_ec: return AMDGPU::S_BUFFER_LOAD_DWORD_IMM; case AMDGPU::S_BUFFER_LOAD_DWORD_SGPR_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX2_SGPR_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM: + case AMDGPU::S_BUFFER_LOAD_DWORDX2_SGPR_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM_ec: return AMDGPU::S_BUFFER_LOAD_DWORD_SGPR_IMM; case AMDGPU::S_LOAD_DWORD_IMM: case AMDGPU::S_LOAD_DWORDX2_IMM: @@ -703,6 +727,10 @@ static AddressRegs getRegs(unsigned Opc, const SIInstrInfo &TII) { case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM: + case AMDGPU::S_BUFFER_LOAD_DWORDX2_SGPR_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_
[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Include constrained buffer load variants (PR #101619)
https://github.com/cdevadas updated https://github.com/llvm/llvm-project/pull/101619 >From ad8a8dfea913c92fb94079aab0a4a5905b30384d Mon Sep 17 00:00:00 2001 From: Christudasan Devadasan Date: Tue, 30 Jul 2024 14:46:36 +0530 Subject: [PATCH 1/4] [AMDGPU][SILoadStoreOptimizer] Include constrained buffer load variants Use the constrained buffer load opcodes while combining under-aligned load for XNACK enabled subtargets. --- .../Target/AMDGPU/SILoadStoreOptimizer.cpp| 75 ++- .../AMDGPU/llvm.amdgcn.s.buffer.load.ll | 56 +- .../CodeGen/AMDGPU/merge-sbuffer-load.mir | 564 -- 3 files changed, 613 insertions(+), 82 deletions(-) diff --git a/llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp b/llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp index ae537b194f50c..7553c370f694f 100644 --- a/llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp +++ b/llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp @@ -352,6 +352,8 @@ static unsigned getOpcodeWidth(const MachineInstr &MI, const SIInstrInfo &TII) { return 1; case AMDGPU::S_BUFFER_LOAD_DWORDX2_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX2_SGPR_IMM: + case AMDGPU::S_BUFFER_LOAD_DWORDX2_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX2_SGPR_IMM_ec: case AMDGPU::S_LOAD_DWORDX2_IMM: case AMDGPU::S_LOAD_DWORDX2_IMM_ec: case AMDGPU::GLOBAL_LOAD_DWORDX2: @@ -363,6 +365,8 @@ static unsigned getOpcodeWidth(const MachineInstr &MI, const SIInstrInfo &TII) { return 2; case AMDGPU::S_BUFFER_LOAD_DWORDX3_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM: + case AMDGPU::S_BUFFER_LOAD_DWORDX3_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM_ec: case AMDGPU::S_LOAD_DWORDX3_IMM: case AMDGPU::S_LOAD_DWORDX3_IMM_ec: case AMDGPU::GLOBAL_LOAD_DWORDX3: @@ -374,6 +378,8 @@ static unsigned getOpcodeWidth(const MachineInstr &MI, const SIInstrInfo &TII) { return 3; case AMDGPU::S_BUFFER_LOAD_DWORDX4_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM: + case AMDGPU::S_BUFFER_LOAD_DWORDX4_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM_ec: case AMDGPU::S_LOAD_DWORDX4_IMM: case AMDGPU::S_LOAD_DWORDX4_IMM_ec: case AMDGPU::GLOBAL_LOAD_DWORDX4: @@ -385,6 +391,8 @@ static unsigned getOpcodeWidth(const MachineInstr &MI, const SIInstrInfo &TII) { return 4; case AMDGPU::S_BUFFER_LOAD_DWORDX8_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM: + case AMDGPU::S_BUFFER_LOAD_DWORDX8_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM_ec: case AMDGPU::S_LOAD_DWORDX8_IMM: case AMDGPU::S_LOAD_DWORDX8_IMM_ec: return 8; @@ -499,12 +507,20 @@ static InstClassEnum getInstClass(unsigned Opc, const SIInstrInfo &TII) { case AMDGPU::S_BUFFER_LOAD_DWORDX3_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX4_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX8_IMM: + case AMDGPU::S_BUFFER_LOAD_DWORDX2_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX3_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX4_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX8_IMM_ec: return S_BUFFER_LOAD_IMM; case AMDGPU::S_BUFFER_LOAD_DWORD_SGPR_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX2_SGPR_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM: + case AMDGPU::S_BUFFER_LOAD_DWORDX2_SGPR_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM_ec: return S_BUFFER_LOAD_SGPR_IMM; case AMDGPU::S_LOAD_DWORD_IMM: case AMDGPU::S_LOAD_DWORDX2_IMM: @@ -587,12 +603,20 @@ static unsigned getInstSubclass(unsigned Opc, const SIInstrInfo &TII) { case AMDGPU::S_BUFFER_LOAD_DWORDX3_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX4_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX8_IMM: + case AMDGPU::S_BUFFER_LOAD_DWORDX2_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX3_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX4_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX8_IMM_ec: return AMDGPU::S_BUFFER_LOAD_DWORD_IMM; case AMDGPU::S_BUFFER_LOAD_DWORD_SGPR_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX2_SGPR_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM: + case AMDGPU::S_BUFFER_LOAD_DWORDX2_SGPR_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM_ec: return AMDGPU::S_BUFFER_LOAD_DWORD_SGPR_IMM; case AMDGPU::S_LOAD_DWORD_IMM: case AMDGPU::S_LOAD_DWORDX2_IMM: @@ -703,6 +727,10 @@ static AddressRegs getRegs(unsigned Opc, const SIInstrInfo &TII) { case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM: case AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM: + case AMDGPU::S_BUFFER_LOAD_DWORDX2_SGPR_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX3_SGPR_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_DWORDX4_SGPR_IMM_ec: + case AMDGPU::S_BUFFER_LOAD_
[llvm-branch-commits] [llvm] AMDGPU: Handle folding frame indexes into s_add_i32 (PR #101694)
@@ -0,0 +1,930 @@ +# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py +# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx700 -verify-machineinstrs -run-pass=prologepilog %s -o - | FileCheck -check-prefix=MUBUFW64 %s +# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx803 -verify-machineinstrs -run-pass=prologepilog %s -o - | FileCheck -check-prefix=MUBUFW64 %s +# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx900 -verify-machineinstrs -run-pass=prologepilog %s -o - | FileCheck -check-prefix=MUBUFW64 %s +# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx90a -verify-machineinstrs -run-pass=prologepilog %s -o - | FileCheck -check-prefix=MUBUFW64 %s +# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1010 -verify-machineinstrs -run-pass=prologepilog %s -o - | FileCheck -check-prefix=MUBUFW32 %s + +# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx940 -verify-machineinstrs -run-pass=prologepilog %s -o - | FileCheck -check-prefix=FLATSCRW64 %s +# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1100 -verify-machineinstrs -run-pass=prologepilog %s -o - | FileCheck -check-prefix=FLATSCRW32 %s +# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1200 -verify-machineinstrs -run-pass=prologepilog %s -o - | FileCheck -check-prefix=FLATSCRW32 %s + +--- +name: s_add_i32__inline_imm__fi_offset0 +tracksRegLiveness: true +stack: + - { id: 0, size: 32, alignment: 16 } +machineFunctionInfo: + scratchRSrcReg: '$sgpr0_sgpr1_sgpr2_sgpr3' + frameOffsetReg: '$sgpr33' + stackPtrOffsetReg: '$sgpr32' +body: | + bb.0: +; MUBUFW64-LABEL: name: s_add_i32__inline_imm__fi_offset0 +; MUBUFW64: renamable $sgpr7 = S_LSHR_B32 $sgpr32, 6, implicit-def dead $scc +; MUBUFW64-NEXT: renamable $sgpr7 = S_ADD_I32 $sgpr7, 12, implicit-def dead $scc +; MUBUFW64-NEXT: SI_RETURN implicit $sgpr7 +; +; MUBUFW32-LABEL: name: s_add_i32__inline_imm__fi_offset0 +; MUBUFW32: renamable $sgpr7 = S_LSHR_B32 $sgpr32, 5, implicit-def dead $scc +; MUBUFW32-NEXT: renamable $sgpr7 = S_ADD_I32 $sgpr7, 12, implicit-def dead $scc +; MUBUFW32-NEXT: SI_RETURN implicit $sgpr7 +; +; FLATSCRW64-LABEL: name: s_add_i32__inline_imm__fi_offset0 +; FLATSCRW64: renamable $sgpr7 = S_ADD_I32 $sgpr32, 12, implicit-def dead $scc +; FLATSCRW64-NEXT: SI_RETURN implicit $sgpr7 +; +; FLATSCRW32-LABEL: name: s_add_i32__inline_imm__fi_offset0 +; FLATSCRW32: renamable $sgpr7 = S_ADD_I32 $sgpr32, 12, implicit-def dead $scc +; FLATSCRW32-NEXT: SI_RETURN implicit $sgpr7 +renamable $sgpr7 = S_ADD_I32 12, %stack.0, implicit-def dead $scc +SI_RETURN implicit $sgpr7 + +... + +--- +name: s_add_i32__fi_offset0__inline_imm +tracksRegLiveness: true +stack: + - { id: 0, size: 32, alignment: 16 } +machineFunctionInfo: + scratchRSrcReg: '$sgpr0_sgpr1_sgpr2_sgpr3' + frameOffsetReg: '$sgpr33' + stackPtrOffsetReg: '$sgpr32' +body: | + bb.0: +; MUBUFW64-LABEL: name: s_add_i32__fi_offset0__inline_imm +; MUBUFW64: renamable $sgpr7 = S_LSHR_B32 $sgpr32, 6, implicit-def dead $scc +; MUBUFW64-NEXT: renamable $sgpr7 = S_ADD_I32 12, $sgpr7, implicit-def dead $scc +; MUBUFW64-NEXT: SI_RETURN implicit $sgpr7 +; +; MUBUFW32-LABEL: name: s_add_i32__fi_offset0__inline_imm +; MUBUFW32: renamable $sgpr7 = S_LSHR_B32 $sgpr32, 5, implicit-def dead $scc +; MUBUFW32-NEXT: renamable $sgpr7 = S_ADD_I32 12, $sgpr7, implicit-def dead $scc +; MUBUFW32-NEXT: SI_RETURN implicit $sgpr7 +; +; FLATSCRW64-LABEL: name: s_add_i32__fi_offset0__inline_imm +; FLATSCRW64: renamable $sgpr7 = S_ADD_I32 12, $sgpr32, implicit-def dead $scc +; FLATSCRW64-NEXT: SI_RETURN implicit $sgpr7 +; +; FLATSCRW32-LABEL: name: s_add_i32__fi_offset0__inline_imm +; FLATSCRW32: renamable $sgpr7 = S_ADD_I32 12, $sgpr32, implicit-def dead $scc +; FLATSCRW32-NEXT: SI_RETURN implicit $sgpr7 +renamable $sgpr7 = S_ADD_I32 %stack.0, 12, implicit-def dead $scc +SI_RETURN implicit $sgpr7 + +... + +--- +name: s_add_i32__inline_imm___fi_offset_inline_imm +tracksRegLiveness: true +stack: + - { id: 0, size: 16, alignment: 16 } + - { id: 1, size: 24, alignment: 4 } +machineFunctionInfo: + scratchRSrcReg: '$sgpr0_sgpr1_sgpr2_sgpr3' + frameOffsetReg: '$sgpr33' + stackPtrOffsetReg: '$sgpr32' +body: | + bb.0: +; MUBUFW64-LABEL: name: s_add_i32__inline_imm___fi_offset_inline_imm +; MUBUFW64: renamable $sgpr7 = S_LSHR_B32 $sgpr32, 6, implicit-def dead $scc +; MUBUFW64-NEXT: renamable $sgpr7 = S_ADD_I32 $sgpr7, 28, implicit-def $scc +; MUBUFW64-NEXT: SI_RETURN implicit $sgpr7 +; +; MUBUFW32-LABEL: name: s_add_i32__inline_imm___fi_offset_inline_imm +; MUBUFW32: renamable $sgpr7 = S_LSHR_B32 $sgpr32, 5, implicit-def dead $scc +; MUBUFW32-NEXT: renamable $sgpr7 = S_ADD_I32 $sgpr7, 28, implicit-def $scc +; MUBUFW32-NEXT: SI_RETURN implicit $sgpr7 +; +; FLATSCRW64-LABEL: name: s_add_i32__inline_imm___
[llvm-branch-commits] [llvm] AMDGPU: Handle folding frame indexes into s_add_i32 (PR #101694)
https://github.com/cdevadas approved this pull request. https://github.com/llvm/llvm-project/pull/101694 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Include constrained buffer load variants (PR #101619)
cdevadas wrote: ### Merge activity * **Aug 6, 1:43 AM EDT**: @cdevadas started a stack merge that includes this pull request via [Graphite](https://app.graphite.dev/github/pr/llvm/llvm-project/101619). https://github.com/llvm/llvm-project/pull/101619 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][R600] Move R600TargetMachine into R600CodeGenPassBuilder(NFC). (PR #103721)
https://github.com/cdevadas created https://github.com/llvm/llvm-project/pull/103721 None >From cc30faa32e7828d74826421cdae50464ede38e0b Mon Sep 17 00:00:00 2001 From: Christudasan Devadasan Date: Wed, 14 Aug 2024 14:18:59 +0530 Subject: [PATCH] [AMDGPU][R600] Move R600TargetMachine into R600CodeGenPassBuilder(NFC). --- .../AMDGPU/AMDGPUCodeGenPassBuilder.cpp | 2 +- llvm/lib/Target/AMDGPU/CMakeLists.txt | 1 - .../Target/AMDGPU/R600CodeGenPassBuilder.cpp | 149 - .../Target/AMDGPU/R600CodeGenPassBuilder.h| 38 - llvm/lib/Target/AMDGPU/R600ISelLowering.cpp | 3 +- llvm/lib/Target/AMDGPU/R600TargetMachine.cpp | 154 -- llvm/lib/Target/AMDGPU/R600TargetMachine.h| 60 --- 7 files changed, 187 insertions(+), 220 deletions(-) delete mode 100644 llvm/lib/Target/AMDGPU/R600TargetMachine.cpp delete mode 100644 llvm/lib/Target/AMDGPU/R600TargetMachine.h diff --git a/llvm/lib/Target/AMDGPU/AMDGPUCodeGenPassBuilder.cpp b/llvm/lib/Target/AMDGPU/AMDGPUCodeGenPassBuilder.cpp index 0d7233432fc2b5..8fbd672c2ad62d 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUCodeGenPassBuilder.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUCodeGenPassBuilder.cpp @@ -32,8 +32,8 @@ #include "GCNSchedStrategy.h" #include "GCNVOPDUtils.h" #include "R600.h" +#include "R600CodeGenPassBuilder.h" #include "R600MachineFunctionInfo.h" -#include "R600TargetMachine.h" #include "SIFixSGPRCopies.h" #include "SIMachineFunctionInfo.h" #include "SIMachineScheduler.h" diff --git a/llvm/lib/Target/AMDGPU/CMakeLists.txt b/llvm/lib/Target/AMDGPU/CMakeLists.txt index e21b2cf62f0c8f..9cb84614fefd36 100644 --- a/llvm/lib/Target/AMDGPU/CMakeLists.txt +++ b/llvm/lib/Target/AMDGPU/CMakeLists.txt @@ -137,7 +137,6 @@ add_llvm_target(AMDGPUCodeGen R600Packetizer.cpp R600RegisterInfo.cpp R600Subtarget.cpp - R600TargetMachine.cpp R600TargetTransformInfo.cpp SIAnnotateControlFlow.cpp SIFixSGPRCopies.cpp diff --git a/llvm/lib/Target/AMDGPU/R600CodeGenPassBuilder.cpp b/llvm/lib/Target/AMDGPU/R600CodeGenPassBuilder.cpp index a57b3aa0adb158..1b182e17add9c0 100644 --- a/llvm/lib/Target/AMDGPU/R600CodeGenPassBuilder.cpp +++ b/llvm/lib/Target/AMDGPU/R600CodeGenPassBuilder.cpp @@ -5,12 +5,159 @@ // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception // //===--===// +// +/// \file +/// This file contains both AMDGPU-R600 target machine and the CodeGen pass +/// builder. The target machine contains all of the hardware specific +/// information needed to emit code for R600 GPUs and the CodeGen pass builder +/// handles the same for new pass manager infrastructure. +// +//===--===// #include "R600CodeGenPassBuilder.h" -#include "R600TargetMachine.h" +#include "R600.h" +#include "R600MachineScheduler.h" +#include "R600TargetTransformInfo.h" +#include "llvm/Transforms/Scalar.h" +#include using namespace llvm; +static cl::opt +EnableR600StructurizeCFG("r600-ir-structurize", + cl::desc("Use StructurizeCFG IR pass"), + cl::init(true)); + +static cl::opt EnableR600IfConvert("r600-if-convert", + cl::desc("Use if conversion pass"), + cl::ReallyHidden, cl::init(true)); + +static cl::opt EnableAMDGPUFunctionCallsOpt( +"amdgpu-function-calls", cl::desc("Enable AMDGPU function call support"), +cl::location(AMDGPUTargetMachine::EnableFunctionCalls), cl::init(true), +cl::Hidden); + +static ScheduleDAGInstrs *createR600MachineScheduler(MachineSchedContext *C) { + return new ScheduleDAGMILive(C, std::make_unique()); +} + +static MachineSchedRegistry R600SchedRegistry("r600", + "Run R600's custom scheduler", + createR600MachineScheduler); + +//===--===// +// R600 Target Machine (R600 -> Cayman) - Legacy Pass Manager interface. +//===--===// + +R600TargetMachine::R600TargetMachine(const Target &T, const Triple &TT, + StringRef CPU, StringRef FS, + const TargetOptions &Options, + std::optional RM, + std::optional CM, + CodeGenOptLevel OL, bool JIT) +: AMDGPUTargetMachine(T, TT, CPU, FS, Options, RM, CM, OL) { + setRequiresStructuredCFG(true); + + // Override the default since calls aren't supported for r600. + if (EnableFunctionCalls && + EnableAMDGPUFunctionCallsOpt.getNumOccurrences() == 0) +EnableFunctionCalls = false; +} + +const TargetSubtargetInfo * +R600TargetMachine::getSubtarget
[llvm-branch-commits] [llvm] [AMDGPU][R600] Move R600TargetMachine into R600CodeGenPassBuilder(NFC). (PR #103721)
cdevadas wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.dev/github/pr/llvm/llvm-project/103721?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#103721** https://app.graphite.dev/github/pr/llvm/llvm-project/103721?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 * **#103720** https://app.graphite.dev/github/pr/llvm/llvm-project/103720?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * `main` This stack of pull requests is managed by Graphite. https://stacking.dev/?utm_source=stack-comment";>Learn more about stacking. Join @cdevadas and the rest of your teammates on https://graphite.dev?utm-source=stack-comment";>https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="11px" height="11px"/> Graphite https://github.com/llvm/llvm-project/pull/103721 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][R600] Move R600TargetMachine into R600CodeGenPassBuilder(NFC). (PR #103721)
https://github.com/cdevadas ready_for_review https://github.com/llvm/llvm-project/pull/103721 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][R600] Move R600TargetMachine into R600CodeGenPassBuilder(NFC). (PR #103721)
https://github.com/cdevadas updated https://github.com/llvm/llvm-project/pull/103721 >From a10910597e6ee30e87dd09a4f77fcfa1729873f0 Mon Sep 17 00:00:00 2001 From: Christudasan Devadasan Date: Wed, 14 Aug 2024 14:18:59 +0530 Subject: [PATCH 1/2] [AMDGPU][R600] Move R600TargetMachine into R600CodeGenPassBuilder(NFC). --- .../lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 2 +- llvm/lib/Target/AMDGPU/CMakeLists.txt | 1 - .../Target/AMDGPU/R600CodeGenPassBuilder.cpp | 149 - .../Target/AMDGPU/R600CodeGenPassBuilder.h| 38 - llvm/lib/Target/AMDGPU/R600ISelLowering.cpp | 3 +- llvm/lib/Target/AMDGPU/R600TargetMachine.cpp | 154 -- 6 files changed, 187 insertions(+), 160 deletions(-) delete mode 100644 llvm/lib/Target/AMDGPU/R600TargetMachine.cpp diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp index bcedc3623d3ed7..13024f45151899 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp @@ -32,8 +32,8 @@ #include "GCNSchedStrategy.h" #include "GCNVOPDUtils.h" #include "R600.h" +#include "R600CodeGenPassBuilder.h" #include "R600MachineFunctionInfo.h" -#include "R600TargetMachine.h" #include "SIFixSGPRCopies.h" #include "SIMachineFunctionInfo.h" #include "SIMachineScheduler.h" diff --git a/llvm/lib/Target/AMDGPU/CMakeLists.txt b/llvm/lib/Target/AMDGPU/CMakeLists.txt index f493076f5bb8a3..16186f1f1bbed0 100644 --- a/llvm/lib/Target/AMDGPU/CMakeLists.txt +++ b/llvm/lib/Target/AMDGPU/CMakeLists.txt @@ -137,7 +137,6 @@ add_llvm_target(AMDGPUCodeGen R600Packetizer.cpp R600RegisterInfo.cpp R600Subtarget.cpp - R600TargetMachine.cpp R600TargetTransformInfo.cpp SIAnnotateControlFlow.cpp SIFixSGPRCopies.cpp diff --git a/llvm/lib/Target/AMDGPU/R600CodeGenPassBuilder.cpp b/llvm/lib/Target/AMDGPU/R600CodeGenPassBuilder.cpp index a57b3aa0adb158..1b182e17add9c0 100644 --- a/llvm/lib/Target/AMDGPU/R600CodeGenPassBuilder.cpp +++ b/llvm/lib/Target/AMDGPU/R600CodeGenPassBuilder.cpp @@ -5,12 +5,159 @@ // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception // //===--===// +// +/// \file +/// This file contains both AMDGPU-R600 target machine and the CodeGen pass +/// builder. The target machine contains all of the hardware specific +/// information needed to emit code for R600 GPUs and the CodeGen pass builder +/// handles the same for new pass manager infrastructure. +// +//===--===// #include "R600CodeGenPassBuilder.h" -#include "R600TargetMachine.h" +#include "R600.h" +#include "R600MachineScheduler.h" +#include "R600TargetTransformInfo.h" +#include "llvm/Transforms/Scalar.h" +#include using namespace llvm; +static cl::opt +EnableR600StructurizeCFG("r600-ir-structurize", + cl::desc("Use StructurizeCFG IR pass"), + cl::init(true)); + +static cl::opt EnableR600IfConvert("r600-if-convert", + cl::desc("Use if conversion pass"), + cl::ReallyHidden, cl::init(true)); + +static cl::opt EnableAMDGPUFunctionCallsOpt( +"amdgpu-function-calls", cl::desc("Enable AMDGPU function call support"), +cl::location(AMDGPUTargetMachine::EnableFunctionCalls), cl::init(true), +cl::Hidden); + +static ScheduleDAGInstrs *createR600MachineScheduler(MachineSchedContext *C) { + return new ScheduleDAGMILive(C, std::make_unique()); +} + +static MachineSchedRegistry R600SchedRegistry("r600", + "Run R600's custom scheduler", + createR600MachineScheduler); + +//===--===// +// R600 Target Machine (R600 -> Cayman) - Legacy Pass Manager interface. +//===--===// + +R600TargetMachine::R600TargetMachine(const Target &T, const Triple &TT, + StringRef CPU, StringRef FS, + const TargetOptions &Options, + std::optional RM, + std::optional CM, + CodeGenOptLevel OL, bool JIT) +: AMDGPUTargetMachine(T, TT, CPU, FS, Options, RM, CM, OL) { + setRequiresStructuredCFG(true); + + // Override the default since calls aren't supported for r600. + if (EnableFunctionCalls && + EnableAMDGPUFunctionCallsOpt.getNumOccurrences() == 0) +EnableFunctionCalls = false; +} + +const TargetSubtargetInfo * +R600TargetMachine::getSubtargetImpl(const Function &F) const { + StringRef GPU = getGPUName(F); + StringRef FS = getFeatureString(F); + + SmallString<128> SubtargetKey(GPU);
[llvm-branch-commits] [llvm] [AMDGPU][R600] Move R600CodeGenPassBuilder into R600TargetMachine(NFC). (PR #103721)
https://github.com/cdevadas edited https://github.com/llvm/llvm-project/pull/103721 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][R600] Move createMachineFunctionInfo into R600 TM. (PR #104038)
https://github.com/cdevadas created https://github.com/llvm/llvm-project/pull/104038 This definition shouldn't be in AMDGPU TM. >From 6ab7119cd8b4e63dad2625b0aa9416e2898a3bbc Mon Sep 17 00:00:00 2001 From: Christudasan Devadasan Date: Wed, 14 Aug 2024 20:50:30 +0530 Subject: [PATCH] [AMDGPU][R600] Move createMachineFunctionInfo into R600 TM. This definition shouldn't be in AMDGPU TM. --- llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 8 llvm/lib/Target/AMDGPU/R600TargetMachine.cpp | 8 2 files changed, 8 insertions(+), 8 deletions(-) diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp index bcedc3623d3ed..f5da459a43c59 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp @@ -32,7 +32,6 @@ #include "GCNSchedStrategy.h" #include "GCNVOPDUtils.h" #include "R600.h" -#include "R600MachineFunctionInfo.h" #include "R600TargetMachine.h" #include "SIFixSGPRCopies.h" #include "SIMachineFunctionInfo.h" @@ -1193,13 +1192,6 @@ AMDGPUPassConfig::createMachineScheduler(MachineSchedContext *C) const { return DAG; } -MachineFunctionInfo *R600TargetMachine::createMachineFunctionInfo( -BumpPtrAllocator &Allocator, const Function &F, -const TargetSubtargetInfo *STI) const { - return R600MachineFunctionInfo::create( - Allocator, F, static_cast(STI)); -} - //===--===// // GCN Legacy Pass Setup //===--===// diff --git a/llvm/lib/Target/AMDGPU/R600TargetMachine.cpp b/llvm/lib/Target/AMDGPU/R600TargetMachine.cpp index 06ae5f7289360..a1a60b8bdfa9e 100644 --- a/llvm/lib/Target/AMDGPU/R600TargetMachine.cpp +++ b/llvm/lib/Target/AMDGPU/R600TargetMachine.cpp @@ -16,6 +16,7 @@ #include "R600TargetMachine.h" #include "R600.h" +#include "R600MachineFunctionInfo.h" #include "R600MachineScheduler.h" #include "R600TargetTransformInfo.h" #include "llvm/Transforms/Scalar.h" @@ -154,6 +155,13 @@ Error R600TargetMachine::buildCodeGenPipeline( return CGPB.buildPipeline(MPM, Out, DwoOut, FileType); } +MachineFunctionInfo *R600TargetMachine::createMachineFunctionInfo( +BumpPtrAllocator &Allocator, const Function &F, +const TargetSubtargetInfo *STI) const { + return R600MachineFunctionInfo::create( + Allocator, F, static_cast(STI)); +} + //===--===// // R600 CodeGen Pass Builder interface. //===--===// ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][R600] Move createMachineFunctionInfo into R600 TM. (PR #104038)
cdevadas wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.dev/github/pr/llvm/llvm-project/104038?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#104038** https://app.graphite.dev/github/pr/llvm/llvm-project/104038?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 * **#103721** https://app.graphite.dev/github/pr/llvm/llvm-project/103721?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#103720** https://app.graphite.dev/github/pr/llvm/llvm-project/103720?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * `main` This stack of pull requests is managed by Graphite. https://stacking.dev/?utm_source=stack-comment";>Learn more about stacking. Join @cdevadas and the rest of your teammates on https://graphite.dev?utm-source=stack-comment";>https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="11px" height="11px"/> Graphite https://github.com/llvm/llvm-project/pull/104038 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][R600] Move createMachineFunctionInfo into R600 TM. (PR #104038)
https://github.com/cdevadas ready_for_review https://github.com/llvm/llvm-project/pull/104038 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][R600] Move R600CodeGenPassBuilder into R600TargetMachine(NFC). (PR #103721)
https://github.com/cdevadas updated https://github.com/llvm/llvm-project/pull/103721 >From f2095f23eaa5c3876bf7f8d5706881e404c5aa1b Mon Sep 17 00:00:00 2001 From: Christudasan Devadasan Date: Wed, 14 Aug 2024 14:18:59 +0530 Subject: [PATCH 1/3] [AMDGPU][R600] Move R600TargetMachine into R600CodeGenPassBuilder(NFC). --- llvm/lib/Target/AMDGPU/CMakeLists.txt | 1 - .../Target/AMDGPU/R600CodeGenPassBuilder.cpp | 149 - .../Target/AMDGPU/R600CodeGenPassBuilder.h| 38 - llvm/lib/Target/AMDGPU/R600ISelLowering.cpp | 3 +- llvm/lib/Target/AMDGPU/R600TargetMachine.cpp | 154 -- 5 files changed, 186 insertions(+), 159 deletions(-) delete mode 100644 llvm/lib/Target/AMDGPU/R600TargetMachine.cpp diff --git a/llvm/lib/Target/AMDGPU/CMakeLists.txt b/llvm/lib/Target/AMDGPU/CMakeLists.txt index f493076f5bb8a3..16186f1f1bbed0 100644 --- a/llvm/lib/Target/AMDGPU/CMakeLists.txt +++ b/llvm/lib/Target/AMDGPU/CMakeLists.txt @@ -137,7 +137,6 @@ add_llvm_target(AMDGPUCodeGen R600Packetizer.cpp R600RegisterInfo.cpp R600Subtarget.cpp - R600TargetMachine.cpp R600TargetTransformInfo.cpp SIAnnotateControlFlow.cpp SIFixSGPRCopies.cpp diff --git a/llvm/lib/Target/AMDGPU/R600CodeGenPassBuilder.cpp b/llvm/lib/Target/AMDGPU/R600CodeGenPassBuilder.cpp index a57b3aa0adb158..1b182e17add9c0 100644 --- a/llvm/lib/Target/AMDGPU/R600CodeGenPassBuilder.cpp +++ b/llvm/lib/Target/AMDGPU/R600CodeGenPassBuilder.cpp @@ -5,12 +5,159 @@ // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception // //===--===// +// +/// \file +/// This file contains both AMDGPU-R600 target machine and the CodeGen pass +/// builder. The target machine contains all of the hardware specific +/// information needed to emit code for R600 GPUs and the CodeGen pass builder +/// handles the same for new pass manager infrastructure. +// +//===--===// #include "R600CodeGenPassBuilder.h" -#include "R600TargetMachine.h" +#include "R600.h" +#include "R600MachineScheduler.h" +#include "R600TargetTransformInfo.h" +#include "llvm/Transforms/Scalar.h" +#include using namespace llvm; +static cl::opt +EnableR600StructurizeCFG("r600-ir-structurize", + cl::desc("Use StructurizeCFG IR pass"), + cl::init(true)); + +static cl::opt EnableR600IfConvert("r600-if-convert", + cl::desc("Use if conversion pass"), + cl::ReallyHidden, cl::init(true)); + +static cl::opt EnableAMDGPUFunctionCallsOpt( +"amdgpu-function-calls", cl::desc("Enable AMDGPU function call support"), +cl::location(AMDGPUTargetMachine::EnableFunctionCalls), cl::init(true), +cl::Hidden); + +static ScheduleDAGInstrs *createR600MachineScheduler(MachineSchedContext *C) { + return new ScheduleDAGMILive(C, std::make_unique()); +} + +static MachineSchedRegistry R600SchedRegistry("r600", + "Run R600's custom scheduler", + createR600MachineScheduler); + +//===--===// +// R600 Target Machine (R600 -> Cayman) - Legacy Pass Manager interface. +//===--===// + +R600TargetMachine::R600TargetMachine(const Target &T, const Triple &TT, + StringRef CPU, StringRef FS, + const TargetOptions &Options, + std::optional RM, + std::optional CM, + CodeGenOptLevel OL, bool JIT) +: AMDGPUTargetMachine(T, TT, CPU, FS, Options, RM, CM, OL) { + setRequiresStructuredCFG(true); + + // Override the default since calls aren't supported for r600. + if (EnableFunctionCalls && + EnableAMDGPUFunctionCallsOpt.getNumOccurrences() == 0) +EnableFunctionCalls = false; +} + +const TargetSubtargetInfo * +R600TargetMachine::getSubtargetImpl(const Function &F) const { + StringRef GPU = getGPUName(F); + StringRef FS = getFeatureString(F); + + SmallString<128> SubtargetKey(GPU); + SubtargetKey.append(FS); + + auto &I = SubtargetMap[SubtargetKey]; + if (!I) { +// This needs to be done before we create a new subtarget since any +// creation will depend on the TM and the code generation flags on the +// function that reside in TargetOptions. +resetTargetOptions(F); +I = std::make_unique(TargetTriple, GPU, FS, *this); + } + + return I.get(); +} + +TargetTransformInfo +R600TargetMachine::getTargetTransformInfo(const Function &F) const { + return TargetTransformInfo(R600TTIImpl(this, F)); +} + +namespace { +class R600PassConfig final : public AMDGPUPassConfig { +pu
[llvm-branch-commits] [llvm] [AMDGPU][R600] Move R600CodeGenPassBuilder into R600TargetMachine(NFC). (PR #103721)
cdevadas wrote: ### Merge activity * **Aug 19, 10:55 AM EDT**: @cdevadas started a stack merge that includes this pull request via [Graphite](https://app.graphite.dev/github/pr/llvm/llvm-project/103721). https://github.com/llvm/llvm-project/pull/103721 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port MachineCSE pass to new pass manager. (PR #106605)
https://github.com/cdevadas created https://github.com/llvm/llvm-project/pull/106605 None >From 607099de09be2fed6d9277c8439ade69e0820d92 Mon Sep 17 00:00:00 2001 From: Christudasan Devadasan Date: Thu, 29 Aug 2024 22:21:22 +0530 Subject: [PATCH] [CodeGen][NewPM] Port MachineCSE pass to new pass manager. --- llvm/include/llvm/CodeGen/MachineCSE.h| 29 ++ llvm/include/llvm/CodeGen/Passes.h| 2 +- llvm/include/llvm/InitializePasses.h | 2 +- llvm/include/llvm/Passes/CodeGenPassBuilder.h | 1 + .../llvm/Passes/MachinePassRegistry.def | 2 +- llvm/lib/CodeGen/CodeGen.cpp | 2 +- llvm/lib/CodeGen/MachineCSE.cpp | 283 ++ llvm/lib/CodeGen/TargetPassConfig.cpp | 6 +- llvm/lib/Passes/PassBuilder.cpp | 1 + .../lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 2 +- llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp | 2 +- .../GlobalISel/machine-cse-mid-pipeline.mir | 1 + .../AArch64/sve-pfalse-machine-cse.mir| 1 + .../no-cse-nonlocal-convergent-instrs.mir | 1 + .../copyprop_regsequence_with_undef.mir | 1 + llvm/test/CodeGen/AMDGPU/machine-cse-ssa.mir | 8 +- .../CodeGen/PowerPC/machine-cse-rm-pre.mir| 1 + .../CodeGen/Thumb/machine-cse-deadreg.mir | 1 + .../CodeGen/Thumb/machine-cse-physreg.mir | 1 + llvm/test/CodeGen/X86/cse-two-preds.mir | 1 + llvm/test/DebugInfo/MIR/X86/machine-cse.mir | 1 + 21 files changed, 215 insertions(+), 134 deletions(-) create mode 100644 llvm/include/llvm/CodeGen/MachineCSE.h diff --git a/llvm/include/llvm/CodeGen/MachineCSE.h b/llvm/include/llvm/CodeGen/MachineCSE.h new file mode 100644 index 00..7440068e3e6f46 --- /dev/null +++ b/llvm/include/llvm/CodeGen/MachineCSE.h @@ -0,0 +1,29 @@ +//===- llvm/CodeGen/MachineCSE.h -*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// + +#ifndef LLVM_CODEGEN_MACHINECSE_H +#define LLVM_CODEGEN_MACHINECSE_H + +#include "llvm/CodeGen/MachinePassManager.h" + +namespace llvm { + +class MachineCSEPass : public PassInfoMixin { +public: + PreservedAnalyses run(MachineFunction &MF, +MachineFunctionAnalysisManager &MFAM); + + MachineFunctionProperties getRequiredProperties() { +return MachineFunctionProperties().set( +MachineFunctionProperties::Property::IsSSA); + } +}; + +} // namespace llvm + +#endif // LLVM_CODEGEN_MACHINECSE_H diff --git a/llvm/include/llvm/CodeGen/Passes.h b/llvm/include/llvm/CodeGen/Passes.h index dbdd110b0600e5..ddb2012cd2bffc 100644 --- a/llvm/include/llvm/CodeGen/Passes.h +++ b/llvm/include/llvm/CodeGen/Passes.h @@ -330,7 +330,7 @@ namespace llvm { extern char &GCMachineCodeAnalysisID; /// MachineCSE - This pass performs global CSE on machine instructions. - extern char &MachineCSEID; + extern char &MachineCSELegacyID; /// MIRCanonicalizer - This pass canonicalizes MIR by renaming vregs /// according to the semantics of the instruction as well as hoists diff --git a/llvm/include/llvm/InitializePasses.h b/llvm/include/llvm/InitializePasses.h index 47a1ca15fc0d1f..6605c6fde92510 100644 --- a/llvm/include/llvm/InitializePasses.h +++ b/llvm/include/llvm/InitializePasses.h @@ -188,7 +188,7 @@ void initializeMachineBlockPlacementPass(PassRegistry &); void initializeMachineBlockPlacementStatsPass(PassRegistry &); void initializeMachineBranchProbabilityInfoWrapperPassPass(PassRegistry &); void initializeMachineCFGPrinterPass(PassRegistry &); -void initializeMachineCSEPass(PassRegistry &); +void initializeMachineCSELegacyPass(PassRegistry &); void initializeMachineCombinerPass(PassRegistry &); void initializeMachineCopyPropagationPass(PassRegistry &); void initializeMachineCycleInfoPrinterPassPass(PassRegistry &); diff --git a/llvm/include/llvm/Passes/CodeGenPassBuilder.h b/llvm/include/llvm/Passes/CodeGenPassBuilder.h index eb15beb835b535..6c34747b9da406 100644 --- a/llvm/include/llvm/Passes/CodeGenPassBuilder.h +++ b/llvm/include/llvm/Passes/CodeGenPassBuilder.h @@ -42,6 +42,7 @@ #include "llvm/CodeGen/LocalStackSlotAllocation.h" #include "llvm/CodeGen/LowerEmuTLS.h" #include "llvm/CodeGen/MIRPrinter.h" +#include "llvm/CodeGen/MachineCSE.h" #include "llvm/CodeGen/MachineFunctionAnalysis.h" #include "llvm/CodeGen/MachineModuleInfo.h" #include "llvm/CodeGen/MachinePassManager.h" diff --git a/llvm/include/llvm/Passes/MachinePassRegistry.def b/llvm/include/llvm/Passes/MachinePassRegistry.def index b710b1c46f643f..cb781532e266e6 100644 --- a/llvm/include/llvm/Passes/MachinePassRegistry.def +++ b/llvm/include/llvm/Passes/MachinePassRegistry.def @@ -131,6 +131,7 @@ MACHINE_FUNCTION_ANALYSIS("slot-indexes"
[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port MachineCSE pass to new pass manager. (PR #106605)
cdevadas wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.dev/github/pr/llvm/llvm-project/106605?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#106605** https://app.graphite.dev/github/pr/llvm/llvm-project/106605?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 * **#106604** https://app.graphite.dev/github/pr/llvm/llvm-project/106604?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * `main` This stack of pull requests is managed by Graphite. https://stacking.dev/?utm_source=stack-comment";>Learn more about stacking. Join @cdevadas and the rest of your teammates on https://graphite.dev?utm-source=stack-comment";>https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="11px" height="11px"/> Graphite https://github.com/llvm/llvm-project/pull/106605 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port MachineCSE pass to new pass manager. (PR #106605)
https://github.com/cdevadas ready_for_review https://github.com/llvm/llvm-project/pull/106605 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port MachineCSE pass to new pass manager. (PR #106605)
https://github.com/cdevadas updated https://github.com/llvm/llvm-project/pull/106605 >From 607099de09be2fed6d9277c8439ade69e0820d92 Mon Sep 17 00:00:00 2001 From: Christudasan Devadasan Date: Thu, 29 Aug 2024 22:21:22 +0530 Subject: [PATCH 1/2] [CodeGen][NewPM] Port MachineCSE pass to new pass manager. --- llvm/include/llvm/CodeGen/MachineCSE.h| 29 ++ llvm/include/llvm/CodeGen/Passes.h| 2 +- llvm/include/llvm/InitializePasses.h | 2 +- llvm/include/llvm/Passes/CodeGenPassBuilder.h | 1 + .../llvm/Passes/MachinePassRegistry.def | 2 +- llvm/lib/CodeGen/CodeGen.cpp | 2 +- llvm/lib/CodeGen/MachineCSE.cpp | 283 ++ llvm/lib/CodeGen/TargetPassConfig.cpp | 6 +- llvm/lib/Passes/PassBuilder.cpp | 1 + .../lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 2 +- llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp | 2 +- .../GlobalISel/machine-cse-mid-pipeline.mir | 1 + .../AArch64/sve-pfalse-machine-cse.mir| 1 + .../no-cse-nonlocal-convergent-instrs.mir | 1 + .../copyprop_regsequence_with_undef.mir | 1 + llvm/test/CodeGen/AMDGPU/machine-cse-ssa.mir | 8 +- .../CodeGen/PowerPC/machine-cse-rm-pre.mir| 1 + .../CodeGen/Thumb/machine-cse-deadreg.mir | 1 + .../CodeGen/Thumb/machine-cse-physreg.mir | 1 + llvm/test/CodeGen/X86/cse-two-preds.mir | 1 + llvm/test/DebugInfo/MIR/X86/machine-cse.mir | 1 + 21 files changed, 215 insertions(+), 134 deletions(-) create mode 100644 llvm/include/llvm/CodeGen/MachineCSE.h diff --git a/llvm/include/llvm/CodeGen/MachineCSE.h b/llvm/include/llvm/CodeGen/MachineCSE.h new file mode 100644 index 00..7440068e3e6f46 --- /dev/null +++ b/llvm/include/llvm/CodeGen/MachineCSE.h @@ -0,0 +1,29 @@ +//===- llvm/CodeGen/MachineCSE.h -*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// + +#ifndef LLVM_CODEGEN_MACHINECSE_H +#define LLVM_CODEGEN_MACHINECSE_H + +#include "llvm/CodeGen/MachinePassManager.h" + +namespace llvm { + +class MachineCSEPass : public PassInfoMixin { +public: + PreservedAnalyses run(MachineFunction &MF, +MachineFunctionAnalysisManager &MFAM); + + MachineFunctionProperties getRequiredProperties() { +return MachineFunctionProperties().set( +MachineFunctionProperties::Property::IsSSA); + } +}; + +} // namespace llvm + +#endif // LLVM_CODEGEN_MACHINECSE_H diff --git a/llvm/include/llvm/CodeGen/Passes.h b/llvm/include/llvm/CodeGen/Passes.h index dbdd110b0600e5..ddb2012cd2bffc 100644 --- a/llvm/include/llvm/CodeGen/Passes.h +++ b/llvm/include/llvm/CodeGen/Passes.h @@ -330,7 +330,7 @@ namespace llvm { extern char &GCMachineCodeAnalysisID; /// MachineCSE - This pass performs global CSE on machine instructions. - extern char &MachineCSEID; + extern char &MachineCSELegacyID; /// MIRCanonicalizer - This pass canonicalizes MIR by renaming vregs /// according to the semantics of the instruction as well as hoists diff --git a/llvm/include/llvm/InitializePasses.h b/llvm/include/llvm/InitializePasses.h index 47a1ca15fc0d1f..6605c6fde92510 100644 --- a/llvm/include/llvm/InitializePasses.h +++ b/llvm/include/llvm/InitializePasses.h @@ -188,7 +188,7 @@ void initializeMachineBlockPlacementPass(PassRegistry &); void initializeMachineBlockPlacementStatsPass(PassRegistry &); void initializeMachineBranchProbabilityInfoWrapperPassPass(PassRegistry &); void initializeMachineCFGPrinterPass(PassRegistry &); -void initializeMachineCSEPass(PassRegistry &); +void initializeMachineCSELegacyPass(PassRegistry &); void initializeMachineCombinerPass(PassRegistry &); void initializeMachineCopyPropagationPass(PassRegistry &); void initializeMachineCycleInfoPrinterPassPass(PassRegistry &); diff --git a/llvm/include/llvm/Passes/CodeGenPassBuilder.h b/llvm/include/llvm/Passes/CodeGenPassBuilder.h index eb15beb835b535..6c34747b9da406 100644 --- a/llvm/include/llvm/Passes/CodeGenPassBuilder.h +++ b/llvm/include/llvm/Passes/CodeGenPassBuilder.h @@ -42,6 +42,7 @@ #include "llvm/CodeGen/LocalStackSlotAllocation.h" #include "llvm/CodeGen/LowerEmuTLS.h" #include "llvm/CodeGen/MIRPrinter.h" +#include "llvm/CodeGen/MachineCSE.h" #include "llvm/CodeGen/MachineFunctionAnalysis.h" #include "llvm/CodeGen/MachineModuleInfo.h" #include "llvm/CodeGen/MachinePassManager.h" diff --git a/llvm/include/llvm/Passes/MachinePassRegistry.def b/llvm/include/llvm/Passes/MachinePassRegistry.def index b710b1c46f643f..cb781532e266e6 100644 --- a/llvm/include/llvm/Passes/MachinePassRegistry.def +++ b/llvm/include/llvm/Passes/MachinePassRegistry.def @@ -131,6 +131,7 @@ MACHINE_FUNCTION_ANALYSIS("slot-indexes",
[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port MachineCSE pass to new pass manager. (PR #106605)
@@ -932,18 +935,55 @@ bool MachineCSE::isProfitableToHoistInto(MachineBasicBlock *CandidateBB, MBFI->getBlockFreq(MBB) + MBFI->getBlockFreq(MBB1); } -bool MachineCSE::runOnMachineFunction(MachineFunction &MF) { - if (skipFunction(MF.getFunction())) -return false; +void MachineCSEImpl::releaseMemory() { + ScopeMap.clear(); + PREMap.clear(); + Exps.clear(); +} +bool MachineCSEImpl::run(MachineFunction &MF) { TII = MF.getSubtarget().getInstrInfo(); TRI = MF.getSubtarget().getRegisterInfo(); MRI = &MF.getRegInfo(); - DT = &getAnalysis().getDomTree(); - MBFI = &getAnalysis().getMBFI(); LookAheadLimit = TII->getMachineCSELookAheadLimit(); bool ChangedPRE, ChangedCSE; ChangedPRE = PerformSimplePRE(DT); ChangedCSE = PerformCSE(DT->getRootNode()); + releaseMemory(); return ChangedPRE || ChangedCSE; } + +PreservedAnalyses MachineCSEPass::run(MachineFunction &MF, + MachineFunctionAnalysisManager &MFAM) { + MFPropsModifier _(*this, MF); + + if (MF.getFunction().hasOptNone()) +return PreservedAnalyses::all(); + + MachineDominatorTree &MDT = MFAM.getResult(MF); + MachineBlockFrequencyInfo &MBFI = + MFAM.getResult(MF); + MachineCSEImpl Impl(&MDT, &MBFI); + bool Changed = Impl.run(MF); + if (!Changed) +return PreservedAnalyses::all(); + + auto PA = getMachineFunctionPassPreservedAnalyses(); + PA.preserve(); + PA.preserve(); cdevadas wrote: Not sure about that. I have seen similar instances (DT being preserved along with the preserveSet) in other backend passes ported to NPM. https://github.com/llvm/llvm-project/pull/106605 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port MachineCSE pass to new pass manager. (PR #106605)
https://github.com/cdevadas updated https://github.com/llvm/llvm-project/pull/106605 >From 607099de09be2fed6d9277c8439ade69e0820d92 Mon Sep 17 00:00:00 2001 From: Christudasan Devadasan Date: Thu, 29 Aug 2024 22:21:22 +0530 Subject: [PATCH 1/3] [CodeGen][NewPM] Port MachineCSE pass to new pass manager. --- llvm/include/llvm/CodeGen/MachineCSE.h| 29 ++ llvm/include/llvm/CodeGen/Passes.h| 2 +- llvm/include/llvm/InitializePasses.h | 2 +- llvm/include/llvm/Passes/CodeGenPassBuilder.h | 1 + .../llvm/Passes/MachinePassRegistry.def | 2 +- llvm/lib/CodeGen/CodeGen.cpp | 2 +- llvm/lib/CodeGen/MachineCSE.cpp | 283 ++ llvm/lib/CodeGen/TargetPassConfig.cpp | 6 +- llvm/lib/Passes/PassBuilder.cpp | 1 + .../lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 2 +- llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp | 2 +- .../GlobalISel/machine-cse-mid-pipeline.mir | 1 + .../AArch64/sve-pfalse-machine-cse.mir| 1 + .../no-cse-nonlocal-convergent-instrs.mir | 1 + .../copyprop_regsequence_with_undef.mir | 1 + llvm/test/CodeGen/AMDGPU/machine-cse-ssa.mir | 8 +- .../CodeGen/PowerPC/machine-cse-rm-pre.mir| 1 + .../CodeGen/Thumb/machine-cse-deadreg.mir | 1 + .../CodeGen/Thumb/machine-cse-physreg.mir | 1 + llvm/test/CodeGen/X86/cse-two-preds.mir | 1 + llvm/test/DebugInfo/MIR/X86/machine-cse.mir | 1 + 21 files changed, 215 insertions(+), 134 deletions(-) create mode 100644 llvm/include/llvm/CodeGen/MachineCSE.h diff --git a/llvm/include/llvm/CodeGen/MachineCSE.h b/llvm/include/llvm/CodeGen/MachineCSE.h new file mode 100644 index 00..7440068e3e6f46 --- /dev/null +++ b/llvm/include/llvm/CodeGen/MachineCSE.h @@ -0,0 +1,29 @@ +//===- llvm/CodeGen/MachineCSE.h -*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// + +#ifndef LLVM_CODEGEN_MACHINECSE_H +#define LLVM_CODEGEN_MACHINECSE_H + +#include "llvm/CodeGen/MachinePassManager.h" + +namespace llvm { + +class MachineCSEPass : public PassInfoMixin { +public: + PreservedAnalyses run(MachineFunction &MF, +MachineFunctionAnalysisManager &MFAM); + + MachineFunctionProperties getRequiredProperties() { +return MachineFunctionProperties().set( +MachineFunctionProperties::Property::IsSSA); + } +}; + +} // namespace llvm + +#endif // LLVM_CODEGEN_MACHINECSE_H diff --git a/llvm/include/llvm/CodeGen/Passes.h b/llvm/include/llvm/CodeGen/Passes.h index dbdd110b0600e5..ddb2012cd2bffc 100644 --- a/llvm/include/llvm/CodeGen/Passes.h +++ b/llvm/include/llvm/CodeGen/Passes.h @@ -330,7 +330,7 @@ namespace llvm { extern char &GCMachineCodeAnalysisID; /// MachineCSE - This pass performs global CSE on machine instructions. - extern char &MachineCSEID; + extern char &MachineCSELegacyID; /// MIRCanonicalizer - This pass canonicalizes MIR by renaming vregs /// according to the semantics of the instruction as well as hoists diff --git a/llvm/include/llvm/InitializePasses.h b/llvm/include/llvm/InitializePasses.h index 47a1ca15fc0d1f..6605c6fde92510 100644 --- a/llvm/include/llvm/InitializePasses.h +++ b/llvm/include/llvm/InitializePasses.h @@ -188,7 +188,7 @@ void initializeMachineBlockPlacementPass(PassRegistry &); void initializeMachineBlockPlacementStatsPass(PassRegistry &); void initializeMachineBranchProbabilityInfoWrapperPassPass(PassRegistry &); void initializeMachineCFGPrinterPass(PassRegistry &); -void initializeMachineCSEPass(PassRegistry &); +void initializeMachineCSELegacyPass(PassRegistry &); void initializeMachineCombinerPass(PassRegistry &); void initializeMachineCopyPropagationPass(PassRegistry &); void initializeMachineCycleInfoPrinterPassPass(PassRegistry &); diff --git a/llvm/include/llvm/Passes/CodeGenPassBuilder.h b/llvm/include/llvm/Passes/CodeGenPassBuilder.h index eb15beb835b535..6c34747b9da406 100644 --- a/llvm/include/llvm/Passes/CodeGenPassBuilder.h +++ b/llvm/include/llvm/Passes/CodeGenPassBuilder.h @@ -42,6 +42,7 @@ #include "llvm/CodeGen/LocalStackSlotAllocation.h" #include "llvm/CodeGen/LowerEmuTLS.h" #include "llvm/CodeGen/MIRPrinter.h" +#include "llvm/CodeGen/MachineCSE.h" #include "llvm/CodeGen/MachineFunctionAnalysis.h" #include "llvm/CodeGen/MachineModuleInfo.h" #include "llvm/CodeGen/MachinePassManager.h" diff --git a/llvm/include/llvm/Passes/MachinePassRegistry.def b/llvm/include/llvm/Passes/MachinePassRegistry.def index b710b1c46f643f..cb781532e266e6 100644 --- a/llvm/include/llvm/Passes/MachinePassRegistry.def +++ b/llvm/include/llvm/Passes/MachinePassRegistry.def @@ -131,6 +131,7 @@ MACHINE_FUNCTION_ANALYSIS("slot-indexes",
[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port MachineCSE pass to new pass manager. (PR #106605)
@@ -932,18 +935,55 @@ bool MachineCSE::isProfitableToHoistInto(MachineBasicBlock *CandidateBB, MBFI->getBlockFreq(MBB) + MBFI->getBlockFreq(MBB1); } -bool MachineCSE::runOnMachineFunction(MachineFunction &MF) { - if (skipFunction(MF.getFunction())) -return false; +void MachineCSEImpl::releaseMemory() { + ScopeMap.clear(); + PREMap.clear(); + Exps.clear(); +} +bool MachineCSEImpl::run(MachineFunction &MF) { TII = MF.getSubtarget().getInstrInfo(); TRI = MF.getSubtarget().getRegisterInfo(); MRI = &MF.getRegInfo(); - DT = &getAnalysis().getDomTree(); - MBFI = &getAnalysis().getMBFI(); LookAheadLimit = TII->getMachineCSELookAheadLimit(); bool ChangedPRE, ChangedCSE; ChangedPRE = PerformSimplePRE(DT); ChangedCSE = PerformCSE(DT->getRootNode()); + releaseMemory(); return ChangedPRE || ChangedCSE; } + +PreservedAnalyses MachineCSEPass::run(MachineFunction &MF, + MachineFunctionAnalysisManager &MFAM) { + MFPropsModifier _(*this, MF); + + if (MF.getFunction().hasOptNone()) +return PreservedAnalyses::all(); + + MachineDominatorTree &MDT = MFAM.getResult(MF); + MachineBlockFrequencyInfo &MBFI = + MFAM.getResult(MF); + MachineCSEImpl Impl(&MDT, &MBFI); + bool Changed = Impl.run(MF); + if (!Changed) +return PreservedAnalyses::all(); + + auto PA = getMachineFunctionPassPreservedAnalyses(); + PA.preserve(); + PA.preserve(); cdevadas wrote: I guess, we won't be able to figure it out until we have the whole pipeline running in NPM. https://github.com/llvm/llvm-project/pull/106605 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port MachineCSE pass to new pass manager. (PR #106605)
cdevadas wrote: Ping https://github.com/llvm/llvm-project/pull/106605 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port MachineCSE pass to new pass manager. (PR #106605)
cdevadas wrote: ### Merge activity * **Sep 4, 8:51 AM EDT**: @cdevadas started a stack merge that includes this pull request via [Graphite](https://app.graphite.dev/github/pr/llvm/llvm-project/106605). https://github.com/llvm/llvm-project/pull/106605 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port machine trace metrics analysis to new pass manager. (PR #108507)
https://github.com/cdevadas created https://github.com/llvm/llvm-project/pull/108507 None >From b56e90fa59c0f5db905b94aa74c771e3f72cd81d Mon Sep 17 00:00:00 2001 From: Christudasan Devadasan Date: Thu, 12 Sep 2024 23:38:09 +0530 Subject: [PATCH] [CodeGen][NewPM] Port machine trace metrics analysis to new pass manager. --- .../llvm/CodeGen/MachineTraceMetrics.h| 58 ++--- llvm/include/llvm/InitializePasses.h | 2 +- .../llvm/Passes/MachinePassRegistry.def | 4 +- llvm/lib/CodeGen/EarlyIfConversion.cpp| 8 +-- llvm/lib/CodeGen/MachineCombiner.cpp | 8 +-- llvm/lib/CodeGen/MachineTraceMetrics.cpp | 62 +++ llvm/lib/Passes/PassBuilder.cpp | 1 + .../AArch64/AArch64ConditionalCompares.cpp| 8 +-- .../AArch64/AArch64StorePairSuppress.cpp | 6 +- 9 files changed, 119 insertions(+), 38 deletions(-) diff --git a/llvm/include/llvm/CodeGen/MachineTraceMetrics.h b/llvm/include/llvm/CodeGen/MachineTraceMetrics.h index a5e78d47724d82..ca5a1197911d0a 100644 --- a/llvm/include/llvm/CodeGen/MachineTraceMetrics.h +++ b/llvm/include/llvm/CodeGen/MachineTraceMetrics.h @@ -46,12 +46,13 @@ #ifndef LLVM_CODEGEN_MACHINETRACEMETRICS_H #define LLVM_CODEGEN_MACHINETRACEMETRICS_H -#include "llvm/ADT/SparseSet.h" #include "llvm/ADT/ArrayRef.h" #include "llvm/ADT/DenseMap.h" #include "llvm/ADT/SmallVector.h" +#include "llvm/ADT/SparseSet.h" #include "llvm/CodeGen/MachineBasicBlock.h" #include "llvm/CodeGen/MachineFunctionPass.h" +#include "llvm/CodeGen/MachinePassManager.h" #include "llvm/CodeGen/TargetSchedule.h" namespace llvm { @@ -93,7 +94,7 @@ enum class MachineTraceStrategy { TS_NumStrategies }; -class MachineTraceMetrics : public MachineFunctionPass { +class MachineTraceMetrics { const MachineFunction *MF = nullptr; const TargetInstrInfo *TII = nullptr; const TargetRegisterInfo *TRI = nullptr; @@ -102,19 +103,22 @@ class MachineTraceMetrics : public MachineFunctionPass { TargetSchedModel SchedModel; public: + friend class MachineTraceMetricsWrapperPass; friend class Ensemble; friend class Trace; class Ensemble; - static char ID; + // For legacy pass. + MachineTraceMetrics() { +std::fill(std::begin(Ensembles), std::end(Ensembles), nullptr); + } - MachineTraceMetrics(); + explicit MachineTraceMetrics(MachineFunction &MF, const MachineLoopInfo &LI); + ~MachineTraceMetrics(); - void getAnalysisUsage(AnalysisUsage&) const override; - bool runOnMachineFunction(MachineFunction&) override; - void releaseMemory() override; - void verifyAnalysis() const override; + void init(MachineFunction &Func, const MachineLoopInfo &LI); + void clear(); /// Per-basic block information that doesn't depend on the trace through the /// block. @@ -400,6 +404,12 @@ class MachineTraceMetrics : public MachineFunctionPass { /// Call Ensemble::getTrace() again to update any trace handles. void invalidate(const MachineBasicBlock *MBB); + /// Handle invalidation explicitly. + bool invalidate(MachineFunction &, const PreservedAnalyses &PA, + MachineFunctionAnalysisManager::Invalidator &); + + void verifyAnalysis() const; + private: // One entry per basic block, indexed by block number. SmallVector BlockInfo; @@ -435,6 +445,38 @@ inline raw_ostream &operator<<(raw_ostream &OS, return OS; } +class MachineTraceMetricsAnalysis +: public AnalysisInfoMixin { + friend AnalysisInfoMixin; + static AnalysisKey Key; + +public: + using Result = MachineTraceMetrics; + Result run(MachineFunction &MF, MachineFunctionAnalysisManager &MFAM); +}; + +/// Verifier pass for \c MachineTraceMetrics. +struct MachineTraceMetricsVerifierPass +: PassInfoMixin { + PreservedAnalyses run(MachineFunction &MF, +MachineFunctionAnalysisManager &MFAM); + static bool isRequired() { return true; } +}; + +class MachineTraceMetricsWrapperPass : public MachineFunctionPass { +public: + static char ID; + MachineTraceMetrics MTM; + + MachineTraceMetricsWrapperPass(); + + void getAnalysisUsage(AnalysisUsage &) const override; + bool runOnMachineFunction(MachineFunction &) override; + void releaseMemory() override { MTM.clear(); } + void verifyAnalysis() const override { MTM.verifyAnalysis(); } + MachineTraceMetrics &getMTM() { return MTM; } +}; + } // end namespace llvm #endif // LLVM_CODEGEN_MACHINETRACEMETRICS_H diff --git a/llvm/include/llvm/InitializePasses.h b/llvm/include/llvm/InitializePasses.h index 4352099d6dbb99..3fa6fabaeccd64 100644 --- a/llvm/include/llvm/InitializePasses.h +++ b/llvm/include/llvm/InitializePasses.h @@ -209,7 +209,7 @@ void initializeMachineRegionInfoPassPass(PassRegistry &); void initializeMachineSanitizerBinaryMetadataPass(PassRegistry &); void initializeMachineSchedulerPass(PassRegistry &); void initializeMachineSinkingPass(PassRegistry &); -void initializeMachineTraceMetricsPass(PassRegistry &
[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port EarlyIfConversion pass to NPM. (PR #108508)
https://github.com/cdevadas created https://github.com/llvm/llvm-project/pull/108508 None >From 8c819329488c087fce339d4fd65761bc986ed80e Mon Sep 17 00:00:00 2001 From: Christudasan Devadasan Date: Fri, 13 Sep 2024 12:22:03 +0530 Subject: [PATCH] [CodeGen][NewPM] Port EarlyIfConversion pass to NPM. --- llvm/include/llvm/CodeGen/EarlyIfConversion.h | 24 ++ llvm/include/llvm/CodeGen/Passes.h| 2 +- llvm/include/llvm/InitializePasses.h | 2 +- llvm/include/llvm/Passes/CodeGenPassBuilder.h | 1 + .../llvm/Passes/MachinePassRegistry.def | 2 +- llvm/lib/CodeGen/CodeGen.cpp | 2 +- llvm/lib/CodeGen/EarlyIfConversion.cpp| 79 ++- llvm/lib/CodeGen/TargetPassConfig.cpp | 4 +- llvm/lib/Passes/PassBuilder.cpp | 1 + .../Target/AArch64/AArch64TargetMachine.cpp | 2 +- .../lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 2 +- llvm/lib/Target/PowerPC/PPCTargetMachine.cpp | 2 +- .../Target/SystemZ/SystemZTargetMachine.cpp | 2 +- llvm/lib/Target/X86/X86TargetMachine.cpp | 2 +- .../early-ifcvt-likely-predictable.mir| 1 + .../AArch64/early-ifcvt-regclass-mismatch.mir | 1 + .../AArch64/early-ifcvt-same-value.mir| 1 + .../CodeGen/PowerPC/early-ifcvt-no-isel.mir | 2 + 18 files changed, 102 insertions(+), 30 deletions(-) create mode 100644 llvm/include/llvm/CodeGen/EarlyIfConversion.h diff --git a/llvm/include/llvm/CodeGen/EarlyIfConversion.h b/llvm/include/llvm/CodeGen/EarlyIfConversion.h new file mode 100644 index 00..78bf12ade02c3d --- /dev/null +++ b/llvm/include/llvm/CodeGen/EarlyIfConversion.h @@ -0,0 +1,24 @@ +//===- llvm/CodeGen/EarlyIfConversion.h -*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// + +#ifndef LLVM_CODEGEN_EARLYIFCONVERSION_H +#define LLVM_CODEGEN_EARLYIFCONVERSION_H + +#include "llvm/CodeGen/MachinePassManager.h" + +namespace llvm { + +class EarlyIfConverterPass : public PassInfoMixin { +public: + PreservedAnalyses run(MachineFunction &MF, +MachineFunctionAnalysisManager &MFAM); +}; + +} // namespace llvm + +#endif // LLVM_CODEGEN_EARLYIFCONVERSION_H diff --git a/llvm/include/llvm/CodeGen/Passes.h b/llvm/include/llvm/CodeGen/Passes.h index ddb2012cd2bffc..5d042d8daa5630 100644 --- a/llvm/include/llvm/CodeGen/Passes.h +++ b/llvm/include/llvm/CodeGen/Passes.h @@ -273,7 +273,7 @@ namespace llvm { /// EarlyIfConverter - This pass performs if-conversion on SSA form by /// inserting cmov instructions. - extern char &EarlyIfConverterID; + extern char &EarlyIfConverterLegacyID; /// EarlyIfPredicator - This pass performs if-conversion on SSA form by /// predicating if/else block and insert select at the join point. diff --git a/llvm/include/llvm/InitializePasses.h b/llvm/include/llvm/InitializePasses.h index 3fa6fabaeccd64..9d98e80a4fd365 100644 --- a/llvm/include/llvm/InitializePasses.h +++ b/llvm/include/llvm/InitializePasses.h @@ -98,7 +98,7 @@ void initializeDominatorTreeWrapperPassPass(PassRegistry &); void initializeDwarfEHPrepareLegacyPassPass(PassRegistry &); void initializeEarlyCSELegacyPassPass(PassRegistry &); void initializeEarlyCSEMemSSALegacyPassPass(PassRegistry &); -void initializeEarlyIfConverterPass(PassRegistry &); +void initializeEarlyIfConverterLegacyPass(PassRegistry &); void initializeEarlyIfPredicatorPass(PassRegistry &); void initializeEarlyMachineLICMPass(PassRegistry &); void initializeEarlyTailDuplicatePass(PassRegistry &); diff --git a/llvm/include/llvm/Passes/CodeGenPassBuilder.h b/llvm/include/llvm/Passes/CodeGenPassBuilder.h index a99fed86d168d1..5f005707fe3cc0 100644 --- a/llvm/include/llvm/Passes/CodeGenPassBuilder.h +++ b/llvm/include/llvm/Passes/CodeGenPassBuilder.h @@ -27,6 +27,7 @@ #include "llvm/CodeGen/CodeGenPrepare.h" #include "llvm/CodeGen/DeadMachineInstructionElim.h" #include "llvm/CodeGen/DwarfEHPrepare.h" +#include "llvm/CodeGen/EarlyIfConversion.h" #include "llvm/CodeGen/ExpandLargeDivRem.h" #include "llvm/CodeGen/ExpandLargeFpConvert.h" #include "llvm/CodeGen/ExpandMemCmp.h" diff --git a/llvm/include/llvm/Passes/MachinePassRegistry.def b/llvm/include/llvm/Passes/MachinePassRegistry.def index e92d6dd97c655a..949936e55a0f06 100644 --- a/llvm/include/llvm/Passes/MachinePassRegistry.def +++ b/llvm/include/llvm/Passes/MachinePassRegistry.def @@ -128,6 +128,7 @@ MACHINE_FUNCTION_ANALYSIS("slot-indexes", SlotIndexesAnalysis()) #ifndef MACHINE_FUNCTION_PASS #define MACHINE_FUNCTION_PASS(NAME, CREATE_PASS) #endif +MACHINE_FUNCTION_PASS("early-ifcvt", EarlyIfConverterPass()) MACHINE_FUNCTION_PASS("dead-mi-elimination", DeadMachineInstructionElimPass()) MACHINE_FUNCTION
[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port EarlyIfConversion pass to NPM. (PR #108508)
cdevadas wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.dev/github/pr/llvm/llvm-project/108508?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#108508** https://app.graphite.dev/github/pr/llvm/llvm-project/108508?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 * **#108507** https://app.graphite.dev/github/pr/llvm/llvm-project/108507?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#108506** https://app.graphite.dev/github/pr/llvm/llvm-project/108506?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * `main` This stack of pull requests is managed by Graphite. https://stacking.dev/?utm_source=stack-comment";>Learn more about stacking. Join @cdevadas and the rest of your teammates on https://graphite.dev?utm-source=stack-comment";>https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="11px" height="11px"/> Graphite https://github.com/llvm/llvm-project/pull/108508 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port machine trace metrics analysis to new pass manager. (PR #108507)
cdevadas wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.dev/github/pr/llvm/llvm-project/108507?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#108508** https://app.graphite.dev/github/pr/llvm/llvm-project/108508?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#108507** https://app.graphite.dev/github/pr/llvm/llvm-project/108507?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 * **#108506** https://app.graphite.dev/github/pr/llvm/llvm-project/108506?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * `main` This stack of pull requests is managed by Graphite. https://stacking.dev/?utm_source=stack-comment";>Learn more about stacking. Join @cdevadas and the rest of your teammates on https://graphite.dev?utm-source=stack-comment";>https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="11px" height="11px"/> Graphite https://github.com/llvm/llvm-project/pull/108507 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port machine trace metrics analysis to new pass manager. (PR #108507)
https://github.com/cdevadas ready_for_review https://github.com/llvm/llvm-project/pull/108507 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port EarlyIfConversion pass to NPM. (PR #108508)
https://github.com/cdevadas ready_for_review https://github.com/llvm/llvm-project/pull/108508 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port machine trace metrics analysis to new pass manager. (PR #108507)
@@ -39,41 +39,68 @@ using namespace llvm; #define DEBUG_TYPE "machine-trace-metrics" -char MachineTraceMetrics::ID = 0; +AnalysisKey MachineTraceMetricsAnalysis::Key; -char &llvm::MachineTraceMetricsID = MachineTraceMetrics::ID; +MachineTraceMetricsAnalysis::Result +MachineTraceMetricsAnalysis::run(MachineFunction &MF, + MachineFunctionAnalysisManager &MFAM) { + return Result(MF, MFAM.getResult(MF)); +} + +PreservedAnalyses +MachineTraceMetricsVerifierPass::run(MachineFunction &MF, + MachineFunctionAnalysisManager &MFAM) { + MFAM.getResult(MF).verifyAnalysis(); cdevadas wrote: In the NPM infrastructure verifyAnalysis is done separately. There are many such instances in the opt pipeline. This is the first attempt to port CodeGen analysis with verify. https://github.com/llvm/llvm-project/pull/108507 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port EarlyIfConversion pass to NPM. (PR #108508)
https://github.com/cdevadas updated https://github.com/llvm/llvm-project/pull/108508 >From 8c819329488c087fce339d4fd65761bc986ed80e Mon Sep 17 00:00:00 2001 From: Christudasan Devadasan Date: Fri, 13 Sep 2024 12:22:03 +0530 Subject: [PATCH 1/2] [CodeGen][NewPM] Port EarlyIfConversion pass to NPM. --- llvm/include/llvm/CodeGen/EarlyIfConversion.h | 24 ++ llvm/include/llvm/CodeGen/Passes.h| 2 +- llvm/include/llvm/InitializePasses.h | 2 +- llvm/include/llvm/Passes/CodeGenPassBuilder.h | 1 + .../llvm/Passes/MachinePassRegistry.def | 2 +- llvm/lib/CodeGen/CodeGen.cpp | 2 +- llvm/lib/CodeGen/EarlyIfConversion.cpp| 79 ++- llvm/lib/CodeGen/TargetPassConfig.cpp | 4 +- llvm/lib/Passes/PassBuilder.cpp | 1 + .../Target/AArch64/AArch64TargetMachine.cpp | 2 +- .../lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 2 +- llvm/lib/Target/PowerPC/PPCTargetMachine.cpp | 2 +- .../Target/SystemZ/SystemZTargetMachine.cpp | 2 +- llvm/lib/Target/X86/X86TargetMachine.cpp | 2 +- .../early-ifcvt-likely-predictable.mir| 1 + .../AArch64/early-ifcvt-regclass-mismatch.mir | 1 + .../AArch64/early-ifcvt-same-value.mir| 1 + .../CodeGen/PowerPC/early-ifcvt-no-isel.mir | 2 + 18 files changed, 102 insertions(+), 30 deletions(-) create mode 100644 llvm/include/llvm/CodeGen/EarlyIfConversion.h diff --git a/llvm/include/llvm/CodeGen/EarlyIfConversion.h b/llvm/include/llvm/CodeGen/EarlyIfConversion.h new file mode 100644 index 00..78bf12ade02c3d --- /dev/null +++ b/llvm/include/llvm/CodeGen/EarlyIfConversion.h @@ -0,0 +1,24 @@ +//===- llvm/CodeGen/EarlyIfConversion.h -*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// + +#ifndef LLVM_CODEGEN_EARLYIFCONVERSION_H +#define LLVM_CODEGEN_EARLYIFCONVERSION_H + +#include "llvm/CodeGen/MachinePassManager.h" + +namespace llvm { + +class EarlyIfConverterPass : public PassInfoMixin { +public: + PreservedAnalyses run(MachineFunction &MF, +MachineFunctionAnalysisManager &MFAM); +}; + +} // namespace llvm + +#endif // LLVM_CODEGEN_EARLYIFCONVERSION_H diff --git a/llvm/include/llvm/CodeGen/Passes.h b/llvm/include/llvm/CodeGen/Passes.h index ddb2012cd2bffc..5d042d8daa5630 100644 --- a/llvm/include/llvm/CodeGen/Passes.h +++ b/llvm/include/llvm/CodeGen/Passes.h @@ -273,7 +273,7 @@ namespace llvm { /// EarlyIfConverter - This pass performs if-conversion on SSA form by /// inserting cmov instructions. - extern char &EarlyIfConverterID; + extern char &EarlyIfConverterLegacyID; /// EarlyIfPredicator - This pass performs if-conversion on SSA form by /// predicating if/else block and insert select at the join point. diff --git a/llvm/include/llvm/InitializePasses.h b/llvm/include/llvm/InitializePasses.h index 3fa6fabaeccd64..9d98e80a4fd365 100644 --- a/llvm/include/llvm/InitializePasses.h +++ b/llvm/include/llvm/InitializePasses.h @@ -98,7 +98,7 @@ void initializeDominatorTreeWrapperPassPass(PassRegistry &); void initializeDwarfEHPrepareLegacyPassPass(PassRegistry &); void initializeEarlyCSELegacyPassPass(PassRegistry &); void initializeEarlyCSEMemSSALegacyPassPass(PassRegistry &); -void initializeEarlyIfConverterPass(PassRegistry &); +void initializeEarlyIfConverterLegacyPass(PassRegistry &); void initializeEarlyIfPredicatorPass(PassRegistry &); void initializeEarlyMachineLICMPass(PassRegistry &); void initializeEarlyTailDuplicatePass(PassRegistry &); diff --git a/llvm/include/llvm/Passes/CodeGenPassBuilder.h b/llvm/include/llvm/Passes/CodeGenPassBuilder.h index a99fed86d168d1..5f005707fe3cc0 100644 --- a/llvm/include/llvm/Passes/CodeGenPassBuilder.h +++ b/llvm/include/llvm/Passes/CodeGenPassBuilder.h @@ -27,6 +27,7 @@ #include "llvm/CodeGen/CodeGenPrepare.h" #include "llvm/CodeGen/DeadMachineInstructionElim.h" #include "llvm/CodeGen/DwarfEHPrepare.h" +#include "llvm/CodeGen/EarlyIfConversion.h" #include "llvm/CodeGen/ExpandLargeDivRem.h" #include "llvm/CodeGen/ExpandLargeFpConvert.h" #include "llvm/CodeGen/ExpandMemCmp.h" diff --git a/llvm/include/llvm/Passes/MachinePassRegistry.def b/llvm/include/llvm/Passes/MachinePassRegistry.def index e92d6dd97c655a..949936e55a0f06 100644 --- a/llvm/include/llvm/Passes/MachinePassRegistry.def +++ b/llvm/include/llvm/Passes/MachinePassRegistry.def @@ -128,6 +128,7 @@ MACHINE_FUNCTION_ANALYSIS("slot-indexes", SlotIndexesAnalysis()) #ifndef MACHINE_FUNCTION_PASS #define MACHINE_FUNCTION_PASS(NAME, CREATE_PASS) #endif +MACHINE_FUNCTION_PASS("early-ifcvt", EarlyIfConverterPass()) MACHINE_FUNCTION_PASS("dead-mi-elimination", DeadMachineInstructionElimPass()) MACHINE_FUNCTION_P
[llvm-branch-commits] [llvm] [AMDGPU][NewPM] Fill out addILPOpts. (PR #108514)
https://github.com/cdevadas created https://github.com/llvm/llvm-project/pull/108514 None >From 2941048b558ae43ff0c96a1cc301976435c95a7f Mon Sep 17 00:00:00 2001 From: Christudasan Devadasan Date: Fri, 13 Sep 2024 13:53:01 +0530 Subject: [PATCH] [AMDGPU][NewPM] Fill out addILPOpts. --- llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 7 +++ llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h | 1 + 2 files changed, 8 insertions(+) diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp index 8c45e6b5e589c2..6cc9ff6a981fee 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp @@ -1898,6 +1898,13 @@ void AMDGPUCodeGenPassBuilder::addPreISel(AddIRPass &addPass) const { addPass(RequireAnalysisPass()); } +void AMDGPUCodeGenPassBuilder::addILPOpts(AddMachinePass &addPass) const { + if (EnableEarlyIfConversion) +addPass(EarlyIfConverterPass()); + + Base::addILPOpts(addPass); +} + void AMDGPUCodeGenPassBuilder::addAsmPrinter(AddMachinePass &addPass, CreateMCStreamer) const { // TODO: Add AsmPrinter. diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h index 5b7257ddb36f1e..96b414f294ee70 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h @@ -172,6 +172,7 @@ class AMDGPUCodeGenPassBuilder void addIRPasses(AddIRPass &) const; void addCodeGenPrepare(AddIRPass &) const; void addPreISel(AddIRPass &addPass) const; + void addILPOpts(AddMachinePass &) const; void addAsmPrinter(AddMachinePass &, CreateMCStreamer) const; Error addInstSelector(AddMachinePass &) const; ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][NewPM] Fill out addILPOpts. (PR #108514)
cdevadas wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.dev/github/pr/llvm/llvm-project/108514?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#108514** https://app.graphite.dev/github/pr/llvm/llvm-project/108514?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 * **#108508** https://app.graphite.dev/github/pr/llvm/llvm-project/108508?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#108507** https://app.graphite.dev/github/pr/llvm/llvm-project/108507?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#108506** https://app.graphite.dev/github/pr/llvm/llvm-project/108506?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * `main` This stack of pull requests is managed by Graphite. https://stacking.dev/?utm_source=stack-comment";>Learn more about stacking. Join @cdevadas and the rest of your teammates on https://graphite.dev?utm-source=stack-comment";>https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="11px" height="11px"/> Graphite https://github.com/llvm/llvm-project/pull/108514 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][NewPM] Fill out addILPOpts. (PR #108514)
https://github.com/cdevadas ready_for_review https://github.com/llvm/llvm-project/pull/108514 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] c971bcd - [AMDGPU] Test clean up (NFC)
Author: Christudasan Devadasan Date: 2021-01-22T13:38:52+05:30 New Revision: c971bcd2102b905e6469463fb8309ab3f7b2b8f2 URL: https://github.com/llvm/llvm-project/commit/c971bcd2102b905e6469463fb8309ab3f7b2b8f2 DIFF: https://github.com/llvm/llvm-project/commit/c971bcd2102b905e6469463fb8309ab3f7b2b8f2.diff LOG: [AMDGPU] Test clean up (NFC) Added: Modified: llvm/test/CodeGen/AMDGPU/local-stack-alloc-block-sp-reference.ll Removed: diff --git a/llvm/test/CodeGen/AMDGPU/local-stack-alloc-block-sp-reference.ll b/llvm/test/CodeGen/AMDGPU/local-stack-alloc-block-sp-reference.ll index 4fb2625a5221..5e4b5f70de0b 100644 --- a/llvm/test/CodeGen/AMDGPU/local-stack-alloc-block-sp-reference.ll +++ b/llvm/test/CodeGen/AMDGPU/local-stack-alloc-block-sp-reference.ll @@ -16,7 +16,7 @@ ; so eliminateFrameIndex would not adjust the access to use the ; correct FP offset. -define amdgpu_kernel void @local_stack_offset_uses_sp(i64 addrspace(1)* %out, i8 addrspace(1)* %in) { +define amdgpu_kernel void @local_stack_offset_uses_sp(i64 addrspace(1)* %out) { ; MUBUF-LABEL: local_stack_offset_uses_sp: ; MUBUF: ; %bb.0: ; %entry ; MUBUF-NEXT:s_load_dwordx2 s[4:5], s[4:5], 0x0 @@ -106,7 +106,7 @@ entry: ret void } -define void @func_local_stack_offset_uses_sp(i64 addrspace(1)* %out, i8 addrspace(1)* %in) { +define void @func_local_stack_offset_uses_sp(i64 addrspace(1)* %out) { ; MUBUF-LABEL: func_local_stack_offset_uses_sp: ; MUBUF: ; %bb.0: ; %entry ; MUBUF-NEXT:s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] ff8a1ca - [AMDGPU] Fix the inconsistency in soffset for MUBUF stack accesses.
Author: Christudasan Devadasan Date: 2021-01-22T14:20:59+05:30 New Revision: ff8a1cae181438b97937848060da1efb67117ea4 URL: https://github.com/llvm/llvm-project/commit/ff8a1cae181438b97937848060da1efb67117ea4 DIFF: https://github.com/llvm/llvm-project/commit/ff8a1cae181438b97937848060da1efb67117ea4.diff LOG: [AMDGPU] Fix the inconsistency in soffset for MUBUF stack accesses. During instruction selection, there is an inconsistency in choosing the initial soffset value. With certain early passes, this value is getting modified and that brought additional fixup during eliminateFrameIndex to work for all cases. This whole transformation looks trivial and can be handled better. This patch clearly defines the initial value for soffset and keeps it unchanged before eliminateFrameIndex. The initial value must be zero for MUBUF with a frame index. The non-frame index MUBUF forms that use a raw offset from SP will have the stack register for soffset. During frame elimination, the soffset remains zero for entry functions with zero dynamic allocas and no callsites, or else is updated to the appropriate frame/stack register. Also, did some code clean up and made all asserts around soffset stricter to match. Reviewed By: scott.linder Differential Revision: https://reviews.llvm.org/D95071 Added: Modified: llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp llvm/lib/Target/AMDGPU/SIFoldOperands.cpp llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-private.mir llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-store-private.mir llvm/test/CodeGen/AMDGPU/amdpal-callable.ll llvm/test/CodeGen/AMDGPU/fold-fi-mubuf.mir llvm/test/CodeGen/AMDGPU/stack-pointer-offset-relative-frameindex.ll Removed: diff --git a/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp b/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp index 3c66745c0e70..340f4ac6f57a 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp @@ -1523,7 +1523,9 @@ std::pair AMDGPUDAGToDAGISel::foldFrameIndex(SDValue N) const FI ? CurDAG->getTargetFrameIndex(FI->getIndex(), FI->getValueType(0)) : N; // We rebase the base address into an absolute stack address and hence - // use constant 0 for soffset. + // use constant 0 for soffset. This value must be retained until + // frame elimination and eliminateFrameIndex will choose the appropriate + // frame register if need be. return std::make_pair(TFI, CurDAG->getTargetConstant(0, DL, MVT::i32)); } diff --git a/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp b/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp index 7255a061b26b..bd577a6fb8c5 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp @@ -3669,13 +3669,9 @@ AMDGPUInstructionSelector::selectMUBUFScratchOffen(MachineOperand &Root) const { MIB.addReg(HighBits); }, [=](MachineInstrBuilder &MIB) { // soffset - const MachineMemOperand *MMO = *MI->memoperands_begin(); - const MachinePointerInfo &PtrInfo = MMO->getPointerInfo(); - - if (isStackPtrRelative(PtrInfo)) - MIB.addReg(Info->getStackPtrOffsetReg()); - else - MIB.addImm(0); + // Use constant zero for soffset and rely on eliminateFrameIndex + // to choose the appropriate frame register if need be. + MIB.addImm(0); }, [=](MachineInstrBuilder &MIB) { // offset MIB.addImm(Offset & 4095); @@ -3722,15 +3718,9 @@ AMDGPUInstructionSelector::selectMUBUFScratchOffen(MachineOperand &Root) const { MIB.addReg(VAddr); }, [=](MachineInstrBuilder &MIB) { // soffset - // If we don't know this private access is a local stack object, it - // needs to be relative to the entry point's scratch wave offset. - // TODO: Should split large offsets that don't fit like above. - // TODO: Don't use scratch wave offset just because the offset - // didn't fit. - if (!Info->isEntryFunction() && FI.hasValue()) - MIB.addReg(Info->getStackPtrOffsetReg()); - else - MIB.addImm(0); + // Use constant zero for soffset and rely on eliminateFrameIndex + // to choose the appropriate frame register if need be. + MIB.addImm(0); }, [=](MachineInstrBuilder &MIB) { // offset MIB.addImm(Offset); diff --git a/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp b/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp index d22bdb791535..d5fa9afded27 100644 --- a/llvm/lib/T
[llvm-branch-commits] [llvm] d68458b - [GlobalISel] Base implementation for sret demotion.
Author: Christudasan Devadasan Date: 2021-01-06T10:30:50+05:30 New Revision: d68458bd56d9d55b05fca5447891aa8752d70509 URL: https://github.com/llvm/llvm-project/commit/d68458bd56d9d55b05fca5447891aa8752d70509 DIFF: https://github.com/llvm/llvm-project/commit/d68458bd56d9d55b05fca5447891aa8752d70509.diff LOG: [GlobalISel] Base implementation for sret demotion. If the return values can't be lowered to registers SelectionDAG performs the sret demotion. This patch contains the basic implementation for the same in the GlobalISel pipeline. Furthermore, targets should bring relevant changes during lowerFormalArguments, lowerReturn and lowerCall to make use of this feature. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D92953 Added: Modified: llvm/include/llvm/CodeGen/GlobalISel/CallLowering.h llvm/lib/CodeGen/GlobalISel/CallLowering.cpp llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp llvm/lib/Target/AArch64/GISel/AArch64CallLowering.cpp llvm/lib/Target/AArch64/GISel/AArch64CallLowering.h llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp llvm/lib/Target/AMDGPU/AMDGPUCallLowering.h llvm/lib/Target/ARM/ARMCallLowering.cpp llvm/lib/Target/ARM/ARMCallLowering.h llvm/lib/Target/Mips/MipsCallLowering.cpp llvm/lib/Target/Mips/MipsCallLowering.h llvm/lib/Target/PowerPC/GISel/PPCCallLowering.cpp llvm/lib/Target/PowerPC/GISel/PPCCallLowering.h llvm/lib/Target/RISCV/RISCVCallLowering.cpp llvm/lib/Target/RISCV/RISCVCallLowering.h llvm/lib/Target/X86/X86CallLowering.cpp llvm/lib/Target/X86/X86CallLowering.h llvm/tools/llvm-exegesis/lib/Assembler.cpp Removed: diff --git a/llvm/include/llvm/CodeGen/GlobalISel/CallLowering.h b/llvm/include/llvm/CodeGen/GlobalISel/CallLowering.h index dbd7e00c429a..dff73d185114 100644 --- a/llvm/include/llvm/CodeGen/GlobalISel/CallLowering.h +++ b/llvm/include/llvm/CodeGen/GlobalISel/CallLowering.h @@ -19,6 +19,7 @@ #include "llvm/CodeGen/CallingConvLower.h" #include "llvm/CodeGen/MachineOperand.h" #include "llvm/CodeGen/TargetCallingConv.h" +#include "llvm/IR/Attributes.h" #include "llvm/IR/CallingConv.h" #include "llvm/IR/Type.h" #include "llvm/Support/ErrorHandling.h" @@ -31,6 +32,7 @@ namespace llvm { class CallBase; class DataLayout; class Function; +class FunctionLoweringInfo; class MachineIRBuilder; struct MachinePointerInfo; class MachineRegisterInfo; @@ -42,21 +44,30 @@ class CallLowering { virtual void anchor(); public: - struct ArgInfo { + struct BaseArgInfo { +Type *Ty; +SmallVector Flags; +bool IsFixed; + +BaseArgInfo(Type *Ty, +ArrayRef Flags = ArrayRef(), +bool IsFixed = true) +: Ty(Ty), Flags(Flags.begin(), Flags.end()), IsFixed(IsFixed) {} + +BaseArgInfo() : Ty(nullptr), IsFixed(false) {} + }; + + struct ArgInfo : public BaseArgInfo { SmallVector Regs; // If the argument had to be split into multiple parts according to the // target calling convention, then this contains the original vregs // if the argument was an incoming arg. SmallVector OrigRegs; -Type *Ty; -SmallVector Flags; -bool IsFixed; ArgInfo(ArrayRef Regs, Type *Ty, ArrayRef Flags = ArrayRef(), bool IsFixed = true) -: Regs(Regs.begin(), Regs.end()), Ty(Ty), - Flags(Flags.begin(), Flags.end()), IsFixed(IsFixed) { +: BaseArgInfo(Ty, Flags, IsFixed), Regs(Regs.begin(), Regs.end()) { if (!Regs.empty() && Flags.empty()) this->Flags.push_back(ISD::ArgFlagsTy()); // FIXME: We should have just one way of saying "no register". @@ -65,7 +76,7 @@ class CallLowering { "only void types should have no register"); } -ArgInfo() : Ty(nullptr), IsFixed(false) {} +ArgInfo() : BaseArgInfo() {} }; struct CallLoweringInfo { @@ -101,6 +112,15 @@ class CallLowering { /// True if the call is to a vararg function. bool IsVarArg = false; + +/// True if the function's return value can be lowered to registers. +bool CanLowerReturn = true; + +/// VReg to hold the hidden sret parameter. +Register DemoteRegister; + +/// The stack index for sret demotion. +int DemoteStackIndex; }; /// Argument handling is mostly uniform between the four places that @@ -292,20 +312,73 @@ class CallLowering { return false; } + /// Load the returned value from the stack into virtual registers in \p VRegs. + /// It uses the frame index \p FI and the start offset from \p DemoteReg. + /// The loaded data size will be determined from \p RetTy. + void insertSRetLoads(MachineIRBuilder &MIRBuilder, Type *RetTy, + ArrayRef VRegs, Register DemoteReg, + int FI) const; + + /// Store the return value given by \p VRegs into stack starting at the offset +
[llvm-branch-commits] [llvm] ae25a39 - AMDGPU/GlobalISel: Enable sret demotion
Author: Christudasan Devadasan Date: 2021-01-08T10:56:35+05:30 New Revision: ae25a397e9de833ffbd5d8e3b480086404625cb7 URL: https://github.com/llvm/llvm-project/commit/ae25a397e9de833ffbd5d8e3b480086404625cb7 DIFF: https://github.com/llvm/llvm-project/commit/ae25a397e9de833ffbd5d8e3b480086404625cb7.diff LOG: AMDGPU/GlobalISel: Enable sret demotion Added: Modified: llvm/include/llvm/CodeGen/GlobalISel/CallLowering.h llvm/lib/CodeGen/GlobalISel/CallLowering.cpp llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp llvm/lib/Target/AMDGPU/AMDGPUCallLowering.h llvm/test/CodeGen/AMDGPU/GlobalISel/function-returns.ll llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-call-return-values.ll Removed: diff --git a/llvm/include/llvm/CodeGen/GlobalISel/CallLowering.h b/llvm/include/llvm/CodeGen/GlobalISel/CallLowering.h index dff73d185114..57ff3900ef25 100644 --- a/llvm/include/llvm/CodeGen/GlobalISel/CallLowering.h +++ b/llvm/include/llvm/CodeGen/GlobalISel/CallLowering.h @@ -358,8 +358,8 @@ class CallLowering { /// described by \p Outs can fit into the return registers. If false /// is returned, an sret-demotion is performed. virtual bool canLowerReturn(MachineFunction &MF, CallingConv::ID CallConv, - SmallVectorImpl &Outs, bool IsVarArg, - LLVMContext &Context) const { + SmallVectorImpl &Outs, + bool IsVarArg) const { return true; } diff --git a/llvm/lib/CodeGen/GlobalISel/CallLowering.cpp b/llvm/lib/CodeGen/GlobalISel/CallLowering.cpp index e1591a4bf19b..a6d4ea76 100644 --- a/llvm/lib/CodeGen/GlobalISel/CallLowering.cpp +++ b/llvm/lib/CodeGen/GlobalISel/CallLowering.cpp @@ -95,8 +95,7 @@ bool CallLowering::lowerCall(MachineIRBuilder &MIRBuilder, const CallBase &CB, SmallVector SplitArgs; getReturnInfo(CallConv, RetTy, CB.getAttributes(), SplitArgs, DL); - Info.CanLowerReturn = - canLowerReturn(MF, CallConv, SplitArgs, IsVarArg, RetTy->getContext()); + Info.CanLowerReturn = canLowerReturn(MF, CallConv, SplitArgs, IsVarArg); if (!Info.CanLowerReturn) { // Callee requires sret demotion. @@ -592,8 +591,7 @@ bool CallLowering::checkReturnTypeForCallConv(MachineFunction &MF) const { SmallVector SplitArgs; getReturnInfo(CallConv, ReturnType, F.getAttributes(), SplitArgs, MF.getDataLayout()); - return canLowerReturn(MF, CallConv, SplitArgs, F.isVarArg(), -ReturnType->getContext()); + return canLowerReturn(MF, CallConv, SplitArgs, F.isVarArg()); } bool CallLowering::analyzeArgInfo(CCState &CCState, diff --git a/llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp b/llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp index 3b6e263ee6d8..b86052e3a14e 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp @@ -20,6 +20,7 @@ #include "SIMachineFunctionInfo.h" #include "SIRegisterInfo.h" #include "llvm/CodeGen/Analysis.h" +#include "llvm/CodeGen/FunctionLoweringInfo.h" #include "llvm/CodeGen/GlobalISel/MachineIRBuilder.h" #include "llvm/IR/IntrinsicsAMDGPU.h" @@ -420,6 +421,22 @@ static void unpackRegsToOrigType(MachineIRBuilder &B, B.buildUnmerge(UnmergeResults, UnmergeSrc); } +bool AMDGPUCallLowering::canLowerReturn(MachineFunction &MF, +CallingConv::ID CallConv, +SmallVectorImpl &Outs, +bool IsVarArg) const { + // For shaders. Vector types should be explicitly handled by CC. + if (AMDGPU::isEntryFunctionCC(CallConv)) +return true; + + SmallVector ArgLocs; + const SITargetLowering &TLI = *getTLI(); + CCState CCInfo(CallConv, IsVarArg, MF, ArgLocs, + MF.getFunction().getContext()); + + return checkReturn(CCInfo, Outs, TLI.CCAssignFnForReturn(CallConv, IsVarArg)); +} + /// Lower the return value for the already existing \p Ret. This assumes that /// \p B's insertion point is correct. bool AMDGPUCallLowering::lowerReturnVal(MachineIRBuilder &B, @@ -533,7 +550,9 @@ bool AMDGPUCallLowering::lowerReturn(MachineIRBuilder &B, const Value *Val, Ret.addUse(ReturnAddrVReg); } - if (!lowerReturnVal(B, Val, VRegs, Ret)) + if (!FLI.CanLowerReturn) +insertSRetStores(B, Val->getType(), VRegs, FLI.DemoteRegister); + else if (!lowerReturnVal(B, Val, VRegs, Ret)) return false; if (ReturnOpc == AMDGPU::S_SETPC_B64_return) { @@ -872,6 +891,11 @@ bool AMDGPUCallLowering::lowerFormalArguments( unsigned Idx = 0; unsigned PSInputNum = 0; + // Insert the hidden sret parameter if the return value won't fit in the + // return registers. + if (!FLI.CanLowerReturn) +insertSRetIncomingArgument(F, SplitArgs, FLI.DemoteRegister, MRI, DL); + for (auto &Arg : F.args()) {
[llvm-branch-commits] [llvm] [MIR] Add missing noteNewVirtualRegister callbacks (PR #111634)
https://github.com/cdevadas commented: The changes you made in llvm/lib/CodeGen/MIRParser/MIRParser.cpp are not related to this PR. Create a separate NFC patch for it. https://github.com/llvm/llvm-project/pull/111634 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SILowerSGPRSpills] Updated the correct pass dependency (PR #109937)
https://github.com/cdevadas edited https://github.com/llvm/llvm-project/pull/109937 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][SILowerSGPRSpills] Updated the correct pass dependency. (PR #109937)
https://github.com/cdevadas edited https://github.com/llvm/llvm-project/pull/109937 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port machine trace metrics analysis to new pass manager. (PR #108507)
https://github.com/cdevadas updated https://github.com/llvm/llvm-project/pull/108507 >From e0e4e978c06a2c78b31382274527201e03082e00 Mon Sep 17 00:00:00 2001 From: Christudasan Devadasan Date: Thu, 12 Sep 2024 23:38:09 +0530 Subject: [PATCH 1/3] [CodeGen][NewPM] Port machine trace metrics analysis to new pass manager. --- .../llvm/CodeGen/MachineTraceMetrics.h| 58 ++--- llvm/include/llvm/InitializePasses.h | 2 +- .../llvm/Passes/MachinePassRegistry.def | 4 +- llvm/lib/CodeGen/EarlyIfConversion.cpp| 8 +-- llvm/lib/CodeGen/MachineCombiner.cpp | 8 +-- llvm/lib/CodeGen/MachineTraceMetrics.cpp | 62 +++ llvm/lib/Passes/PassBuilder.cpp | 1 + .../AArch64/AArch64ConditionalCompares.cpp| 8 +-- .../AArch64/AArch64StorePairSuppress.cpp | 6 +- 9 files changed, 119 insertions(+), 38 deletions(-) diff --git a/llvm/include/llvm/CodeGen/MachineTraceMetrics.h b/llvm/include/llvm/CodeGen/MachineTraceMetrics.h index c7d97597d551cd..36718f80a1b6dd 100644 --- a/llvm/include/llvm/CodeGen/MachineTraceMetrics.h +++ b/llvm/include/llvm/CodeGen/MachineTraceMetrics.h @@ -46,12 +46,13 @@ #ifndef LLVM_CODEGEN_MACHINETRACEMETRICS_H #define LLVM_CODEGEN_MACHINETRACEMETRICS_H -#include "llvm/ADT/SparseSet.h" #include "llvm/ADT/ArrayRef.h" #include "llvm/ADT/DenseMap.h" #include "llvm/ADT/SmallVector.h" +#include "llvm/ADT/SparseSet.h" #include "llvm/CodeGen/MachineBasicBlock.h" #include "llvm/CodeGen/MachineFunctionPass.h" +#include "llvm/CodeGen/MachinePassManager.h" #include "llvm/CodeGen/TargetSchedule.h" namespace llvm { @@ -93,7 +94,7 @@ enum class MachineTraceStrategy { TS_NumStrategies }; -class MachineTraceMetrics : public MachineFunctionPass { +class MachineTraceMetrics { const MachineFunction *MF = nullptr; const TargetInstrInfo *TII = nullptr; const TargetRegisterInfo *TRI = nullptr; @@ -102,19 +103,22 @@ class MachineTraceMetrics : public MachineFunctionPass { TargetSchedModel SchedModel; public: + friend class MachineTraceMetricsWrapperPass; friend class Ensemble; friend class Trace; class Ensemble; - static char ID; + // For legacy pass. + MachineTraceMetrics() { +std::fill(std::begin(Ensembles), std::end(Ensembles), nullptr); + } - MachineTraceMetrics(); + explicit MachineTraceMetrics(MachineFunction &MF, const MachineLoopInfo &LI); + ~MachineTraceMetrics(); - void getAnalysisUsage(AnalysisUsage&) const override; - bool runOnMachineFunction(MachineFunction&) override; - void releaseMemory() override; - void verifyAnalysis() const override; + void init(MachineFunction &Func, const MachineLoopInfo &LI); + void clear(); /// Per-basic block information that doesn't depend on the trace through the /// block. @@ -400,6 +404,12 @@ class MachineTraceMetrics : public MachineFunctionPass { /// Call Ensemble::getTrace() again to update any trace handles. void invalidate(const MachineBasicBlock *MBB); + /// Handle invalidation explicitly. + bool invalidate(MachineFunction &, const PreservedAnalyses &PA, + MachineFunctionAnalysisManager::Invalidator &); + + void verifyAnalysis() const; + private: // One entry per basic block, indexed by block number. SmallVector BlockInfo; @@ -435,6 +445,38 @@ inline raw_ostream &operator<<(raw_ostream &OS, return OS; } +class MachineTraceMetricsAnalysis +: public AnalysisInfoMixin { + friend AnalysisInfoMixin; + static AnalysisKey Key; + +public: + using Result = MachineTraceMetrics; + Result run(MachineFunction &MF, MachineFunctionAnalysisManager &MFAM); +}; + +/// Verifier pass for \c MachineTraceMetrics. +struct MachineTraceMetricsVerifierPass +: PassInfoMixin { + PreservedAnalyses run(MachineFunction &MF, +MachineFunctionAnalysisManager &MFAM); + static bool isRequired() { return true; } +}; + +class MachineTraceMetricsWrapperPass : public MachineFunctionPass { +public: + static char ID; + MachineTraceMetrics MTM; + + MachineTraceMetricsWrapperPass(); + + void getAnalysisUsage(AnalysisUsage &) const override; + bool runOnMachineFunction(MachineFunction &) override; + void releaseMemory() override { MTM.clear(); } + void verifyAnalysis() const override { MTM.verifyAnalysis(); } + MachineTraceMetrics &getMTM() { return MTM; } +}; + } // end namespace llvm #endif // LLVM_CODEGEN_MACHINETRACEMETRICS_H diff --git a/llvm/include/llvm/InitializePasses.h b/llvm/include/llvm/InitializePasses.h index 6a75dc0285cc61..5ed0ad98a2a72d 100644 --- a/llvm/include/llvm/InitializePasses.h +++ b/llvm/include/llvm/InitializePasses.h @@ -209,7 +209,7 @@ void initializeMachineRegionInfoPassPass(PassRegistry &); void initializeMachineSanitizerBinaryMetadataPass(PassRegistry &); void initializeMachineSchedulerPass(PassRegistry &); void initializeMachineSinkingPass(PassRegistry &); -void initializeMachineTraceMetricsPass(PassRegistry &);
[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port machine trace metrics analysis to new pass manager. (PR #108507)
https://github.com/cdevadas updated https://github.com/llvm/llvm-project/pull/108507 >From e0e4e978c06a2c78b31382274527201e03082e00 Mon Sep 17 00:00:00 2001 From: Christudasan Devadasan Date: Thu, 12 Sep 2024 23:38:09 +0530 Subject: [PATCH 1/3] [CodeGen][NewPM] Port machine trace metrics analysis to new pass manager. --- .../llvm/CodeGen/MachineTraceMetrics.h| 58 ++--- llvm/include/llvm/InitializePasses.h | 2 +- .../llvm/Passes/MachinePassRegistry.def | 4 +- llvm/lib/CodeGen/EarlyIfConversion.cpp| 8 +-- llvm/lib/CodeGen/MachineCombiner.cpp | 8 +-- llvm/lib/CodeGen/MachineTraceMetrics.cpp | 62 +++ llvm/lib/Passes/PassBuilder.cpp | 1 + .../AArch64/AArch64ConditionalCompares.cpp| 8 +-- .../AArch64/AArch64StorePairSuppress.cpp | 6 +- 9 files changed, 119 insertions(+), 38 deletions(-) diff --git a/llvm/include/llvm/CodeGen/MachineTraceMetrics.h b/llvm/include/llvm/CodeGen/MachineTraceMetrics.h index c7d97597d551cd..36718f80a1b6dd 100644 --- a/llvm/include/llvm/CodeGen/MachineTraceMetrics.h +++ b/llvm/include/llvm/CodeGen/MachineTraceMetrics.h @@ -46,12 +46,13 @@ #ifndef LLVM_CODEGEN_MACHINETRACEMETRICS_H #define LLVM_CODEGEN_MACHINETRACEMETRICS_H -#include "llvm/ADT/SparseSet.h" #include "llvm/ADT/ArrayRef.h" #include "llvm/ADT/DenseMap.h" #include "llvm/ADT/SmallVector.h" +#include "llvm/ADT/SparseSet.h" #include "llvm/CodeGen/MachineBasicBlock.h" #include "llvm/CodeGen/MachineFunctionPass.h" +#include "llvm/CodeGen/MachinePassManager.h" #include "llvm/CodeGen/TargetSchedule.h" namespace llvm { @@ -93,7 +94,7 @@ enum class MachineTraceStrategy { TS_NumStrategies }; -class MachineTraceMetrics : public MachineFunctionPass { +class MachineTraceMetrics { const MachineFunction *MF = nullptr; const TargetInstrInfo *TII = nullptr; const TargetRegisterInfo *TRI = nullptr; @@ -102,19 +103,22 @@ class MachineTraceMetrics : public MachineFunctionPass { TargetSchedModel SchedModel; public: + friend class MachineTraceMetricsWrapperPass; friend class Ensemble; friend class Trace; class Ensemble; - static char ID; + // For legacy pass. + MachineTraceMetrics() { +std::fill(std::begin(Ensembles), std::end(Ensembles), nullptr); + } - MachineTraceMetrics(); + explicit MachineTraceMetrics(MachineFunction &MF, const MachineLoopInfo &LI); + ~MachineTraceMetrics(); - void getAnalysisUsage(AnalysisUsage&) const override; - bool runOnMachineFunction(MachineFunction&) override; - void releaseMemory() override; - void verifyAnalysis() const override; + void init(MachineFunction &Func, const MachineLoopInfo &LI); + void clear(); /// Per-basic block information that doesn't depend on the trace through the /// block. @@ -400,6 +404,12 @@ class MachineTraceMetrics : public MachineFunctionPass { /// Call Ensemble::getTrace() again to update any trace handles. void invalidate(const MachineBasicBlock *MBB); + /// Handle invalidation explicitly. + bool invalidate(MachineFunction &, const PreservedAnalyses &PA, + MachineFunctionAnalysisManager::Invalidator &); + + void verifyAnalysis() const; + private: // One entry per basic block, indexed by block number. SmallVector BlockInfo; @@ -435,6 +445,38 @@ inline raw_ostream &operator<<(raw_ostream &OS, return OS; } +class MachineTraceMetricsAnalysis +: public AnalysisInfoMixin { + friend AnalysisInfoMixin; + static AnalysisKey Key; + +public: + using Result = MachineTraceMetrics; + Result run(MachineFunction &MF, MachineFunctionAnalysisManager &MFAM); +}; + +/// Verifier pass for \c MachineTraceMetrics. +struct MachineTraceMetricsVerifierPass +: PassInfoMixin { + PreservedAnalyses run(MachineFunction &MF, +MachineFunctionAnalysisManager &MFAM); + static bool isRequired() { return true; } +}; + +class MachineTraceMetricsWrapperPass : public MachineFunctionPass { +public: + static char ID; + MachineTraceMetrics MTM; + + MachineTraceMetricsWrapperPass(); + + void getAnalysisUsage(AnalysisUsage &) const override; + bool runOnMachineFunction(MachineFunction &) override; + void releaseMemory() override { MTM.clear(); } + void verifyAnalysis() const override { MTM.verifyAnalysis(); } + MachineTraceMetrics &getMTM() { return MTM; } +}; + } // end namespace llvm #endif // LLVM_CODEGEN_MACHINETRACEMETRICS_H diff --git a/llvm/include/llvm/InitializePasses.h b/llvm/include/llvm/InitializePasses.h index 6a75dc0285cc61..5ed0ad98a2a72d 100644 --- a/llvm/include/llvm/InitializePasses.h +++ b/llvm/include/llvm/InitializePasses.h @@ -209,7 +209,7 @@ void initializeMachineRegionInfoPassPass(PassRegistry &); void initializeMachineSanitizerBinaryMetadataPass(PassRegistry &); void initializeMachineSchedulerPass(PassRegistry &); void initializeMachineSinkingPass(PassRegistry &); -void initializeMachineTraceMetricsPass(PassRegistry &);
[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port EarlyIfConversion pass to NPM. (PR #108508)
https://github.com/cdevadas updated https://github.com/llvm/llvm-project/pull/108508 >From 9f27fd5fbaa0c9a9075d074d1915ea0cc65e3b07 Mon Sep 17 00:00:00 2001 From: Christudasan Devadasan Date: Fri, 13 Sep 2024 12:22:03 +0530 Subject: [PATCH 1/2] [CodeGen][NewPM] Port EarlyIfConversion pass to NPM. --- llvm/include/llvm/CodeGen/EarlyIfConversion.h | 24 ++ llvm/include/llvm/CodeGen/Passes.h| 2 +- llvm/include/llvm/InitializePasses.h | 2 +- llvm/include/llvm/Passes/CodeGenPassBuilder.h | 1 + .../llvm/Passes/MachinePassRegistry.def | 2 +- llvm/lib/CodeGen/CodeGen.cpp | 2 +- llvm/lib/CodeGen/EarlyIfConversion.cpp| 79 ++- llvm/lib/CodeGen/TargetPassConfig.cpp | 4 +- llvm/lib/Passes/PassBuilder.cpp | 1 + .../Target/AArch64/AArch64TargetMachine.cpp | 2 +- .../lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 2 +- llvm/lib/Target/PowerPC/PPCTargetMachine.cpp | 2 +- .../Target/SystemZ/SystemZTargetMachine.cpp | 2 +- llvm/lib/Target/X86/X86TargetMachine.cpp | 2 +- .../early-ifcvt-likely-predictable.mir| 1 + .../AArch64/early-ifcvt-regclass-mismatch.mir | 1 + .../AArch64/early-ifcvt-same-value.mir| 1 + .../CodeGen/PowerPC/early-ifcvt-no-isel.mir | 2 + 18 files changed, 102 insertions(+), 30 deletions(-) create mode 100644 llvm/include/llvm/CodeGen/EarlyIfConversion.h diff --git a/llvm/include/llvm/CodeGen/EarlyIfConversion.h b/llvm/include/llvm/CodeGen/EarlyIfConversion.h new file mode 100644 index 00..78bf12ade02c3d --- /dev/null +++ b/llvm/include/llvm/CodeGen/EarlyIfConversion.h @@ -0,0 +1,24 @@ +//===- llvm/CodeGen/EarlyIfConversion.h -*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// + +#ifndef LLVM_CODEGEN_EARLYIFCONVERSION_H +#define LLVM_CODEGEN_EARLYIFCONVERSION_H + +#include "llvm/CodeGen/MachinePassManager.h" + +namespace llvm { + +class EarlyIfConverterPass : public PassInfoMixin { +public: + PreservedAnalyses run(MachineFunction &MF, +MachineFunctionAnalysisManager &MFAM); +}; + +} // namespace llvm + +#endif // LLVM_CODEGEN_EARLYIFCONVERSION_H diff --git a/llvm/include/llvm/CodeGen/Passes.h b/llvm/include/llvm/CodeGen/Passes.h index 99421bdf769ffa..bbbf99626098a6 100644 --- a/llvm/include/llvm/CodeGen/Passes.h +++ b/llvm/include/llvm/CodeGen/Passes.h @@ -273,7 +273,7 @@ namespace llvm { /// EarlyIfConverter - This pass performs if-conversion on SSA form by /// inserting cmov instructions. - extern char &EarlyIfConverterID; + extern char &EarlyIfConverterLegacyID; /// EarlyIfPredicator - This pass performs if-conversion on SSA form by /// predicating if/else block and insert select at the join point. diff --git a/llvm/include/llvm/InitializePasses.h b/llvm/include/llvm/InitializePasses.h index 5ed0ad98a2a72d..1374880b6a716b 100644 --- a/llvm/include/llvm/InitializePasses.h +++ b/llvm/include/llvm/InitializePasses.h @@ -98,7 +98,7 @@ void initializeDominatorTreeWrapperPassPass(PassRegistry &); void initializeDwarfEHPrepareLegacyPassPass(PassRegistry &); void initializeEarlyCSELegacyPassPass(PassRegistry &); void initializeEarlyCSEMemSSALegacyPassPass(PassRegistry &); -void initializeEarlyIfConverterPass(PassRegistry &); +void initializeEarlyIfConverterLegacyPass(PassRegistry &); void initializeEarlyIfPredicatorPass(PassRegistry &); void initializeEarlyMachineLICMPass(PassRegistry &); void initializeEarlyTailDuplicatePass(PassRegistry &); diff --git a/llvm/include/llvm/Passes/CodeGenPassBuilder.h b/llvm/include/llvm/Passes/CodeGenPassBuilder.h index 0d45df08cb0ca7..9ef6e39dbb1cdd 100644 --- a/llvm/include/llvm/Passes/CodeGenPassBuilder.h +++ b/llvm/include/llvm/Passes/CodeGenPassBuilder.h @@ -27,6 +27,7 @@ #include "llvm/CodeGen/CodeGenPrepare.h" #include "llvm/CodeGen/DeadMachineInstructionElim.h" #include "llvm/CodeGen/DwarfEHPrepare.h" +#include "llvm/CodeGen/EarlyIfConversion.h" #include "llvm/CodeGen/ExpandLargeDivRem.h" #include "llvm/CodeGen/ExpandLargeFpConvert.h" #include "llvm/CodeGen/ExpandMemCmp.h" diff --git a/llvm/include/llvm/Passes/MachinePassRegistry.def b/llvm/include/llvm/Passes/MachinePassRegistry.def index 1d7084354455c7..15c6e2de6488c4 100644 --- a/llvm/include/llvm/Passes/MachinePassRegistry.def +++ b/llvm/include/llvm/Passes/MachinePassRegistry.def @@ -128,6 +128,7 @@ MACHINE_FUNCTION_ANALYSIS("slot-indexes", SlotIndexesAnalysis()) #ifndef MACHINE_FUNCTION_PASS #define MACHINE_FUNCTION_PASS(NAME, CREATE_PASS) #endif +MACHINE_FUNCTION_PASS("early-ifcvt", EarlyIfConverterPass()) MACHINE_FUNCTION_PASS("dead-mi-elimination", DeadMachineInstructionElimPass()) MACHINE_FUNCTION_P
[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port EarlyIfConversion pass to NPM. (PR #108508)
https://github.com/cdevadas updated https://github.com/llvm/llvm-project/pull/108508 >From 9f27fd5fbaa0c9a9075d074d1915ea0cc65e3b07 Mon Sep 17 00:00:00 2001 From: Christudasan Devadasan Date: Fri, 13 Sep 2024 12:22:03 +0530 Subject: [PATCH 1/2] [CodeGen][NewPM] Port EarlyIfConversion pass to NPM. --- llvm/include/llvm/CodeGen/EarlyIfConversion.h | 24 ++ llvm/include/llvm/CodeGen/Passes.h| 2 +- llvm/include/llvm/InitializePasses.h | 2 +- llvm/include/llvm/Passes/CodeGenPassBuilder.h | 1 + .../llvm/Passes/MachinePassRegistry.def | 2 +- llvm/lib/CodeGen/CodeGen.cpp | 2 +- llvm/lib/CodeGen/EarlyIfConversion.cpp| 79 ++- llvm/lib/CodeGen/TargetPassConfig.cpp | 4 +- llvm/lib/Passes/PassBuilder.cpp | 1 + .../Target/AArch64/AArch64TargetMachine.cpp | 2 +- .../lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 2 +- llvm/lib/Target/PowerPC/PPCTargetMachine.cpp | 2 +- .../Target/SystemZ/SystemZTargetMachine.cpp | 2 +- llvm/lib/Target/X86/X86TargetMachine.cpp | 2 +- .../early-ifcvt-likely-predictable.mir| 1 + .../AArch64/early-ifcvt-regclass-mismatch.mir | 1 + .../AArch64/early-ifcvt-same-value.mir| 1 + .../CodeGen/PowerPC/early-ifcvt-no-isel.mir | 2 + 18 files changed, 102 insertions(+), 30 deletions(-) create mode 100644 llvm/include/llvm/CodeGen/EarlyIfConversion.h diff --git a/llvm/include/llvm/CodeGen/EarlyIfConversion.h b/llvm/include/llvm/CodeGen/EarlyIfConversion.h new file mode 100644 index 00..78bf12ade02c3d --- /dev/null +++ b/llvm/include/llvm/CodeGen/EarlyIfConversion.h @@ -0,0 +1,24 @@ +//===- llvm/CodeGen/EarlyIfConversion.h -*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// + +#ifndef LLVM_CODEGEN_EARLYIFCONVERSION_H +#define LLVM_CODEGEN_EARLYIFCONVERSION_H + +#include "llvm/CodeGen/MachinePassManager.h" + +namespace llvm { + +class EarlyIfConverterPass : public PassInfoMixin { +public: + PreservedAnalyses run(MachineFunction &MF, +MachineFunctionAnalysisManager &MFAM); +}; + +} // namespace llvm + +#endif // LLVM_CODEGEN_EARLYIFCONVERSION_H diff --git a/llvm/include/llvm/CodeGen/Passes.h b/llvm/include/llvm/CodeGen/Passes.h index 99421bdf769ffa..bbbf99626098a6 100644 --- a/llvm/include/llvm/CodeGen/Passes.h +++ b/llvm/include/llvm/CodeGen/Passes.h @@ -273,7 +273,7 @@ namespace llvm { /// EarlyIfConverter - This pass performs if-conversion on SSA form by /// inserting cmov instructions. - extern char &EarlyIfConverterID; + extern char &EarlyIfConverterLegacyID; /// EarlyIfPredicator - This pass performs if-conversion on SSA form by /// predicating if/else block and insert select at the join point. diff --git a/llvm/include/llvm/InitializePasses.h b/llvm/include/llvm/InitializePasses.h index 5ed0ad98a2a72d..1374880b6a716b 100644 --- a/llvm/include/llvm/InitializePasses.h +++ b/llvm/include/llvm/InitializePasses.h @@ -98,7 +98,7 @@ void initializeDominatorTreeWrapperPassPass(PassRegistry &); void initializeDwarfEHPrepareLegacyPassPass(PassRegistry &); void initializeEarlyCSELegacyPassPass(PassRegistry &); void initializeEarlyCSEMemSSALegacyPassPass(PassRegistry &); -void initializeEarlyIfConverterPass(PassRegistry &); +void initializeEarlyIfConverterLegacyPass(PassRegistry &); void initializeEarlyIfPredicatorPass(PassRegistry &); void initializeEarlyMachineLICMPass(PassRegistry &); void initializeEarlyTailDuplicatePass(PassRegistry &); diff --git a/llvm/include/llvm/Passes/CodeGenPassBuilder.h b/llvm/include/llvm/Passes/CodeGenPassBuilder.h index 0d45df08cb0ca7..9ef6e39dbb1cdd 100644 --- a/llvm/include/llvm/Passes/CodeGenPassBuilder.h +++ b/llvm/include/llvm/Passes/CodeGenPassBuilder.h @@ -27,6 +27,7 @@ #include "llvm/CodeGen/CodeGenPrepare.h" #include "llvm/CodeGen/DeadMachineInstructionElim.h" #include "llvm/CodeGen/DwarfEHPrepare.h" +#include "llvm/CodeGen/EarlyIfConversion.h" #include "llvm/CodeGen/ExpandLargeDivRem.h" #include "llvm/CodeGen/ExpandLargeFpConvert.h" #include "llvm/CodeGen/ExpandMemCmp.h" diff --git a/llvm/include/llvm/Passes/MachinePassRegistry.def b/llvm/include/llvm/Passes/MachinePassRegistry.def index 1d7084354455c7..15c6e2de6488c4 100644 --- a/llvm/include/llvm/Passes/MachinePassRegistry.def +++ b/llvm/include/llvm/Passes/MachinePassRegistry.def @@ -128,6 +128,7 @@ MACHINE_FUNCTION_ANALYSIS("slot-indexes", SlotIndexesAnalysis()) #ifndef MACHINE_FUNCTION_PASS #define MACHINE_FUNCTION_PASS(NAME, CREATE_PASS) #endif +MACHINE_FUNCTION_PASS("early-ifcvt", EarlyIfConverterPass()) MACHINE_FUNCTION_PASS("dead-mi-elimination", DeadMachineInstructionElimPass()) MACHINE_FUNCTION_P
[llvm-branch-commits] [llvm] [AMDGPU][NewPM] Fill out addILPOpts. (PR #108514)
https://github.com/cdevadas updated https://github.com/llvm/llvm-project/pull/108514 >From 10462c6c2e6b087575d1f8f0c94c38ddebb013a9 Mon Sep 17 00:00:00 2001 From: Christudasan Devadasan Date: Fri, 13 Sep 2024 13:53:01 +0530 Subject: [PATCH] [AMDGPU][NewPM] Fill out addILPOpts. --- llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 7 +++ llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h | 1 + 2 files changed, 8 insertions(+) diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp index 8a3c1a92a63a2d..affb8a2654c1ab 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp @@ -1994,6 +1994,13 @@ void AMDGPUCodeGenPassBuilder::addPreISel(AddIRPass &addPass) const { addPass(RequireAnalysisPass()); } +void AMDGPUCodeGenPassBuilder::addILPOpts(AddMachinePass &addPass) const { + if (EnableEarlyIfConversion) +addPass(EarlyIfConverterPass()); + + Base::addILPOpts(addPass); +} + void AMDGPUCodeGenPassBuilder::addAsmPrinter(AddMachinePass &addPass, CreateMCStreamer) const { // TODO: Add AsmPrinter. diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h index af8476bc21ec61..d8a5111e5898d7 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h @@ -172,6 +172,7 @@ class AMDGPUCodeGenPassBuilder void addIRPasses(AddIRPass &) const; void addCodeGenPrepare(AddIRPass &) const; void addPreISel(AddIRPass &addPass) const; + void addILPOpts(AddMachinePass &) const; void addAsmPrinter(AddMachinePass &, CreateMCStreamer) const; Error addInstSelector(AddMachinePass &) const; void addMachineSSAOptimization(AddMachinePass &) const; ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port machine trace metrics analysis to new pass manager. (PR #108507)
https://github.com/cdevadas updated https://github.com/llvm/llvm-project/pull/108507 >From e0e4e978c06a2c78b31382274527201e03082e00 Mon Sep 17 00:00:00 2001 From: Christudasan Devadasan Date: Thu, 12 Sep 2024 23:38:09 +0530 Subject: [PATCH 1/2] [CodeGen][NewPM] Port machine trace metrics analysis to new pass manager. --- .../llvm/CodeGen/MachineTraceMetrics.h| 58 ++--- llvm/include/llvm/InitializePasses.h | 2 +- .../llvm/Passes/MachinePassRegistry.def | 4 +- llvm/lib/CodeGen/EarlyIfConversion.cpp| 8 +-- llvm/lib/CodeGen/MachineCombiner.cpp | 8 +-- llvm/lib/CodeGen/MachineTraceMetrics.cpp | 62 +++ llvm/lib/Passes/PassBuilder.cpp | 1 + .../AArch64/AArch64ConditionalCompares.cpp| 8 +-- .../AArch64/AArch64StorePairSuppress.cpp | 6 +- 9 files changed, 119 insertions(+), 38 deletions(-) diff --git a/llvm/include/llvm/CodeGen/MachineTraceMetrics.h b/llvm/include/llvm/CodeGen/MachineTraceMetrics.h index c7d97597d551cd..36718f80a1b6dd 100644 --- a/llvm/include/llvm/CodeGen/MachineTraceMetrics.h +++ b/llvm/include/llvm/CodeGen/MachineTraceMetrics.h @@ -46,12 +46,13 @@ #ifndef LLVM_CODEGEN_MACHINETRACEMETRICS_H #define LLVM_CODEGEN_MACHINETRACEMETRICS_H -#include "llvm/ADT/SparseSet.h" #include "llvm/ADT/ArrayRef.h" #include "llvm/ADT/DenseMap.h" #include "llvm/ADT/SmallVector.h" +#include "llvm/ADT/SparseSet.h" #include "llvm/CodeGen/MachineBasicBlock.h" #include "llvm/CodeGen/MachineFunctionPass.h" +#include "llvm/CodeGen/MachinePassManager.h" #include "llvm/CodeGen/TargetSchedule.h" namespace llvm { @@ -93,7 +94,7 @@ enum class MachineTraceStrategy { TS_NumStrategies }; -class MachineTraceMetrics : public MachineFunctionPass { +class MachineTraceMetrics { const MachineFunction *MF = nullptr; const TargetInstrInfo *TII = nullptr; const TargetRegisterInfo *TRI = nullptr; @@ -102,19 +103,22 @@ class MachineTraceMetrics : public MachineFunctionPass { TargetSchedModel SchedModel; public: + friend class MachineTraceMetricsWrapperPass; friend class Ensemble; friend class Trace; class Ensemble; - static char ID; + // For legacy pass. + MachineTraceMetrics() { +std::fill(std::begin(Ensembles), std::end(Ensembles), nullptr); + } - MachineTraceMetrics(); + explicit MachineTraceMetrics(MachineFunction &MF, const MachineLoopInfo &LI); + ~MachineTraceMetrics(); - void getAnalysisUsage(AnalysisUsage&) const override; - bool runOnMachineFunction(MachineFunction&) override; - void releaseMemory() override; - void verifyAnalysis() const override; + void init(MachineFunction &Func, const MachineLoopInfo &LI); + void clear(); /// Per-basic block information that doesn't depend on the trace through the /// block. @@ -400,6 +404,12 @@ class MachineTraceMetrics : public MachineFunctionPass { /// Call Ensemble::getTrace() again to update any trace handles. void invalidate(const MachineBasicBlock *MBB); + /// Handle invalidation explicitly. + bool invalidate(MachineFunction &, const PreservedAnalyses &PA, + MachineFunctionAnalysisManager::Invalidator &); + + void verifyAnalysis() const; + private: // One entry per basic block, indexed by block number. SmallVector BlockInfo; @@ -435,6 +445,38 @@ inline raw_ostream &operator<<(raw_ostream &OS, return OS; } +class MachineTraceMetricsAnalysis +: public AnalysisInfoMixin { + friend AnalysisInfoMixin; + static AnalysisKey Key; + +public: + using Result = MachineTraceMetrics; + Result run(MachineFunction &MF, MachineFunctionAnalysisManager &MFAM); +}; + +/// Verifier pass for \c MachineTraceMetrics. +struct MachineTraceMetricsVerifierPass +: PassInfoMixin { + PreservedAnalyses run(MachineFunction &MF, +MachineFunctionAnalysisManager &MFAM); + static bool isRequired() { return true; } +}; + +class MachineTraceMetricsWrapperPass : public MachineFunctionPass { +public: + static char ID; + MachineTraceMetrics MTM; + + MachineTraceMetricsWrapperPass(); + + void getAnalysisUsage(AnalysisUsage &) const override; + bool runOnMachineFunction(MachineFunction &) override; + void releaseMemory() override { MTM.clear(); } + void verifyAnalysis() const override { MTM.verifyAnalysis(); } + MachineTraceMetrics &getMTM() { return MTM; } +}; + } // end namespace llvm #endif // LLVM_CODEGEN_MACHINETRACEMETRICS_H diff --git a/llvm/include/llvm/InitializePasses.h b/llvm/include/llvm/InitializePasses.h index 6a75dc0285cc61..5ed0ad98a2a72d 100644 --- a/llvm/include/llvm/InitializePasses.h +++ b/llvm/include/llvm/InitializePasses.h @@ -209,7 +209,7 @@ void initializeMachineRegionInfoPassPass(PassRegistry &); void initializeMachineSanitizerBinaryMetadataPass(PassRegistry &); void initializeMachineSchedulerPass(PassRegistry &); void initializeMachineSinkingPass(PassRegistry &); -void initializeMachineTraceMetricsPass(PassRegistry &);
[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port EarlyIfConversion pass to NPM. (PR #108508)
https://github.com/cdevadas updated https://github.com/llvm/llvm-project/pull/108508 >From dc7a71f17436ab0ce96546cef7639ef2cffd173f Mon Sep 17 00:00:00 2001 From: Christudasan Devadasan Date: Fri, 13 Sep 2024 12:22:03 +0530 Subject: [PATCH 1/2] [CodeGen][NewPM] Port EarlyIfConversion pass to NPM. --- llvm/include/llvm/CodeGen/EarlyIfConversion.h | 24 ++ llvm/include/llvm/CodeGen/Passes.h| 2 +- llvm/include/llvm/InitializePasses.h | 2 +- llvm/include/llvm/Passes/CodeGenPassBuilder.h | 1 + .../llvm/Passes/MachinePassRegistry.def | 2 +- llvm/lib/CodeGen/CodeGen.cpp | 2 +- llvm/lib/CodeGen/EarlyIfConversion.cpp| 79 ++- llvm/lib/CodeGen/TargetPassConfig.cpp | 4 +- llvm/lib/Passes/PassBuilder.cpp | 1 + .../Target/AArch64/AArch64TargetMachine.cpp | 2 +- .../lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 2 +- llvm/lib/Target/PowerPC/PPCTargetMachine.cpp | 2 +- .../Target/SystemZ/SystemZTargetMachine.cpp | 2 +- llvm/lib/Target/X86/X86TargetMachine.cpp | 2 +- .../early-ifcvt-likely-predictable.mir| 1 + .../AArch64/early-ifcvt-regclass-mismatch.mir | 1 + .../AArch64/early-ifcvt-same-value.mir| 1 + .../CodeGen/PowerPC/early-ifcvt-no-isel.mir | 2 + 18 files changed, 102 insertions(+), 30 deletions(-) create mode 100644 llvm/include/llvm/CodeGen/EarlyIfConversion.h diff --git a/llvm/include/llvm/CodeGen/EarlyIfConversion.h b/llvm/include/llvm/CodeGen/EarlyIfConversion.h new file mode 100644 index 00..78bf12ade02c3d --- /dev/null +++ b/llvm/include/llvm/CodeGen/EarlyIfConversion.h @@ -0,0 +1,24 @@ +//===- llvm/CodeGen/EarlyIfConversion.h -*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// + +#ifndef LLVM_CODEGEN_EARLYIFCONVERSION_H +#define LLVM_CODEGEN_EARLYIFCONVERSION_H + +#include "llvm/CodeGen/MachinePassManager.h" + +namespace llvm { + +class EarlyIfConverterPass : public PassInfoMixin { +public: + PreservedAnalyses run(MachineFunction &MF, +MachineFunctionAnalysisManager &MFAM); +}; + +} // namespace llvm + +#endif // LLVM_CODEGEN_EARLYIFCONVERSION_H diff --git a/llvm/include/llvm/CodeGen/Passes.h b/llvm/include/llvm/CodeGen/Passes.h index 99421bdf769ffa..bbbf99626098a6 100644 --- a/llvm/include/llvm/CodeGen/Passes.h +++ b/llvm/include/llvm/CodeGen/Passes.h @@ -273,7 +273,7 @@ namespace llvm { /// EarlyIfConverter - This pass performs if-conversion on SSA form by /// inserting cmov instructions. - extern char &EarlyIfConverterID; + extern char &EarlyIfConverterLegacyID; /// EarlyIfPredicator - This pass performs if-conversion on SSA form by /// predicating if/else block and insert select at the join point. diff --git a/llvm/include/llvm/InitializePasses.h b/llvm/include/llvm/InitializePasses.h index 5ed0ad98a2a72d..1374880b6a716b 100644 --- a/llvm/include/llvm/InitializePasses.h +++ b/llvm/include/llvm/InitializePasses.h @@ -98,7 +98,7 @@ void initializeDominatorTreeWrapperPassPass(PassRegistry &); void initializeDwarfEHPrepareLegacyPassPass(PassRegistry &); void initializeEarlyCSELegacyPassPass(PassRegistry &); void initializeEarlyCSEMemSSALegacyPassPass(PassRegistry &); -void initializeEarlyIfConverterPass(PassRegistry &); +void initializeEarlyIfConverterLegacyPass(PassRegistry &); void initializeEarlyIfPredicatorPass(PassRegistry &); void initializeEarlyMachineLICMPass(PassRegistry &); void initializeEarlyTailDuplicatePass(PassRegistry &); diff --git a/llvm/include/llvm/Passes/CodeGenPassBuilder.h b/llvm/include/llvm/Passes/CodeGenPassBuilder.h index 0d45df08cb0ca7..9ef6e39dbb1cdd 100644 --- a/llvm/include/llvm/Passes/CodeGenPassBuilder.h +++ b/llvm/include/llvm/Passes/CodeGenPassBuilder.h @@ -27,6 +27,7 @@ #include "llvm/CodeGen/CodeGenPrepare.h" #include "llvm/CodeGen/DeadMachineInstructionElim.h" #include "llvm/CodeGen/DwarfEHPrepare.h" +#include "llvm/CodeGen/EarlyIfConversion.h" #include "llvm/CodeGen/ExpandLargeDivRem.h" #include "llvm/CodeGen/ExpandLargeFpConvert.h" #include "llvm/CodeGen/ExpandMemCmp.h" diff --git a/llvm/include/llvm/Passes/MachinePassRegistry.def b/llvm/include/llvm/Passes/MachinePassRegistry.def index 1d7084354455c7..15c6e2de6488c4 100644 --- a/llvm/include/llvm/Passes/MachinePassRegistry.def +++ b/llvm/include/llvm/Passes/MachinePassRegistry.def @@ -128,6 +128,7 @@ MACHINE_FUNCTION_ANALYSIS("slot-indexes", SlotIndexesAnalysis()) #ifndef MACHINE_FUNCTION_PASS #define MACHINE_FUNCTION_PASS(NAME, CREATE_PASS) #endif +MACHINE_FUNCTION_PASS("early-ifcvt", EarlyIfConverterPass()) MACHINE_FUNCTION_PASS("dead-mi-elimination", DeadMachineInstructionElimPass()) MACHINE_FUNCTION_P
[llvm-branch-commits] [llvm] [AMDGPU][NewPM] Fill out addILPOpts. (PR #108514)
https://github.com/cdevadas updated https://github.com/llvm/llvm-project/pull/108514 >From 0be0bb79f9e6bae184905f753bc5bd5c73eb1e56 Mon Sep 17 00:00:00 2001 From: Christudasan Devadasan Date: Fri, 13 Sep 2024 13:53:01 +0530 Subject: [PATCH] [AMDGPU][NewPM] Fill out addILPOpts. --- llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 7 +++ llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h | 1 + 2 files changed, 8 insertions(+) diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp index 8a3c1a92a63a2d..affb8a2654c1ab 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp @@ -1994,6 +1994,13 @@ void AMDGPUCodeGenPassBuilder::addPreISel(AddIRPass &addPass) const { addPass(RequireAnalysisPass()); } +void AMDGPUCodeGenPassBuilder::addILPOpts(AddMachinePass &addPass) const { + if (EnableEarlyIfConversion) +addPass(EarlyIfConverterPass()); + + Base::addILPOpts(addPass); +} + void AMDGPUCodeGenPassBuilder::addAsmPrinter(AddMachinePass &addPass, CreateMCStreamer) const { // TODO: Add AsmPrinter. diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h index af8476bc21ec61..d8a5111e5898d7 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h @@ -172,6 +172,7 @@ class AMDGPUCodeGenPassBuilder void addIRPasses(AddIRPass &) const; void addCodeGenPrepare(AddIRPass &) const; void addPreISel(AddIRPass &addPass) const; + void addILPOpts(AddMachinePass &) const; void addAsmPrinter(AddMachinePass &, CreateMCStreamer) const; Error addInstSelector(AddMachinePass &) const; void addMachineSSAOptimization(AddMachinePass &) const; ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port EarlyIfConversion pass to NPM. (PR #108508)
https://github.com/cdevadas updated https://github.com/llvm/llvm-project/pull/108508 >From f692bb7b707610ebd1ad5f61943802aaef723f67 Mon Sep 17 00:00:00 2001 From: Christudasan Devadasan Date: Fri, 13 Sep 2024 12:22:03 +0530 Subject: [PATCH 1/2] [CodeGen][NewPM] Port EarlyIfConversion pass to NPM. --- llvm/include/llvm/CodeGen/EarlyIfConversion.h | 24 ++ llvm/include/llvm/CodeGen/Passes.h| 2 +- llvm/include/llvm/InitializePasses.h | 2 +- llvm/include/llvm/Passes/CodeGenPassBuilder.h | 1 + .../llvm/Passes/MachinePassRegistry.def | 2 +- llvm/lib/CodeGen/CodeGen.cpp | 2 +- llvm/lib/CodeGen/EarlyIfConversion.cpp| 79 ++- llvm/lib/CodeGen/TargetPassConfig.cpp | 4 +- llvm/lib/Passes/PassBuilder.cpp | 1 + .../Target/AArch64/AArch64TargetMachine.cpp | 2 +- .../lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 2 +- llvm/lib/Target/PowerPC/PPCTargetMachine.cpp | 2 +- .../Target/SystemZ/SystemZTargetMachine.cpp | 2 +- llvm/lib/Target/X86/X86TargetMachine.cpp | 2 +- .../early-ifcvt-likely-predictable.mir| 1 + .../AArch64/early-ifcvt-regclass-mismatch.mir | 1 + .../AArch64/early-ifcvt-same-value.mir| 1 + .../CodeGen/PowerPC/early-ifcvt-no-isel.mir | 2 + 18 files changed, 102 insertions(+), 30 deletions(-) create mode 100644 llvm/include/llvm/CodeGen/EarlyIfConversion.h diff --git a/llvm/include/llvm/CodeGen/EarlyIfConversion.h b/llvm/include/llvm/CodeGen/EarlyIfConversion.h new file mode 100644 index 00..78bf12ade02c3d --- /dev/null +++ b/llvm/include/llvm/CodeGen/EarlyIfConversion.h @@ -0,0 +1,24 @@ +//===- llvm/CodeGen/EarlyIfConversion.h -*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===--===// + +#ifndef LLVM_CODEGEN_EARLYIFCONVERSION_H +#define LLVM_CODEGEN_EARLYIFCONVERSION_H + +#include "llvm/CodeGen/MachinePassManager.h" + +namespace llvm { + +class EarlyIfConverterPass : public PassInfoMixin { +public: + PreservedAnalyses run(MachineFunction &MF, +MachineFunctionAnalysisManager &MFAM); +}; + +} // namespace llvm + +#endif // LLVM_CODEGEN_EARLYIFCONVERSION_H diff --git a/llvm/include/llvm/CodeGen/Passes.h b/llvm/include/llvm/CodeGen/Passes.h index 99421bdf769ffa..bbbf99626098a6 100644 --- a/llvm/include/llvm/CodeGen/Passes.h +++ b/llvm/include/llvm/CodeGen/Passes.h @@ -273,7 +273,7 @@ namespace llvm { /// EarlyIfConverter - This pass performs if-conversion on SSA form by /// inserting cmov instructions. - extern char &EarlyIfConverterID; + extern char &EarlyIfConverterLegacyID; /// EarlyIfPredicator - This pass performs if-conversion on SSA form by /// predicating if/else block and insert select at the join point. diff --git a/llvm/include/llvm/InitializePasses.h b/llvm/include/llvm/InitializePasses.h index 5ed0ad98a2a72d..1374880b6a716b 100644 --- a/llvm/include/llvm/InitializePasses.h +++ b/llvm/include/llvm/InitializePasses.h @@ -98,7 +98,7 @@ void initializeDominatorTreeWrapperPassPass(PassRegistry &); void initializeDwarfEHPrepareLegacyPassPass(PassRegistry &); void initializeEarlyCSELegacyPassPass(PassRegistry &); void initializeEarlyCSEMemSSALegacyPassPass(PassRegistry &); -void initializeEarlyIfConverterPass(PassRegistry &); +void initializeEarlyIfConverterLegacyPass(PassRegistry &); void initializeEarlyIfPredicatorPass(PassRegistry &); void initializeEarlyMachineLICMPass(PassRegistry &); void initializeEarlyTailDuplicatePass(PassRegistry &); diff --git a/llvm/include/llvm/Passes/CodeGenPassBuilder.h b/llvm/include/llvm/Passes/CodeGenPassBuilder.h index 0d45df08cb0ca7..9ef6e39dbb1cdd 100644 --- a/llvm/include/llvm/Passes/CodeGenPassBuilder.h +++ b/llvm/include/llvm/Passes/CodeGenPassBuilder.h @@ -27,6 +27,7 @@ #include "llvm/CodeGen/CodeGenPrepare.h" #include "llvm/CodeGen/DeadMachineInstructionElim.h" #include "llvm/CodeGen/DwarfEHPrepare.h" +#include "llvm/CodeGen/EarlyIfConversion.h" #include "llvm/CodeGen/ExpandLargeDivRem.h" #include "llvm/CodeGen/ExpandLargeFpConvert.h" #include "llvm/CodeGen/ExpandMemCmp.h" diff --git a/llvm/include/llvm/Passes/MachinePassRegistry.def b/llvm/include/llvm/Passes/MachinePassRegistry.def index 1d7084354455c7..15c6e2de6488c4 100644 --- a/llvm/include/llvm/Passes/MachinePassRegistry.def +++ b/llvm/include/llvm/Passes/MachinePassRegistry.def @@ -128,6 +128,7 @@ MACHINE_FUNCTION_ANALYSIS("slot-indexes", SlotIndexesAnalysis()) #ifndef MACHINE_FUNCTION_PASS #define MACHINE_FUNCTION_PASS(NAME, CREATE_PASS) #endif +MACHINE_FUNCTION_PASS("early-ifcvt", EarlyIfConverterPass()) MACHINE_FUNCTION_PASS("dead-mi-elimination", DeadMachineInstructionElimPass()) MACHINE_FUNCTION_P
[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port machine trace metrics analysis to new pass manager. (PR #108507)
https://github.com/cdevadas updated https://github.com/llvm/llvm-project/pull/108507 >From eabec10a9a7daf537006ff5d580b798a285e33e3 Mon Sep 17 00:00:00 2001 From: Christudasan Devadasan Date: Thu, 12 Sep 2024 23:38:09 +0530 Subject: [PATCH 1/4] [CodeGen][NewPM] Port machine trace metrics analysis to new pass manager. --- .../llvm/CodeGen/MachineTraceMetrics.h| 58 ++--- llvm/include/llvm/InitializePasses.h | 2 +- .../llvm/Passes/MachinePassRegistry.def | 4 +- llvm/lib/CodeGen/EarlyIfConversion.cpp| 8 +-- llvm/lib/CodeGen/MachineCombiner.cpp | 8 +-- llvm/lib/CodeGen/MachineTraceMetrics.cpp | 62 +++ llvm/lib/Passes/PassBuilder.cpp | 1 + .../AArch64/AArch64ConditionalCompares.cpp| 8 +-- .../AArch64/AArch64StorePairSuppress.cpp | 6 +- 9 files changed, 119 insertions(+), 38 deletions(-) diff --git a/llvm/include/llvm/CodeGen/MachineTraceMetrics.h b/llvm/include/llvm/CodeGen/MachineTraceMetrics.h index c7d97597d551cd..36718f80a1b6dd 100644 --- a/llvm/include/llvm/CodeGen/MachineTraceMetrics.h +++ b/llvm/include/llvm/CodeGen/MachineTraceMetrics.h @@ -46,12 +46,13 @@ #ifndef LLVM_CODEGEN_MACHINETRACEMETRICS_H #define LLVM_CODEGEN_MACHINETRACEMETRICS_H -#include "llvm/ADT/SparseSet.h" #include "llvm/ADT/ArrayRef.h" #include "llvm/ADT/DenseMap.h" #include "llvm/ADT/SmallVector.h" +#include "llvm/ADT/SparseSet.h" #include "llvm/CodeGen/MachineBasicBlock.h" #include "llvm/CodeGen/MachineFunctionPass.h" +#include "llvm/CodeGen/MachinePassManager.h" #include "llvm/CodeGen/TargetSchedule.h" namespace llvm { @@ -93,7 +94,7 @@ enum class MachineTraceStrategy { TS_NumStrategies }; -class MachineTraceMetrics : public MachineFunctionPass { +class MachineTraceMetrics { const MachineFunction *MF = nullptr; const TargetInstrInfo *TII = nullptr; const TargetRegisterInfo *TRI = nullptr; @@ -102,19 +103,22 @@ class MachineTraceMetrics : public MachineFunctionPass { TargetSchedModel SchedModel; public: + friend class MachineTraceMetricsWrapperPass; friend class Ensemble; friend class Trace; class Ensemble; - static char ID; + // For legacy pass. + MachineTraceMetrics() { +std::fill(std::begin(Ensembles), std::end(Ensembles), nullptr); + } - MachineTraceMetrics(); + explicit MachineTraceMetrics(MachineFunction &MF, const MachineLoopInfo &LI); + ~MachineTraceMetrics(); - void getAnalysisUsage(AnalysisUsage&) const override; - bool runOnMachineFunction(MachineFunction&) override; - void releaseMemory() override; - void verifyAnalysis() const override; + void init(MachineFunction &Func, const MachineLoopInfo &LI); + void clear(); /// Per-basic block information that doesn't depend on the trace through the /// block. @@ -400,6 +404,12 @@ class MachineTraceMetrics : public MachineFunctionPass { /// Call Ensemble::getTrace() again to update any trace handles. void invalidate(const MachineBasicBlock *MBB); + /// Handle invalidation explicitly. + bool invalidate(MachineFunction &, const PreservedAnalyses &PA, + MachineFunctionAnalysisManager::Invalidator &); + + void verifyAnalysis() const; + private: // One entry per basic block, indexed by block number. SmallVector BlockInfo; @@ -435,6 +445,38 @@ inline raw_ostream &operator<<(raw_ostream &OS, return OS; } +class MachineTraceMetricsAnalysis +: public AnalysisInfoMixin { + friend AnalysisInfoMixin; + static AnalysisKey Key; + +public: + using Result = MachineTraceMetrics; + Result run(MachineFunction &MF, MachineFunctionAnalysisManager &MFAM); +}; + +/// Verifier pass for \c MachineTraceMetrics. +struct MachineTraceMetricsVerifierPass +: PassInfoMixin { + PreservedAnalyses run(MachineFunction &MF, +MachineFunctionAnalysisManager &MFAM); + static bool isRequired() { return true; } +}; + +class MachineTraceMetricsWrapperPass : public MachineFunctionPass { +public: + static char ID; + MachineTraceMetrics MTM; + + MachineTraceMetricsWrapperPass(); + + void getAnalysisUsage(AnalysisUsage &) const override; + bool runOnMachineFunction(MachineFunction &) override; + void releaseMemory() override { MTM.clear(); } + void verifyAnalysis() const override { MTM.verifyAnalysis(); } + MachineTraceMetrics &getMTM() { return MTM; } +}; + } // end namespace llvm #endif // LLVM_CODEGEN_MACHINETRACEMETRICS_H diff --git a/llvm/include/llvm/InitializePasses.h b/llvm/include/llvm/InitializePasses.h index 6a75dc0285cc61..5ed0ad98a2a72d 100644 --- a/llvm/include/llvm/InitializePasses.h +++ b/llvm/include/llvm/InitializePasses.h @@ -209,7 +209,7 @@ void initializeMachineRegionInfoPassPass(PassRegistry &); void initializeMachineSanitizerBinaryMetadataPass(PassRegistry &); void initializeMachineSchedulerPass(PassRegistry &); void initializeMachineSinkingPass(PassRegistry &); -void initializeMachineTraceMetricsPass(PassRegistry &);
[llvm-branch-commits] [llvm] [AMDGPU][NewPM] Fill out addILPOpts. (PR #108514)
https://github.com/cdevadas updated https://github.com/llvm/llvm-project/pull/108514 >From 037b15166e575b59e115795e5b4fd8f4065b4483 Mon Sep 17 00:00:00 2001 From: Christudasan Devadasan Date: Fri, 13 Sep 2024 13:53:01 +0530 Subject: [PATCH] [AMDGPU][NewPM] Fill out addILPOpts. --- llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp | 7 +++ llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h | 1 + 2 files changed, 8 insertions(+) diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp index 8a3c1a92a63a2d..affb8a2654c1ab 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp @@ -1994,6 +1994,13 @@ void AMDGPUCodeGenPassBuilder::addPreISel(AddIRPass &addPass) const { addPass(RequireAnalysisPass()); } +void AMDGPUCodeGenPassBuilder::addILPOpts(AddMachinePass &addPass) const { + if (EnableEarlyIfConversion) +addPass(EarlyIfConverterPass()); + + Base::addILPOpts(addPass); +} + void AMDGPUCodeGenPassBuilder::addAsmPrinter(AddMachinePass &addPass, CreateMCStreamer) const { // TODO: Add AsmPrinter. diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h index af8476bc21ec61..d8a5111e5898d7 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h +++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.h @@ -172,6 +172,7 @@ class AMDGPUCodeGenPassBuilder void addIRPasses(AddIRPass &) const; void addCodeGenPrepare(AddIRPass &) const; void addPreISel(AddIRPass &addPass) const; + void addILPOpts(AddMachinePass &) const; void addAsmPrinter(AddMachinePass &, CreateMCStreamer) const; Error addInstSelector(AddMachinePass &) const; void addMachineSSAOptimization(AddMachinePass &) const; ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Serialize WWM_REG vreg flag (PR #110229)
@@ -684,8 +684,8 @@ class SIMachineFunctionInfo final : public AMDGPUMachineFunction, void setFlag(Register Reg, uint8_t Flag) { assert(Reg.isVirtual()); -if (VRegFlags.inBounds(Reg)) - VRegFlags[Reg] |= Flag; +VRegFlags.grow(Reg); cdevadas wrote: This change doesn't make sense to me. What will happen to the regular flow when it reaches from MRI createVirtualRegister? Isn't duplicating the size? https://github.com/llvm/llvm-project/blob/main/llvm/lib/CodeGen/MachineRegisterInfo.cpp#L166 https://github.com/llvm/llvm-project/pull/110229 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Serialize WWM_REG vreg flag (PR #110229)
https://github.com/cdevadas approved this pull request. https://github.com/llvm/llvm-project/pull/110229 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port machine trace metrics analysis to new pass manager. (PR #108507)
cdevadas wrote: ### Merge activity * **Oct 16, 3:38 AM EDT**: A user started a stack merge that includes this pull request via [Graphite](https://app.graphite.dev/github/pr/llvm/llvm-project/108507). https://github.com/llvm/llvm-project/pull/108507 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU][NewPM] Fill out addILPOpts. (PR #108514)
cdevadas wrote: ### Merge activity * **Oct 16, 3:38 AM EDT**: A user started a stack merge that includes this pull request via [Graphite](https://app.graphite.dev/github/pr/llvm/llvm-project/108514). https://github.com/llvm/llvm-project/pull/108514 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port EarlyIfConversion pass to NPM. (PR #108508)
cdevadas wrote: ### Merge activity * **Oct 16, 3:38 AM EDT**: A user started a stack merge that includes this pull request via [Graphite](https://app.graphite.dev/github/pr/llvm/llvm-project/108508). https://github.com/llvm/llvm-project/pull/108508 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port machine trace metrics analysis to new pass manager. (PR #108507)
@@ -102,19 +103,22 @@ class MachineTraceMetrics : public MachineFunctionPass { TargetSchedModel SchedModel; public: + friend class MachineTraceMetricsWrapperPass; friend class Ensemble; friend class Trace; class Ensemble; - static char ID; + // For legacy pass. + MachineTraceMetrics() { +std::fill(std::begin(Ensembles), std::end(Ensembles), nullptr); + } cdevadas wrote: It isn't possible to move out the Ensembles pointer initialization from the constructors. Certain tests crashed while the destructor invokes clear() that tries to delete the Ensemble pointers (some garbage value). The default constructor for these tests doesn't appropriately clear the object's members. The tests that crashed don't contain any function definitions, but only some global declarations. So the run() instances won't be invoked for clearing these pointers. Also, they should point to the dynamically allocated addresses (otherwise the initial null addresses) before the destructor is invoked. https://github.com/llvm/llvm-project/pull/108507 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Serialize WWM_REG vreg flag (PR #110229)
@@ -684,8 +684,8 @@ class SIMachineFunctionInfo final : public AMDGPUMachineFunction, void setFlag(Register Reg, uint8_t Flag) { assert(Reg.isVirtual()); -if (VRegFlags.inBounds(Reg)) - VRegFlags[Reg] |= Flag; +VRegFlags.grow(Reg); cdevadas wrote: I guess, this change is unnecessary. Keep the inbounds check back. The following change in the other patch would make sure to grow VRegFlags for each virtual register it encountered. https://github.com/llvm/llvm-project/pull/110228/files#diff-c72079b2a595aca3300d5e3c15d227f81937f2745f7c5494fcf1fe9ba37d8828R1789 https://github.com/llvm/llvm-project/pull/110229 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Serialize WWM_REG vreg flag (PR #110229)
@@ -0,0 +1,16 @@ +# RUN: llc -mtriple=amdgcn -run-pass=none -o - %s | FileCheck %s cdevadas wrote: Mostly, the tests related to serialize options go in llvm/test/CodeGen/MIR/AMDGPU/machine-function-info-no-ir.mir. Additionally, a negative test is required for this serialized option and that should be a separate test like llvm/test/CodeGen/MIR/AMDGPU/sgpr-for-exec-copy-invalid-reg.mir https://github.com/llvm/llvm-project/pull/110229 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port MachineCycleInfo to NPM (PR #114745)
https://github.com/cdevadas edited https://github.com/llvm/llvm-project/pull/114745 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port MachineCycleInfo to NPM (PR #114745)
@@ -151,6 +152,7 @@ MACHINE_FUNCTION_PASS("print", MACHINE_FUNCTION_PASS("print", MachineDominatorTreePrinterPass(dbgs())) MACHINE_FUNCTION_PASS("print", MachineLoopPrinterPass(dbgs())) +MACHINE_FUNCTION_PASS("print", MachineCycleInfoPrinterPass(dbgs())) cdevadas wrote: Alphabetical order. https://github.com/llvm/llvm-project/pull/114745 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [CodeGen][NewPM] Port MachineCycleInfo to NPM (PR #114745)
https://github.com/cdevadas commented: I guess this PR is for porting MachineCycleAnalysis. The run interface for legacy MachineCycle analysis is missing. The porting for the Print interface seems complete. Also, no test/runline to validate the MachineCycle analysis in the NPM path. I see only the runline for the print interface. https://github.com/llvm/llvm-project/pull/114745 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Add liverange split instructions into BB Prolog (PR #117544)
https://github.com/cdevadas created https://github.com/llvm/llvm-project/pull/117544 The COPY inserted for liverange split during sgpr-regalloc pipeline currently breaks the BB prolog during the subsequent vgpr-regalloc phase while spilling and/or splitting the vector liveranges. This patch fixes it by correctly including the the LR split instructions during sgpr-regalloc and wwm-regalloc pipelines into the BB prolog. >From 2e2b216d58e525e7dd65245ae2ee8b86750564c8 Mon Sep 17 00:00:00 2001 From: Christudasan Devadasan Date: Tue, 19 Nov 2024 12:14:05 +0530 Subject: [PATCH] [AMDGPU] Add liverange split instructions into BB Prolog The COPY inserted for liverange split during sgpr-regalloc pipeline currently breaks the BB prolog during the subsequent vgpr-regalloc phase while spilling and/or splitting the vector liveranges. This patch fixes it by correctly including the the LR split instructions during sgpr-regalloc and wwm-regalloc pipelines into the BB prolog. --- llvm/lib/Target/AMDGPU/SIInstrInfo.cpp| 34 - llvm/lib/Target/AMDGPU/SIInstrInfo.h | 2 + .../identical-subrange-spill-infloop.ll | 116 .../ran-out-of-sgprs-allocation-failure.mir | 128 +- 4 files changed, 148 insertions(+), 132 deletions(-) diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp index 4a94d690297949..204a575e2f64c1 100644 --- a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp +++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp @@ -8956,6 +8956,30 @@ unsigned SIInstrInfo::getLiveRangeSplitOpcode(Register SrcReg, return AMDGPU::COPY; } +bool SIInstrInfo::canAddToBBProlog(const MachineInstr &MI) const { + uint16_t Opcode = MI.getOpcode(); + // Check if it is SGPR spill or wwm-register spill Opcode. + if (isSGPRSpill(Opcode) || isWWMRegSpillOpcode(Opcode)) +return true; + + const MachineFunction *MF = MI.getMF(); + const MachineRegisterInfo &MRI = MF->getRegInfo(); + const SIMachineFunctionInfo *MFI = MF->getInfo(); + + // See if this is Liverange split instruction inserted for SGPR or + // wwm-register. The implicit def inserted for wwm-registers should also be + // included as they can appear at the bb begin. + bool IsLRSplitInst = MI.getFlag(MachineInstr::LRSplit); + if (!IsLRSplitInst && Opcode != AMDGPU::IMPLICIT_DEF) +return false; + + Register Reg = MI.getOperand(0).getReg(); + if (RI.isSGPRClass(RI.getRegClassForReg(MRI, Reg))) +return IsLRSplitInst; + + return MFI->isWWMReg(Reg); +} + bool SIInstrInfo::isBasicBlockPrologue(const MachineInstr &MI, Register Reg) const { // We need to handle instructions which may be inserted during register @@ -8964,20 +8988,16 @@ bool SIInstrInfo::isBasicBlockPrologue(const MachineInstr &MI, // needed by the prolog. However, the insertions for scalar registers can // always be placed at the BB top as they are independent of the exec mask // value. - const MachineFunction *MF = MI.getParent()->getParent(); bool IsNullOrVectorRegister = true; if (Reg) { +const MachineFunction *MF = MI.getMF(); const MachineRegisterInfo &MRI = MF->getRegInfo(); IsNullOrVectorRegister = !RI.isSGPRClass(RI.getRegClassForReg(MRI, Reg)); } - uint16_t Opcode = MI.getOpcode(); - const SIMachineFunctionInfo *MFI = MF->getInfo(); return IsNullOrVectorRegister && - (isSGPRSpill(Opcode) || isWWMRegSpillOpcode(Opcode) || - (Opcode == AMDGPU::IMPLICIT_DEF && - MFI->isWWMReg(MI.getOperand(0).getReg())) || - (!MI.isTerminator() && Opcode != AMDGPU::COPY && + (canAddToBBProlog(MI) || + (!MI.isTerminator() && MI.getOpcode() != AMDGPU::COPY && MI.modifiesRegister(AMDGPU::EXEC, &RI))); } diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.h b/llvm/lib/Target/AMDGPU/SIInstrInfo.h index e55418326a4bd0..ea1d16784645e1 100644 --- a/llvm/lib/Target/AMDGPU/SIInstrInfo.h +++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.h @@ -1348,6 +1348,8 @@ class SIInstrInfo final : public AMDGPUGenInstrInfo { bool isBasicBlockPrologue(const MachineInstr &MI, Register Reg = Register()) const override; + bool canAddToBBProlog(const MachineInstr &MI) const; + MachineInstr *createPHIDestinationCopy(MachineBasicBlock &MBB, MachineBasicBlock::iterator InsPt, const DebugLoc &DL, Register Src, diff --git a/llvm/test/CodeGen/AMDGPU/identical-subrange-spill-infloop.ll b/llvm/test/CodeGen/AMDGPU/identical-subrange-spill-infloop.ll index 5dff660912e402..d7c38f26957677 100644 --- a/llvm/test/CodeGen/AMDGPU/identical-subrange-spill-infloop.ll +++ b/llvm/test/CodeGen/AMDGPU/identical-subrange-spill-infloop.ll @@ -176,39 +176,39 @@ define void @main(i1 %arg) #0 { ; CHECK-NEXT:v_readlane_b32 s17, v7, 37 ; CHECK-NEXT:v_readlane_b32 s18, v7, 38 ; CHECK-NEXT:v_readlane_b
[llvm-branch-commits] [llvm] [AMDGPU] Add liverange split instructions into BB Prolog (PR #117544)
cdevadas wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.dev/github/pr/llvm/llvm-project/117544?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#117544** https://app.graphite.dev/github/pr/llvm/llvm-project/117544?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/117544?utm_source=stack-comment-view-in-graphite"; target="_blank">(View in Graphite) * **#117543** https://app.graphite.dev/github/pr/llvm/llvm-project/117543?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * `main` This stack of pull requests is managed by https://graphite.dev?utm-source=stack-comment";>Graphite. Learn more about https://stacking.dev/?utm_source=stack-comment";>stacking. https://github.com/llvm/llvm-project/pull/117544 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Add liverange split instructions into BB Prolog (PR #117544)
https://github.com/cdevadas ready_for_review https://github.com/llvm/llvm-project/pull/117544 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [NewPM] Introduce MFAnalysisGetter for a common analysis getter (PR #116166)
https://github.com/cdevadas edited https://github.com/llvm/llvm-project/pull/116166 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [NewPM] Introduce MFAnalysisGetter for a common analysis getter (PR #116166)
https://github.com/cdevadas commented: I liked the general direction. Do you have at least one analysis pass ported using this Getter? https://github.com/llvm/llvm-project/pull/116166 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [NewPM] Introduce MFAnalysisGetter for a common analysis getter (PR #116166)
@@ -236,6 +238,82 @@ using MachineFunctionPassManager = PassManager; /// preserve. PreservedAnalyses getMachineFunctionPassPreservedAnalyses(); +/// For migrating to new pass manager cdevadas wrote: Full stop at the end of the comment. https://github.com/llvm/llvm-project/pull/116166 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [NewPM] Introduce MFAnalysisGetter for a common analysis getter (PR #116166)
@@ -236,6 +238,82 @@ using MachineFunctionPassManager = PassManager; /// preserve. PreservedAnalyses getMachineFunctionPassPreservedAnalyses(); +/// For migrating to new pass manager +/// Provides a common interface to fetch analyses instead of doing it twice in +/// the *LegacyPass::runOnMachineFunction and NPM Pass::run. +/// +/// NPM analyses must have the LegacyWrapper type to indicate which legacy +/// analysis to run. Legacy wrapper analyses must have `getResult()` method. +/// This can be added on a needs-to basis. +/// +/// Outer analyses passes(Module or Function) can also be requested through +/// `getAnalysis` or `getCachedAnalysis`. +class MFAnalysisGetter { +private: + Pass *LegacyPass = nullptr; + MachineFunctionAnalysisManager *MFAM = nullptr; + + template + using type_of_run = + typename function_traits::template arg_t<0>; + + template + static constexpr bool IsFunctionAnalysis = + std::is_same_v>; + + template + static constexpr bool IsModuleAnalysis = + std::is_same_v>; + +public: + MFAnalysisGetter(Pass *LegacyPass) : LegacyPass(LegacyPass) {} + MFAnalysisGetter(MachineFunctionAnalysisManager *MFAM) : MFAM(MFAM) {} + + /// Outer analyses requested from NPM will be cached results and can be null + template + typename AnalysisT::Result *getAnalysis(MachineFunction &MF) { +if (MFAM) { + // need a proxy to get the result for outer analyses cdevadas wrote: Capitalize the first character. https://github.com/llvm/llvm-project/pull/116166 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits