[llvm-branch-commits] [llvm] 639a50e - [AMDGPU] Precommit test case for D94010

2021-01-05 Thread Jay Foad via llvm-branch-commits
Author: Jay Foad Date: 2021-01-05T11:55:14Z New Revision: 639a50e2f138ed3e647b00809a2871a1b9ae9012 URL: https://github.com/llvm/llvm-project/commit/639a50e2f138ed3e647b00809a2871a1b9ae9012 DIFF: https://github.com/llvm/llvm-project/commit/639a50e2f138ed3e647b00809a2871a1b9ae9012.diff LOG: [AMD

[llvm-branch-commits] [llvm] 3914beb - [AMDGPU] Handle v_fmac_legacy_f32 in SIFoldOperands

2021-01-05 Thread Jay Foad via llvm-branch-commits
Author: Jay Foad Date: 2021-01-05T11:55:33Z New Revision: 3914bebe91f6b557e61d6d74117762f9043593e0 URL: https://github.com/llvm/llvm-project/commit/3914bebe91f6b557e61d6d74117762f9043593e0 DIFF: https://github.com/llvm/llvm-project/commit/3914bebe91f6b557e61d6d74117762f9043593e0.diff LOG: [AMD

[llvm-branch-commits] [llvm] 6dcf920 - [AMDGPU] Fix a urem combine test to test what it was supposed to

2021-01-11 Thread Jay Foad via llvm-branch-commits
Author: Jay Foad Date: 2021-01-11T13:32:34Z New Revision: 6dcf9207df11f5cdb0126e5c5632e93532642ed9 URL: https://github.com/llvm/llvm-project/commit/6dcf9207df11f5cdb0126e5c5632e93532642ed9 DIFF: https://github.com/llvm/llvm-project/commit/6dcf9207df11f5cdb0126e5c5632e93532642ed9.diff LOG: [AMD

[llvm-branch-commits] [llvm] f264f9a - [SlotIndexes] Fix and simplify basic block splitting

2021-01-12 Thread Jay Foad via llvm-branch-commits
Author: Jay Foad Date: 2021-01-12T10:50:14Z New Revision: f264f9ad7df538357dfc8c5f318c5c8b0df3d99f URL: https://github.com/llvm/llvm-project/commit/f264f9ad7df538357dfc8c5f318c5c8b0df3d99f DIFF: https://github.com/llvm/llvm-project/commit/f264f9ad7df538357dfc8c5f318c5c8b0df3d99f.diff LOG: [Slo

[llvm-branch-commits] [llvm] 000400c - Fix speling in comments. NFC.

2020-11-23 Thread Jay Foad via llvm-branch-commits
Author: Jay Foad Date: 2020-11-23T14:43:24Z New Revision: 000400ca0aeb32e347eefd110a4ed58ebc23d333 URL: https://github.com/llvm/llvm-project/commit/000400ca0aeb32e347eefd110a4ed58ebc23d333 DIFF: https://github.com/llvm/llvm-project/commit/000400ca0aeb32e347eefd110a4ed58ebc23d333.diff LOG: Fix

[llvm-branch-commits] [llvm] 4f87d30 - [AMDGPU] Introduce and use isGFX10Plus. NFC.

2020-11-26 Thread Jay Foad via llvm-branch-commits
Author: Jay Foad Date: 2020-11-26T09:02:36Z New Revision: 4f87d30a06dd08cec45cb595e9dbed6345c9a7c5 URL: https://github.com/llvm/llvm-project/commit/4f87d30a06dd08cec45cb595e9dbed6345c9a7c5 DIFF: https://github.com/llvm/llvm-project/commit/4f87d30a06dd08cec45cb595e9dbed6345c9a7c5.diff LOG: [AMD

[llvm-branch-commits] [llvm] 0d9166f - [LegacyPM] Remove unused undocumented parameter. NFC.

2020-11-27 Thread Jay Foad via llvm-branch-commits
Author: Jay Foad Date: 2020-11-27T10:41:38Z New Revision: 0d9166ff79578c7e98cef8c554e1342ece8efee6 URL: https://github.com/llvm/llvm-project/commit/0d9166ff79578c7e98cef8c554e1342ece8efee6 DIFF: https://github.com/llvm/llvm-project/commit/0d9166ff79578c7e98cef8c554e1342ece8efee6.diff LOG: [Leg

[llvm-branch-commits] [llvm] 68ed644 - [LegacyPM] Avoid a redundant map lookup in setLastUser. NFC.

2020-11-27 Thread Jay Foad via llvm-branch-commits
Author: Jay Foad Date: 2020-11-27T10:42:01Z New Revision: 68ed6447855632b954b55f63807481eaa44705df URL: https://github.com/llvm/llvm-project/commit/68ed6447855632b954b55f63807481eaa44705df DIFF: https://github.com/llvm/llvm-project/commit/68ed6447855632b954b55f63807481eaa44705df.diff LOG: [Leg

[llvm-branch-commits] [llvm] e20efa3 - [LegacyPM] Simplify PMTopLevelManager::collectLastUses. NFC.

2020-11-30 Thread Jay Foad via llvm-branch-commits
Author: Jay Foad Date: 2020-11-30T10:36:19Z New Revision: e20efa3dd5c75a79a47d40335aee0f63261f9c5b URL: https://github.com/llvm/llvm-project/commit/e20efa3dd5c75a79a47d40335aee0f63261f9c5b DIFF: https://github.com/llvm/llvm-project/commit/e20efa3dd5c75a79a47d40335aee0f63261f9c5b.diff LOG: [Leg

[llvm-branch-commits] [llvm] 839c963 - [AMDGPU] Simplify some generation checks. NFC.

2020-12-01 Thread Jay Foad via llvm-branch-commits
Author: Jay Foad Date: 2020-12-01T10:15:32Z New Revision: 839c9635edce4f6ed348b154a4e755ff8263d366 URL: https://github.com/llvm/llvm-project/commit/839c9635edce4f6ed348b154a4e755ff8263d366 DIFF: https://github.com/llvm/llvm-project/commit/839c9635edce4f6ed348b154a4e755ff8263d366.diff LOG: [AMD

[llvm-branch-commits] [llvm] 0f32e81 - [TableGen] Remove unused class RecordValResolver. NFC.

2020-12-03 Thread Jay Foad via llvm-branch-commits
Author: Jay Foad Date: 2020-12-03T13:36:58Z New Revision: 0f32e81407d33ab8886081db5d8ed2c7407a15e8 URL: https://github.com/llvm/llvm-project/commit/0f32e81407d33ab8886081db5d8ed2c7407a15e8 DIFF: https://github.com/llvm/llvm-project/commit/0f32e81407d33ab8886081db5d8ed2c7407a15e8.diff LOG: [Tab

[llvm-branch-commits] [llvm] 03663e4 - [AMDGPU] Add occupancy level tests for GFX10.3. NFC.

2020-12-08 Thread Jay Foad via llvm-branch-commits
Author: Jay Foad Date: 2020-12-08T14:15:01Z New Revision: 03663e4130d700c6c8ea28b357fcac4d31b617f7 URL: https://github.com/llvm/llvm-project/commit/03663e4130d700c6c8ea28b357fcac4d31b617f7 DIFF: https://github.com/llvm/llvm-project/commit/03663e4130d700c6c8ea28b357fcac4d31b617f7.diff LOG: [AMD

[llvm-branch-commits] [llvm] 4f25e53 - [AMDGPU] Make use of emitRemovedIntrinsicError. NFC.

2020-12-11 Thread Jay Foad via llvm-branch-commits
Author: Jay Foad Date: 2020-12-11T14:02:14Z New Revision: 4f25e5398211c603e765ab6c30ab35ad286d505f URL: https://github.com/llvm/llvm-project/commit/4f25e5398211c603e765ab6c30ab35ad286d505f DIFF: https://github.com/llvm/llvm-project/commit/4f25e5398211c603e765ab6c30ab35ad286d505f.diff LOG: [AMD

[llvm-branch-commits] [llvm] 07e92e6 - [AMDGPU] Make use of HasSMemRealTime predicate. NFC.

2020-12-14 Thread Jay Foad via llvm-branch-commits
Author: Jay Foad Date: 2020-12-14T16:34:57Z New Revision: 07e92e6b6002d95d438d24eaabf4452ad6e4ef8f URL: https://github.com/llvm/llvm-project/commit/07e92e6b6002d95d438d24eaabf4452ad6e4ef8f DIFF: https://github.com/llvm/llvm-project/commit/07e92e6b6002d95d438d24eaabf4452ad6e4ef8f.diff LOG: [AMD

[llvm-branch-commits] [llvm] release/19.x: AMDGPU: Fix inst-selection of large scratch offsets with sgpr base (#110256) (PR #110470)

2024-09-30 Thread Jay Foad via llvm-branch-commits
https://github.com/jayfoad approved this pull request. https://github.com/llvm/llvm-project/pull/110470 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] MachineUniformityAnalysis: Improve isConstantOrUndefValuePhi (PR #112866)

2024-10-18 Thread Jay Foad via llvm-branch-commits
https://github.com/jayfoad approved this pull request. > Change existing code to match what LLVM-IR version is doing Yeah, looks reasonable to me. https://github.com/llvm/llvm-project/pull/112866 ___ llvm-branch-commits mailing list llvm-branch-commit

[llvm-branch-commits] [llvm] AMDGPU: Fix inst-selection of large scratch offsets with sgpr base (PR #110256)

2024-09-27 Thread Jay Foad via llvm-branch-commits
https://github.com/jayfoad approved this pull request. LGTM, thanks! Please also backport to release/19.x. https://github.com/llvm/llvm-project/pull/110256 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/c

[llvm-branch-commits] [llvm] AMDGPU: Fix inst-selection of large scratch offsets with sgpr base (PR #110256)

2024-09-27 Thread Jay Foad via llvm-branch-commits
@@ -1911,7 +1911,7 @@ bool AMDGPUDAGToDAGISel::SelectScratchSAddr(SDNode *Parent, SDValue Addr, 0); } - Offset = CurDAG->getTargetConstant(COffsetVal, DL, MVT::i16); + Offset = CurDAG->getTargetConstant(COffsetVal, DL, MVT::i32); jayfo

[llvm-branch-commits] [llvm] [AMDGPU] Introduce a "new" target feature `xf32-insts` (PR #115214)

2024-11-07 Thread Jay Foad via llvm-branch-commits
@@ -1110,6 +1110,13 @@ def FeatureRequiresCOV6 : SubtargetFeature<"requires-cov6", "Target Requires Code Object V6" >; +def FeatureXF32Insts : SubtargetFeature<"xf32-insts", + "HasXF32Insts", + "true", + "Has instructions that support xf32 format, such as " + "v_mfm

[llvm-branch-commits] [llvm] AMDGPU: Simplify demanded bits on readlane/writeline index arguments (PR #117963)

2024-11-28 Thread Jay Foad via llvm-branch-commits
jayfoad wrote: > If we have out of bounds indexing, these will now clamp down to > a low bit which may CSE with the operations on the low half of the > wave. Should mention that in the comment in the code. It was not clear to me why you would want to clamp constants. https://github.com/llvm/ll

[llvm-branch-commits] [llvm] [Linker] Remove a use of StructType::setBody. NFC. (PR #116653)

2024-11-18 Thread Jay Foad via llvm-branch-commits
https://github.com/jayfoad created https://github.com/llvm/llvm-project/pull/116653 This falls out naturally after inlining finishType into its only remaining use. >From 4140bc772f5930807cb2ea5b4b2aa945c57b699c Mon Sep 17 00:00:00 2001 From: Jay Foad Date: Mon, 18 Nov 2024 16:36:33 + Subje

[llvm-branch-commits] [llvm] AMDGPU: Increase the LDS size to support to 160 KB for gfx950 (PR #116309)

2024-11-15 Thread Jay Foad via llvm-branch-commits
@@ -1494,7 +1494,8 @@ def FeatureISAVersion9_5_Common : FeatureSet< [FeatureFP8Insts, FeatureFP8ConversionInsts, FeatureCvtFP8VOP1Bug, - FeatureGFX950Insts + FeatureGFX950Insts, + FeatureAddressableLocalMemorySize163840 jayfoad wrote: This means

[llvm-branch-commits] [llvm] AMDGPU: Mark sendmsg intrinsics as norecurse (PR #125016)

2025-01-30 Thread Jay Foad via llvm-branch-commits
jayfoad wrote: > We cannot mark these as nocallback or nosync. I think we can and should mark these as `nocallback`. I don't know how well it is documented, but I think in practice `nocallback` means that the intrinsic does not call back into user code **synchronously, in the current thread,

[llvm-branch-commits] [llvm] PeepholeOpt: Remove check for subreg index on a def operand (PR #123943)

2025-01-22 Thread Jay Foad via llvm-branch-commits
https://github.com/jayfoad approved this pull request. https://github.com/llvm/llvm-project/pull/123943 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] PeepholeOpt: Avoid double map lookup (PR #124531)

2025-01-27 Thread Jay Foad via llvm-branch-commits
@@ -1035,8 +1035,10 @@ bool PeepholeOptimizer::findNextSource(RegSubRegPair RegSubReg, return false; // Insert the Def -> Use entry for the recently found source. - ValueTrackerResult CurSrcRes = RewriteMap.lookup(CurSrcPair); - if (CurSrcRes.isValid()

[llvm-branch-commits] [llvm] PeepholeOpt: Avoid double map lookup (PR #124531)

2025-01-27 Thread Jay Foad via llvm-branch-commits
https://github.com/jayfoad approved this pull request. https://github.com/llvm/llvm-project/pull/124531 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] TableGen: Add intrinsic property for norecurse (PR #125015)

2025-01-30 Thread Jay Foad via llvm-branch-commits
jayfoad wrote: This needs motivation. What does `norecurse` mean for an intrinsic and how does it differ from `nocallback`? What sort of intrinsic would be `norecurse` but not `nocallback`? https://github.com/llvm/llvm-project/pull/125015 ___ llvm-br

[llvm-branch-commits] [llvm] AMDGPU: Implement isExtractVecEltCheap (PR #122460)

2025-01-14 Thread Jay Foad via llvm-branch-commits
https://github.com/jayfoad edited https://github.com/llvm/llvm-project/pull/122460 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU: Implement isExtractVecEltCheap (PR #122460)

2025-01-14 Thread Jay Foad via llvm-branch-commits
https://github.com/jayfoad approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/122460 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU: Implement isExtractVecEltCheap (PR #122460)

2025-01-14 Thread Jay Foad via llvm-branch-commits
@@ -1949,6 +1949,13 @@ bool SITargetLowering::isExtractSubvectorCheap(EVT ResVT, EVT SrcVT, return Index == 0; } +bool SITargetLowering::isExtractVecEltCheap(EVT VT, unsigned Index) const { + // TODO: This should be more aggressive, particular for 16-bit element + // vect

[llvm-branch-commits] [llvm] AMDGPU: Implement isExtractVecEltCheap (PR #122460)

2025-01-14 Thread Jay Foad via llvm-branch-commits
@@ -1949,6 +1949,13 @@ bool SITargetLowering::isExtractSubvectorCheap(EVT ResVT, EVT SrcVT, return Index == 0; } +bool SITargetLowering::isExtractVecEltCheap(EVT VT, unsigned Index) const { + // TODO: This should be more aggressive, particular for 16-bit element + // vect

[llvm-branch-commits] [llvm] AMDGPU: Reduce 64-bit add width if low bits are known 0 (PR #122049)

2025-01-08 Thread Jay Foad via llvm-branch-commits
jayfoad wrote: Why doesn't this fall out naturally from splitting the 64-bit add into 32-bit parts and then simplifying each part? Do we leave it as a 64-bit add all the way until final instruction selection? https://github.com/llvm/llvm-project/pull/122049

[llvm-branch-commits] [clang] [llvm] [IR] Add FPOperation intrinsic property (PR #122313)

2025-01-09 Thread Jay Foad via llvm-branch-commits
@@ -308,6 +308,9 @@ def StackProtectStrong : EnumAttr<"sspstrong", IntersectPreserve, [FnAttr]>; /// Function was called in a scope requiring strict floating point semantics. def StrictFP : EnumAttr<"strictfp", IntersectPreserve, [FnAttr]>; +/// Function is a floating point o

[llvm-branch-commits] [llvm] [AMDGPU][NewPM] Port SILowerControlFlow pass into NPM. (PR #123045)

2025-01-15 Thread Jay Foad via llvm-branch-commits
jayfoad wrote: > I observed something while porting this pass. The analysis LiveIntervals > (LIS) uses the SlotIndexes (SI). There is no explicit use of SI in this pass. > If we have to preserve LIS, it required us to preserve SI as well. When I > initially failed to preserve SI, the following

[llvm-branch-commits] [llvm] [AMDGPU][GlobalISel] Combine (sext (trunc (sext_in_reg x))) (PR #131312)

2025-03-14 Thread Jay Foad via llvm-branch-commits
@@ -258,6 +258,14 @@ def sext_trunc_sextload : GICombineRule< [{ return Helper.matchSextTruncSextLoad(*${d}); }]), (apply [{ Helper.applySextTruncSextLoad(*${d}); }])>; +def sext_trunc_sextinreg : GICombineRule< + (defs root:$dst), + (match (G_SEXT_INREG $sir, $sr

[llvm-branch-commits] [llvm] [AMDGPU] Support image_bvh8_intersect_ray instruction and intrinsic. (PR #130041)

2025-03-18 Thread Jay Foad via llvm-branch-commits
https://github.com/jayfoad approved this pull request. https://github.com/llvm/llvm-project/pull/130041 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] LICM: Avoid looking at use list of constant data (PR #134690)

2025-04-07 Thread Jay Foad via llvm-branch-commits
@@ -2294,10 +2294,14 @@ collectPromotionCandidates(MemorySSA *MSSA, AliasAnalysis *AA, Loop *L) { AliasSetTracker AST(BatchAA); auto IsPotentiallyPromotable = [L](const Instruction *I) { -if (const auto *SI = dyn_cast(I)) - return L->isLoopInvariant(SI->getPointe

[llvm-branch-commits] [llvm] AMDGPU: Replace unused permlane inputs with poison instead of undef (PR #131288)

2025-03-14 Thread Jay Foad via llvm-branch-commits
jayfoad wrote: Same kind of objection as #131287: as a general strategy, "replace unused inputs with poison" seems incompatible with "propagate poison from arguments to result". @nunoplopes any thoughts on this? https://github.com/llvm/llvm-project/pull/131288 _

[llvm-branch-commits] [llvm] AMDGPU: Replace unused export inputs with poison instead of undef (PR #131286)

2025-03-14 Thread Jay Foad via llvm-branch-commits
https://github.com/jayfoad approved this pull request. https://github.com/llvm/llvm-project/pull/131286 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU: Replace unused update.dpp inputs with poison instead of undef (PR #131287)

2025-03-14 Thread Jay Foad via llvm-branch-commits
jayfoad wrote: I have a conceptual objection: I don't think we can do both of these things: 1. Replace unused inputs of all intrinsics with poison 2. Propagate poison from any argument, for all intrinsics So how should we handle this in general? Is it better to replace unused inputs with "free

[llvm-branch-commits] [llvm] AMDGPU: Replace unused update.dpp inputs with poison instead of undef (PR #131287)

2025-03-14 Thread Jay Foad via llvm-branch-commits
jayfoad wrote: > We do it when the semantics allow it. My concern is that it is not obvious when the semantics allow it, when you have a plethora of undocumented target intrinsics. I guess the grown-up solution is to document them properly. https://github.com/llvm/llvm-project/pull/131287 ___

[llvm-branch-commits] [llvm] AMDGPU: Replace unused permlane inputs with poison instead of undef (PR #131288)

2025-03-14 Thread Jay Foad via llvm-branch-commits
jayfoad wrote: > > Same kind of objection as #131287: as a general strategy, "replace unused > > inputs with poison" > > Repeating from the other review, but this is not the case. Poison does not > unconditionally fold through intrinsics. This is specific to an operand for > an intrinsic. It

[llvm-branch-commits] [llvm] [AMDGPU][GlobalISel] Combine (sext (trunc (sext_in_reg x))) (PR #131312)

2025-03-14 Thread Jay Foad via llvm-branch-commits
@@ -258,6 +258,14 @@ def sext_trunc_sextload : GICombineRule< [{ return Helper.matchSextTruncSextLoad(*${d}); }]), (apply [{ Helper.applySextTruncSextLoad(*${d}); }])>; +def sext_trunc_sextinreg : GICombineRule< + (defs root:$dst), + (match (G_SEXT_INREG $sir, $sr

[llvm-branch-commits] [llvm] [AMDGPU][GlobalISel] Combine (sext (trunc (sext_in_reg x))) (PR #131312)

2025-03-14 Thread Jay Foad via llvm-branch-commits
@@ -258,6 +258,14 @@ def sext_trunc_sextload : GICombineRule< [{ return Helper.matchSextTruncSextLoad(*${d}); }]), (apply [{ Helper.applySextTruncSextLoad(*${d}); }])>; +def sext_trunc_sextinreg : GICombineRule< + (defs root:$dst), + (match (G_SEXT_INREG $sir, $sr

[llvm-branch-commits] [llvm] AMDGPU: Replace unused permlane inputs with poison instead of undef (PR #131288)

2025-03-14 Thread Jay Foad via llvm-branch-commits
jayfoad wrote: > > Then I think the poison-propagating rules for this intrinsic should be > > documented. They're not obvious and "it just does what the underlying > > instruction does" is no longer sufficient. > > We're currently not propagating poison for these intrinsics, and this patch >

[llvm-branch-commits] [llvm] [AMDGPU][InsertWaitCnts] Track global_wb/inv/wbinv (PR #135340)

2025-04-22 Thread Jay Foad via llvm-branch-commits
https://github.com/jayfoad approved this pull request. Patch looks OK to me, unless you are still worried about the global_inv loadcnt decrement ordering thing. Removing unnecessary waits at a function call boundary can be done as a separate optimization. https://github.com/llvm/llvm-project/

[llvm-branch-commits] [llvm] [AMDGPU][InsertWaitCnts] Track global_wb/inv/wbinv (PR #135340)

2025-04-22 Thread Jay Foad via llvm-branch-commits
https://github.com/jayfoad edited https://github.com/llvm/llvm-project/pull/135340 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU][InsertWaitCnts] Track global_wb/inv/wbinv (PR #135340)

2025-04-22 Thread Jay Foad via llvm-branch-commits
@@ -19,7 +19,7 @@ body: | ; GFX12-NEXT: {{ $}} ; GFX12-NEXT: renamable $vgpr0 = GLOBAL_LOAD_DWORD_SADDR renamable $sgpr2_sgpr3, killed $vgpr0, 0, 0, implicit $exec :: (load (s32), addrspace 1) ; GFX12-NEXT: GLOBAL_INV 16, implicit $exec -; GFX12-NEXT: S_WAIT_L

[llvm-branch-commits] [llvm] [AMDGPU][InsertWaitCnts] Track global_wb/inv/wbinv (PR #135340)

2025-04-22 Thread Jay Foad via llvm-branch-commits
@@ -2130,13 +2140,14 @@ void SIInsertWaitcnts::updateEventWaitcntAfter(MachineInstr &Inst, ScoreBrackets->updateByEvent(TII, TRI, MRI, LDS_ACCESS, Inst); } } else if (TII->isFLAT(Inst)) { -// TODO: Track this properly. -if (isCacheInvOrWBInst(Inst)) +if

[llvm-branch-commits] [llvm] [AMDGPU][InsertWaitCnts] Track global_wb/inv/wbinv (PR #135340)

2025-04-22 Thread Jay Foad via llvm-branch-commits
@@ -19,7 +19,7 @@ body: | ; GFX12-NEXT: {{ $}} ; GFX12-NEXT: renamable $vgpr0 = GLOBAL_LOAD_DWORD_SADDR renamable $sgpr2_sgpr3, killed $vgpr0, 0, 0, implicit $exec :: (load (s32), addrspace 1) ; GFX12-NEXT: GLOBAL_INV 16, implicit $exec -; GFX12-NEXT: S_WAIT_L

[llvm-branch-commits] [llvm] [GlobalISel] Add computeNumSignBits for ASHR (PR #139503)

2025-05-12 Thread Jay Foad via llvm-branch-commits
https://github.com/jayfoad requested changes to this pull request. https://github.com/llvm/llvm-project/pull/139503 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-comm

[llvm-branch-commits] [llvm] [GlobalISel] Add computeNumSignBits for ASHR (PR #139503)

2025-05-12 Thread Jay Foad via llvm-branch-commits
@@ -864,6 +864,16 @@ unsigned GISelValueTracking::computeNumSignBits(Register R, return TyBits - 1; // Every always-zero bit is a sign bit. break; } + case TargetOpcode::G_ASHR: { +Register Src1 = MI.getOperand(1).getReg(); +Register Src2 = MI.getOperand(2)

[llvm-branch-commits] [llvm] [GlobalISel] Add computeNumSignBits for G_SHUFFLE_VECTOR (PR #139505)

2025-05-12 Thread Jay Foad via llvm-branch-commits
@@ -874,6 +874,30 @@ unsigned GISelValueTracking::computeNumSignBits(Register R, SrcTy.getScalarSizeInBits()); break; } + case TargetOpcode::G_SHUFFLE_VECTOR: { +// Collect the minimum number of sign bits that are shared by ever

[llvm-branch-commits] [llvm] [GlobalISel] Add computeNumSignBits for G_SHUFFLE_VECTOR (PR #139505)

2025-05-12 Thread Jay Foad via llvm-branch-commits
https://github.com/jayfoad edited https://github.com/llvm/llvm-project/pull/139505 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [GlobalISel] Add computeNumSignBits for G_SHUFFLE_VECTOR (PR #139505)

2025-05-12 Thread Jay Foad via llvm-branch-commits
@@ -874,6 +874,30 @@ unsigned GISelValueTracking::computeNumSignBits(Register R, SrcTy.getScalarSizeInBits()); break; } + case TargetOpcode::G_SHUFFLE_VECTOR: { +// Collect the minimum number of sign bits that are shared by ever

[llvm-branch-commits] [llvm] AMDGPU: Fix tracking subreg defs when folding through reg_sequence (PR #140608)

2025-05-28 Thread Jay Foad via llvm-branch-commits
@@ -25,52 +25,151 @@ using namespace llvm; namespace { -struct FoldCandidate { - MachineInstr *UseMI; +/// Track a value we may want to fold into downstream users, applying +/// subregister extracts along the way. +struct FoldableDef { union { -MachineOperand *OpToFol

[llvm-branch-commits] [llvm] AMDGPU: Fix tracking subreg defs when folding through reg_sequence (PR #140608)

2025-05-28 Thread Jay Foad via llvm-branch-commits
https://github.com/jayfoad commented: The idea seems good. I haven't reviewed it all in detail. https://github.com/llvm/llvm-project/pull/140608 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mail

[llvm-branch-commits] [llvm] AMDGPU: Fix tracking subreg defs when folding through reg_sequence (PR #140608)

2025-05-28 Thread Jay Foad via llvm-branch-commits
@@ -25,52 +25,151 @@ using namespace llvm; namespace { -struct FoldCandidate { - MachineInstr *UseMI; +/// Track a value we may want to fold into downstream users, applying +/// subregister extracts along the way. +struct FoldableDef { union { -MachineOperand *OpToFol

[llvm-branch-commits] [llvm] AMDGPU: Fix tracking subreg defs when folding through reg_sequence (PR #140608)

2025-05-28 Thread Jay Foad via llvm-branch-commits
https://github.com/jayfoad edited https://github.com/llvm/llvm-project/pull/140608 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU: Fix tracking subreg defs when folding through reg_sequence (PR #140608)

2025-05-28 Thread Jay Foad via llvm-branch-commits
@@ -25,52 +25,151 @@ using namespace llvm; namespace { -struct FoldCandidate { - MachineInstr *UseMI; +/// Track a value we may want to fold into downstream users, applying +/// subregister extracts along the way. +struct FoldableDef { union { -MachineOperand *OpToFol

[llvm-branch-commits] [llvm] AMDGPU: Fix tracking subreg defs when folding through reg_sequence (PR #140608)

2025-05-28 Thread Jay Foad via llvm-branch-commits
@@ -380,7 +477,8 @@ bool SIFoldOperandsImpl::canUseImmWithOpSel(FoldCandidate &Fold) const { return true; } -bool SIFoldOperandsImpl::tryFoldImmWithOpSel(FoldCandidate &Fold) const { +bool SIFoldOperandsImpl::tryFoldImmWithOpSel(FoldCandidate &Fold, +

[llvm-branch-commits] [llvm] [AMDGPU] Use reverse iteration in CodeGenPrepare (PR #145484)

2025-06-24 Thread Jay Foad via llvm-branch-commits
jayfoad wrote: I don't understand the high level motivation here. "Normal" combining/simplification order is to visit the operands of an instruction before you visit the instruction itself. That way the "visit" function can assume that the operands have already been simplified. GlobalISel comb

[llvm-branch-commits] [llvm] [AMDGPU] Use reverse iteration in CodeGenPrepare (PR #145484)

2025-06-25 Thread Jay Foad via llvm-branch-commits
jayfoad wrote: > This is not a simplifying pass, it is making the IR more complicated. We have > to do hacks like this to prevent later more profitable combines from needing > to parse out expanded IR: Fair enough, makes sense. I just want to make sure the justification is properly understood

[llvm-branch-commits] [llvm] [GISel] Combine compare of bitfield extracts or'd together. (PR #146055)

2025-06-30 Thread Jay Foad via llvm-branch-commits
@@ -140,3 +140,92 @@ bool CombinerHelper::matchCanonicalizeFCmp(const MachineInstr &MI, return false; } + +bool CombinerHelper::combineMergedBFXCompare(MachineInstr &MI) const { + const GICmp *Cmp = cast(&MI); + + ICmpInst::Predicate CC = Cmp->getCond(); + if (CC != CmpI

[llvm-branch-commits] [llvm] [DAG] Fold (setcc ((x | x >> c0 | ...) & mask)) sequences (PR #146054)

2025-06-30 Thread Jay Foad via llvm-branch-commits
@@ -28909,13 +28909,97 @@ SDValue DAGCombiner::SimplifySelectCC(const SDLoc &DL, SDValue N0, SDValue N1, return SDValue(); } +static SDValue matchMergedBFX(SDValue Root, SelectionDAG &DAG, + const TargetLowering &TLI) { + // Match a pattern suc

[llvm-branch-commits] [llvm] [DAG] Fold (setcc ((x | x >> c0 | ...) & mask)) sequences (PR #146054)

2025-06-30 Thread Jay Foad via llvm-branch-commits
@@ -28909,13 +28909,99 @@ SDValue DAGCombiner::SimplifySelectCC(const SDLoc &DL, SDValue N0, SDValue N1, return SDValue(); } +static SDValue matchMergedBFX(SDValue Root, SelectionDAG &DAG, + const TargetLowering &TLI) { + // Match a pattern suc

[llvm-branch-commits] [llvm] [DAG] Fold (setcc ((x | x >> c0 | ...) & mask)) sequences (PR #146054)

2025-06-30 Thread Jay Foad via llvm-branch-commits
jayfoad wrote: Does this also handle the case where _all_ of the values ORed together are shifted, like `(setcc ((x >> c0 | x >> c1 | ...) & mask))` ? https://github.com/llvm/llvm-project/pull/146054 ___ llvm-branch-commits mailing list llvm-branch-co

[llvm-branch-commits] [llvm] [AMDGPU] Move S_BFE lowering into RegBankCombiner (PR #141589)

2025-07-01 Thread Jay Foad via llvm-branch-commits
@@ -392,6 +394,55 @@ void AMDGPURegBankCombinerImpl::applyCanonicalizeZextShiftAmt( MI.eraseFromParent(); } +bool AMDGPURegBankCombinerImpl::lowerUniformBFX(MachineInstr &MI) const { + assert(MI.getOpcode() == TargetOpcode::G_UBFX || + MI.getOpcode() == TargetOpcod

[llvm-branch-commits] [llvm] [AMDGPU] Move S_BFE lowering into RegBankCombiner (PR #141589)

2025-07-01 Thread Jay Foad via llvm-branch-commits
@@ -392,6 +394,55 @@ void AMDGPURegBankCombinerImpl::applyCanonicalizeZextShiftAmt( MI.eraseFromParent(); } +bool AMDGPURegBankCombinerImpl::lowerUniformBFX(MachineInstr &MI) const { + assert(MI.getOpcode() == TargetOpcode::G_UBFX || + MI.getOpcode() == TargetOpcod

[llvm-branch-commits] [llvm] [AMDGPU] Move S_BFE lowering into RegBankCombiner (PR #141589)

2025-07-01 Thread Jay Foad via llvm-branch-commits
https://github.com/jayfoad edited https://github.com/llvm/llvm-project/pull/141589 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU] Move S_BFE lowering into RegBankCombiner (PR #141589)

2025-07-01 Thread Jay Foad via llvm-branch-commits
https://github.com/jayfoad commented: No test changes? Is it possible to test any of this? We have `regbankcombiner-*` tests for some things. https://github.com/llvm/llvm-project/pull/141589 ___ llvm-branch-commits mailing list llvm-branch-commits@lis

[llvm-branch-commits] [llvm] [AMDGPU] Move S_BFE lowering into RegBankCombiner (PR #141589)

2025-07-01 Thread Jay Foad via llvm-branch-commits
@@ -392,6 +394,55 @@ void AMDGPURegBankCombinerImpl::applyCanonicalizeZextShiftAmt( MI.eraseFromParent(); } +bool AMDGPURegBankCombinerImpl::lowerUniformBFX(MachineInstr &MI) const { + assert(MI.getOpcode() == TargetOpcode::G_UBFX || + MI.getOpcode() == TargetOpcod

[llvm-branch-commits] [llvm] [AMDGPU] Move S_BFE lowering into RegBankCombiner (PR #141589)

2025-07-01 Thread Jay Foad via llvm-branch-commits
@@ -392,6 +394,55 @@ void AMDGPURegBankCombinerImpl::applyCanonicalizeZextShiftAmt( MI.eraseFromParent(); } +bool AMDGPURegBankCombinerImpl::lowerUniformBFX(MachineInstr &MI) const { + assert(MI.getOpcode() == TargetOpcode::G_UBFX || + MI.getOpcode() == TargetOpcod

[llvm-branch-commits] [llvm] [DAG] Fold (setcc ((x | x >> c0 | ...) & mask)) sequences (PR #146054)

2025-07-01 Thread Jay Foad via llvm-branch-commits
https://github.com/jayfoad approved this pull request. https://github.com/llvm/llvm-project/pull/146054 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [DAG] Fold (setcc ((x | x >> c0 | ...) & mask)) sequences (PR #146054)

2025-06-27 Thread Jay Foad via llvm-branch-commits
@@ -28909,13 +28909,97 @@ SDValue DAGCombiner::SimplifySelectCC(const SDLoc &DL, SDValue N0, SDValue N1, return SDValue(); } +static SDValue matchMergedBFX(SDValue Root, SelectionDAG &DAG, + const TargetLowering &TLI) { + // Match a pattern suc

[llvm-branch-commits] [llvm] [DAG] Fold (setcc ((x | x >> c0 | ...) & mask)) sequences (PR #146054)

2025-06-27 Thread Jay Foad via llvm-branch-commits
@@ -28909,13 +28909,97 @@ SDValue DAGCombiner::SimplifySelectCC(const SDLoc &DL, SDValue N0, SDValue N1, return SDValue(); } +static SDValue matchMergedBFX(SDValue Root, SelectionDAG &DAG, + const TargetLowering &TLI) { + // Match a pattern suc

[llvm-branch-commits] [llvm] [DAG] Fold (setcc ((x | x >> c0 | ...) & mask)) sequences (PR #146054)

2025-06-27 Thread Jay Foad via llvm-branch-commits
@@ -28909,13 +28909,97 @@ SDValue DAGCombiner::SimplifySelectCC(const SDLoc &DL, SDValue N0, SDValue N1, return SDValue(); } +static SDValue matchMergedBFX(SDValue Root, SelectionDAG &DAG, + const TargetLowering &TLI) { + // Match a pattern suc

[llvm-branch-commits] [llvm] [DAG] Fold (setcc ((x | x >> c0 | ...) & mask)) sequences (PR #146054)

2025-06-27 Thread Jay Foad via llvm-branch-commits
@@ -28909,13 +28909,97 @@ SDValue DAGCombiner::SimplifySelectCC(const SDLoc &DL, SDValue N0, SDValue N1, return SDValue(); } +static SDValue matchMergedBFX(SDValue Root, SelectionDAG &DAG, + const TargetLowering &TLI) { + // Match a pattern suc

[llvm-branch-commits] [llvm] [DAG] Fold (setcc ((x | x >> c0 | ...) & mask)) sequences (PR #146054)

2025-06-27 Thread Jay Foad via llvm-branch-commits
@@ -28909,13 +28909,97 @@ SDValue DAGCombiner::SimplifySelectCC(const SDLoc &DL, SDValue N0, SDValue N1, return SDValue(); } +static SDValue matchMergedBFX(SDValue Root, SelectionDAG &DAG, + const TargetLowering &TLI) { + // Match a pattern suc

[llvm-branch-commits] [llvm] [AMDGPU] Add tests for workgroup/workitem intrinsic optimizations (PR #146053)

2025-06-27 Thread Jay Foad via llvm-branch-commits
https://github.com/jayfoad approved this pull request. LGTM with nits https://github.com/llvm/llvm-project/pull/146053 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-c

[llvm-branch-commits] [llvm] [AMDGPU] Add tests for workgroup/workitem intrinsic optimizations (PR #146053)

2025-06-27 Thread Jay Foad via llvm-branch-commits
https://github.com/jayfoad edited https://github.com/llvm/llvm-project/pull/146053 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU] Add tests for workgroup/workitem intrinsic optimizations (PR #146053)

2025-06-27 Thread Jay Foad via llvm-branch-commits
@@ -0,0 +1,553 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5 jayfoad wrote: Nit: I think "workitem-intrinsic-opts" sounds better https://github.com/llvm/llvm-project/pull/146053 _

[llvm-branch-commits] [llvm] [AMDGPU] Add tests for workgroup/workitem intrinsic optimizations (PR #146053)

2025-06-27 Thread Jay Foad via llvm-branch-commits
@@ -0,0 +1,553 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5 +; RUN: llc -O3 -mtriple=amdgcn -mcpu=fiji %s -o - | FileCheck %s --check-prefixes=GFX8,DAGISEL-GFX9 jayfoad wrote: ```suggestion ; RUN: llc -

<    1   2