from:"\"Stanislav Mekhanoshin via llvm\\\-branch\\\-commits\""

[llvm-branch-commits] [llvm] AMDGPU: Handle legal v2f16/v2bf16 atomicrmw fadd for global/flat (PR #95394)

2024-06-14 Thread Stanislav Mekhanoshin via llvm-branch-commits

@@ -1669,13 +1670,16 @@ defm : FlatSignedAtomicPatWithAddrSpace <"FLAT_ATOMIC_ADD_F32", "int_amdgcn_flat } let OtherPredicates = [HasAtomicFlatPkAdd16Insts] in { +// FIXME: These do not have signed offsets rampitec wrote: Can you just use FlatAtomicPat? htt

[llvm-branch-commits] [llvm] AMDGPU: Handle legal v2f16/v2bf16 atomicrmw fadd for global/flat (PR #95394)

2024-06-14 Thread Stanislav Mekhanoshin via llvm-branch-commits

@@ -15931,6 +15931,26 @@ static OptimizationRemark emitAtomicRMWLegalRemark(const AtomicRMWInst *RMW) { << " operation at memory scope " << MemScope; } +static bool isHalf2OrBFloat2(Type *Ty) { rampitec wrote: Does the underlying type really matter?

[llvm-branch-commits] [clang] [llvm] AMDGPU: Remove ds atomic fadd intrinsics (PR #95396)

2024-06-14 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec approved this pull request. LGTM contingent the plan to produce atomicrmw. https://github.com/llvm/llvm-project/pull/95396 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bi

[llvm-branch-commits] [llvm] AMDGPU: Handle legal v2f16/v2bf16 atomicrmw fadd for global/flat (PR #95394)

2024-06-14 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/95394 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU: Handle legal v2bf16 atomicrmw fadd for gfx12 (PR #95930)

2024-06-18 Thread Stanislav Mekhanoshin via llvm-branch-commits

@@ -1735,8 +1737,11 @@ defm : SIBufferAtomicPat<"SIbuffer_atomic_dec", i64, "BUFFER_ATOMIC_DEC_X2">; let OtherPredicates = [HasAtomicCSubNoRtnInsts] in defm : SIBufferAtomicPat<"SIbuffer_atomic_csub", i32, "BUFFER_ATOMIC_CSUB", ["noret"]>; -let SubtargetPredicate = isGFX12Pl

[llvm-branch-commits] [llvm] AMDGPU: Handle legal v2bf16 atomicrmw fadd for gfx12 (PR #95930)

2024-06-18 Thread Stanislav Mekhanoshin via llvm-branch-commits

@@ -743,6 +743,12 @@ def FeatureAtomicGlobalPkAddBF16Inst : SubtargetFeature<"atomic-global-pk-add-bf [FeatureFlatGlobalInsts] >; +def FeatureAtomicBufferPkAddBF16Inst : SubtargetFeature<"atomic-buffer-pk-add-bf16-inst", rampitec wrote: I believe it is abo

[llvm-branch-commits] [llvm] [AMDGPU] Codegen support for constrained multi-dword sloads (PR #96163)

2024-06-20 Thread Stanislav Mekhanoshin via llvm-branch-commits

@@ -886,26 +977,17 @@ multiclass SMRD_Pattern { def : GCNPat < (smrd_load (SMRDSgpr i64:$sbase, i32:$soffset)), (vt (!cast(Instr#"_SGPR") $sbase, $soffset, 0))> { -let OtherPredicates = [isNotGFX9Plus]; - } - def : GCNPat < -(smrd_load (SMRDSgpr i64:$sbase,

[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Merge constrained sloads (PR #96162)

2024-06-20 Thread Stanislav Mekhanoshin via llvm-branch-commits

@@ -1701,17 +1732,33 @@ unsigned SILoadStoreOptimizer::getNewOpcode(const CombineInfo &CI, return AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM; } case S_LOAD_IMM: -switch (Width) { -default: - return 0; -case 2: - return AMDGPU::S_LOAD_DWORDX2_IMM;

[llvm-branch-commits] [llvm] AMDGPU: Add a subtarget feature for fine-grained remote memory support (PR #96442)

2024-06-24 Thread Stanislav Mekhanoshin via llvm-branch-commits

rampitec wrote: > We do statically know for some of the targets (mostly gfx12 and gfx940) that > it's supposed to work. This is the "scope downgrade" vs. "nop" cases in the > atomic support table Actually not, we do not know the bus. Moreover, we know this is opposite. https://github.com/llvm

[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Merge constrained sloads (PR #96162)

2024-06-24 Thread Stanislav Mekhanoshin via llvm-branch-commits

@@ -1701,17 +1732,33 @@ unsigned SILoadStoreOptimizer::getNewOpcode(const CombineInfo &CI, return AMDGPU::S_BUFFER_LOAD_DWORDX8_SGPR_IMM; } case S_LOAD_IMM: -switch (Width) { -default: - return 0; -case 2: - return AMDGPU::S_LOAD_DWORDX2_IMM;

[llvm-branch-commits] [llvm] AMDGPU: Add subtarget feature for memory atomic fadd f64 (PR #96444)

2024-06-24 Thread Stanislav Mekhanoshin via llvm-branch-commits

rampitec wrote: Use it in a predicate when defining pseudos? https://github.com/llvm/llvm-project/pull/96444 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU: Add subtarget feature for global atomic fadd denormal support (PR #96443)

2024-06-24 Thread Stanislav Mekhanoshin via llvm-branch-commits

rampitec wrote: It is worse than that. It behaves differently depending on where atomic is executed. There is no single answer if this instruction supports denorms or not. https://github.com/llvm/llvm-project/pull/96443 ___ llvm-branch-commits mailing

[llvm-branch-commits] [llvm] AMDGPU: Add subtarget feature for global atomic fadd denormal support (PR #96443)

2024-06-24 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec approved this pull request. https://github.com/llvm/llvm-project/pull/96443 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU: Add subtarget feature for memory atomic fadd f64 (PR #96444)

2024-06-24 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec approved this pull request. https://github.com/llvm/llvm-project/pull/96444 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU: Remove ds_fmin/ds_fmax intrinsics (PR #96739)

2024-06-26 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec approved this pull request. https://github.com/llvm/llvm-project/pull/96739 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU: Enable vectorization of v2f16 copysign (PR #100799)

2024-07-29 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/100799 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU: Correct costs of saturating add/sub intrinsics (PR #100808)

2024-07-29 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec approved this pull request. https://github.com/llvm/llvm-project/pull/100808 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU: Support VALU add instructions in localstackalloc (PR #101692)

2024-08-02 Thread Stanislav Mekhanoshin via llvm-branch-commits

@@ -809,7 +826,59 @@ int64_t SIRegisterInfo::getFrameIndexInstrOffset(const MachineInstr *MI, return getScratchInstrOffset(MI); } +static bool isFIPlusImmOrVGPR(const SIRegisterInfo &TRI, + const MachineInstr &MI) { + const MachineOperand &Src0

[llvm-branch-commits] [llvm] AMDGPU: Support VALU add instructions in localstackalloc (PR #101692)

2024-08-02 Thread Stanislav Mekhanoshin via llvm-branch-commits

@@ -797,6 +797,23 @@ int64_t SIRegisterInfo::getScratchInstrOffset(const MachineInstr *MI) const { int64_t SIRegisterInfo::getFrameIndexInstrOffset(const MachineInstr *MI, int Idx) const { + switch (MI->getOpcode()) {

[llvm-branch-commits] [llvm] AMDGPU: Support VALU add instructions in localstackalloc (PR #101692)

2024-08-02 Thread Stanislav Mekhanoshin via llvm-branch-commits

@@ -877,6 +948,86 @@ Register SIRegisterInfo::materializeFrameBaseRegister(MachineBasicBlock *MBB, void SIRegisterInfo::resolveFrameIndex(MachineInstr &MI, Register BaseReg, int64_t Offset) const { const SIInstrInfo *TII = ST.getInstrIn

[llvm-branch-commits] [llvm] AMDGPU: Support VALU add instructions in localstackalloc (PR #101692)

2024-08-02 Thread Stanislav Mekhanoshin via llvm-branch-commits

@@ -797,6 +797,23 @@ int64_t SIRegisterInfo::getScratchInstrOffset(const MachineInstr *MI) const { int64_t SIRegisterInfo::getFrameIndexInstrOffset(const MachineInstr *MI, int Idx) const { + switch (MI->getOpcode()) {

[llvm-branch-commits] [llvm] InferAddressSpaces: Handle llvm.is.constant (PR #102010)

2024-08-05 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec commented: Add some tests where argument is not a pointer? https://github.com/llvm/llvm-project/pull/102010 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listi

[llvm-branch-commits] [llvm] InferAddressSpaces: Handle masked load and store intrinsics (PR #102007)

2024-08-05 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/102007 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] InferAddressSpaces: Handle llvm.is.constant (PR #102010)

2024-08-05 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec approved this pull request. LGTM modulo braces comment. https://github.com/llvm/llvm-project/pull/102010 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo

[llvm-branch-commits] [llvm] AMDGPU: Fold frame indexes into s_or_b32 and s_and_b32 (PR #102345)

2024-08-07 Thread Stanislav Mekhanoshin via llvm-branch-commits

@@ -190,31 +186,31 @@ body: | ; MUBUFW64-LABEL: name: s_and_b32__sgpr__fi_literal_offset ; MUBUFW64: liveins: $sgpr8 ; MUBUFW64-NEXT: {{ $}} -; MUBUFW64-NEXT: $sgpr4 = S_LSHR_B32 $sgpr32, 6, implicit-def $scc -; MUBUFW64-NEXT: $sgpr4 = S_ADD_I32

[llvm-branch-commits] [llvm] AMDGPU: Preserve atomicrmw name when specializing address space (PR #102470)

2024-08-08 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec approved this pull request. https://github.com/llvm/llvm-project/pull/102470 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU: Add noalias.addrspace metadata when autoupgrading atomic intrinsics (PR #102599)

2024-08-09 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec approved this pull request. https://github.com/llvm/llvm-project/pull/102599 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] NewPM/AMDGPU: Port AMDGPUPerfHintAnalysis to new pass manager (PR #102645)

2024-08-09 Thread Stanislav Mekhanoshin via llvm-branch-commits

@@ -22,6 +22,7 @@ MODULE_PASS("amdgpu-lower-buffer-fat-pointers", AMDGPULowerBufferFatPointersPass(*this)) MODULE_PASS("amdgpu-lower-ctor-dtor", AMDGPUCtorDtorLoweringPass()) MODULE_PASS("amdgpu-lower-module-lds", AMDGPULowerModuleLDSPass(*this)) +MODULE_PASS("amdgp

[llvm-branch-commits] [llvm] NewPM/AMDGPU: Port AMDGPUPerfHintAnalysis to new pass manager (PR #102645)

2024-08-09 Thread Stanislav Mekhanoshin via llvm-branch-commits

@@ -413,18 +439,57 @@ bool AMDGPUPerfHintAnalysis::runOnSCC(CallGraphSCC &SCC) { return Changed; } -bool AMDGPUPerfHintAnalysis::isMemoryBound(const Function *F) const { - auto FI = FIM.find(F); - if (FI == FIM.end()) -return false; +bool AMDGPUPerfHintAnalysis::run(co

[llvm-branch-commits] [llvm] AMDGPU/NewPM: Port AMDGPUAnnotateUniformValues to new pass manager (PR #102654)

2024-08-09 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec approved this pull request. https://github.com/llvm/llvm-project/pull/102654 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU/NewPM: Port SILowerI1Copies to new pass manager (PR #102663)

2024-08-09 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec approved this pull request. https://github.com/llvm/llvm-project/pull/102663 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] CodeGen/NewPM: Add ExpandLarge* passes to isel IR passes (PR #102815)

2024-08-12 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec approved this pull request. https://github.com/llvm/llvm-project/pull/102815 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU/NewPM: Start implementing addCodeGenPrepare (PR #102816)

2024-08-12 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec approved this pull request. https://github.com/llvm/llvm-project/pull/102816 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU: Declare pass control flags in header (PR #102865)

2024-08-12 Thread Stanislav Mekhanoshin via llvm-branch-commits

rampitec wrote: > I don't really like needing to expose these globally like this; maybe it > would be better to just move TargetPassConfig and the CodeGenPassBuilder into > one common file? Yep, I also do not like extern cl::opt. https://github.com/llvm/llvm-project/pull/102865 __

[llvm-branch-commits] [llvm] AMDGPU/NewPM: Fill out passes in addCodeGenPrepare (PR #102867)

2024-08-12 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec approved this pull request. https://github.com/llvm/llvm-project/pull/102867 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU/NewPM: Start filling out addIRPasses (PR #102884)

2024-08-12 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec approved this pull request. https://github.com/llvm/llvm-project/pull/102884 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU] Fix sign confusion in performMulLoHiCombine (PR #106977)

2024-09-03 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec approved this pull request. https://github.com/llvm/llvm-project/pull/106977 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] 607bec0 - Change materializeFrameBaseRegister() to return register

2021-01-22 Thread Stanislav Mekhanoshin via llvm-branch-commits

Author: Stanislav Mekhanoshin Date: 2021-01-22T15:51:06-08:00 New Revision: 607bec0bb9f787acca95f53dabe6a5c227f6b6b2 URL: https://github.com/llvm/llvm-project/commit/607bec0bb9f787acca95f53dabe6a5c227f6b6b2 DIFF: https://github.com/llvm/llvm-project/commit/607bec0bb9f787acca95f53dabe6a5c227f6b6

[llvm-branch-commits] [llvm] ca904b8 - [AMDGPU] Fix FP materialization/resolve with flat scratch

2021-01-22 Thread Stanislav Mekhanoshin via llvm-branch-commits

Author: Stanislav Mekhanoshin Date: 2021-01-22T16:06:47-08:00 New Revision: ca904b81e6488b45cbfe846dc86f1406b8e9c03d URL: https://github.com/llvm/llvm-project/commit/ca904b81e6488b45cbfe846dc86f1406b8e9c03d DIFF: https://github.com/llvm/llvm-project/commit/ca904b81e6488b45cbfe846dc86f1406b8e9c0

[llvm-branch-commits] [llvm] eb66bf0 - [AMDGPU] Print SCRATCH_EN field after the kernel

2020-12-15 Thread Stanislav Mekhanoshin via llvm-branch-commits

Author: Stanislav Mekhanoshin Date: 2020-12-15T22:44:30-08:00 New Revision: eb66bf0802f96458b24a9c6eb9bd6451d8f90110 URL: https://github.com/llvm/llvm-project/commit/eb66bf0802f96458b24a9c6eb9bd6451d8f90110 DIFF: https://github.com/llvm/llvm-project/commit/eb66bf0802f96458b24a9c6eb9bd6451d8f901

[llvm-branch-commits] [llvm] ae8f4b2 - [AMDGPU] Folding of FI operand with flat scratch

2020-12-22 Thread Stanislav Mekhanoshin via llvm-branch-commits

Author: Stanislav Mekhanoshin Date: 2020-12-22T10:48:04-08:00 New Revision: ae8f4b2178c46da1f10eb9279c9b44fab8b85417 URL: https://github.com/llvm/llvm-project/commit/ae8f4b2178c46da1f10eb9279c9b44fab8b85417 DIFF: https://github.com/llvm/llvm-project/commit/ae8f4b2178c46da1f10eb9279c9b44fab8b854

[llvm-branch-commits] [llvm] ca4bf58 - [AMDGPU] Support unaligned flat scratch in TLI

2020-12-22 Thread Stanislav Mekhanoshin via llvm-branch-commits

Author: Stanislav Mekhanoshin Date: 2020-12-22T16:12:31-08:00 New Revision: ca4bf58e4ee5951473a861716193063c5ef83e9a URL: https://github.com/llvm/llvm-project/commit/ca4bf58e4ee5951473a861716193063c5ef83e9a DIFF: https://github.com/llvm/llvm-project/commit/ca4bf58e4ee5951473a861716193063c5ef83e

[llvm-branch-commits] [llvm] d15119a - [AMDGPU][GlobalISel] GlobalISel for flat scratch

2020-12-22 Thread Stanislav Mekhanoshin via llvm-branch-commits

Author: Stanislav Mekhanoshin Date: 2020-12-22T16:33:06-08:00 New Revision: d15119a02d92274cd7f779f4bb8485b1020110e0 URL: https://github.com/llvm/llvm-project/commit/d15119a02d92274cd7f779f4bb8485b1020110e0 DIFF: https://github.com/llvm/llvm-project/commit/d15119a02d92274cd7f779f4bb8485b1020110

[llvm-branch-commits] [llvm] 747f67e - [AMDGPU] Fix adjustWritemask subreg handling

2020-12-23 Thread Stanislav Mekhanoshin via llvm-branch-commits

Author: Stanislav Mekhanoshin Date: 2020-12-23T14:43:31-08:00 New Revision: 747f67e034a924cf308f4c0f1bb6b1fa46bd9fbe URL: https://github.com/llvm/llvm-project/commit/747f67e034a924cf308f4c0f1bb6b1fa46bd9fbe DIFF: https://github.com/llvm/llvm-project/commit/747f67e034a924cf308f4c0f1bb6b1fa46bd9f

[llvm-branch-commits] [llvm] dd89249 - [AMDGPU] Annotate vgpr<->agpr spills in asm

2020-12-07 Thread Stanislav Mekhanoshin via llvm-branch-commits

Author: Stanislav Mekhanoshin Date: 2020-12-07T11:25:25-08:00 New Revision: dd892494983a2e64d1e1eb3d05ce9577357336d2 URL: https://github.com/llvm/llvm-project/commit/dd892494983a2e64d1e1eb3d05ce9577357336d2 DIFF: https://github.com/llvm/llvm-project/commit/dd892494983a2e64d1e1eb3d05ce9577357336

[llvm-branch-commits] [llvm] 87d7757 - [SLP] Control maximum vectorization factor from TTI

2020-12-14 Thread Stanislav Mekhanoshin via llvm-branch-commits

Author: Stanislav Mekhanoshin Date: 2020-12-14T08:49:40-08:00 New Revision: 87d7757bbe14fed420092071ded3430072053316 URL: https://github.com/llvm/llvm-project/commit/87d7757bbe14fed420092071ded3430072053316 DIFF: https://github.com/llvm/llvm-project/commit/87d7757bbe14fed420092071ded34300720533

[llvm-branch-commits] [llvm] AMDGPU/GlobalISel: Handle atomic sextload and zextload (PR #111721)

2024-10-09 Thread Stanislav Mekhanoshin via llvm-branch-commits

rampitec wrote: > > Missing test for buffer loads? > > Those are the gfx7 global cases. There aren't any atomic buffer load > intrinsics But patch adds several MUBUF_Pseudo_Load_Pats which are not covered by tests? https://github.com/llvm/llvm-project/pull/111721 _

[llvm-branch-commits] [llvm] AMDGPU: Fold more scalar operations on frame index to VALU (PR #115059)

2024-11-05 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec approved this pull request. https://github.com/llvm/llvm-project/pull/115059 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU: Default to selecting frame indexes to SGPRs (PR #115060)

2024-11-05 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec approved this pull request. https://github.com/llvm/llvm-project/pull/115060 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [AMDGPU] Simplify dpp builtin handling (PR #115090)

2024-11-05 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec edited https://github.com/llvm/llvm-project/pull/115090 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU: Expand flat atomics that may access private memory (PR #109407)

2024-09-23 Thread Stanislav Mekhanoshin via llvm-branch-commits

rampitec wrote: > > Is it legal and defined behavior to target private memory with an atomic? > > In the IR it would have to be, and this is the expected behavior in OpenMP > and C++. It's UB in OpenCL, and UB in CUDA/HIP for old style atomics, but > defined for new std::atomic style cases Is

[llvm-branch-commits] [llvm] AMDGPU: Expand flat atomics that may access private memory (PR #109407)

2024-09-23 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec approved this pull request. Thanks. Can this be landed after https://github.com/llvm/llvm-project/pull/102462? https://github.com/llvm/llvm-project/pull/109407 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.

[llvm-branch-commits] [llvm] AMDGPU: Expand flat atomics that may access private memory (PR #109407)

2024-09-20 Thread Stanislav Mekhanoshin via llvm-branch-commits

rampitec wrote: Is it legal and defined behavior to target private memory with an atomic? https://github.com/llvm/llvm-project/pull/109407 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/li

[llvm-branch-commits] [llvm] AMDGPU: Add baseline tests for cmpxchg custom expansion (PR #109408)

2024-09-20 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec approved this pull request. https://github.com/llvm/llvm-project/pull/109408 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU: Add baseline tests for flat-may-alias private atomic expansions (PR #109406)

2024-09-20 Thread Stanislav Mekhanoshin via llvm-branch-commits

@@ -0,0 +1,6911 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py +; RUN: llc -mtriple=amdgcn -mcpu=bonaire < %s | FileCheck -check-prefix=GCN1 %s rampitec wrote: Why GCN1 and GCN2? GFX7 and GFX8 are easier to understand. https://

[llvm-branch-commits] [clang] [AMDGPU] Simplify dpp builtin handling (PR #115090)

2024-11-06 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec updated https://github.com/llvm/llvm-project/pull/115090 >From f3d99e4ae92e407ebc2ef3f6b8e4017b397d34eb Mon Sep 17 00:00:00 2001 From: Stanislav Mekhanoshin Date: Mon, 4 Nov 2024 12:28:07 -0800 Subject: [PATCH] [AMDGPU] Simplify dpp builtin handling DPP intrinsics c

[llvm-branch-commits] [clang] [AMDGPU] Simplify dpp builtin handling (PR #115090)

2024-11-06 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec updated https://github.com/llvm/llvm-project/pull/115090 >From 084e347f5fb6e9068313ad4dbc53b44c2d4cee69 Mon Sep 17 00:00:00 2001 From: Stanislav Mekhanoshin Date: Mon, 4 Nov 2024 12:28:07 -0800 Subject: [PATCH] [AMDGPU] Simplify dpp builtin handling DPP intrinsics c

[llvm-branch-commits] [clang] [AMDGPU] Simplify dpp builtin handling (PR #115090)

2024-11-06 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec updated https://github.com/llvm/llvm-project/pull/115090 >From 7ccac58706b2d7e54c8498818b560af490a70eac Mon Sep 17 00:00:00 2001 From: Stanislav Mekhanoshin Date: Mon, 4 Nov 2024 12:28:07 -0800 Subject: [PATCH] [AMDGPU] Simplify dpp builtin handling DPP intrinsics c

[llvm-branch-commits] [clang] [AMDGPU] Simplify dpp builtin handling (PR #115090)

2024-11-06 Thread Stanislav Mekhanoshin via llvm-branch-commits

rampitec wrote: > Should also teach instcombine to fold bitcast + app It still needs downstack change to handle i8: https://github.com/llvm/llvm-project/pull/114887 https://github.com/llvm/llvm-project/pull/115090 ___ llvm-branch-commits mailing list

[llvm-branch-commits] [llvm] AMDGPU: Add baseline test for treating v_pk_mov_b32 like reg_sequence (PR #125656)

2025-02-04 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec approved this pull request. https://github.com/llvm/llvm-project/pull/125656 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU: Custom lower 32-bit element shuffles (PR #123711)

2025-01-21 Thread Stanislav Mekhanoshin via llvm-branch-commits

rampitec wrote: Is there any way at all to test it? https://github.com/llvm/llvm-project/pull/123711 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU: Custom lower 32-bit element shuffles (PR #123711)

2025-01-21 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec approved this pull request. https://github.com/llvm/llvm-project/pull/123711 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU: Custom lower 32-bit element shuffles (PR #123711)

2025-01-21 Thread Stanislav Mekhanoshin via llvm-branch-commits

rampitec wrote: > > Is there any way at all to test it? > > Many shuffle tests were added in > [7786266](https://github.com/llvm/llvm-project/commit/7786266dc7b4e89feadcb01ff21f9e3cf2022a6b), > this shows they are a no-op. The expected test changes from this are in > #123711 OK, I see. LGTM.

[llvm-branch-commits] [llvm] [AMDGPU] Add test for VALU hoisiting from WWM region. NFC. (PR #123234)

2025-01-17 Thread Stanislav Mekhanoshin via llvm-branch-commits

@@ -0,0 +1,43 @@ +# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5 +# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1100 -run-pass=early-machinelicm,si-wqm -o - %s | FileCheck -check-prefix=GCN %s + rampitec wrot

[llvm-branch-commits] [llvm] [AMDGPU] Add test for VALU hoisiting from WWM region. NFC. (PR #123234)

2025-01-17 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec updated https://github.com/llvm/llvm-project/pull/123234 >From 7501423b29230f37273094e1b15e8bca0fcc90bd Mon Sep 17 00:00:00 2001 From: Stanislav Mekhanoshin Date: Thu, 16 Jan 2025 10:49:05 -0800 Subject: [PATCH] [AMDGPU] Add test for VALU hoisiting from WWM region. N

[llvm-branch-commits] [llvm] [AMDGPU] Disable VALU sinking and hoisting with WWM (PR #123124)

2025-01-17 Thread Stanislav Mekhanoshin via llvm-branch-commits

@@ -2773,6 +2773,9 @@ void AMDGPUDAGToDAGISel::SelectINTRINSIC_WO_CHAIN(SDNode *N) { case Intrinsic::amdgcn_wwm: case Intrinsic::amdgcn_strict_wwm: Opcode = AMDGPU::STRICT_WWM; +CurDAG->getMachineFunction() +.getInfo() +->setInitWholeWave(); ---

[llvm-branch-commits] [llvm] [AMDGPU] Disable VALU sinking and hoisting with WWM (PR #123124)

2025-01-17 Thread Stanislav Mekhanoshin via llvm-branch-commits

rampitec wrote: > I guess my concern is performance regressions if any use of WWM (e.g. atomic > optimizer) essentially turns off Machine LICM. I agree. But when moving the code llvm thinks it is something cheap, and its is not, which is also a performance problem. Things would be much easier

[llvm-branch-commits] [llvm] [AMDGPU] Disable VALU sinking and hoisting with WWM (PR #123124)

2025-01-17 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec edited https://github.com/llvm/llvm-project/pull/123124 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU] Add test for VALU hoisiting from WWM region. NFC. (PR #123234)

2025-01-16 Thread Stanislav Mekhanoshin via llvm-branch-commits

rampitec wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.dev/github/pr/llvm/llvm-project/123234?utm_source=stack-comment-downstack-mergeability-warning"

[llvm-branch-commits] [llvm] [AMDGPU] Disable VALU sinking and hoisting with WWM (PR #123124)

2025-01-16 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec edited https://github.com/llvm/llvm-project/pull/123124 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU] Add test for VALU hoisiting from WWM region. NFC. (PR #123234)

2025-01-16 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec created https://github.com/llvm/llvm-project/pull/123234 The test demonstraits a suboptimal VALU hoisting from a WWM region. As a result we have 2 WWM regions instead of one. >From 263a43571303c16c3295cb0a88261504c4aef322 Mon Sep 17 00:00:00 2001 From: Stanislav Mekh

[llvm-branch-commits] [llvm] [AMDGPU] Add test for VALU hoisiting from WWM region. NFC. (PR #123234)

2025-01-16 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec ready_for_review https://github.com/llvm/llvm-project/pull/123234 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU] Disable VALU sinking and hoisting with WWM (PR #123124)

2025-01-16 Thread Stanislav Mekhanoshin via llvm-branch-commits

rampitec wrote: > Missing new test? Tests added. https://github.com/llvm/llvm-project/pull/123124 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU] Set inst_pref_size to maximum (PR #126981)

2025-02-13 Thread Stanislav Mekhanoshin via llvm-branch-commits

@@ -199,3 +201,28 @@ const MCExpr *SIProgramInfo::getPGMRSrc2(CallingConv::ID CC, return MCConstantExpr::create(0, Ctx); } + +uint64_t SIProgramInfo::getFunctionCodeSize(const MachineFunction &MF) { rampitec wrote: I wanted to look at this separately. Righ

[llvm-branch-commits] [llvm] [AMDGPU] Set inst_pref_size to maximum (PR #126981)

2025-02-13 Thread Stanislav Mekhanoshin via llvm-branch-commits

@@ -199,3 +201,28 @@ const MCExpr *SIProgramInfo::getPGMRSrc2(CallingConv::ID CC, return MCConstantExpr::create(0, Ctx); } + +uint64_t SIProgramInfo::getFunctionCodeSize(const MachineFunction &MF) { + if (!CodeSizeInBytes.has_value()) { +const GCNSubtarget &STM = MF.ge

[llvm-branch-commits] [llvm] [AMDGPU] Set inst_pref_size to maximum (PR #126981)

2025-02-13 Thread Stanislav Mekhanoshin via llvm-branch-commits

@@ -199,3 +201,28 @@ const MCExpr *SIProgramInfo::getPGMRSrc2(CallingConv::ID CC, return MCConstantExpr::create(0, Ctx); } + +uint64_t SIProgramInfo::getFunctionCodeSize(const MachineFunction &MF) { + if (!CodeSizeInBytes.has_value()) { +const GCNSubtarget &STM = MF.ge

[llvm-branch-commits] [llvm] [AMDGPU] Set inst_pref_size to maximum (PR #126981)

2025-02-13 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec edited https://github.com/llvm/llvm-project/pull/126981 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [AMDGPU][clang] Replace gfx940 and gfx941 with gfx942 in clang (PR #126762)

2025-02-11 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec approved this pull request. https://github.com/llvm/llvm-project/pull/126762 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [llvm] [AMDGPU] Replace gfx940 and gfx941 with gfx942 in llvm (PR #126763)

2025-02-11 Thread Stanislav Mekhanoshin via llvm-branch-commits

rampitec wrote: > Should just leave the subtarget feature name alone. It's not worth the > trouble, and this will now start spewing warnings on old IR (due to > unnecessary target-features spam clang should stop emitting). It really > should have been named 94-insts, but I think it's best to l

[llvm-branch-commits] [clang] [llvm] [AMDGPU] Replace gfx940 and gfx941 with gfx942 in llvm (PR #126763)

2025-02-11 Thread Stanislav Mekhanoshin via llvm-branch-commits

@@ -1619,28 +1613,6 @@ def FeatureISAVersion9_5_Common : FeatureSet< FeatureAtomicBufferPkAddBF16Inst ])>; -def FeatureISAVersion9_4_0 : FeatureSet< - !listconcat(FeatureISAVersion9_4_Common.Features, -[ - FeatureAddressableLocalMemorySize65536, - FeatureF

[llvm-branch-commits] [llvm] [AMDGPU] Remove the pass `AMDGPUPromoteKernelArguments` (PR #137655)

2025-04-28 Thread Stanislav Mekhanoshin via llvm-branch-commits

@@ -11,11 +10,9 @@ define amdgpu_kernel void @ptr_nest_3(ptr addrspace(1) nocapture readonly %Arg) ; CHECK-NEXT: entry: ; CHECK-NEXT:[[I:%.*]] = tail call i32 @llvm.amdgcn.workitem.id.x() ; CHECK-NEXT:[[P1:%.*]] = getelementptr inbounds ptr, ptr addrspace(1) [[ARG:%.

[llvm-branch-commits] [llvm] [AMDGPU] Remove the pass `AMDGPUPromoteKernelArguments` (PR #137655)

2025-04-28 Thread Stanislav Mekhanoshin via llvm-branch-commits

@@ -11,11 +10,9 @@ define amdgpu_kernel void @ptr_nest_3(ptr addrspace(1) nocapture readonly %Arg) ; CHECK-NEXT: entry: ; CHECK-NEXT:[[I:%.*]] = tail call i32 @llvm.amdgcn.workitem.id.x() ; CHECK-NEXT:[[P1:%.*]] = getelementptr inbounds ptr, ptr addrspace(1) [[ARG:%.

[llvm-branch-commits] [llvm] [AMDGPU] Respect MBB alignment in the getFunctionCodeSize() (PR #127142)

2025-02-17 Thread Stanislav Mekhanoshin via llvm-branch-commits

rampitec wrote: Which one do you prefer, this or https://github.com/llvm/llvm-project/pull/127246? They are mutually exclusive. https://github.com/llvm/llvm-project/pull/127142 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org ht

[llvm-branch-commits] [llvm] AMDGPU: Handle subregister uses in SIFoldOperands constant folding (PR #127485)

2025-02-17 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/127485 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU: Handle brev and not cases in getConstValDefinedInReg (PR #127483)

2025-02-17 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec approved this pull request. https://github.com/llvm/llvm-project/pull/127483 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU] Respect MBB alignment in the getFunctionCodeSize() (PR #127142)

2025-02-18 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec updated https://github.com/llvm/llvm-project/pull/127142 >From b574a4b4afbf4cd0a6e128ea5d1e1579698124bc Mon Sep 17 00:00:00 2001 From: Stanislav Mekhanoshin Date: Thu, 13 Feb 2025 14:46:37 -0800 Subject: [PATCH] [AMDGPU] Respect MBB alignment in the getFunctionCodeSi

[llvm-branch-commits] [llvm] [AMDGPU] Respect MBB alignment in the getFunctionCodeSize() (PR #127142)

2025-02-18 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec updated https://github.com/llvm/llvm-project/pull/127142 >From b574a4b4afbf4cd0a6e128ea5d1e1579698124bc Mon Sep 17 00:00:00 2001 From: Stanislav Mekhanoshin Date: Thu, 13 Feb 2025 14:46:37 -0800 Subject: [PATCH] [AMDGPU] Respect MBB alignment in the getFunctionCodeSi

[llvm-branch-commits] [llvm] [AMDGPU] Respect MBB alignment in the getFunctionCodeSize() (PR #127142)

2025-02-17 Thread Stanislav Mekhanoshin via llvm-branch-commits

rampitec wrote: > > Which one do you prefer, this or #127246? They are mutually exclusive. > > They're not really. This one is the incremental step which adds the test, > #127246 is the final form The test is meaningless if we overestimate. https://github.com/llvm/llvm-project/pull/127142 ___

[llvm-branch-commits] [llvm] [AMDGPU] Respect MBB alignment in the getFunctionCodeSize() (PR #127142)

2025-02-17 Thread Stanislav Mekhanoshin via llvm-branch-commits

rampitec wrote: And in any case it is a moot until baseline change is accepted. https://github.com/llvm/llvm-project/pull/127142 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llv

[llvm-branch-commits] [llvm] AMDGPU: Stop introducing v_accvgpr_write_b32 for reg-to-reg copy (PR #129059)

2025-02-27 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/129059 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] [AMDGPU] Simplify dpp builtin handling (PR #115090)

2025-03-01 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec updated https://github.com/llvm/llvm-project/pull/115090 >From f7e10b1e26159442945c2682ca1ed463bd152605 Mon Sep 17 00:00:00 2001 From: Stanislav Mekhanoshin Date: Mon, 4 Nov 2024 12:28:07 -0800 Subject: [PATCH] [AMDGPU] Simplify dpp builtin handling DPP intrinsics c

[llvm-branch-commits] [clang] [AMDGPU] Simplify dpp builtin handling (PR #115090)

2025-03-01 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec updated https://github.com/llvm/llvm-project/pull/115090 >From f7e10b1e26159442945c2682ca1ed463bd152605 Mon Sep 17 00:00:00 2001 From: Stanislav Mekhanoshin Date: Mon, 4 Nov 2024 12:28:07 -0800 Subject: [PATCH] [AMDGPU] Simplify dpp builtin handling DPP intrinsics c

[llvm-branch-commits] [llvm] AMDGPU: Replace amdgpu-no-agpr with amdgpu-num-agpr (PR #129893)

2025-03-05 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec approved this pull request. https://github.com/llvm/llvm-project/pull/129893 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU] Early bail in getFunctionCodeSize for meta inst. NFC. (PR #127129)

2025-02-13 Thread Stanislav Mekhanoshin via llvm-branch-commits

rampitec wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.dev/github/pr/llvm/llvm-project/127129?utm_source=stack-comment-downstack-mergeability-warning"

[llvm-branch-commits] [llvm] [AMDGPU] Early bail in getFunctionCodeSize for meta inst. NFC. (PR #127129)

2025-02-13 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec created https://github.com/llvm/llvm-project/pull/127129 It does not change the estimate because getInstSizeInBytes() already returns 0 for meta instructions, but added a test and early bail. >From c0489545755c98dc2f87ffcd83af929816643074 Mon Sep 17 00:00:00 2001 Fro

[llvm-branch-commits] [llvm] [AMDGPU] Respect MBB alignment in the getFunctionCodeSize() (PR #127142)

2025-02-13 Thread Stanislav Mekhanoshin via llvm-branch-commits

@@ -212,6 +212,8 @@ uint64_t SIProgramInfo::getFunctionCodeSize(const MachineFunction &MF) { uint64_t CodeSize = 0; for (const MachineBasicBlock &MBB : MF) { +CodeSize = alignTo(CodeSize, MBB.getAlignment()); rampitec wrote: Pessimistic overestimate

[llvm-branch-commits] [llvm] [AMDGPU] Respect MBB alignment in the getFunctionCodeSize() (PR #127142)

2025-02-13 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec created https://github.com/llvm/llvm-project/pull/127142 None >From d01d16815ade61a599b94bb18bc292e326767f15 Mon Sep 17 00:00:00 2001 From: Stanislav Mekhanoshin Date: Thu, 13 Feb 2025 14:46:37 -0800 Subject: [PATCH] [AMDGPU] Respect MBB alignment in the getFunction

[llvm-branch-commits] [llvm] [AMDGPU] Respect MBB alignment in the getFunctionCodeSize() (PR #127142)

2025-02-13 Thread Stanislav Mekhanoshin via llvm-branch-commits

rampitec wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.dev/github/pr/llvm/llvm-project/127142?utm_source=stack-comment-downstack-mergeability-warning"

[llvm-branch-commits] [llvm] [AMDGPU] Set inst_pref_size to maximum (PR #126981)

2025-02-13 Thread Stanislav Mekhanoshin via llvm-branch-commits

@@ -199,3 +201,28 @@ const MCExpr *SIProgramInfo::getPGMRSrc2(CallingConv::ID CC, return MCConstantExpr::create(0, Ctx); } + +uint64_t SIProgramInfo::getFunctionCodeSize(const MachineFunction &MF) { + if (!CodeSizeInBytes.has_value()) { +const GCNSubtarget &STM = MF.ge

[llvm-branch-commits] [llvm] [AMDGPU] Respect MBB alignment in the getFunctionCodeSize() (PR #127142)

2025-02-13 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec ready_for_review https://github.com/llvm/llvm-project/pull/127142 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

1 2 >

1 - 100 of 188 matches

Mail list logo