from:"Joe Nash via llvm\-branch\-commits"

[llvm-branch-commits] [llvm] AMDGPU: Create pseudo to real mapping for flat/buffer atomic fmin/fmax (PR #95591)

2024-06-14 Thread Joe Nash via llvm-branch-commits


https://github.com/Sisyph edited https://github.com/llvm/llvm-project/pull/95591
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU: Create pseudo to real mapping for flat/buffer atomic fmin/fmax (PR #95591)

2024-06-14 Thread Joe Nash via llvm-branch-commits


https://github.com/Sisyph approved this pull request.


https://github.com/llvm/llvm-project/pull/95591
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU: Create pseudo to real mapping for flat/buffer atomic fmin/fmax (PR #95591)

2024-06-14 Thread Joe Nash via llvm-branch-commits



@@ -1608,14 +1598,14 @@ defm : FlatSignedAtomicIntrPat <"FLAT_ATOMIC_FMAX", 
"int_amdgcn_flat_atomic_fmax
 }
 
 let OtherPredicates = [isGFX10Only] in {
-defm : GlobalFLATAtomicPats <"GLOBAL_ATOMIC_FMIN_X2", 
"atomic_load_fmin_global", f64>;
-defm : GlobalFLATAtomicPats <"GLOBAL_ATOMIC_FMAX_X2", 
"atomic_load_fmax_global", f64>;
-defm : GlobalFLATAtomicIntrPats <"GLOBAL_ATOMIC_FMIN_X2", 
"int_amdgcn_global_atomic_fmin", f64>;
-defm : GlobalFLATAtomicIntrPats <"GLOBAL_ATOMIC_FMAX_X2", 
"int_amdgcn_global_atomic_fmax", f64>;
-defm : FlatSignedAtomicPat <"FLAT_ATOMIC_FMIN_X2", "atomic_load_fmin_flat", 
f64>;
-defm : FlatSignedAtomicPat <"FLAT_ATOMIC_FMAX_X2", "atomic_load_fmax_flat", 
f64>;
-defm : FlatSignedAtomicIntrPat <"FLAT_ATOMIC_FMIN_X2", 
"int_amdgcn_flat_atomic_fmin", f64>;
-defm : FlatSignedAtomicIntrPat <"FLAT_ATOMIC_FMAX_X2", 
"int_amdgcn_flat_atomic_fmax", f64>;
+defm : GlobalFLATAtomicPats <"GLOBAL_ATOMIC_MIN_F64", 
"atomic_load_fmin_global", f64>;

Sisyph wrote:

Can you deduplicate these somehow with the patterns at L1641? They look 
essentially the same, just with a different predicate. Otherwise LGTM

https://github.com/llvm/llvm-project/pull/95591
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU: Cleanup immediate selection patterns (PR #100787)

2024-07-29 Thread Joe Nash via llvm-branch-commits


https://github.com/Sisyph approved this pull request.


https://github.com/llvm/llvm-project/pull/100787
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU/GlobalISel: Select all constants in tablegen (PR #100788)

2024-07-29 Thread Joe Nash via llvm-branch-commits


https://github.com/Sisyph approved this pull request.


https://github.com/llvm/llvm-project/pull/100788
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] 60466fa - [AMDGPU] Remove deprecated V_MUL_LO_I32 from GFX10

2021-01-05 Thread Joe Nash via llvm-branch-commits


Author: Joe Nash
Date: 2021-01-05T11:59:57-05:00
New Revision: 60466fad2dc155329cc870ea733d4f41561bd46d

URL: 
https://github.com/llvm/llvm-project/commit/60466fad2dc155329cc870ea733d4f41561bd46d
DIFF: 
https://github.com/llvm/llvm-project/commit/60466fad2dc155329cc870ea733d4f41561bd46d.diff

LOG: [AMDGPU] Remove deprecated V_MUL_LO_I32 from GFX10

It was removed in GFX10 GPUs, but LLVM could
generate it.

Reviewed By: rampitec, arsenm

Differential Revision: https://reviews.llvm.org/D94020

Change-Id: Id1c716d71313edcfb768b2b175a6789ef9b01f3c

Added: 


Modified: 
llvm/lib/Target/AMDGPU/AMDGPU.td
llvm/lib/Target/AMDGPU/VOP3Instructions.td
llvm/test/MC/AMDGPU/gfx1030_unsupported.s
llvm/test/MC/AMDGPU/gfx10_asm_vop3.s

Removed: 




diff  --git a/llvm/lib/Target/AMDGPU/AMDGPU.td 
b/llvm/lib/Target/AMDGPU/AMDGPU.td
index 42d134de9229..0a212a41ab6a 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPU.td
+++ b/llvm/lib/Target/AMDGPU/AMDGPU.td
@@ -1131,6 +1131,11 @@ def isGFX10Plus :
   Predicate<"Subtarget->getGeneration() >= AMDGPUSubtarget::GFX10">,
   AssemblerPredicate<(all_of FeatureGFX10Insts)>;
 
+def isGFX10Before1030 :
+  Predicate<"Subtarget->getGeneration() == AMDGPUSubtarget::GFX10 &&"
+"!Subtarget->hasGFX10_3Insts()">,
+  AssemblerPredicate<(all_of FeatureGFX10Insts,(not FeatureGFX10_3Insts))>;
+
 def HasFlatAddressSpace : Predicate<"Subtarget->hasFlatAddressSpace()">,
   AssemblerPredicate<(all_of FeatureFlatAddressSpace)>;
 

diff  --git a/llvm/lib/Target/AMDGPU/VOP3Instructions.td 
b/llvm/lib/Target/AMDGPU/VOP3Instructions.td
index 28e4a09069a8..f349a0f54fa7 100644
--- a/llvm/lib/Target/AMDGPU/VOP3Instructions.td
+++ b/llvm/lib/Target/AMDGPU/VOP3Instructions.td
@@ -867,6 +867,10 @@ let InOperandList = (ins SSrcOrLds_b32:$src0, 
SCSrc_b32:$src1, VGPR_32:$vdst_in)
   defm V_WRITELANE_B32 : VOP3_Real_gfx10<0x361>;
 } // End InOperandList = (ins SSrcOrLds_b32:$src0, SCSrc_b32:$src1, 
VGPR_32:$vdst_in)
 
+let SubtargetPredicate = isGFX10Before1030 in {
+  defm V_MUL_LO_I32  : VOP3_Real_gfx10<0x16b>;
+}
+
 defm V_XOR3_B32   : VOP3_Real_gfx10<0x178>;
 defm V_LSHLREV_B64: VOP3_Real_gfx10<0x2ff>;
 defm V_LSHRREV_B64: VOP3_Real_gfx10<0x300>;
@@ -992,6 +996,7 @@ multiclass VOP3be_Real_gfx6_gfx7_gfx10 op> :
 defm V_LSHL_B64: VOP3_Real_gfx6_gfx7<0x161>;
 defm V_LSHR_B64: VOP3_Real_gfx6_gfx7<0x162>;
 defm V_ASHR_I64: VOP3_Real_gfx6_gfx7<0x163>;
+defm V_MUL_LO_I32  : VOP3_Real_gfx6_gfx7<0x16b>;
 
 defm V_MAD_LEGACY_F32  : VOP3_Real_gfx6_gfx7_gfx10<0x140>;
 defm V_MAD_F32 : VOP3_Real_gfx6_gfx7_gfx10<0x141>;
@@ -1033,7 +1038,6 @@ defm V_MAX_F64 : VOP3_Real_gfx6_gfx7_gfx10<0x167>;
 defm V_LDEXP_F64   : VOP3_Real_gfx6_gfx7_gfx10<0x168>;
 defm V_MUL_LO_U32  : VOP3_Real_gfx6_gfx7_gfx10<0x169>;
 defm V_MUL_HI_U32  : VOP3_Real_gfx6_gfx7_gfx10<0x16a>;
-defm V_MUL_LO_I32  : VOP3_Real_gfx6_gfx7_gfx10<0x16b>;
 defm V_MUL_HI_I32  : VOP3_Real_gfx6_gfx7_gfx10<0x16c>;
 defm V_DIV_FMAS_F32: VOP3_Real_gfx6_gfx7_gfx10<0x16f>;
 defm V_DIV_FMAS_F64: VOP3_Real_gfx6_gfx7_gfx10<0x170>;

diff  --git a/llvm/test/MC/AMDGPU/gfx1030_unsupported.s 
b/llvm/test/MC/AMDGPU/gfx1030_unsupported.s
index b3660d66f21d..57cfb2f2514c 100644
--- a/llvm/test/MC/AMDGPU/gfx1030_unsupported.s
+++ b/llvm/test/MC/AMDGPU/gfx1030_unsupported.s
@@ -1,6 +1,9 @@
 // RUN: not llvm-mc -arch=amdgcn -mcpu=gfx1030 
-mattr=+wavefrontsize32,-wavefrontsize64 %s 2>&1 | FileCheck 
--implicit-check-not=error: %s
 // RUN: not llvm-mc -arch=amdgcn -mcpu=gfx1030 
-mattr=-wavefrontsize32,+wavefrontsize64 %s 2>&1 | FileCheck 
--implicit-check-not=error: %s
 
+v_mul_lo_i32 v0, v1, v2
+// CHECK: :[[@LINE-1]]:{{[0-9]+}}: error: instruction not supported on this GPU
+
 
//===--===//
 // Unsupported dpp variants.
 
//===--===//

diff  --git a/llvm/test/MC/AMDGPU/gfx10_asm_vop3.s 
b/llvm/test/MC/AMDGPU/gfx10_asm_vop3.s
index a4f77a4bbaad..be5b3d4a7cf3 100644
--- a/llvm/test/MC/AMDGPU/gfx10_asm_vop3.s
+++ b/llvm/test/MC/AMDGPU/gfx10_asm_vop3.s
@@ -6685,6 +6685,30 @@ v_mul_hi_u32 v5, v1, 0.5
 v_mul_hi_u32 v5, v1, -4.0
 // GFX10: encoding: [0x05,0x00,0x6a,0xd5,0x01,0xef,0x01,0x00]
 
+v_mul_lo_i32 v5, v1, v2
+// GFX10: encoding: [0x05,0x00,0x6b,0xd5,0x01,0x05,0x02,0x00]
+
+v_mul_lo_i32 v255, v1, v2
+// GFX10: encoding: [0xff,0x00,0x6b,0xd5,0x01,0x05,0x02,0x00]
+
+v_mul_lo_i32 v5, v255, v2
+// GFX10: encoding: [0x05,0x00,0x6b,0xd5,0xff,0x05,0x02,0x00]
+
+v_mul_lo_i32 v5, s1, v2
+// GFX10: encoding: [0x05,0x00,0x6b,0xd5,0x01,0x04,0x02,0x00]
+
+v_mul_lo_i32 v5, s103, v2
+// GFX10: encoding: [0x05,0x00,0x6b,0xd5,0x67,0x04,0x02,0x00]
+
+v_mul_lo_i32 v5, vcc_lo, v2
+// GFX10: encoding: [0x05,0x00,0x6b,0xd5,0x6a,0x04,0x02,0x00]
+
+v_mul_lo_i32 v5, vcc_hi,

[llvm-branch-commits] [llvm] bcec0f2 - [AMDGPU] Deduplicate VOP tablegen asm & ins

2021-01-11 Thread Joe Nash via llvm-branch-commits


Author: Joe Nash
Date: 2021-01-11T13:49:26-05:00
New Revision: bcec0f27a2c37b64d5e8b84bbbfa563edae6affe

URL: 
https://github.com/llvm/llvm-project/commit/bcec0f27a2c37b64d5e8b84bbbfa563edae6affe
DIFF: 
https://github.com/llvm/llvm-project/commit/bcec0f27a2c37b64d5e8b84bbbfa563edae6affe.diff

LOG: [AMDGPU] Deduplicate VOP tablegen asm & ins

VOP3 and VOP DPP subroutines to generate input
operands and asm strings were essentially copy
pasted several times. They are deduplicated to
reduce the maintenance burden and allow faster
development.

Reviewed By: dp

Differential Revision: https://reviews.llvm.org/D94102

Change-Id: I76225eed3c33239d9573351e0c8a0abfad0146ea

Added: 


Modified: 
llvm/lib/Target/AMDGPU/SIInstrInfo.td
llvm/lib/Target/AMDGPU/VOP3Instructions.td

Removed: 




diff  --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.td 
b/llvm/lib/Target/AMDGPU/SIInstrInfo.td
index e48138e56d71..78600bebdad2 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.td
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.td
@@ -1587,7 +1587,7 @@ class getIns32  {
 // Returns the input arguments for VOP3 instructions for the given SrcVT.
 class getIns64  {
 
   dag ret =
@@ -1602,7 +1602,7 @@ class getIns64  {
+  // getInst64 handles clamp and omod. implicit mutex between vop3p and omod
+  dag base = getIns64 .ret;
+  dag opsel = (ins op_sel0:$op_sel);
+  dag vop3pFields = (ins op_sel_hi0:$op_sel_hi, neg_lo0:$neg_lo, 
neg_hi0:$neg_hi);
+  dag ret = !con(base,
+ !if(HasOpSel, opsel,(ins)),
+ !if(IsVOP3P, vop3pFields,(ins)));
+}
 
-// The modifiers (except clamp) are dummy operands for the benefit of
-// printing and parsing. They defer their values to looking at the
-// srcN_modifiers for what to print.
 class getInsVOP3P  {
-  dag ret = !if (!eq(NumSrcArgs, 2),
-!if (HasClamp,
-  (ins Src0Mod:$src0_modifiers, Src0RC:$src0,
-   Src1Mod:$src1_modifiers, Src1RC:$src1,
-   clampmod0:$clamp,
-   op_sel0:$op_sel, op_sel_hi0:$op_sel_hi,
-   neg_lo0:$neg_lo, neg_hi0:$neg_hi),
-  (ins Src0Mod:$src0_modifiers, Src0RC:$src0,
-   Src1Mod:$src1_modifiers, Src1RC:$src1,
-   op_sel0:$op_sel, op_sel_hi0:$op_sel_hi,
-   neg_lo0:$neg_lo, neg_hi0:$neg_hi)),
-// else NumSrcArgs == 3
-!if (HasClamp,
-  (ins Src0Mod:$src0_modifiers, Src0RC:$src0,
-   Src1Mod:$src1_modifiers, Src1RC:$src1,
-   Src2Mod:$src2_modifiers, Src2RC:$src2,
-   clampmod0:$clamp,
-   op_sel0:$op_sel, op_sel_hi0:$op_sel_hi,
-   neg_lo0:$neg_lo, neg_hi0:$neg_hi),
-  (ins Src0Mod:$src0_modifiers, Src0RC:$src0,
-   Src1Mod:$src1_modifiers, Src1RC:$src1,
-   Src2Mod:$src2_modifiers, Src2RC:$src2,
-   op_sel0:$op_sel, op_sel_hi0:$op_sel_hi,
-   neg_lo0:$neg_lo, neg_hi0:$neg_hi))
-  );
+  dag ret = getInsVOP3Base.ret;
 }
 
-class getInsVOP3OpSel  {
-  dag ret = !if (!eq(NumSrcArgs, 2),
-!if (HasClamp,
-  (ins Src0Mod:$src0_modifiers, Src0RC:$src0,
-   Src1Mod:$src1_modifiers, Src1RC:$src1,
-   clampmod0:$clamp,
-   op_sel0:$op_sel),
-  (ins Src0Mod:$src0_modifiers, Src0RC:$src0,
-   Src1Mod:$src1_modifiers, Src1RC:$src1,
-   op_sel0:$op_sel)),
-// else NumSrcArgs == 3
-!if (HasClamp,
-  (ins Src0Mod:$src0_modifiers, Src0RC:$src0,
-   Src1Mod:$src1_modifiers, Src1RC:$src1,
-   Src2Mod:$src2_modifiers, Src2RC:$src2,
-   clampmod0:$clamp,
-   op_sel0:$op_sel),
-  (ins Src0Mod:$src0_modifiers, Src0RC:$src0,
-   Src1Mod:$src1_modifiers, Src1RC:$src1,
-   Src2Mod:$src2_modifiers, Src2RC:$src2,
-   op_sel0:$op_sel))
-  );
+class getInsVOP3OpSel  {
+  dag ret = getInsVOP3Base.ret;
 }
 
-class getInsDPP  {
 
   dag ret = !if (!eq(NumSrcArgs, 0),
 // VOP1 without input operands (V_NOP)
-(ins dpp_ctrl:$dpp_ctrl, row_mask:$row_mask,
- bank_mask:$bank_mask, bound_ctrl:$bound_ctrl),
+(ins ),
 !if (!eq(NumSrcArgs, 1),
   !if (HasModifiers,
 // VOP1_DPP with modifiers
 (ins DstRC:$old, Src0Mod:$src0_modifiers,
- Src0RC:$src0, dpp_ctrl:$dpp_ctrl, row_mask:$row_mask,
- bank_mask:$bank_mask, bound_ctrl:$bound_ctrl)
+ Src0RC:$src0)
   /* else */,
 // VOP1_DPP without modifiers
-(ins DstRC:$old, Src0RC:$src0,
- dpp_ctrl:$dpp_ctrl, row_mask:$row_mask,
- bank_mask:$bank_mask, bound_ctrl:$bound_ctrl)
-  /* endif */)
-  /* NumSrcArgs == 2 */,
+(ins DstRC:$old, Src0RC:$src0)
+  /* endif */),
   !if (HasModifiers,
 // VOP2_DPP with modifiers
 (ins DstRC:$old,

[llvm-branch-commits] [clang] [llvm] AMDGPU: Add first gfx950 mfma instructions (PR #116312)

2024-11-15 Thread Joe Nash via llvm-branch-commits


https://github.com/Sisyph approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/116312
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU: Make vector_shuffle legal for v2i32 with v_pk_mov_b32 (PR #123684)

2025-01-21 Thread Joe Nash via llvm-branch-commits



@@ -489,6 +489,90 @@ void AMDGPUDAGToDAGISel::SelectBuildVector(SDNode *N, 
unsigned RegClassID) {
   CurDAG->SelectNodeTo(N, AMDGPU::REG_SEQUENCE, N->getVTList(), RegSeqArgs);
 }
 
+void AMDGPUDAGToDAGISel::SelectVectorShuffle(SDNode *N) {
+  EVT VT = N->getValueType(0);
+  EVT EltVT = VT.getVectorElementType();
+
+  // TODO: Handle 16-bit element vectors with even aligned masks.
+  if (!Subtarget->hasPkMovB32() || !EltVT.bitsEq(MVT::i32) ||
+  VT.getVectorNumElements() != 2) {
+SelectCode(N);
+return;
+  }
+
+  auto *SVN = cast(N);
+
+  SDValue Src0 = SVN->getOperand(0);
+  SDValue Src1 = SVN->getOperand(1);
+  ArrayRef Mask = SVN->getMask();
+  SDLoc DL(N);
+
+  assert(Src0.getValueType().getVectorNumElements() == 2 && Mask.size() == 2 &&
+ Mask[0] < 4 && Mask[1] < 4);
+
+  SDValue VSrc0 = Mask[0] < 2 ? Src0 : Src1;
+  SDValue VSrc1 = Mask[1] < 2 ? Src0 : Src1;
+  unsigned Src0SubReg = Mask[0] & 1 ? AMDGPU::sub1 : AMDGPU::sub0;
+  unsigned Src1SubReg = Mask[1] & 1 ? AMDGPU::sub1 : AMDGPU::sub0;
+
+  if (Mask[0] < 0) {
+Src0SubReg = Src1SubReg;
+MachineSDNode *ImpDef =
+CurDAG->getMachineNode(TargetOpcode::IMPLICIT_DEF, DL, VT);
+VSrc0 = SDValue(ImpDef, 0);
+  }
+
+  if (Mask[1] < 0) {
+Src1SubReg = Src0SubReg;
+MachineSDNode *ImpDef =
+CurDAG->getMachineNode(TargetOpcode::IMPLICIT_DEF, DL, VT);
+VSrc1 = SDValue(ImpDef, 0);
+  }
+
+  // SGPR case needs to lower to copies.
+  //
+  // Also use subregister extract when we can directly blend the registers with
+  // a simple subregister copy.
+  //
+  // TODO: Maybe we should fold this out earlier
+  if (N->isDivergent() && Src0SubReg == AMDGPU::sub1 &&
+  Src1SubReg == AMDGPU::sub0) {
+// The low element of the result always comes from src0.
+// The high element of the result always comes from src1.
+// op_sel selects the high half of src0.
+// op_sel_hi selects the high half of src1.
+
+unsigned Src0OpSel =
+Src0SubReg == AMDGPU::sub1 ? SISrcMods::OP_SEL_0 : SISrcMods::NONE;
+unsigned Src1OpSel =
+Src1SubReg == AMDGPU::sub1 ? SISrcMods::OP_SEL_0 : SISrcMods::NONE;

Sisyph wrote:

It is written in a very confusing way in the docs, but I think you have it 
correct in the code. Out of the 6 bits (op_sel[0-2] and op_sel_hi[0-2]) only 
op_sel[0] and op_sel[1] do anything iiuc. 

https://github.com/llvm/llvm-project/pull/123684
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU: Implement isExtractVecEltCheap (PR #122460)

2025-01-14 Thread Joe Nash via llvm-branch-commits



@@ -1949,6 +1949,13 @@ bool SITargetLowering::isExtractSubvectorCheap(EVT 
ResVT, EVT SrcVT,
   return Index == 0;
 }
 
+bool SITargetLowering::isExtractVecEltCheap(EVT VT, unsigned Index) const {
+  // TODO: This should be more aggressive, particular for 16-bit element
+  // vectors. However there are some mixed improvements and regressions.
+  EVT EltTy = VT.getVectorElementType();
+  return EltTy.getSizeInBits() % 32 == 0;

Sisyph wrote:

Yes I would think EltTy.getSizeInBits() * Index % 16 == 0 for True16 would be 
the way to go.

https://github.com/llvm/llvm-project/pull/122460
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU: Remove redundant operand folding checks (PR #140587)

2025-05-20 Thread Joe Nash via llvm-branch-commits


https://github.com/Sisyph approved this pull request.

Your logic makes sense to me. Handing cases uniformly is good.

https://github.com/llvm/llvm-project/pull/140587
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU] Negative gfx1250 v_dual_cndmask_b32 tests. NFC. (PR #148057)

2025-07-10 Thread Joe Nash via llvm-branch-commits


https://github.com/Sisyph approved this pull request.


https://github.com/llvm/llvm-project/pull/148057
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] AMDGPU: Create pseudo to real mapping for flat/buffer atomic fmin/fmax (PR #95591)

[llvm-branch-commits] [llvm] AMDGPU: Create pseudo to real mapping for flat/buffer atomic fmin/fmax (PR #95591)

[llvm-branch-commits] [llvm] AMDGPU: Create pseudo to real mapping for flat/buffer atomic fmin/fmax (PR #95591)

[llvm-branch-commits] [llvm] AMDGPU: Cleanup immediate selection patterns (PR #100787)

[llvm-branch-commits] [llvm] AMDGPU/GlobalISel: Select all constants in tablegen (PR #100788)

[llvm-branch-commits] [llvm] 60466fa - [AMDGPU] Remove deprecated V_MUL_LO_I32 from GFX10

[llvm-branch-commits] [llvm] bcec0f2 - [AMDGPU] Deduplicate VOP tablegen asm & ins

[llvm-branch-commits] [clang] [llvm] AMDGPU: Add first gfx950 mfma instructions (PR #116312)

[llvm-branch-commits] [llvm] AMDGPU: Make vector_shuffle legal for v2i32 with v_pk_mov_b32 (PR #123684)

[llvm-branch-commits] [llvm] AMDGPU: Implement isExtractVecEltCheap (PR #122460)

[llvm-branch-commits] [llvm] AMDGPU: Remove redundant operand folding checks (PR #140587)

[llvm-branch-commits] [llvm] [AMDGPU] Negative gfx1250 v_dual_cndmask_b32 tests. NFC. (PR #148057)

12 matches

Site Navigation

Mail list logo

Footer information