[clang] [llvm] [mlir] [AMDGPU] Add a new amdgcn.load.to.lds intrinsic (PR #137425)

2025-04-30 Thread Alan Li via cfe-commits


@@ -444,17 +444,40 @@ def ROCDL_ds_read_tr6_b96 : 
ROCDL_LDS_Read_Tr_IntrOp<"ds.read.tr6.b96">;
 def ROCDL_ds_read_tr16_b64 : ROCDL_LDS_Read_Tr_IntrOp<"ds.read.tr16.b64">;
 
 //===-===//
-// Global load to LDS intrinsic (available in GFX950)
+// Load to LDS intrinsic (available in GFX9 and GFX10)
+//===-===//
+
+def ROCDL_LoadToLDSOp :
+  ROCDL_IntrOp<"load.to.lds", [], [0], [], 0, 0, 1, [2, 3, 4], ["size", 
"offset", "aux"]> {
+  dag args = (ins Arg:$globalPtr,
+ Arg:$ldsPtr,
+ I32Attr:$size,
+ I32Attr:$offset,
+ I32Attr:$aux);
+  let arguments = !con(args, aliasAttrs);
+  let assemblyFormat = [{
+$globalPtr `,`  $ldsPtr `,` $size `,` $offset `,` $aux
+attr-dict `:` type($globalPtr)
+  }];
+  let extraClassDefinition = [{
+::llvm::SmallVector<::mlir::Value> $cppClass::getAccessedOperands() {
+  return {getGlobalPtr(), getLdsPtr()};
+}
+  }];
+}
 
 def ROCDL_GlobalLoadLDSOp :
-  ROCDL_IntrOp<"global.load.lds", [], [], [], 0, 0, 1> {
+  ROCDL_IntrOp<"global.load.lds", [], [], [], 0, 0, 1, [2, 3, 4], ["size", 
"offset", "aux"]> {

lialan wrote:

@krzysz00 should we simply remove this op?

https://github.com/llvm/llvm-project/pull/137425
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [mlir] [AMDGPU] Add a new amdgcn.load.to.lds intrinsic (PR #137425)

2025-04-29 Thread Alan Li via cfe-commits

lialan wrote:

> > I still think we need an intrinsic here because a load + an addtid store 
> > can be scheduled much different from the asynchronous "gather to LDS" - and 
> > because we don't want this load/store to not be optimized
> 
> IMO the intrinsic should only be added as a last resort if we really can't 
> get the pattern based codegen to work well enough.

Beg to differ in particularly this case. In downstream application, I want to 
fine control to use this particular instruction so this gets propagated down to 
LLVM IR, without being changed or modified along the way.

Well, actual reason: we need this instruction now. :-p

https://github.com/llvm/llvm-project/pull/137425
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits