[Lldb-commits] [lldb] [MLIR] Enabling Intel GPU Integration. (PR #65539)

2023-09-07 Thread Guray Ozen via lldb-commits


@@ -811,8 +812,13 @@ LogicalResult 
ConvertAllocOpToGpuRuntimeCallPattern::matchAndRewrite(
   // descriptor.
   Type elementPtrType = this->getElementPtrType(memRefType);
   auto stream = adaptor.getAsyncDependencies().front();
+
+  auto isHostShared = rewriter.create(
+  loc, llvmInt64Type, rewriter.getI64IntegerAttr(isShared));
+
   Value allocatedPtr =
-  allocCallBuilder.create(loc, rewriter, {sizeBytes, stream}).getResult();
+  allocCallBuilder.create(loc, rewriter, {sizeBytes, stream, isHostShared})
+  .getResult();

grypp wrote:

Regarding `host_shared`, I noticed this code in the examples:

```
%memref, %asyncToken = gpu.alloc async [%0] host_shared (): memref<3x3xi64>
```

Can SYCL's runtime allocate `host_shared` data asynchronously? It might be a 
good idea to prevent the use of `host_shared` and `async` together. FWIW, CUDA 
and HIP cannot do that. As far as I can see from the PR, the queue is not used 
when allocating `host_shared`.

Nonetheless, having `async` on `gpu.alloc` is perfectly acceptable. CUDA does 
support asynchronous device memory allocation.

https://github.com/llvm/llvm-project/pull/65539
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lldb] [MLIR] Enabling Intel GPU Integration. (PR #65539)

2023-09-07 Thread Guray Ozen via lldb-commits

https://github.com/grypp edited https://github.com/llvm/llvm-project/pull/65539
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lldb] [MLIR] Enabling Intel GPU Integration. (PR #65539)

2023-09-12 Thread Guray Ozen via lldb-commits


@@ -811,8 +812,13 @@ LogicalResult 
ConvertAllocOpToGpuRuntimeCallPattern::matchAndRewrite(
   // descriptor.
   Type elementPtrType = this->getElementPtrType(memRefType);
   auto stream = adaptor.getAsyncDependencies().front();
+
+  auto isHostShared = rewriter.create(
+  loc, llvmInt64Type, rewriter.getI64IntegerAttr(isShared));
+
   Value allocatedPtr =
-  allocCallBuilder.create(loc, rewriter, {sizeBytes, stream}).getResult();
+  allocCallBuilder.create(loc, rewriter, {sizeBytes, stream, isHostShared})
+  .getResult();

grypp wrote:

> the upstream GPUToLLVMConversion lowering does not support lowering of 
> gpu.alloc which is not async.

Would that work if omit that check when `host_shared` is present? 




https://github.com/llvm/llvm-project/pull/65539
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits