[Lldb-commits] [lldb] [MLIR] Enabling Intel GPU Integration. (PR #65539)
@@ -811,8 +812,13 @@ LogicalResult ConvertAllocOpToGpuRuntimeCallPattern::matchAndRewrite( // descriptor. Type elementPtrType = this->getElementPtrType(memRefType); auto stream = adaptor.getAsyncDependencies().front(); + + auto isHostShared = rewriter.create( + loc, llvmInt64Type, rewriter.getI64IntegerAttr(isShared)); + Value allocatedPtr = - allocCallBuilder.create(loc, rewriter, {sizeBytes, stream}).getResult(); + allocCallBuilder.create(loc, rewriter, {sizeBytes, stream, isHostShared}) + .getResult(); grypp wrote: Regarding `host_shared`, I noticed this code in the examples: ``` %memref, %asyncToken = gpu.alloc async [%0] host_shared (): memref<3x3xi64> ``` Can SYCL's runtime allocate `host_shared` data asynchronously? It might be a good idea to prevent the use of `host_shared` and `async` together. FWIW, CUDA and HIP cannot do that. As far as I can see from the PR, the queue is not used when allocating `host_shared`. Nonetheless, having `async` on `gpu.alloc` is perfectly acceptable. CUDA does support asynchronous device memory allocation. https://github.com/llvm/llvm-project/pull/65539 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [MLIR] Enabling Intel GPU Integration. (PR #65539)
https://github.com/grypp edited https://github.com/llvm/llvm-project/pull/65539 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits
[Lldb-commits] [lldb] [MLIR] Enabling Intel GPU Integration. (PR #65539)
@@ -811,8 +812,13 @@ LogicalResult ConvertAllocOpToGpuRuntimeCallPattern::matchAndRewrite( // descriptor. Type elementPtrType = this->getElementPtrType(memRefType); auto stream = adaptor.getAsyncDependencies().front(); + + auto isHostShared = rewriter.create( + loc, llvmInt64Type, rewriter.getI64IntegerAttr(isShared)); + Value allocatedPtr = - allocCallBuilder.create(loc, rewriter, {sizeBytes, stream}).getResult(); + allocCallBuilder.create(loc, rewriter, {sizeBytes, stream, isHostShared}) + .getResult(); grypp wrote: > the upstream GPUToLLVMConversion lowering does not support lowering of > gpu.alloc which is not async. Would that work if omit that check when `host_shared` is present? https://github.com/llvm/llvm-project/pull/65539 ___ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits