================ @@ -811,8 +812,13 @@ LogicalResult ConvertAllocOpToGpuRuntimeCallPattern::matchAndRewrite( // descriptor. Type elementPtrType = this->getElementPtrType(memRefType); auto stream = adaptor.getAsyncDependencies().front(); + + auto isHostShared = rewriter.create<mlir::LLVM::ConstantOp>( + loc, llvmInt64Type, rewriter.getI64IntegerAttr(isShared)); + Value allocatedPtr = - allocCallBuilder.create(loc, rewriter, {sizeBytes, stream}).getResult(); + allocCallBuilder.create(loc, rewriter, {sizeBytes, stream, isHostShared}) + .getResult(); ---------------- grypp wrote:
Regarding `host_shared`, I noticed this code in the examples: ``` %memref, %asyncToken = gpu.alloc async [%0] host_shared (): memref<3x3xi64> ``` Can SYCL's runtime allocate `host_shared` data asynchronously? It might be a good idea to prevent the use of `host_shared` and `async` together. FWIW, CUDA and HIP cannot do that. As far as I can see from the PR, the queue is not used when allocating `host_shared`. Nonetheless, having `async` on `gpu.alloc` is perfectly acceptable. CUDA does support asynchronous device memory allocation. https://github.com/llvm/llvm-project/pull/65539 _______________________________________________ lldb-commits mailing list lldb-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits