tra added a comment.

In D73979#1857536 <https://reviews.llvm.org/D73979#1857536>, @yaxunl wrote:

> Based on CUDA usage of extern shared var 
> (https://devblogs.nvidia.com/using-shared-memory-cuda-cc/), it seems CUDA 
> also assumes all extern shared vars have the same address, therefore HIP and 
> CUDA have similar behavior.


Yes. Because of the `CUDA also assumes all extern shared vars have the same 
address` I think that not allowing additional types for `extern __shared__` 
makes sense for HIP, too. I'd rather not give users more ways to do a wrong 
thing.

Can you elaborate on why you want to allow this feature? While it would be 
convenient for someone who uses exactly *one* such extern object, in practice 
the most common use case is that users declare a single `extern __shared__` 
array to serve as the memory pool and then manually allocate chunks within it 
and assign the addresses to appropriately typed pointers. I guess they could 
define an `extern __shared__ struct` with fields representing the objects, but 
that seems sort of pointless considering that the only reason to use `extern 
__shared__` is to allocate shared memory dynamically.

In general, the concept of `extern __shared__` with *all* such extern items 
occupying the same space is broken by design. It's not composable (every 
function using one needs to coordinate with every other function doig the 
same). It introduces failure modes not obvious from the source code (access an 
object, fail with invalid memory access). It does not fit the conventional 
meaning of what `extern something` means in C++ (different objects have 
different addresses). IMO, it should not have existed and the shared 
memory/pointer should've been exposed via explicit API. I.e. CUDA could've used 
the same mechanism which provides threads with threadIdx and blockIdx.

As things stand right now,  `extern __shared__` is something I want gone, not 
added more features to. AFAICT, the limitations clang places on it right now 
have not been an issue for the CUDA code we compile.

Is there a pressing need for this feature for HIP? Perhaps it would make more 
sense to introduce a more sensible API and port existing HIP code to use it.

WDYT?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D73979/new/

https://reviews.llvm.org/D73979



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to