tra added a comment.

A better description for the change would be helpful.

For what it's worth, NVCC accepts 'all of the above' for extern __shared__. 
https://godbolt.org/z/8cBsXv Whether that makes sense or not is another 
question. 
 IIRC, `extern __shared__` can, effectively only be a pointer to an opaque type 
because at compile time we neither know the address, nor do we know the size of 
the memory it will point to. I believe that was the reason why we've limited 
accepted types to size-less arrays only. Otherwise, we end up with situations 
where on the source level we declare an object (I.e. nobody expects a pointer 
to be involved), but end up failing with invalid memory access because no 
memory was ever allocated for it. Incomplete array seems to be a reasonable 
trade-off.

What's your proposed use case for this change? Does `extern __shared__` work in 
HIP the same way it works in CUDA?



================
Comment at: clang/test/CodeGenCUDA/extern-shared.cu:2-3
+// REQUIRES: amdgpu-registered-target
+// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -emit-llvm -o - -fcuda-is-device \
+// RUN:   -target-cpu gfx906 -x hip %s | FileCheck %s
+
----------------
I'd add a CUDA test run, too to issuestrate what we expect CUDA to handle.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D73979/new/

https://reviews.llvm.org/D73979



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to