[PATCH] D112053: [cuda] Add address space predicate funuctions.

Michael Liao via Phabricator via cfe-commits Tue, 19 Oct 2021 15:30:45 -0700

hliao added inline comments.


================
Comment at: clang/include/clang/Basic/BuiltinsNVPTX.def:691-694
+BUILTIN(__nvvm_isspacep_const, "bvC*", "nc")
+BUILTIN(__nvvm_isspacep_global, "bvC*", "nc")
+BUILTIN(__nvvm_isspacep_local, "bvC*", "nc")
+BUILTIN(__nvvm_isspacep_shared, "bvC*", "nc")
----------------
tra wrote:
> hliao wrote:
> > hliao wrote:
> > > tra wrote:
> > > > CUDA appears to be using `__nv_isGlobal_impl` for the AS predicates. 
> > > > Perhaps we want to add those, too, forwarding them to the `__nvvm_...` 
> > > > implementations above. I've already added a few other AS-related 
> > > > `__nv_*` builtins in `lib/Headers/__clang_cuda_intrinsics.h`.
> > > `__nv_isGlobal_impl` is exposed as an official interface. In fact, in 
> > > CUDA SDK 10.0 or earlier, `__isGlobal` is directly implemented as inline 
> > > asm. If possible, we should avoid defining unofficial or undocumented 
> > > interfaces. `__nv_isGlobal_impl` was introduced from CUDA SDK 10.1 but 
> > > there is no documentation on it.
> > > 
> > >   // This function returns 1 if generic address "ptr" is in global memory 
> > > space.
> > >   // It returns 0 if "ptr" is in shared, local or constant memory space.
> > >   __SM_20_INTRINSICS_DECL__ unsigned int __isGlobal(const void *ptr)
> > >   {
> > >     unsigned int ret;
> > >     asm volatile ("{ \n\t"
> > >                   "    .reg .pred p; \n\t"
> > >                   "    isspacep.global p, %1; \n\t"
> > >                   "    selp.u32 %0, 1, 0, p;  \n\t"
> > >   #if (defined(_MSC_VER) && defined(_WIN64)) || defined(__LP64__) || 
> > > defined(__CUDACC_RTC__)
> > >                   "} \n\t" : "=r"(ret) : "l"(ptr));
> > >   #else
> > >                   "} \n\t" : "=r"(ret) : "r"(ptr));
> > >   #endif
> > > 
> > >     return ret;
> > >   }
> > > 
> > typo, `__nv_isGlobal_impl` is *not* exposed as an official interface.
> > 
> > > `__nv_isGlobal_impl` is exposed as an official interface. In fact, in 
> > > CUDA SDK 10.0 or earlier, `__isGlobal` is directly implemented as inline 
> > > asm. If possible, we should avoid defining unofficial or undocumented 
> > > interfaces. `__nv_isGlobal_impl` was introduced from CUDA SDK 10.1 but 
> > > there is no documentation on it.
> > > 
> > >   // This function returns 1 if generic address "ptr" is in global memory 
> > > space.
> > >   // It returns 0 if "ptr" is in shared, local or constant memory space.
> > >   __SM_20_INTRINSICS_DECL__ unsigned int __isGlobal(const void *ptr)
> > >   {
> > >     unsigned int ret;
> > >     asm volatile ("{ \n\t"
> > >                   "    .reg .pred p; \n\t"
> > >                   "    isspacep.global p, %1; \n\t"
> > >                   "    selp.u32 %0, 1, 0, p;  \n\t"
> > >   #if (defined(_MSC_VER) && defined(_WIN64)) || defined(__LP64__) || 
> > > defined(__CUDACC_RTC__)
> > >                   "} \n\t" : "=r"(ret) : "l"(ptr));
> > >   #else
> > >                   "} \n\t" : "=r"(ret) : "r"(ptr));
> > >   #endif
> > > 
> > >     return ret;
> > >   }
> > > 
> > 
> > 
> > `__nv_isGlobal_impl` is *not* exposed as an official interface.
> > If possible, we should avoid defining unofficial or undocumented 
> > interfaces. __nv_isGlobal_impl was introduced from CUDA SDK 10.1 but there 
> > is no documentation on it.
> 
> In general, I agree. I just wish NVIDIA would stop using undocumented APIs in 
> the public headers they ship. By necessity, clang either has to rely on 
> preprocessor hacks to edit out uncompileable code, or guess what the 
> undocumented APIs do and implement them.
> 
> In this case I do not think `__nv_is*_impl`  are used anywhere other than in 
> the functions you have renamed, so we're fine without them. They would be 
> needed if the patch didn't have to rename `__isGlobal` and friends.
> 
> 
> > `__nv_isGlobal_impl` is *not* exposed as an official interface.
> > If possible, we should avoid defining unofficial or undocumented 
> > interfaces. __nv_isGlobal_impl was introduced from CUDA SDK 10.1 but there 
> > is no documentation on it.
> 
> In general, I agree. I just wish NVIDIA would stop using undocumented APIs in 
> the public headers they ship. By necessity, clang either has to rely on 
> preprocessor hacks to edit out uncompileable code, or guess what the 
> undocumented APIs do and implement them.
> 
> In this case I do not think `__nv_is*_impl`  are used anywhere other than in 
> the functions you have renamed, so we're fine without them. They would be 
> needed if the patch didn't have to rename `__isGlobal` and friends.
> 
> 

another motivation we have to redefine them is that we need to add const (or 
pure) attributes for them.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D112053/new/

https://reviews.llvm.org/D112053

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D112053: [cuda] Add address space predicate funuctions.

Reply via email to