hliao marked 6 inline comments as done.
hliao added inline comments.
================
Comment at: clang/lib/CodeGen/CodeGenModule.cpp:701-713
+ if (getLangOpts().CUDAIsDevice) {
+ // As CUDA builtin surface/texture types are replaced, skip generating TBAA
+ // access info.
+ if (AccessType->isCUDADeviceBuiltinSurfaceType()) {
+ if (getTargetCodeGenInfo().getCUDADeviceBuiltinSurfaceDeviceType() !=
+ nullptr)
+ return TBAAAccessInfo();
----------------
tra wrote:
> hliao wrote:
> > tra wrote:
> > > Would `isCUDADeviceBuiltinTextureType()` be sufficient criteria for
> > > skipping TBAA regeneration?
> > > Or does it need to be 'it is the texture type and it will be replaced
> > > with something else'? What is 'something else' is the same type?
> > >
> > >
> > The replacement only happens in the device compilation. On the host-side,
> > the original type is still used.
> But you've already checked CUDAIsDevice so you already know that you want to
> replace the type.
> `if (getTargetCodeGenInfo().getCUDADeviceBuiltinTextureDeviceType() !=
> nullptr)` appears to be redundant and can probably be dropped.
That check is a target-specific one, which may choose very different
implementation on how to handle these builtin surface/texture types. If they
don't want to change those types on the device side and, instead, use very
different different `textureReference`. Their
`getCUDADeviceBuiltinTextureDeviceType()` may return `nullptr` to keep use the
same reference type on both host- and device-side compilation.
================
Comment at: clang/lib/CodeGen/CodeGenModule.cpp:4101-4127
+ if (const ClassTemplateSpecializationDecl *TD =
+ dyn_cast<ClassTemplateSpecializationDecl>(RD)) {
+ Linkage = llvm::GlobalValue::InternalLinkage;
+ const TemplateArgumentList &Args =
TD->getTemplateInstantiationArgs();
+ if (RD->hasAttr<CUDADeviceBuiltinSurfaceTypeAttr>()) {
+ assert(Args.size() == 2 &&
+ "Unexpcted number of template arguments of CUDA device "
----------------
tra wrote:
> hliao wrote:
> > tra wrote:
> > > This is the part I'm not comfortable with.
> > > It's possible for the user to use the attribute on other types that do
> > > not match the expectations encoded here.
> > > We should not be failing with an assert here because that's *user* error,
> > > not a compiler bug.
> > >
> > > Expectations we have for the types should be enforced by Sema and
> > > compiler should produce proper diagnostics.
> > >
> > `device_builtin_surface_type` and `device_builtin_texture_type` should only
> > be used internally. Regular users of either CUDA or HIP must not use them
> > as they need special internal handling and coordination beyond the compiler
> > itself.
> I agree that it's probably not something that should be used by users.
> Still, such use should be reported as an error and should *not* crash the
> compiler. Asserts are for clang/llvm developers to catch the bugs in the
> compiler itself, not for the end users misusing something they should not.
>
addressed in the latest revision
================
Comment at: clang/lib/CodeGen/TargetInfo.cpp:6471-6472
+ // Lookup `addrspacecast` through the constant pointer if any.
+ if (auto *ASC = llvm::dyn_cast_or_null<llvm::AddrSpaceCastOperator>(C))
+ C = llvm::cast<llvm::Constant>(ASC->getPointerOperand());
+ if (auto *GV = llvm::dyn_cast_or_null<llvm::GlobalVariable>(C)) {
----------------
tra wrote:
> hliao wrote:
> > tra wrote:
> > > What's the expectation here? Do we care which address spaces we're
> > > casting to/from?
> > We need to check whether we copy from that global variable directly. As all
> > pointers are generic ones, the code here is to look through the
> > `addrspacecast` constant expression for the original global variable.
> I'm still not sure what exactly you want to do here.
> If the assumption is that all `addrspacecast` ops you may see are from global
> to generic AS, this assumption is not always valid. I can [[
> https://clang.llvm.org/docs/LanguageExtensions.html#memory-references-to-specified-segments
> | annotate any pointer with an arbitrary address space ]] which may then be
> cast to generic. Or something else.
>
> If you accept Src as is, without special-casing addrspacecast, what's going
> to happen?
> AFAICT `nvvm_texsurf_handle_internal` does not really care about specific AS.
the backend needs a GlobalVariable as the argument for that intrinsic. The
lookup through `addrspacecast` to check a global variable, which is created in
the global address space and casted into a generic pointer.
================
Comment at: clang/lib/Headers/__clang_cuda_runtime_wrapper.h:82-94
#undef __CUDACC__
#if CUDA_VERSION < 9000
#define __CUDABE__
#else
+#define __CUDACC__
#define __CUDA_LIBDEVICE__
#endif
----------------
tra wrote:
> hliao wrote:
> > tra wrote:
> > > Please add comments on why __CUDACC__ is needed for driver_types.h here?
> > > AFAICT, driver_types.h does not have any conditionals that depend on
> > > __CUDACC__. What happens if it's not defined.
> > >
> > >
> > `driver_types.h` includes `host_defines.h`, where macros
> > `__device_builtin_surface_type__` and `__device_builtin_texture_type__` are
> > conditional defined if `__CUDACC__`.
> >
> > The following is extracted from `cuda/crt/host_defines.h`
> >
> > ```
> > #if !defined(__CUDACC__)
> > #define __device_builtin__
> > #define __device_builtin_texture_type__
> > #define __device_builtin_surface_type__
> > #define __cudart_builtin__
> > #else /* defined(__CUDACC__) */
> > #define __device_builtin__ \
> > __location__(device_builtin)
> > #define __device_builtin_texture_type__ \
> > __location__(device_builtin_texture_type)
> > #define __device_builtin_surface_type__ \
> > __location__(device_builtin_surface_type)
> > #define __cudart_builtin__ \
> > __location__(cudart_builtin)
> > #endif /* !defined(__CUDACC__) */
> > ```
> My concern is -- what else is going to get defined? There are ~60 references
> to __CUDACC__ in CUDA-10.1 headers. The wrappers are fragile enough that
> there's a good chance something may break. It does not help that my CUDA
> build bot decided to die just after we switched to work-from-home, so there
> will be no early warning if something goes wrong.
>
> If all we need are the macros above, we may just define them.
Let me check all CUDA SDK through their dockers. Redefining sounds good me as
wll.
================
Comment at: clang/test/CodeGenCUDA/surface.cu:12-14
+template<typename T, int type = 1>
+struct __attribute__((device_builtin_surface_type)) surface : public
surfaceReference {
+};
----------------
tra wrote:
> Please add a test for applying the attribute to a wrong type. I.e. a
> non-template or a template with different number or kinds of parameters. We
> should have a proper syntax error and not a compiler crash or silent failure.
addressed in refined tests in the latest revision
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D76365/new/
https://reviews.llvm.org/D76365
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits