https://github.com/yxsamliu updated https://github.com/llvm/llvm-project/pull/168711
>From aa7482a2b04c5f5fa9fdf93477d7134f64406379 Mon Sep 17 00:00:00 2001 From: "Yaxun (Sam) Liu" <[email protected]> Date: Wed, 19 Nov 2025 00:16:46 -0500 Subject: [PATCH] [CUDA][HIP] Fix CTAD for host/device constructors Clang currently does not allow using CTAD in CUDA/HIP device functions since deduction guides are treated as host-only. This patch fixes that by treating deduction guides as host+device. The rationale is that deduction guides do not actually generate code in IR, and there is an existing check for device/host correctness for constructors. The patch also suppresses duplicate implicit deduction guides from host/device constructors with identical signatures to prevent ambiguity. For CUDA/HIP, deduction guides are now always implicitly enabled for both host and device, which matches nvcc's effective behavior. Unlike nvcc, which silently ignores explicit CUDA/HIP target attributes on deduction guides, Clang diagnoses such attributes as errors to keep the syntax clean and avoid confusion. This ensures CTAD works correctly in CUDA/HIP for constructors with different target attributes and provides clearer diagnostics when users attempt to annotate deduction guides with CUDA/HIP target attributes. Example: ``` #include <tuple> __host__ __device__ void func() { std::tuple<int, int> t = std::tuple(1, 1); } ``` This compiles with nvcc but fails with clang for CUDA/HIP without this fix. Reference: https://godbolt.org/z/WhT1GrhWE Fixes: https://github.com/ROCm/ROCm/issues/5646 Fixes: https://github.com/llvm/llvm-project/issues/146646 --- clang/docs/HIPSupport.rst | 45 ++++++++++++++++++ clang/docs/ReleaseNotes.rst | 9 ++++ .../clang/Basic/DiagnosticSemaKinds.td | 3 ++ clang/lib/Sema/SemaCUDA.cpp | 12 +++++ clang/lib/Sema/SemaDeclAttr.cpp | 13 +++++ clang/lib/Sema/SemaTemplateDeductionGuide.cpp | 26 +++++++++- clang/test/SemaCUDA/deduction-guide-attrs.cu | 24 ++++++++++ clang/test/SemaCUDA/deduction-guide.cu | 47 +++++++++++++++++++ 8 files changed, 177 insertions(+), 2 deletions(-) create mode 100644 clang/test/SemaCUDA/deduction-guide-attrs.cu create mode 100644 clang/test/SemaCUDA/deduction-guide.cu diff --git a/clang/docs/HIPSupport.rst b/clang/docs/HIPSupport.rst index 92ea07974373e..4c477cc1e1634 100644 --- a/clang/docs/HIPSupport.rst +++ b/clang/docs/HIPSupport.rst @@ -376,6 +376,51 @@ Example Usage basePtr->virtualFunction(); // Allowed since obj is constructed in device code } +C++17 Class Template Argument Deduction (CTAD) Support +====================================================== + +Clang supports C++17 Class Template Argument Deduction (CTAD) in both host and device code for HIP. +This allows you to omit template arguments when creating class template instances, letting the compiler +deduce them from constructor arguments. + +.. code-block:: c++ + + #include <tuple> + + __host__ __device__ void func() { + std::tuple<int, int> t = std::tuple(1, 1); + } + +In the above example, ``std::tuple(1, 1)`` automatically deduces the type to be ``std::tuple<int, int>``. + +Deduction Guides +---------------- + +User-defined deduction guides are also supported. Since deduction guides are not executable code and only +participate in type deduction, they semantically behave as ``__host__ __device__``. This ensures they are +available for deduction in both host and device contexts. + +.. code-block:: c++ + + template <typename T> + struct MyType { + T value; + __device__ MyType(T v) : value(v) {} + }; + + MyType(float) -> MyType<double>; + + __device__ void deviceFunc() { + MyType m(1.0f); // Deduces MyType<double> + } + +.. note:: + + Explicit HIP target attributes such as ``__host__`` or ``__device__`` + are not allowed on deduction guides. Clang treats all deduction guides + as if they were ``__host__ __device__`` and diagnoses any explicit + target attributes on them as errors. + Host and Device Attributes of Default Destructors =================================================== diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst index c4d968bd01b65..1c66c261c5644 100644 --- a/clang/docs/ReleaseNotes.rst +++ b/clang/docs/ReleaseNotes.rst @@ -641,6 +641,15 @@ RISC-V Support CUDA/HIP Language Changes ^^^^^^^^^^^^^^^^^^^^^^^^^ +- Clang now supports C++17 Class Template Argument Deduction (CTAD) in CUDA/HIP + device code by treating deduction guides as if they were ``__host__ __device__``. + +- Clang avoids ambiguous CTAD in CUDA/HIP by not synthesizing duplicate implicit + deduction guides when ``__host__`` and ``__device__`` constructors share a signature. + +- Clang diagnoses CUDA/HIP target attributes written on deduction guides as errors, + since deduction guides do not participate in code generation. + CUDA Support ^^^^^^^^^^^^ diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td index 4a145fd71eedd..959f9cbfda7c3 100644 --- a/clang/include/clang/Basic/DiagnosticSemaKinds.td +++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td @@ -2769,6 +2769,9 @@ def err_deduction_guide_name_not_class_template : Error< "cannot specify deduction guide for " "%select{<error>|function template|variable template|alias template|" "template template parameter|concept|dependent template name}0 %1">; +def err_deduction_guide_target_attr : Error< + "in CUDA/HIP, target attributes are not allowed on deduction guides; " + "deduction guides are implicitly enabled for both host and device">; def err_deduction_guide_wrong_scope : Error< "deduction guide must be declared in the same scope as template %q0">; def err_deduction_guide_defines_function : Error< diff --git a/clang/lib/Sema/SemaCUDA.cpp b/clang/lib/Sema/SemaCUDA.cpp index 31735a0f5feb3..8d1e03c8bc571 100644 --- a/clang/lib/Sema/SemaCUDA.cpp +++ b/clang/lib/Sema/SemaCUDA.cpp @@ -137,6 +137,12 @@ CUDAFunctionTarget SemaCUDA::IdentifyTarget(const FunctionDecl *D, if (D == nullptr) return CurCUDATargetCtx.Target; + // C++ deduction guides are never codegen'ed and only participate in template + // argument deduction. Treat them as if they were always host+device so that + // CUDA/HIP target checking never rejects their use based solely on target. + if (isa<CXXDeductionGuideDecl>(D)) + return CUDAFunctionTarget::HostDevice; + if (D->hasAttr<CUDAInvalidTargetAttr>()) return CUDAFunctionTarget::InvalidTarget; @@ -907,6 +913,12 @@ bool SemaCUDA::CheckCall(SourceLocation Loc, FunctionDecl *Callee) { if (ExprEvalCtx.isUnevaluated() || ExprEvalCtx.isConstantEvaluated()) return true; + // C++ deduction guides participate in overload resolution but are not + // callable functions and are never codegen'ed. Treat them as always + // allowed for CUDA/HIP compatibility checking. + if (isa<CXXDeductionGuideDecl>(Callee)) + return true; + // FIXME: Is bailing out early correct here? Should we instead assume that // the caller is a global initializer? FunctionDecl *Caller = SemaRef.getCurFunctionDecl(/*AllowLambda=*/true); diff --git a/clang/lib/Sema/SemaDeclAttr.cpp b/clang/lib/Sema/SemaDeclAttr.cpp index e3af5023c74d0..f174bbd9de312 100644 --- a/clang/lib/Sema/SemaDeclAttr.cpp +++ b/clang/lib/Sema/SemaDeclAttr.cpp @@ -7984,6 +7984,19 @@ void Sema::ProcessDeclAttributeList( } } + // CUDA/HIP: disallow explicit CUDA target attributes on deduction guides. + // Deduction guides are not callable functions and never participate in + // codegen; they are always treated as host+device for CUDA/HIP semantic + // checks, so explicit target attributes on them would be misleading noise. + if (getLangOpts().CUDA) + if (auto *Guide = dyn_cast<CXXDeductionGuideDecl>(D); + Guide && + (Guide->hasAttr<CUDAHostAttr>() || Guide->hasAttr<CUDADeviceAttr>() || + Guide->hasAttr<CUDAGlobalAttr>())) { + Diag(Guide->getLocation(), diag::err_deduction_guide_target_attr); + Guide->setInvalidDecl(); + } + // Do not permit 'constructor' or 'destructor' attributes on __device__ code. if (getLangOpts().CUDAIsDevice && D->hasAttr<CUDADeviceAttr>() && (D->hasAttr<ConstructorAttr>() || D->hasAttr<DestructorAttr>()) && diff --git a/clang/lib/Sema/SemaTemplateDeductionGuide.cpp b/clang/lib/Sema/SemaTemplateDeductionGuide.cpp index bfb10665c25b1..6949cec0dc141 100644 --- a/clang/lib/Sema/SemaTemplateDeductionGuide.cpp +++ b/clang/lib/Sema/SemaTemplateDeductionGuide.cpp @@ -218,9 +218,31 @@ buildDeductionGuide(Sema &SemaRef, TemplateDecl *OriginalTemplate, TInfo->getTypeLoc().castAs<FunctionProtoTypeLoc>().getParams(); // Build the implicit deduction guide template. + QualType GuideType = TInfo->getType(); + + // In CUDA/HIP mode, avoid creating duplicate implicit deduction guides with + // identical function types. This can happen when there are separate + // __host__ and __device__ constructors with the same signature; each would + // otherwise synthesize its own implicit deduction guide, leading to + // ambiguous CTAD purely due to target attributes. For such cases we keep the + // first guide we created and skip building another one. + if (IsImplicit && Ctor && SemaRef.getLangOpts().CUDA) + for (NamedDecl *Existing : DC->lookup(DeductionGuideName)) { + auto *ExistingFT = dyn_cast<FunctionTemplateDecl>(Existing); + auto *ExistingGuide = + ExistingFT + ? dyn_cast<CXXDeductionGuideDecl>(ExistingFT->getTemplatedDecl()) + : dyn_cast<CXXDeductionGuideDecl>(Existing); + if (!ExistingGuide) + continue; + + if (SemaRef.Context.hasSameType(ExistingGuide->getType(), GuideType)) + return Existing; + } + auto *Guide = CXXDeductionGuideDecl::Create( - SemaRef.Context, DC, LocStart, ES, Name, TInfo->getType(), TInfo, LocEnd, - Ctor, DeductionCandidate::Normal, FunctionTrailingRC); + SemaRef.Context, DC, LocStart, ES, Name, GuideType, TInfo, LocEnd, Ctor, + DeductionCandidate::Normal, FunctionTrailingRC); Guide->setImplicit(IsImplicit); Guide->setParams(Params); diff --git a/clang/test/SemaCUDA/deduction-guide-attrs.cu b/clang/test/SemaCUDA/deduction-guide-attrs.cu new file mode 100644 index 0000000000000..c706a013a5eb8 --- /dev/null +++ b/clang/test/SemaCUDA/deduction-guide-attrs.cu @@ -0,0 +1,24 @@ +// RUN: %clang_cc1 -std=c++17 -triple nvptx64-nvidia-cuda -fsyntax-only \ +// RUN: -fcuda-is-device -verify %s +// RUN: %clang_cc1 -std=c++17 -triple nvptx64-nvidia-cuda -fsyntax-only \ +// RUN: -verify %s + +#include "Inputs/cuda.h" + +template <typename T> +struct S { + __host__ __device__ S(T); +}; + +template <typename T> +S(T) -> S<T>; + +// CUDA/HIP target attributes on deduction guides are rejected. +template <typename U> +__host__ S(U) -> S<U>; // expected-error {{in CUDA/HIP, target attributes are not allowed on deduction guides; deduction guides are implicitly enabled for both host and device}} + +template <typename V> +__device__ S(V) -> S<V>; // expected-error {{in CUDA/HIP, target attributes are not allowed on deduction guides; deduction guides are implicitly enabled for both host and device}} + +template <typename W> +__global__ S(W) -> S<W>; // expected-error {{in CUDA/HIP, target attributes are not allowed on deduction guides; deduction guides are implicitly enabled for both host and device}} diff --git a/clang/test/SemaCUDA/deduction-guide.cu b/clang/test/SemaCUDA/deduction-guide.cu new file mode 100644 index 0000000000000..30e02f7518053 --- /dev/null +++ b/clang/test/SemaCUDA/deduction-guide.cu @@ -0,0 +1,47 @@ +// RUN: %clang_cc1 -std=c++17 -triple nvptx64-nvidia-cuda -fsyntax-only \ +// RUN: -fcuda-is-device -verify=expected,dev %s +// RUN: %clang_cc1 -std=c++17 -triple nvptx64-nvidia-cuda -fsyntax-only \ +// RUN: -verify %s + +#include "Inputs/cuda.h" + +template <class T> +struct CTADType { // expected-note 2{{candidate constructor (the implicit copy constructor) not viable: requires 1 argument, but 3 were provided}} + // expected-note@-1 2{{candidate constructor (the implicit move constructor) not viable: requires 1 argument, but 3 were provided}} + T first; + T second; + + CTADType(T x) : first(x), second(x) {} // expected-note 2{{candidate constructor not viable: requires single argument 'x', but 3 arguments were provided}} + __device__ CTADType(T x) : first(x), second(x) {} // expected-note 2{{candidate constructor not viable: requires single argument 'x', but 3 arguments were provided}} + __host__ __device__ CTADType(T x, T y) : first(x), second(y) {} // expected-note 2{{candidate constructor not viable: requires 2 arguments, but 3 were provided}} + CTADType(T x, T y, T z) : first(x), second(z) {} // dev-note {{'CTADType' declared here}} + // expected-note@-1 {{candidate constructor not viable: call to __host__ function from __device__ function}} + // expected-note@-2 {{candidate constructor not viable: call to __host__ function from __global__ function}} +}; + +template <class T> +CTADType(T, T) -> CTADType<T>; + +__host__ __device__ void use_ctad_host_device() { + CTADType ctad_from_two_args(1, 1); + CTADType ctad_from_one_arg(1); + CTADType ctad_from_three_args(1, 2, 3); // dev-error {{reference to __host__ function 'CTADType' in __host__ __device__ function}} +} + +__host__ void use_ctad_host() { + CTADType ctad_from_two_args(1, 1); + CTADType ctad_from_one_arg(1); + CTADType ctad_from_three_args(1, 2, 3); +} + +__device__ void use_ctad_device() { + CTADType ctad_from_two_args(1, 1); + CTADType ctad_from_one_arg(1); + CTADType<int> ctad_from_three_args(1, 2, 3); // expected-error {{no matching constructor for initialization of 'CTADType<int>'}} +} + +__global__ void use_ctad_global() { + CTADType ctad_from_two_args(1, 1); + CTADType ctad_from_one_arg(1); + CTADType<int> ctad_from_three_args(1, 2, 3); // expected-error {{no matching constructor for initialization of 'CTADType<int>'}} +} _______________________________________________ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
