[llvm-branch-commits] [llvm] InferAddressSpaces: Handle llvm.is.constant (PR #102010)

2024-08-05 Thread Artem Belevich via llvm-branch-commits

https://github.com/Artem-B approved this pull request.


https://github.com/llvm/llvm-project/pull/102010
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] InferAddressSpaces: Handle llvm.is.constant (PR #102010)

2024-08-05 Thread Artem Belevich via llvm-branch-commits


@@ -429,6 +430,15 @@ void 
InferAddressSpacesImpl::collectRewritableIntrinsicOperands(
 appendsFlatAddressExpressionToPostorderStack(II->getArgOperand(0),
  PostorderStack, Visited);
 break;
+  case Intrinsic::is_constant: {
+Value *Ptr = II->getArgOperand(0);
+if (Ptr->getType()->isPtrOrPtrVectorTy()) {
+  appendsFlatAddressExpressionToPostorderStack(Ptr, PostorderStack,
+   Visited);
+}

Artem-B wrote:

> It should never be wrong to include braces.

Having to deal with google style that demands braces everywhere, and LLVM which 
does not want them, my personal choice is "whatever the style guide says". It 
may not always be a perfect choice, but  it's not worth anyone's time to argue 
over specific instances, where the right choice is ambiguous or is a matter of 
personal preference. I wish we could delegate braces/no-braces decisions to 
clang-format, too, but I don't think it currently handles that.

I'd stick with the style guide defaults and either have the braces removed, or 
a comment added to the body. Perhaps, making the function name shorter, and 
avoiding line-wrapping would address your readability concerns about 
braces/no-braces here, too.

https://github.com/llvm/llvm-project/pull/102010
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] InferAddressSpaces: Handle masked load and store intrinsics (PR #102007)

2024-08-05 Thread Artem Belevich via llvm-branch-commits

https://github.com/Artem-B approved this pull request.


https://github.com/llvm/llvm-project/pull/102007
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] InferAddressSpaces: Handle llvm.is.constant (PR #102010)

2024-08-05 Thread Artem Belevich via llvm-branch-commits

Artem-B wrote:

> From the clang-format documentation: 
> https://clang.llvm.org/docs/ClangFormatStyleOptions.html#removebracesllvm

Thank you for the pointer. Unfortunately, it looks like there's a good reason 
it does not seem to be on by default:

> Setting this option to true could lead to incorrect code formatting due to 
> clang-format’s lack of complete semantic information.


https://github.com/llvm/llvm-project/pull/102010
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] clang/AMDGPU: Set noalias.addrspace metadata on atomicrmw (PR #102462)

2024-08-08 Thread Artem Belevich via llvm-branch-commits


@@ -647,6 +647,14 @@ class LangOptions : public LangOptionsBase {
 return ConvergentFunctions;
   }
 
+  /// Return true if atomicrmw operations targeting allocations in private
+  /// memory are undefined.
+  bool threadPrivateMemoryAtomicsAreUndefined() const {
+// Should be false for OpenMP.
+// TODO: Should this be true for SYCL?
+return OpenCL || CUDA;

Artem-B wrote:

@gonzalobg -- Does NVIDIA define what happens if atomics are used on local 
address space?

@arsenm  atomics/and AS relationship seems to be a property of the target, not 
the language. I.e. we could potentially have a different answer for HIP on 
AMDGPU and CUDA on NVPTX, even though both would have `LangOpts.CUDA=true`.

https://github.com/llvm/llvm-project/pull/102462
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] clang/AMDGPU: Set noalias.addrspace metadata on atomicrmw (PR #102462)

2024-08-08 Thread Artem Belevich via llvm-branch-commits


@@ -550,6 +551,16 @@ AMDGPUTargetCodeGenInfo::getLLVMSyncScopeID(const 
LangOptions &LangOpts,
 
 void AMDGPUTargetCodeGenInfo::setTargetAtomicMetadata(
 CodeGenFunction &CGF, llvm::AtomicRMWInst &RMW) const {
+
+  if (RMW.getPointerAddressSpace() == llvm::AMDGPUAS::FLAT_ADDRESS &&
+  CGF.CGM.getLangOpts().threadPrivateMemoryAtomicsAreUndefined()) {
+llvm::MDBuilder MDHelper(CGF.getLLVMContext());
+llvm::MDNode *ASRange = MDHelper.createRange(
+llvm::APInt(32, llvm::AMDGPUAS::PRIVATE_ADDRESS),
+llvm::APInt(32, llvm::AMDGPUAS::PRIVATE_ADDRESS + 1));
+RMW.setMetadata(llvm::LLVMContext::MD_noalias_addrspace, ASRange);

Artem-B wrote:

What's the plan for this metadata? It does not seem to exist or be used for 
anything yet. 
I do not see it in 
https://github.com/llvm/llvm-project/blob/6b5308b7924108d63149d7c521f21c5e90da7a09/llvm/include/llvm/IR/FixedMetadataKinds.def#L21

Does this patch miss some changes or depends on other patches?

https://github.com/llvm/llvm-project/pull/102462
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] 127091b - [CUDA] Normalize handling of defauled dtor.

2021-01-21 Thread Artem Belevich via llvm-branch-commits

Author: Artem Belevich
Date: 2021-01-21T10:48:07-08:00
New Revision: 127091bfd5edf10495fee4724fd21c666e5d79c1

URL: 
https://github.com/llvm/llvm-project/commit/127091bfd5edf10495fee4724fd21c666e5d79c1
DIFF: 
https://github.com/llvm/llvm-project/commit/127091bfd5edf10495fee4724fd21c666e5d79c1.diff

LOG: [CUDA] Normalize handling of defauled dtor.

Defaulted destructor was treated inconsistently, compared to other
compiler-generated functions.

When Sema::IdentifyCUDATarget() got called on just-created dtor which didn't
have implicit __host__ __device__ attributes applied yet, it would treat it as a
host function.  That happened to (sometimes) hide the error when dtor referred
to a host-only functions.

Even when we had identified defaulted dtor as a HD function, we still treated it
inconsistently during selection of usual deallocators, where we did not allow
referring to wrong-side functions, while it is allowed for other HD functions.

This change brings handling of defaulted dtors in line with other HD functions.

Differential Revision: https://reviews.llvm.org/D94732

Added: 


Modified: 
clang/lib/Sema/SemaCUDA.cpp
clang/lib/Sema/SemaExprCXX.cpp
clang/test/CodeGenCUDA/usual-deallocators.cu
clang/test/SemaCUDA/usual-deallocators.cu

Removed: 




diff  --git a/clang/lib/Sema/SemaCUDA.cpp b/clang/lib/Sema/SemaCUDA.cpp
index 0f06adf38f7a..ee91eb4c5deb 100644
--- a/clang/lib/Sema/SemaCUDA.cpp
+++ b/clang/lib/Sema/SemaCUDA.cpp
@@ -123,7 +123,8 @@ Sema::CUDAFunctionTarget Sema::IdentifyCUDATarget(const 
FunctionDecl *D,
 return CFT_Device;
   } else if (hasAttr(D, IgnoreImplicitHDAttr)) {
 return CFT_Host;
-  } else if (D->isImplicit() && !IgnoreImplicitHDAttr) {
+  } else if ((D->isImplicit() || !D->isUserProvided()) &&
+ !IgnoreImplicitHDAttr) {
 // Some implicit declarations (like intrinsic functions) are not marked.
 // Set the most lenient target on them for maximal flexibility.
 return CFT_HostDevice;

diff  --git a/clang/lib/Sema/SemaExprCXX.cpp b/clang/lib/Sema/SemaExprCXX.cpp
index 1ee52107c3da..d91db60f17a0 100644
--- a/clang/lib/Sema/SemaExprCXX.cpp
+++ b/clang/lib/Sema/SemaExprCXX.cpp
@@ -1527,9 +1527,24 @@ Sema::BuildCXXTypeConstructExpr(TypeSourceInfo *TInfo,
 bool Sema::isUsualDeallocationFunction(const CXXMethodDecl *Method) {
   // [CUDA] Ignore this function, if we can't call it.
   const FunctionDecl *Caller = dyn_cast(CurContext);
-  if (getLangOpts().CUDA &&
-  IdentifyCUDAPreference(Caller, Method) <= CFP_WrongSide)
-return false;
+  if (getLangOpts().CUDA) {
+auto CallPreference = IdentifyCUDAPreference(Caller, Method);
+// If it's not callable at all, it's not the right function.
+if (CallPreference < CFP_WrongSide)
+  return false;
+if (CallPreference == CFP_WrongSide) {
+  // Maybe. We have to check if there are better alternatives.
+  DeclContext::lookup_result R =
+  Method->getDeclContext()->lookup(Method->getDeclName());
+  for (const auto *D : R) {
+if (const auto *FD = dyn_cast(D)) {
+  if (IdentifyCUDAPreference(Caller, FD) > CFP_WrongSide)
+return false;
+}
+  }
+  // We've found no better variants.
+}
+  }
 
   SmallVector PreventedBy;
   bool Result = Method->isUsualDeallocationFunction(PreventedBy);

diff  --git a/clang/test/CodeGenCUDA/usual-deallocators.cu 
b/clang/test/CodeGenCUDA/usual-deallocators.cu
index 7e7752497f34..6f4cc267a23f 100644
--- a/clang/test/CodeGenCUDA/usual-deallocators.cu
+++ b/clang/test/CodeGenCUDA/usual-deallocators.cu
@@ -12,6 +12,19 @@ extern "C" __host__ void host_fn();
 extern "C" __device__ void dev_fn();
 extern "C" __host__ __device__ void hd_fn();
 
+// Destructors are handled a bit 
diff erently, compared to regular functions.
+// Make sure we do trigger kernel generation on the GPU side even if it's only
+// referenced by the destructor.
+template __global__ void f(T) {}
+template struct A {
+  ~A() { f<<<1, 1>>>(T()); }
+};
+
+// HOST-LABEL: @a
+A a;
+// HOST-LABEL: define linkonce_odr void @_ZN1AIiED1Ev
+// search further down for the deice-side checks for @_Z1fIiEvT_
+
 struct H1D1 {
   __host__ void operator delete(void *) { host_fn(); };
   __device__ void operator delete(void *) { dev_fn(); };
@@ -95,6 +108,9 @@ __host__ __device__ void tests_hd(void *t) {
   test_hd(t);
 }
 
+// Make sure that we've generated the kernel used by A::~A.
+// DEVICE-LABEL: define dso_local void @_Z1fIiEvT_
+
 // Make sure we've picked deallocator for the correct side of compilation.
 
 // COMMON-LABEL: define  linkonce_odr void @_ZN4H1D1dlEPv(i8* %0)
@@ -131,3 +147,5 @@ __host__ __device__ void tests_hd(void *t) {
 // COMMON-LABEL: define  linkonce_odr void @_ZN8H1H2D1D2dlEPv(i8* %0)
 // DEVICE: call void @dev_fn()
 // HOST: call void @host_fn()
+
+// DEVICE: !0 = !{void (i32)* @_Z1fIiEvT_, !"kernel", i32 1}

diff  --git 

[llvm-branch-commits] [clang] 0936655 - [CUDA] Do not diagnose host/device variable access in dependent types.

2020-12-14 Thread Artem Belevich via llvm-branch-commits

Author: Artem Belevich
Date: 2020-12-14T11:53:18-08:00
New Revision: 0936655bac78f6e9cb84dc3feb30c32012100839

URL: 
https://github.com/llvm/llvm-project/commit/0936655bac78f6e9cb84dc3feb30c32012100839
DIFF: 
https://github.com/llvm/llvm-project/commit/0936655bac78f6e9cb84dc3feb30c32012100839.diff

LOG: [CUDA] Do not diagnose host/device variable access in dependent types.

`isCUDADeviceBuiltinSurfaceType()`/`isCUDADeviceBuiltinTextureType()` do not
work on dependent types as they rely on specific type attributes.

Differential Revision: https://reviews.llvm.org/D92893

Added: 


Modified: 
clang/include/clang/Basic/Attr.td
clang/test/SemaCUDA/device-use-host-var.cu

Removed: 




diff  --git a/clang/include/clang/Basic/Attr.td 
b/clang/include/clang/Basic/Attr.td
index 51f654fc7613..79902c8f5b89 100644
--- a/clang/include/clang/Basic/Attr.td
+++ b/clang/include/clang/Basic/Attr.td
@@ -1079,6 +1079,7 @@ def CUDADeviceBuiltinSurfaceType : InheritableAttr {
   let LangOpts = [CUDA];
   let Subjects = SubjectList<[CXXRecord]>;
   let Documentation = [CUDADeviceBuiltinSurfaceTypeDocs];
+  let MeaningfulToClassTemplateDefinition = 1;
 }
 
 def CUDADeviceBuiltinTextureType : InheritableAttr {
@@ -1087,6 +1088,7 @@ def CUDADeviceBuiltinTextureType : InheritableAttr {
   let LangOpts = [CUDA];
   let Subjects = SubjectList<[CXXRecord]>;
   let Documentation = [CUDADeviceBuiltinTextureTypeDocs];
+  let MeaningfulToClassTemplateDefinition = 1;
 }
 
 def CUDAGlobal : InheritableAttr {

diff  --git a/clang/test/SemaCUDA/device-use-host-var.cu 
b/clang/test/SemaCUDA/device-use-host-var.cu
index cf5514610a42..c8ef7dbbb18d 100644
--- a/clang/test/SemaCUDA/device-use-host-var.cu
+++ b/clang/test/SemaCUDA/device-use-host-var.cu
@@ -158,3 +158,23 @@ void dev_lambda_capture_by_copy(int *out) {
   });
 }
 
+// Texture references are special. As far as C++ is concerned they are host
+// variables that are referenced from device code. However, they are handled
+// very 
diff erently by the compiler under the hood and such references are
+// allowed. Compiler should produce no warning here, but it should diagnose the
+// same case without the device_builtin_texture_type attribute.
+template 
+struct __attribute__((device_builtin_texture_type)) texture {
+  static texture ref;
+  __device__ int c() {
+auto &x = ref;
+  }
+};
+
+template 
+struct  not_a_texture {
+  static not_a_texture ref;
+  __device__ int c() {
+auto &x = ref; // dev-error {{reference to __host__ variable 'ref' in 
__device__ function}}
+  }
+};



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] 4326792 - [CUDA] Another attempt to fix early inclusion of from libstdc++

2020-12-04 Thread Artem Belevich via llvm-branch-commits

Author: Artem Belevich
Date: 2020-12-04T12:03:35-08:00
New Revision: 43267929423bf768bbbcc65e47a07e37af7f4e22

URL: 
https://github.com/llvm/llvm-project/commit/43267929423bf768bbbcc65e47a07e37af7f4e22
DIFF: 
https://github.com/llvm/llvm-project/commit/43267929423bf768bbbcc65e47a07e37af7f4e22.diff

LOG: [CUDA] Another attempt to fix early inclusion of  from libstdc++

Previous patch (9a465057a64dba) did not fix the problem.
https://bugs.llvm.org/show_bug.cgi?id=48228

If the  is included too early, before CUDA-specific defines are available,
just include-next the standard  and undo the include guard.  CUDA-specific
variants of operator new/delete will be declared if/when  is used from the
CUDA source itself, when all CUDA-related macros are available.

Differential Revision: https://reviews.llvm.org/D91807

Added: 


Modified: 
clang/lib/Headers/cuda_wrappers/new

Removed: 




diff  --git a/clang/lib/Headers/cuda_wrappers/new 
b/clang/lib/Headers/cuda_wrappers/new
index 47690f1152fe..7f255314056a 100644
--- a/clang/lib/Headers/cuda_wrappers/new
+++ b/clang/lib/Headers/cuda_wrappers/new
@@ -26,6 +26,13 @@
 
 #include_next 
 
+#if !defined(__device__)
+// The header has been included too early from the standard C++ library
+// and CUDA-specific macros are not available yet.
+// Undo the include guard and try again later.
+#undef __CLANG_CUDA_WRAPPERS_NEW
+#else
+
 #pragma push_macro("CUDA_NOEXCEPT")
 #if __cplusplus >= 201103L
 #define CUDA_NOEXCEPT noexcept
@@ -33,76 +40,67 @@
 #define CUDA_NOEXCEPT
 #endif
 
-#pragma push_macro("__DEVICE__")
-#if defined __device__
-#define __DEVICE__ __device__
-#else
-//  has been included too early from the standard libc++ headers and the
-// standard CUDA macros are not available yet. We have to define our own.
-#define __DEVICE__ __attribute__((device))
-#endif
-
 // Device overrides for non-placement new and delete.
-__DEVICE__ inline void *operator new(__SIZE_TYPE__ size) {
+__device__ inline void *operator new(__SIZE_TYPE__ size) {
   if (size == 0) {
 size = 1;
   }
   return ::malloc(size);
 }
-__DEVICE__ inline void *operator new(__SIZE_TYPE__ size,
+__device__ inline void *operator new(__SIZE_TYPE__ size,
  const std::nothrow_t &) CUDA_NOEXCEPT {
   return ::operator new(size);
 }
 
-__DEVICE__ inline void *operator new[](__SIZE_TYPE__ size) {
+__device__ inline void *operator new[](__SIZE_TYPE__ size) {
   return ::operator new(size);
 }
-__DEVICE__ inline void *operator new[](__SIZE_TYPE__ size,
+__device__ inline void *operator new[](__SIZE_TYPE__ size,
const std::nothrow_t &) {
   return ::operator new(size);
 }
 
-__DEVICE__ inline void operator delete(void* ptr) CUDA_NOEXCEPT {
+__device__ inline void operator delete(void* ptr) CUDA_NOEXCEPT {
   if (ptr) {
 ::free(ptr);
   }
 }
-__DEVICE__ inline void operator delete(void *ptr,
+__device__ inline void operator delete(void *ptr,
const std::nothrow_t &) CUDA_NOEXCEPT {
   ::operator delete(ptr);
 }
 
-__DEVICE__ inline void operator delete[](void* ptr) CUDA_NOEXCEPT {
+__device__ inline void operator delete[](void* ptr) CUDA_NOEXCEPT {
   ::operator delete(ptr);
 }
-__DEVICE__ inline void operator delete[](void *ptr,
+__device__ inline void operator delete[](void *ptr,
  const std::nothrow_t &) CUDA_NOEXCEPT 
{
   ::operator delete(ptr);
 }
 
 // Sized delete, C++14 only.
 #if __cplusplus >= 201402L
-__DEVICE__ inline void operator delete(void *ptr,
+__device__ inline void operator delete(void *ptr,
__SIZE_TYPE__ size) CUDA_NOEXCEPT {
   ::operator delete(ptr);
 }
-__DEVICE__ inline void operator delete[](void *ptr,
+__device__ inline void operator delete[](void *ptr,
  __SIZE_TYPE__ size) CUDA_NOEXCEPT {
   ::operator delete(ptr);
 }
 #endif
 
 // Device overrides for placement new and delete.
-__DEVICE__ inline void *operator new(__SIZE_TYPE__, void *__ptr) CUDA_NOEXCEPT 
{
+__device__ inline void *operator new(__SIZE_TYPE__, void *__ptr) CUDA_NOEXCEPT 
{
   return __ptr;
 }
-__DEVICE__ inline void *operator new[](__SIZE_TYPE__, void *__ptr) 
CUDA_NOEXCEPT {
+__device__ inline void *operator new[](__SIZE_TYPE__, void *__ptr) 
CUDA_NOEXCEPT {
   return __ptr;
 }
-__DEVICE__ inline void operator delete(void *, void *) CUDA_NOEXCEPT {}
-__DEVICE__ inline void operator delete[](void *, void *) CUDA_NOEXCEPT {}
+__device__ inline void operator delete(void *, void *) CUDA_NOEXCEPT {}
+__device__ inline void operator delete[](void *, void *) CUDA_NOEXCEPT {}
 
-#pragma pop_macro("__DEVICE__")
 #pragma pop_macro("CUDA_NOEXCEPT")
 
+#endif // __device__
 #endif // include guard



___
llvm-branch-commits mailing list
llvm-branch-c

[llvm-branch-commits] [clang] 016e4eb - [DWARF] Allow toolchain to adjust specified DWARF version.

2020-12-09 Thread Artem Belevich via llvm-branch-commits

Author: Artem Belevich
Date: 2020-12-09T16:34:34-08:00
New Revision: 016e4ebfde28d6bb1ab6399fc8abd8cfc6a1d9fd

URL: 
https://github.com/llvm/llvm-project/commit/016e4ebfde28d6bb1ab6399fc8abd8cfc6a1d9fd
DIFF: 
https://github.com/llvm/llvm-project/commit/016e4ebfde28d6bb1ab6399fc8abd8cfc6a1d9fd.diff

LOG: [DWARF] Allow toolchain to adjust specified DWARF version.

This is needed for CUDA compilation where NVPTX back-end only supports DWARF2,
but host compilation should be allowed to use newer DWARF versions.

Differential Revision: https://reviews.llvm.org/D92617

Added: 
clang/test/Driver/cuda-omp-unsupported-debug-options.cu
clang/test/Driver/dwarf-target-version-clamp.cu

Modified: 
clang/include/clang/Basic/DiagnosticDriverKinds.td
clang/include/clang/Driver/ToolChain.h
clang/lib/Driver/ToolChains/Clang.cpp
clang/lib/Driver/ToolChains/Cuda.h

Removed: 
clang/test/Driver/cuda-unsupported-debug-options.cu
clang/test/Driver/openmp-unsupported-debug-options.c



diff  --git a/clang/include/clang/Basic/DiagnosticDriverKinds.td 
b/clang/include/clang/Basic/DiagnosticDriverKinds.td
index 8fd7a805589d..8ca176d3bb43 100644
--- a/clang/include/clang/Basic/DiagnosticDriverKinds.td
+++ b/clang/include/clang/Basic/DiagnosticDriverKinds.td
@@ -292,6 +292,9 @@ def warn_drv_unsupported_opt_for_target : Warning<
 def warn_drv_unsupported_debug_info_opt_for_target : Warning<
   "debug information option '%0' is not supported for target '%1'">,
   InGroup;
+def warn_drv_dwarf_version_limited_by_target : Warning<
+  "debug information option '%0' is not supported. It needs DWARF-%2 but 
target '%1' only provides DWARF-%3.">,
+  InGroup;
 def warn_c_kext : Warning<
   "ignoring -fapple-kext which is valid for C++ and Objective-C++ only">;
 def warn_ignoring_fdiscard_for_bitcode : Warning<

diff  --git a/clang/include/clang/Driver/ToolChain.h 
b/clang/include/clang/Driver/ToolChain.h
index 7aa8ba7b1da9..28c37a44e1eb 100644
--- a/clang/include/clang/Driver/ToolChain.h
+++ b/clang/include/clang/Driver/ToolChain.h
@@ -27,6 +27,7 @@
 #include "llvm/Support/VersionTuple.h"
 #include "llvm/Target/TargetOptions.h"
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -489,6 +490,11 @@ class ToolChain {
   // to the contrary.
   virtual unsigned GetDefaultDwarfVersion() const { return 4; }
 
+  // Some toolchains may have 
diff erent restrictions on the DWARF version and
+  // may need to adjust it. E.g. NVPTX may need to enforce DWARF2 even when 
host
+  // compilation uses DWARF5.
+  virtual unsigned getMaxDwarfVersion() const { return UINT_MAX; }
+
   // True if the driver should assume "-fstandalone-debug"
   // in the absence of an option specifying otherwise,
   // provided that debugging was requested in the first place.

diff  --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index 6c78a5d9555c..b092252791ff 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -3821,26 +3821,33 @@ static void RenderDebugOptions(const ToolChain &TC, 
const Driver &D,
 }
   }
 
-  unsigned DWARFVersion = 0;
+  unsigned RequestedDWARFVersion = 0; // DWARF version requested by the user
+  unsigned EffectiveDWARFVersion = 0; // DWARF version TC can generate. It may
+  // be lower than what the user wanted.
   unsigned DefaultDWARFVersion = ParseDebugDefaultVersion(TC, Args);
   if (EmitDwarf) {
 // Start with the platform default DWARF version
-DWARFVersion = TC.GetDefaultDwarfVersion();
-assert(DWARFVersion && "toolchain default DWARF version must be nonzero");
+RequestedDWARFVersion = TC.GetDefaultDwarfVersion();
+assert(RequestedDWARFVersion &&
+   "toolchain default DWARF version must be nonzero");
 
 // If the user specified a default DWARF version, that takes precedence
 // over the platform default.
 if (DefaultDWARFVersion)
-  DWARFVersion = DefaultDWARFVersion;
+  RequestedDWARFVersion = DefaultDWARFVersion;
 
 // Override with a user-specified DWARF version
 if (GDwarfN)
   if (auto ExplicitVersion = DwarfVersionNum(GDwarfN->getSpelling()))
-DWARFVersion = ExplicitVersion;
+RequestedDWARFVersion = ExplicitVersion;
+// Clamp effective DWARF version to the max supported by the toolchain.
+EffectiveDWARFVersion =
+std::min(RequestedDWARFVersion, TC.getMaxDwarfVersion());
   }
 
   // -gline-directives-only supported only for the DWARF debug info.
-  if (DWARFVersion == 0 && DebugInfoKind == 
codegenoptions::DebugDirectivesOnly)
+  if (RequestedDWARFVersion == 0 &&
+  DebugInfoKind == codegenoptions::DebugDirectivesOnly)
 DebugInfoKind = codegenoptions::NoDebugInfo;
 
   // We ignore flag -gstrict-dwarf for now.
@@ -3900,9 +3907,15 @@ static void RenderDebugOptions(const ToolChain &TC, 
const Driver &D,
 

[llvm-branch-commits] [clang] 0df1362 - [CUDA] Fix order of memcpy arguments in __shfl_*(<64-bit type>).

2020-01-24 Thread Artem Belevich via llvm-branch-commits

Author: Artem Belevich
Date: 2020-01-24T15:07:22-08:00
New Revision: 0df13627c6a4006de39e5f01d81a338793b0e82b

URL: 
https://github.com/llvm/llvm-project/commit/0df13627c6a4006de39e5f01d81a338793b0e82b
DIFF: 
https://github.com/llvm/llvm-project/commit/0df13627c6a4006de39e5f01d81a338793b0e82b.diff

LOG: [CUDA] Fix order of memcpy arguments in __shfl_*(<64-bit type>).

Wrong argument order resulted in broken shfl ops for 64-bit types.

(cherry picked from commit cc14de88da27a8178976972bdc8211c31f7ca9ae)

Added: 


Modified: 
clang/lib/Headers/__clang_cuda_intrinsics.h

Removed: 




diff  --git a/clang/lib/Headers/__clang_cuda_intrinsics.h 
b/clang/lib/Headers/__clang_cuda_intrinsics.h
index b67461a146fc..c7bff6a9d8fe 100644
--- a/clang/lib/Headers/__clang_cuda_intrinsics.h
+++ b/clang/lib/Headers/__clang_cuda_intrinsics.h
@@ -45,7 +45,7 @@
 _Static_assert(sizeof(__val) == sizeof(__Bits));   
\
 _Static_assert(sizeof(__Bits) == 2 * sizeof(int)); 
\
 __Bits __tmp;  
\
-memcpy(&__val, &__tmp, sizeof(__val)); 
\
+memcpy(&__tmp, &__val, sizeof(__val));\
 __tmp.__a = ::__FnName(__tmp.__a, __offset, __width);  
\
 __tmp.__b = ::__FnName(__tmp.__b, __offset, __width);  
\
 long long __ret;   
\
@@ -129,7 +129,7 @@ __MAKE_SHUFFLES(__shfl_xor, __nvvm_shfl_bfly_i32, 
__nvvm_shfl_bfly_f32, 0x1f,
 _Static_assert(sizeof(__val) == sizeof(__Bits));   
\
 _Static_assert(sizeof(__Bits) == 2 * sizeof(int)); 
\
 __Bits __tmp;  
\
-memcpy(&__val, &__tmp, sizeof(__val)); 
\
+memcpy(&__tmp, &__val, sizeof(__val)); 
\
 __tmp.__a = ::__FnName(__mask, __tmp.__a, __offset, __width);  
\
 __tmp.__b = ::__FnName(__mask, __tmp.__b, __offset, __width);  
\
 long long __ret;   
\



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [NVPTX] Promote v2i8 to v2i16 (#111189) (PR #115081)

2024-11-05 Thread Artem Belevich via llvm-branch-commits

https://github.com/Artem-B edited 
https://github.com/llvm/llvm-project/pull/115081
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [NVPTX] Promote v2i8 to v2i16 (#111189) (PR #115081)

2024-11-12 Thread Artem Belevich via llvm-branch-commits

https://github.com/Artem-B updated 
https://github.com/llvm/llvm-project/pull/115081

>From ed0fe30e7d4da94b13018e563971524e013c512f Mon Sep 17 00:00:00 2001
From: Manasij Mukherjee 
Date: Fri, 4 Oct 2024 15:15:30 -0600
Subject: [PATCH] [NVPTX] Promote v2i8 to v2i16 (#89)

Promote v2i8 to v2i16, fixes a crash.
Re-enable a test in NVPTX/vector-returns.ll

Partial cherry-pick of fda2fea w/o the test which does not exist in release/19.x
https://github.com/llvm/llvm-project/issues/104864
---
 llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp | 4 
 1 file changed, 4 insertions(+)

diff --git a/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp 
b/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
index 6975412ce5d35b..b2153a7afe7365 100644
--- a/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
+++ b/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
@@ -229,6 +229,10 @@ static void ComputePTXValueVTs(const TargetLowering &TLI, 
const DataLayout &DL,
 // v*i8 are formally lowered as v4i8
 EltVT = MVT::v4i8;
 NumElts = (NumElts + 3) / 4;
+  } else if (EltVT.getSimpleVT() == MVT::i8 && NumElts == 2) {
+// v2i8 is promoted to v2i16
+NumElts = 1;
+EltVT = MVT::v2i16;
   }
   for (unsigned j = 0; j != NumElts; ++j) {
 ValueVTs.push_back(EltVT);

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [NVPTX] Promote v2i8 to v2i16 (#111189) (PR #115081)

2024-11-12 Thread Artem Belevich via llvm-branch-commits

Artem-B wrote:

The CI check was unhappy because PR was based on the tree before 19.1.4. I've 
rebased it on top of the most recent commit.

https://github.com/llvm/llvm-project/pull/115081
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [NVPTX] Promote v2i8 to v2i16 (#111189) (PR #115081)

2024-11-25 Thread Artem Belevich via llvm-branch-commits

Artem-B wrote:

Thank you for merging it. I do not think the fix is interesting enough for 
that. 

https://github.com/llvm/llvm-project/pull/115081
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [NVPTX] Promote v2i8 to v2i16 (#111189) (PR #115081)

2024-11-19 Thread Artem Belevich via llvm-branch-commits

Artem-B wrote:

Done.

https://github.com/llvm/llvm-project/pull/115081
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [NVPTX] Promote v2i8 to v2i16 (#111189) (PR #115081)

2024-11-19 Thread Artem Belevich via llvm-branch-commits

https://github.com/Artem-B updated 
https://github.com/llvm/llvm-project/pull/115081

>From e0a9e99b9efbeae7eb975cc4ebaf1805566195f6 Mon Sep 17 00:00:00 2001
From: Manasij Mukherjee 
Date: Fri, 4 Oct 2024 15:15:30 -0600
Subject: [PATCH] [NVPTX] Promote v2i8 to v2i16 (#89)

Promote v2i8 to v2i16, fixes a crash.
Re-enable a test in NVPTX/vector-returns.ll

Partial cherry-pick of fda2fea w/o the test which does not exist in release/19.x
https://github.com/llvm/llvm-project/issues/104864
---
 llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp | 4 
 1 file changed, 4 insertions(+)

diff --git a/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp 
b/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
index 6975412ce5d35b..b2153a7afe7365 100644
--- a/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
+++ b/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
@@ -229,6 +229,10 @@ static void ComputePTXValueVTs(const TargetLowering &TLI, 
const DataLayout &DL,
 // v*i8 are formally lowered as v4i8
 EltVT = MVT::v4i8;
 NumElts = (NumElts + 3) / 4;
+  } else if (EltVT.getSimpleVT() == MVT::i8 && NumElts == 2) {
+// v2i8 is promoted to v2i16
+NumElts = 1;
+EltVT = MVT::v2i16;
   }
   for (unsigned j = 0; j != NumElts; ++j) {
 ValueVTs.push_back(EltVT);

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [NVPTX] Promote v2i8 to v2i16 (#111189) (PR #115081)

2024-11-19 Thread Artem Belevich via llvm-branch-commits

https://github.com/Artem-B updated 
https://github.com/llvm/llvm-project/pull/115081

>From 02930b87faeb490505b22f588757a18744248b6f Mon Sep 17 00:00:00 2001
From: Manasij Mukherjee 
Date: Fri, 4 Oct 2024 15:15:30 -0600
Subject: [PATCH] [NVPTX] Promote v2i8 to v2i16 (#89)

Promote v2i8 to v2i16, fixes a crash.
Re-enable a test in NVPTX/vector-returns.ll

Partial cherry-pick of fda2fea w/o the test which does not exist in release/19.x
https://github.com/llvm/llvm-project/issues/104864
---
 llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp | 4 
 1 file changed, 4 insertions(+)

diff --git a/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp 
b/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
index 6975412ce5d35b..b2153a7afe7365 100644
--- a/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
+++ b/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
@@ -229,6 +229,10 @@ static void ComputePTXValueVTs(const TargetLowering &TLI, 
const DataLayout &DL,
 // v*i8 are formally lowered as v4i8
 EltVT = MVT::v4i8;
 NumElts = (NumElts + 3) / 4;
+  } else if (EltVT.getSimpleVT() == MVT::i8 && NumElts == 2) {
+// v2i8 is promoted to v2i16
+NumElts = 1;
+EltVT = MVT::v2i16;
   }
   for (unsigned j = 0; j != NumElts; ++j) {
 ValueVTs.push_back(EltVT);

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [NVPTX] Promote v2i8 to v2i16 (#111189) (PR #115081)

2024-11-15 Thread Artem Belevich via llvm-branch-commits

https://github.com/Artem-B updated 
https://github.com/llvm/llvm-project/pull/115081

>From ed0fe30e7d4da94b13018e563971524e013c512f Mon Sep 17 00:00:00 2001
From: Manasij Mukherjee 
Date: Fri, 4 Oct 2024 15:15:30 -0600
Subject: [PATCH] [NVPTX] Promote v2i8 to v2i16 (#89)

Promote v2i8 to v2i16, fixes a crash.
Re-enable a test in NVPTX/vector-returns.ll

Partial cherry-pick of fda2fea w/o the test which does not exist in release/19.x
https://github.com/llvm/llvm-project/issues/104864
---
 llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp | 4 
 1 file changed, 4 insertions(+)

diff --git a/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp 
b/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
index 6975412ce5d35b..b2153a7afe7365 100644
--- a/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
+++ b/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
@@ -229,6 +229,10 @@ static void ComputePTXValueVTs(const TargetLowering &TLI, 
const DataLayout &DL,
 // v*i8 are formally lowered as v4i8
 EltVT = MVT::v4i8;
 NumElts = (NumElts + 3) / 4;
+  } else if (EltVT.getSimpleVT() == MVT::i8 && NumElts == 2) {
+// v2i8 is promoted to v2i16
+NumElts = 1;
+EltVT = MVT::v2i16;
   }
   for (unsigned j = 0; j != NumElts; ++j) {
 ValueVTs.push_back(EltVT);

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [NVPTX] Promote v2i8 to v2i16 (#111189) (PR #115081)

2024-11-15 Thread Artem Belevich via llvm-branch-commits

Artem-B wrote:

That would be me -- I did the review of the original patch and just doing the 
legwork here to cherry-pick it into the 19.x branch.

If you need somebody else, I'd ask @AlexMaclean 

https://github.com/llvm/llvm-project/pull/115081
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 02930b8 - [NVPTX] Promote v2i8 to v2i16 (#111189)

2024-11-25 Thread Artem Belevich via llvm-branch-commits

Author: Manasij Mukherjee
Date: 2024-11-19T14:15:13-08:00
New Revision: 02930b87faeb490505b22f588757a18744248b6f

URL: 
https://github.com/llvm/llvm-project/commit/02930b87faeb490505b22f588757a18744248b6f
DIFF: 
https://github.com/llvm/llvm-project/commit/02930b87faeb490505b22f588757a18744248b6f.diff

LOG: [NVPTX] Promote v2i8 to v2i16 (#89)

Promote v2i8 to v2i16, fixes a crash.
Re-enable a test in NVPTX/vector-returns.ll

Partial cherry-pick of fda2fea w/o the test which does not exist in release/19.x
https://github.com/llvm/llvm-project/issues/104864

Added: 


Modified: 
llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp

Removed: 




diff  --git a/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp 
b/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
index 6975412ce5d35b..b2153a7afe7365 100644
--- a/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
+++ b/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
@@ -229,6 +229,10 @@ static void ComputePTXValueVTs(const TargetLowering &TLI, 
const DataLayout &DL,
 // v*i8 are formally lowered as v4i8
 EltVT = MVT::v4i8;
 NumElts = (NumElts + 3) / 4;
+  } else if (EltVT.getSimpleVT() == MVT::i8 && NumElts == 2) {
+// v2i8 is promoted to v2i16
+NumElts = 1;
+EltVT = MVT::v2i16;
   }
   for (unsigned j = 0; j != NumElts; ++j) {
 ValueVTs.push_back(EltVT);



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/20.x: [CUDA] Add support for sm101 and sm120 target architectures (#127187) (PR #127918)

2025-02-21 Thread Artem Belevich via llvm-branch-commits

https://github.com/Artem-B approved this pull request.

I was the one proposing to merge this change, so I assumed that it's the 
release maintainers who'd need to stamp it.

I am all for merging it.

https://github.com/llvm/llvm-project/pull/127918
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/20.x: [CUDA] Add support for sm101 and sm120 target architectures (#127187) (PR #127918)

2025-02-21 Thread Artem Belevich via llvm-branch-commits

Artem-B wrote:

> patch is first reviewed by someone familiar with the code. 

That would be me, as I am the maintainer of CUDA code and had reviewed the 
original PR.

> They approve the patch, and describe how the fix meets the release branch 
> patch requirements 
> (https://llvm.org/docs/HowToReleaseLLVM.html#release-patch-rules).

This patch fits item #3 on the rule list "or completion of features that were 
started before the branch was created. "

These changes allow clang users to compile CUDA code with just-released 
cuda-12.8 which adds these new GPU variants. 


https://github.com/llvm/llvm-project/pull/127918
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/20.x: [CUDA] Add support for sm101 and sm120 target architectures (#127187) (PR #127918)

2025-02-21 Thread Artem Belevich via llvm-branch-commits

Artem-B wrote:

```
# CUDA
- Clang now supports CUDA compilation with CUDA SDK up to v12.8
- Clang can now target sm_100, sm_101, and sm_120 GPUs (Blackwell)
```

https://github.com/llvm/llvm-project/pull/127918
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] clang: Read the address space from the ABIArgInfo (PR #138865)

2025-05-07 Thread Artem Belevich via llvm-branch-commits


@@ -5384,16 +5384,16 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo 
&CallInfo,
 if (!NeedCopy) {
   // Skip the extra memcpy call.
   llvm::Value *V = getAsNaturalPointerTo(Addr, I->Ty);
-  auto *T = llvm::PointerType::get(
-  CGM.getLLVMContext(), CGM.getDataLayout().getAllocaAddrSpace());
+  auto *T = llvm::PointerType::get(CGM.getLLVMContext(),
+   ArgInfo.getIndirectAddrSpace());
 
   // FIXME: This should not depend on the language address spaces, and
   // only the contextual values. If the address space mismatches, see 
if
   // we can look through a cast to a compatible address space value,
   // otherwise emit a copy.

Artem-B wrote:

Thank you for the details. It makes sense now. 

https://github.com/llvm/llvm-project/pull/138865
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits