[clang] [Clang][HIP] Warn when __AMDGCN_WAVEFRONT_SIZE is used in host code without relying on target-dependent overload resolution (PR #109663)

2024-10-02 Thread Artem Belevich via cfe-commits
Artem-B wrote: > I don't think we should rely on these on the host at all, the addition was a > design mistake initially, we probably should not double down on it. I agree with it in principle. However, removing things that already exist should be done with consideration for the existing user

[clang] [Clang][HIP] Warn when __AMDGCN_WAVEFRONT_SIZE is used in host code without relying on target-dependent overload resolution (PR #109663)

2024-10-02 Thread Artem Belevich via cfe-commits
Artem-B wrote: @ritter-x2a That's an outline of a strawman plan in case one does nave nontrivial amount of existing code that depends on this macro, and assuming that we still want to have a host-side macro for the wavefront size. If the end goal is not to have the host-side macro at all, then

[clang] [Driver] Pass `--no-cuda-version-check` to test (PR #117415)

2024-11-25 Thread Artem Belevich via cfe-commits
Artem-B wrote: I am surprised that this is needed. I suspect clang picks the default CUDA version on your machine. While `--no-cuda-version-check` will make the test work, it will still pick CUDA installation ourside of the source tree, and then suppress the error. A better way to fix this ma

[clang] [CUDA] pass -fno-threadsafe-statics to GPU sub-compilations. (PR #117074)

2024-11-20 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B created https://github.com/llvm/llvm-project/pull/117074 We do not have support for the threadsafe statics on the GPU side. However, we do sometimes end up with empty local static initializers, and those happen to trigger calls to `__cxa_guard*`, which breaks compila

[clang] [CUDA] pass -fno-threadsafe-statics to GPU sub-compilations. (PR #117074)

2024-11-20 Thread Artem Belevich via cfe-commits
Artem-B wrote: @yxsamliu -- should I add it for HIP, too? https://github.com/llvm/llvm-project/pull/117074 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Remove device override for operator new when the C++ standard >= 26 (PR #114056)

2024-11-15 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B closed https://github.com/llvm/llvm-project/pull/114056 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Remove device override for operator new when the C++ standard >= 26 (PR #114056)

2024-11-15 Thread Artem Belevich via cfe-commits
Artem-B wrote: The patch request has been already approved. Nothing seems to have changed since then to the patch? Are you asking for help landing the patch? https://github.com/llvm/llvm-project/pull/114056 ___ cfe-commits mailing list cfe-commits@lis

[clang] [Driver] Pass `--no-cuda-version-check` to test (PR #117415)

2024-11-25 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/117415 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA][HIP] Fix overriding of constexpr virtual function (PR #121986)

2025-01-07 Thread Artem Belevich via cfe-commits
@@ -1595,8 +1606,21 @@ static bool IsOverloadOrOverrideImpl(Sema &SemaRef, FunctionDecl *New, // Allow overloading of functions with same signature and different CUDA // target attributes. -if (NewTarget != OldTarget) +if (NewTarget != OldTarg

[clang] [llvm] [mlir] [NVPTX] Switch front-ends and tests to ptx_kernel cc (PR #120806)

2025-01-07 Thread Artem Belevich via cfe-commits
Artem-B wrote: > A 2% improvement is an excellent result! LLVM caches these attributes these days. Before we've implemented that, nvvm metadata lookups did take a lot of time. That said, CC and attributes are a better way to represent the info, and getting rid of the metadata is worthwhile ev

[clang] [llvm] [mlir] [NVPTX] Switch front-ends and tests to ptx_kernel cc (PR #120806)

2025-01-07 Thread Artem Belevich via cfe-commits
@@ -10,8 +10,14 @@ // CHECK-NEXT:[[TMP0:%.*]] = load ptr, ptr [[RET_ADDR]], align 8 // CHECK-NEXT:store i32 1, ptr [[TMP0]], align 4 // CHECK-NEXT:ret void +// __attribute__((nvptx_kernel)) void foo(int *ret) { *ret = 1; } -// CHECK: !0 = !{ptr @foo, !"kernel

[clang] [llvm] [mlir] [NVPTX] Switch front-ends and tests to ptx_kernel cc (PR #120806)

2025-01-07 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/120806 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Make `-Xarch_` handling generic for all toolchains (PR #125421)

2025-02-03 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,31 @@ +// RUN: %clang -x cuda %s -Xarch_nvptx64 -O3 -S -nogpulib -nogpuinc -### 2>&1 | FileCheck -check-prefix=O3ONCE %s +// RUN: %clang -x hip %s -Xarch_amdgcn -O3 -S -nogpulib -nogpuinc -### 2>&1 | FileCheck -check-prefix=O3ONCE %s Artem-B wrote: >

[clang] [Clang] Make `-Xarch_` handling generic for all toolchains (PR #125421)

2025-02-03 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,31 @@ +// RUN: %clang -x cuda %s -Xarch_nvptx64 -O3 -S -nogpulib -nogpuinc -### 2>&1 | FileCheck -check-prefix=O3ONCE %s +// RUN: %clang -x hip %s -Xarch_amdgcn -O3 -S -nogpulib -nogpuinc -### 2>&1 | FileCheck -check-prefix=O3ONCE %s Artem-B wrote: M

[clang] [Clang] Make `-Xarch_` handling generic for all toolchains (PR #125421)

2025-02-03 Thread Artem Belevich via cfe-commits
@@ -1697,19 +1721,17 @@ llvm::opt::DerivedArgList *ToolChain::TranslateXarchArgs( } else if (A->getOption().matches(options::OPT_Xarch_host)) { NeedTrans = !IsDevice; Skip = IsDevice; -} else if (A->getOption().matches(options::OPT_Xarch__) && IsDevice) { -

[clang] [Clang] Make `-Xarch_` handling generic for all toolchains (PR #125421)

2025-02-03 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/125421 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Make `-Xarch_` handling generic for all toolchains (PR #125421)

2025-02-03 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B commented: LGTM overall, with a few nits. https://github.com/llvm/llvm-project/pull/125421 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Make `-Xarch_` handling generic for all toolchains (PR #125421)

2025-02-03 Thread Artem Belevich via cfe-commits
Artem-B wrote: > Summary: Currently, `-Xarch_` is used to forward argument specially to > certain toolchains. Currently, this is only supported by the Darwin > toolchain. We want to be able to use this generically, and for offloading > too. This patch moves the handling out of the Darwin Toolc

[clang] [Clang] Make `-Xarch_` handling generic for all toolchains (PR #125421)

2025-02-03 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,31 @@ +// RUN: %clang -x cuda %s -Xarch_nvptx64 -O3 -S -nogpulib -nogpuinc -### 2>&1 | FileCheck -check-prefix=O3ONCE %s +// RUN: %clang -x hip %s -Xarch_amdgcn -O3 -S -nogpulib -nogpuinc -### 2>&1 | FileCheck -check-prefix=O3ONCE %s Artem-B wrote: C

[clang] [Clang] Make `-Xarch_` handling generic for all toolchains (PR #125421)

2025-02-03 Thread Artem Belevich via cfe-commits
@@ -1697,19 +1721,17 @@ llvm::opt::DerivedArgList *ToolChain::TranslateXarchArgs( } else if (A->getOption().matches(options::OPT_Xarch_host)) { NeedTrans = !IsDevice; Skip = IsDevice; -} else if (A->getOption().matches(options::OPT_Xarch__) && IsDevice) { -

[clang] [clang][X86] Support __attribute__((model("small"/"large"))) (PR #124834)

2025-02-03 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. LGTM. https://github.com/llvm/llvm-project/pull/124834 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Make `-Xarch_` handling generic for all toolchains (PR #125421)

2025-02-03 Thread Artem Belevich via cfe-commits
@@ -1115,14 +1117,13 @@ def fno_convergent_functions : Flag<["-"], "fno-convergent-functions">, // Common offloading options let Group = offload_Group in { -def offload_arch_EQ : Joined<["--"], "offload-arch=">, Flags<[NoXarchOption]>, Artem-B wrote: Also, `

[clang] [llvm] [NVPTX] Add tcgen05 alloc/dealloc intrinsics (PR #124961)

2025-02-03 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. Nice! LGTM https://github.com/llvm/llvm-project/pull/124961 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Make `-Xarch_` handling generic for all toolchains (PR #125421)

2025-02-03 Thread Artem Belevich via cfe-commits
Artem-B wrote: Also see: https://github.com/llvm/llvm-project/issues/110325 https://github.com/llvm/llvm-project/pull/125421 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [PCH, CUDA] Take CUDA attributes into account (PR #125127)

2025-02-03 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B closed https://github.com/llvm/llvm-project/pull/125127 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [NVPTX] Add tcgen05 alloc/dealloc intrinsics (PR #124961)

2025-01-30 Thread Artem Belevich via cfe-commits
@@ -962,6 +962,109 @@ The ``griddepcontrol`` intrinsics allows the dependent grids and prerequisite gr For more information, refer `PTX ISA `__. +

[clang] [CUDA] Increment VTable index for device thunks (PR #124989)

2025-02-04 Thread Artem Belevich via cfe-commits
Artem-B wrote: I'm out of my depth here and will leave it up to @yxsamliu. https://github.com/llvm/llvm-project/pull/124989 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Make `-Xarch_` handling generic for all toolchains (PR #125421)

2025-02-04 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,44 @@ +// RUN: %clang -x cuda %s -Xarch_nvptx64 -O3 -S -nogpulib -nogpuinc -### 2>&1 | FileCheck -check-prefix=O3ONCE %s +// RUN: %clang -x cuda %s -Xarch_device -O3 -S -nogpulib -nogpuinc -### 2>&1 | FileCheck -check-prefix=O3ONCE %s +// RUN: %clang -x hip %s -Xarch_

[clang] [PCH, CUDA] Take CUDA attributes into account (PR #125127)

2025-01-31 Thread Artem Belevich via cfe-commits
@@ -14,12 +16,19 @@ void kcall(void (*kp)()) { __global__ void kern() { } +__host__ int overloaded_func(); Artem-B wrote: Done https://github.com/llvm/llvm-project/pull/125127 ___ cfe-commits mailing list cfe-commi

[clang] [PCH, CUDA] Take CUDA attributes into account (PR #125127)

2025-01-31 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B updated https://github.com/llvm/llvm-project/pull/125127 >From d6ff7dc3dcc5ca677a811888f5e67cc5aaad9d8f Mon Sep 17 00:00:00 2001 From: Artem Belevich Date: Thu, 30 Jan 2025 14:30:22 -0800 Subject: [PATCH 1/3] [PCH, CUDA] Take CUDA attributes into account During deser

[clang] [Clang] Make `-Xarch_` handling generic for all toolchains (PR #125421)

2025-02-04 Thread Artem Belevich via cfe-commits
Artem-B wrote: > Right now it works as I'd expect, it passes --offload-arch=sm_52 to the sm_52 > compilation, but no other architecture. What happens with that `--offload-arch=sm_52` when cc1 sees it? Ideally there should be either an unused argument warning, or an error is the option is not a

[clang] [Clang] Make `-Xarch_` handling generic for all toolchains (PR #125421)

2025-02-04 Thread Artem Belevich via cfe-commits
Artem-B wrote: > Right now if someone passes -Xarch_foo --offload-arch=gfx1030 and foo doesn't > match it's not passed and it will print something like this. I figured that's > good enough. This part SGTM, too. However, I don't think I've seen the answer what happens when we do pass --offloa

[clang] [Clang][NFC] Clean up fetching the offloading toolchain (PR #125095)

2025-02-04 Thread Artem Belevich via cfe-commits
@@ -6601,6 +6573,72 @@ std::string Driver::GetClPchPath(Compilation &C, StringRef BaseName) const { return std::string(Output); } +const ToolChain &Driver::getOffloadToolChain( +const llvm::opt::ArgList &Args, const Action::OffloadKind Kind, +const llvm::Triple &Tar

[clang] [Clang][NFC] Clean up fetching the offloading toolchain (PR #125095)

2025-02-04 Thread Artem Belevich via cfe-commits
@@ -6601,6 +6573,72 @@ std::string Driver::GetClPchPath(Compilation &C, StringRef BaseName) const { return std::string(Output); } +const ToolChain &Driver::getOffloadToolChain( +const llvm::opt::ArgList &Args, const Action::OffloadKind Kind, +const llvm::Triple &Tar

[clang] [Clang] Make `-Xarch_` handling generic for all toolchains (PR #125421)

2025-02-04 Thread Artem Belevich via cfe-commits
Artem-B wrote: > --offload-arch= isn't an accepted -cc1 argument so it won't be forwarded at > all. Silently? That would be wrong, imo. It should be diagnosed somewhere. https://github.com/llvm/llvm-project/pull/125421 ___ cfe-commits mailing list cf

[clang] [Clang] Make `-Xarch_` handling generic for all toolchains (PR #125421)

2025-02-04 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B commented: > > > --offload-arch= isn't an accepted -cc1 argument so it won't be forwarded > > > at all. > > > > > > Silently? That would be wrong, imo. It should be diagnosed somewhere. > > It's already an error if you pass it directly via -`Xclang` because it's not

[clang] [Clang] Make `-Xarch_` handling generic for all toolchains (PR #125421)

2025-02-04 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,44 @@ +// RUN: %clang -x cuda %s -Xarch_nvptx64 -O3 -S -nogpulib -nogpuinc -### 2>&1 | FileCheck -check-prefix=O3ONCE %s +// RUN: %clang -x cuda %s -Xarch_device -O3 -S -nogpulib -nogpuinc -### 2>&1 | FileCheck -check-prefix=O3ONCE %s +// RUN: %clang -x hip %s -Xarch_

[clang] [Clang] Make `-Xarch_` handling generic for all toolchains (PR #125421)

2025-02-04 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/125421 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Make `-Xarch_` handling generic for all toolchains (PR #125421)

2025-02-04 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,44 @@ +// RUN: %clang -x cuda %s -Xarch_nvptx64 -O3 -S -nogpulib -nogpuinc -### 2>&1 | FileCheck -check-prefix=O3ONCE %s +// RUN: %clang -x cuda %s -Xarch_device -O3 -S -nogpulib -nogpuinc -### 2>&1 | FileCheck -check-prefix=O3ONCE %s +// RUN: %clang -x hip %s -Xarch_

[clang] [Clang] Make `-Xarch_` handling generic for all toolchains (PR #125421)

2025-02-04 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,44 @@ +// RUN: %clang -x cuda %s -Xarch_nvptx64 -O3 -S -nogpulib -nogpuinc -### 2>&1 | FileCheck -check-prefix=O3ONCE %s +// RUN: %clang -x cuda %s -Xarch_device -O3 -S -nogpulib -nogpuinc -### 2>&1 | FileCheck -check-prefix=O3ONCE %s +// RUN: %clang -x hip %s -Xarch_

[clang] [Clang] Make `-Xarch_` handling generic for all toolchains (PR #125421)

2025-02-04 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,44 @@ +// RUN: %clang -x cuda %s -Xarch_nvptx64 -O3 -S -nogpulib -nogpuinc -### 2>&1 | FileCheck -check-prefix=O3ONCE %s +// RUN: %clang -x cuda %s -Xarch_device -O3 -S -nogpulib -nogpuinc -### 2>&1 | FileCheck -check-prefix=O3ONCE %s +// RUN: %clang -x hip %s -Xarch_

[clang] [Clang] Make `-Xarch_` handling generic for all toolchains (PR #125421)

2025-02-04 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/125421 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Make `-Xarch_` handling generic for all toolchains (PR #125421)

2025-02-04 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,44 @@ +// RUN: %clang -x cuda %s -Xarch_nvptx64 -O3 -S -nogpulib -nogpuinc -### 2>&1 | FileCheck -check-prefix=O3ONCE %s +// RUN: %clang -x cuda %s -Xarch_device -O3 -S -nogpulib -nogpuinc -### 2>&1 | FileCheck -check-prefix=O3ONCE %s +// RUN: %clang -x hip %s -Xarch_

[clang] [clang][X86] Support __attribute__((model("small"/"large"))) (PR #124834)

2025-02-04 Thread Artem Belevich via cfe-commits
@@ -2949,15 +2950,32 @@ static void handleSectionAttr(Sema &S, Decl *D, const ParsedAttr &AL) { } } +static bool isValidCodeModelAttr(Sema &S, StringRef Str) { + if (S.Context.getTargetInfo().getTriple().isLoongArch()) { +return Str == "normal" || Str == "medium" || St

[clang] [Clang] Make `-Xarch_` handling generic for all toolchains (PR #125421)

2025-02-04 Thread Artem Belevich via cfe-commits
Artem-B wrote: > I don't think there's any use of --offload-arch outside of the driver. I agree. Yet we do need to deal with such nonsensical input in a consistent manner. We do not control what the users give us, but we control how we respond. https://github.com/llvm/llvm-project/pull/125421

[clang] [llvm] [Offload] Unify offloading entries into a single section (PR #125731)

2025-02-04 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. LGTM syntax-wise. Don't have much of an opinion on the strategy. Are there existing users for this? Should we worry about providing backward compatibility with the "omp" sections in the existing binaries? https://github.com/llvm/llvm-proje

[clang] [Clang] Make `-Xarch_` handling generic for all toolchains (PR #125421)

2025-02-04 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,34 @@ +// RUN: %clang -x cuda %s -Xarch_nvptx64 -O3 -S -nogpulib -nogpuinc -### 2>&1 | FileCheck -check-prefix=O3ONCE %s +// RUN: %clang -x cuda %s -Xarch_device -O3 -S -nogpulib -nogpuinc -### 2>&1 | FileCheck -check-prefix=O3ONCE %s +// RUN: %clang -x hip %s -Xarch_

[clang] [Clang] Make `-Xarch_` handling generic for all toolchains (PR #125421)

2025-02-04 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,34 @@ +// RUN: %clang -x cuda %s -Xarch_nvptx64 -O3 -S -nogpulib -nogpuinc -### 2>&1 | FileCheck -check-prefix=O3ONCE %s +// RUN: %clang -x cuda %s -Xarch_device -O3 -S -nogpulib -nogpuinc -### 2>&1 | FileCheck -check-prefix=O3ONCE %s +// RUN: %clang -x hip %s -Xarch_

[clang] [Clang] Make `-Xarch_` handling generic for all toolchains (PR #125421)

2025-02-04 Thread Artem Belevich via cfe-commits
@@ -932,7 +932,9 @@ def W_Joined : Joined<["-"], "W">, Group, def Xanalyzer : Separate<["-"], "Xanalyzer">, HelpText<"Pass to the static analyzer">, MetaVarName<"">, Group; -def Xarch__ : JoinedAndSeparate<["-"], "Xarch_">, Flags<[NoXarchOption]>; +def Xarch__ : JoinedAndS

[clang] [Clang] Make `-Xarch_` handling generic for all toolchains (PR #125421)

2025-02-04 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/125421 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [LinkerWrapper] Clean up options after proper forwarding (PR #126297)

2025-02-07 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/126297 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][NFC] Introduce `gpulib` positive flag for `nogpulib` (PR #126567)

2025-02-10 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B commented: I agree that we should be able to negate the option (and, perhaps --nocudainc, as well, while we're at that) That said, perhaps we should reconsider how we implement it. Originally `-nogpulib` was modeled after `-nostdlib`. So we've got a single-dash opti

[clang] [libc] [Clang] Add width handling for shuffle helper (PR #125896)

2025-02-05 Thread Artem Belevich via cfe-commits
@@ -149,22 +149,23 @@ _DEFAULT_FN_ATTRS static __inline__ void __gpu_sync_lane(uint64_t __lane_mask) { // Shuffles the the lanes inside the warp according to the given index. _DEFAULT_FN_ATTRS static __inline__ uint32_t -__gpu_shuffle_idx_u32(uint64_t __lane_mask, uint32_t __

[clang] [libc] [Clang] Add width handling for shuffle helper (PR #125896)

2025-02-05 Thread Artem Belevich via cfe-commits
@@ -145,17 +145,21 @@ _DEFAULT_FN_ATTRS static __inline__ void __gpu_sync_lane(uint64_t __lane_mask) { // Shuffles the the lanes inside the wavefront according to the given index. _DEFAULT_FN_ATTRS static __inline__ uint32_t -__gpu_shuffle_idx_u32(uint64_t __lane_mask, uint32

[clang] [llvm] [mlir] [NVPTX] Convert scalar function nvvm.annotations to attributes (PR #125908)

2025-02-05 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/125908 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [mlir] [NVPTX] Convert scalar function nvvm.annotations to attributes (PR #125908)

2025-02-05 Thread Artem Belevich via cfe-commits
@@ -179,6 +179,13 @@ static bool argHasNVVMAnnotation(const Value &Val, return false; } +static std::optional getFnAttrParsedIntOrNull(const Function &F, Artem-B wrote: Nit: `OrNull` is kind of implied by return type being optional. https://github.com/llvm

[clang] [llvm] [mlir] [NVPTX] Convert scalar function nvvm.annotations to attributes (PR #125908)

2025-02-05 Thread Artem Belevich via cfe-commits
@@ -179,6 +179,13 @@ static bool argHasNVVMAnnotation(const Value &Val, return false; } +static std::optional getFnAttrParsedIntOrNull(const Function &F, +StringRef Attr) { + if (F.hasFnAttribute(Attr)) +return F.g

[clang] [libc] [Clang] Add width handling for shuffle helper (PR #125896)

2025-02-05 Thread Artem Belevich via cfe-commits
@@ -149,22 +149,23 @@ _DEFAULT_FN_ATTRS static __inline__ void __gpu_sync_lane(uint64_t __lane_mask) { // Shuffles the the lanes inside the warp according to the given index. _DEFAULT_FN_ATTRS static __inline__ uint32_t -__gpu_shuffle_idx_u32(uint64_t __lane_mask, uint32_t __

[clang] [llvm] [mlir] [NVPTX] Convert scalar function nvvm.annotations to attributes (PR #125908)

2025-02-05 Thread Artem Belevich via cfe-commits
@@ -179,6 +179,13 @@ static bool argHasNVVMAnnotation(const Value &Val, return false; } +static std::optional getFnAttrParsedInt(const Function &F, + StringRef Attr) { + return F.hasFnAttribute(Attr) + ? std::opti

[clang] [Clang][NFC] Clean up fetching the offloading toolchain (PR #125095)

2025-02-06 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/125095 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][NFC] Introduce `gpulib` positive flag for `nogpulib` (PR #126567)

2025-02-10 Thread Artem Belevich via cfe-commits
Artem-B wrote: If we're adding new options, I'd prefer those to be the standard double-dash options, rather than adding a special case to the legacy-style option. Also, we still have the original `-nocudalib` being aliased to `-nogpulib`. We should probably take care of that, too. https://

[clang] [Clang][NFC] Introduce `--offloadlib` positive flag for `nogpulib` and alias to `--no-offloadlib` (PR #126567)

2025-02-10 Thread Artem Belevich via cfe-commits
@@ -5618,9 +5618,17 @@ def nogpuinc : Flag<["-"], "nogpuinc">, Group, def nohipwrapperinc : Flag<["-"], "nohipwrapperinc">, Group, HelpText<"Do not include the default HIP wrapper headers and include paths">; def : Flag<["-"], "nocudainc">, Alias; -def nogpulib : Flag<["-"],

[clang] [libc] [libcxx] [llvm] [NVPTX] Make ctor/dtor lowering always enabled in NVPTX (PR #126544)

2025-02-10 Thread Artem Belevich via cfe-commits
@@ -7484,6 +7484,17 @@ void Sema::ProcessDeclAttributeList( } } + // Do not permit 'constructor' or 'destructor' attributes on __device__ code. + if (getLangOpts().CUDAIsDevice && !getLangOpts().GPUAllowDeviceInit) { +if (D->hasAttr() && D->hasAttr()) { ---

[clang] [libc] [libcxx] [llvm] [NVPTX] Make ctor/dtor lowering always enabled in NVPTX (PR #126544)

2025-02-10 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. LGTM with a nit https://github.com/llvm/llvm-project/pull/126544 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [libc] [libcxx] [llvm] [NVPTX] Make ctor/dtor lowering always enabled in NVPTX (PR #126544)

2025-02-10 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/126544 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Add __has_target_builtin macro (PR #126324)

2025-02-10 Thread Artem Belevich via cfe-commits
@@ -96,6 +101,37 @@ the header file to conditionally make a function constexpr whenever the constant evaluation of the corresponding builtin (for example, ``std::fmax`` calls ``__builtin_fmax``) is supported in Clang. +``__has_target_builtin`` + +

[clang] [Clang] Add __has_target_builtin macro (PR #126324)

2025-02-10 Thread Artem Belevich via cfe-commits
@@ -357,6 +357,7 @@ void Preprocessor::RegisterBuiltinMacros() { Ident__has_builtin = RegisterBuiltinMacro("__has_builtin"); Ident__has_constexpr_builtin = RegisterBuiltinMacro("__has_constexpr_builtin"); + Ident__has_target_builtin = RegisterBuiltinMacro("__has_targ

[clang] [PCH, CUDA] Take CUDA attributes into account (PR #125127)

2025-01-30 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/125127 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [PCH, CUDA] Take CUDA attributes into account (PR #125127)

2025-01-30 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B created https://github.com/llvm/llvm-project/pull/125127 During deserialization of CUDA AST we must consider CUDA target attributes to distinguish overloads from redeclarations. >From d6ff7dc3dcc5ca677a811888f5e67cc5aaad9d8f Mon Sep 17 00:00:00 2001 From: Artem Belev

[clang] [clang][X86] Support __attribute__((model("small"/"large"))) (PR #124834)

2025-01-31 Thread Artem Belevich via cfe-commits
Artem-B wrote: Issuing the warning suppression every time a TU happens to see a perfectly valid use of the attribute in the host code (even transitively, in someone else's headers), would be counterproductive, IMO. The only response to the warning is to suppress it (i.e. there's nothing for th

[clang] [llvm] [NVPTX] Add tcgen05 alloc/dealloc intrinsics (PR #124961)

2025-01-30 Thread Artem Belevich via cfe-commits
@@ -962,6 +962,109 @@ The ``griddepcontrol`` intrinsics allows the dependent grids and prerequisite gr For more information, refer `PTX ISA `__. +

[clang] [clang][X86] Support __attribute__((model("small"/"large"))) (PR #124834)

2025-01-30 Thread Artem Belevich via cfe-commits
@@ -1,64 +1,40 @@ -// RUN: %clang_cc1 -triple aarch64 -verify=expected,aarch64 -fsyntax-only %s +// RUN: %clang_cc1 -triple aarch64 -verify=expected,unsupported -fsyntax-only %s // RUN: %clang_cc1 -triple loongarch64 -verify=expected,loongarch64 -fsyntax-only %s -// RUN: %clang

[clang] [PCH, CUDA] Take CUDA attributes into account (PR #125127)

2025-01-30 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B updated https://github.com/llvm/llvm-project/pull/125127 >From d6ff7dc3dcc5ca677a811888f5e67cc5aaad9d8f Mon Sep 17 00:00:00 2001 From: Artem Belevich Date: Thu, 30 Jan 2025 14:30:22 -0800 Subject: [PATCH 1/2] [PCH, CUDA] Take CUDA attributes into account During deser

[clang] [llvm] [NVPTX] Add NVVMUpgradeAnnotations pass to cleanup legacy annotations (PR #119261)

2024-12-10 Thread Artem Belevich via cfe-commits
@@ -302,6 +299,19 @@ void NVPTXTargetCodeGenInfo::addNVVMMetadata( llvm::ConstantAsMetadata::get(GV), llvm::MDString::get(Ctx, Name), llvm::ConstantAsMetadata::get( llvm::ConstantInt::get(llvm::Type::getInt32Ty(Ctx), Operand))}; + // Append metadata to nv

[clang] [llvm] [NVPTX] Add NVVMUpgradeAnnotations pass to cleanup legacy annotations (PR #119261)

2024-12-10 Thread Artem Belevich via cfe-commits
@@ -5022,6 +5022,69 @@ bool llvm::UpgradeDebugInfo(Module &M) { return Modified; } +bool static upgradeSingleNVVMAnnotation(GlobalValue *GV, StringRef K, +const Metadata *V) { + if (K == "kernel") { +assert(mdconst::extract(V)->ge

[clang] [llvm] [NVPTX] Add NVVMUpgradeAnnotations pass to cleanup legacy annotations (PR #119261)

2024-12-10 Thread Artem Belevich via cfe-commits
@@ -1270,77 +1270,21 @@ exit: ; MODULE: attributes #[[ATTR1:[0-9]+]] = { convergent nocallback nounwind } ; MODULE: attributes #[[ATTR2:[0-9]+]] = { convergent nocallback nofree nounwind willreturn } ; MODULE: attributes #[[ATTR3:[0-9]+]] = { nocallback nofree nosync nounwind

[clang] [llvm] [NVPTX] Add NVVMUpgradeAnnotations pass to cleanup legacy annotations (PR #119261)

2024-12-10 Thread Artem Belevich via cfe-commits
@@ -5022,6 +5022,69 @@ bool llvm::UpgradeDebugInfo(Module &M) { return Modified; } +bool static upgradeSingleNVVMAnnotation(GlobalValue *GV, StringRef K, +const Metadata *V) { + if (K == "kernel") { +assert(mdconst::extract(V)->ge

[clang] [llvm] [NVPTX] Add NVVMUpgradeAnnotations pass to cleanup legacy annotations (PR #119261)

2024-12-10 Thread Artem Belevich via cfe-commits
@@ -5022,6 +5022,69 @@ bool llvm::UpgradeDebugInfo(Module &M) { return Modified; } +bool static upgradeSingleNVVMAnnotation(GlobalValue *GV, StringRef K, +const Metadata *V) { + if (K == "kernel") { +assert(mdconst::extract(V)->ge

[clang] [llvm] [NVPTX] Add NVVMUpgradeAnnotations pass to cleanup legacy annotations (PR #119261)

2024-12-10 Thread Artem Belevich via cfe-commits
@@ -324,14 +326,17 @@ MaybeAlign getAlign(const Function &F, unsigned Index) { F.getAttributes().getAttributes(Index).getStackAlignment()) return StackAlign; - // If that is missing, check the legacy nvvm metadata - std::vector Vs; - bool retval = findAllNVVMA

[clang] [llvm] [NVPTX] Add NVVMUpgradeAnnotations pass to cleanup legacy annotations (PR #119261)

2024-12-10 Thread Artem Belevich via cfe-commits
@@ -10,7 +10,7 @@ extern "C" __device__ void device_function() {} -// CHECK-LABEL: define{{.*}} void @global_function +// CHECK: define{{.*}} void @global_function{{.*}} #[[ATTR0:[0-9]+]] Artem-B wrote: It should still be `CHECK-LABEL` You could split attribu

[clang] [llvm] [NVPTX] Add NVVMUpgradeAnnotations pass to cleanup legacy annotations (PR #119261)

2024-12-10 Thread Artem Belevich via cfe-commits
@@ -5022,6 +5022,69 @@ bool llvm::UpgradeDebugInfo(Module &M) { return Modified; } +bool static upgradeSingleNVVMAnnotation(GlobalValue *GV, StringRef K, +const Metadata *V) { + if (K == "kernel") { +assert(mdconst::extract(V)->ge

[clang] [llvm] [NVPTX] Add NVVMUpgradeAnnotations pass to cleanup legacy annotations (PR #119261)

2024-12-10 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/119261 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [NVPTX] Add NVVMUpgradeAnnotations pass to cleanup legacy annotations (PR #119261)

2024-12-10 Thread Artem Belevich via cfe-commits
@@ -5911,31 +5911,21 @@ bool llvm::omp::isOpenMPKernel(Function &Fn) { KernelSet llvm::omp::getDeviceKernels(Module &M) { // TODO: Create a more cross-platform way of determining device kernels. - NamedMDNode *MD = M.getNamedMetadata("nvvm.annotations"); KernelSet Kernel

[clang] [llvm] [NVPTX] Add NVVMUpgradeAnnotations pass to cleanup legacy annotations (PR #119261)

2024-12-10 Thread Artem Belevich via cfe-commits
@@ -5022,6 +5022,69 @@ bool llvm::UpgradeDebugInfo(Module &M) { return Modified; } +bool static upgradeSingleNVVMAnnotation(GlobalValue *GV, StringRef K, +const Metadata *V) { + if (K == "kernel") { +assert(mdconst::extract(V)->ge

[clang] [llvm] [NVPTX] Add NVVMUpgradeAnnotations pass to cleanup legacy annotations (PR #119261)

2024-12-10 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B commented: Few nits. LGTM overall, except for the "kernel/nvvm.kernel" distinction question. https://github.com/llvm/llvm-project/pull/119261 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/c

[clang] [llvm] [NVPTX] Add NVVMUpgradeAnnotations pass to cleanup legacy annotations (PR #119261)

2024-12-10 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/119261 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [NVPTX] Add NVVMUpgradeAnnotations pass to cleanup legacy annotations (PR #119261)

2024-12-10 Thread Artem Belevich via cfe-commits
@@ -324,14 +326,15 @@ MaybeAlign getAlign(const Function &F, unsigned Index) { F.getAttributes().getAttributes(Index).getStackAlignment()) return StackAlign; - // If that is missing, check the legacy nvvm metadata - std::vector Vs; - bool retval = findAllNVVMA

[clang] [clang][Darwin] Remove legacy framework search path logic in the frontend (PR #120149)

2024-12-16 Thread Artem Belevich via cfe-commits
@@ -1,13 +1,6 @@ -// RUN: %clang -cc1 -fcuda-is-device -isysroot /var/empty \ -// RUN: -triple nvptx-nvidia-cuda -aux-triple i386-apple-macosx \ -// RUN: -E -fcuda-is-device -v -o /dev/null -x cuda %s 2>&1 | FileCheck %s - -// RUN: %clang -cc1 -isysroot /var/empty \ -// RUN:

[clang] [CUDA] pass -fno-threadsafe-statics to GPU sub-compilations. (PR #117074)

2024-11-22 Thread Artem Belevich via cfe-commits
Artem-B wrote: Darn. I've missed additional HIP tests. I'll fix the test failures shortly. https://github.com/llvm/llvm-project/pull/117074 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-comm

[clang] [HIP] Fix tests broken by #117074 / 689c532 (PR #117361)

2024-11-22 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B created https://github.com/llvm/llvm-project/pull/117361 None >From 394ef51560731ae1f42fd049655dde1ce2d11a1e Mon Sep 17 00:00:00 2001 From: Artem Belevich Date: Fri, 22 Nov 2024 10:50:50 -0800 Subject: [PATCH] [HIP] Fix tests broken by #117074 / 689c532 --- clang/t

[clang] [CUDA] pass -fno-threadsafe-statics to GPU sub-compilations. (PR #117074)

2024-11-22 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B closed https://github.com/llvm/llvm-project/pull/117074 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [HIP] Fix tests broken by #117074 / 689c532 (PR #117361)

2024-11-22 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B closed https://github.com/llvm/llvm-project/pull/117361 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA] pass -fno-threadsafe-statics to GPU sub-compilations. (PR #117074)

2024-11-21 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B updated https://github.com/llvm/llvm-project/pull/117074 >From 1c8829a1defa6dd06aacb9a2047e7f79db238e2b Mon Sep 17 00:00:00 2001 From: Artem Belevich Date: Wed, 20 Nov 2024 14:24:00 -0800 Subject: [PATCH 1/2] [CUDA] pass -fno-threadsafe-statics to GPU sub-compilation

[clang] [CUDA] pass -fno-threadsafe-statics to GPU sub-compilations. (PR #117074)

2024-11-21 Thread Artem Belevich via cfe-commits
Artem-B wrote: Done. https://github.com/llvm/llvm-project/pull/117074 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [mlir] [Offload] Rework offloading entry type to be more generic (PR #124018)

2025-01-22 Thread Artem Belevich via cfe-commits
@@ -55,29 +68,30 @@ enum OffloadEntryKindFlag : uint32_t { /// globals that will be registered with the offloading runtime. StructType *getEntryTy(Module &M); -/// Returns the struct type we store the two pointers for CUDA / HIP managed -/// variables in. Necessary until we wi

[clang] [llvm] [Offload] Rework offloading entry type to be more generic (PR #124018)

2025-01-22 Thread Artem Belevich via cfe-commits
@@ -55,29 +68,30 @@ enum OffloadEntryKindFlag : uint32_t { /// globals that will be registered with the offloading runtime. StructType *getEntryTy(Module &M); -/// Returns the struct type we store the two pointers for CUDA / HIP managed -/// variables in. Necessary until we wi

[clang] [llvm] [Offload] Rework offloading entry type to be more generic (PR #124018)

2025-01-22 Thread Artem Belevich via cfe-commits
@@ -160,54 +160,30 @@ // CHECK-NTARGET-NOT: private unnamed_addr constant [1 x i // CHECK-DAG: [[NAMEPTR1:@.+]] = internal unnamed_addr constant [{{.*}} x i8] c"[[NAME1:__omp_offloading_[0-9a-f]+_[0-9a-f]+__Z.+_l[0-9]+]]\00" -// CHECK-DAG: [[ENTRY1:@.+]] = weak{{.*}} constant

[clang] [clang][test] Add .cuh as a recognized extension for lit test files (PR #124080)

2025-01-23 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/124080 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [NVPTX] Add support for PTX 8.6 and CUDA 12.6 (12.8) (PR #123398)

2025-01-17 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/123398 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] __has_builtin should return false for aux triple builtins (PR #121839)

2025-01-27 Thread Artem Belevich via cfe-commits
Artem-B wrote: This breaks CUDA compilation on ARM, because `__has_builtin()` now returns false for the host-side builtins and that causes some clang headers on ARM to try defining their own replacement for the builtin they consider to be missing, but which is actually still there: https://god

<    6   7   8   9   10   11   12   13   14   >