from:"Artem Belevich via cfe\-commits"

[clang] [llvm] [LLVM] Add GNU make jobserver support (PR #145131)

2025-07-18 Thread Artem Belevich via cfe-commits

https://github.com/Artem-B approved this pull request. LGTM syntax/style-wise. Looks reasonable on the functionality side, but we could use a second opinion on that. https://github.com/llvm/llvm-project/pull/145131 ___ cfe-commits mailing list cfe-com

[clang] [llvm] [LLVM] Add GNU make jobserver support (PR #145131)

2025-07-18 Thread Artem Belevich via cfe-commits

https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/145131 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Rework creating offloading toolchains (PR #125556)

2025-07-17 Thread Artem Belevich via cfe-commits

@@ -14,14 +14,14 @@ // RUN: | FileCheck %s --check-prefix=NO-OUTPUT-ERROR // RUN: not %clang -### --target=x86_64-unknown-linux-gnu -nogpulib --offload-new-driver --offload-arch=native --amdgpu-arch-tool=%t/amdgpu_arch_fail -x hip %s 2>&1 \ // RUN: | FileCheck %s --chec

[clang] [Clang] Rework creating offloading toolchains (PR #125556)

2025-07-17 Thread Artem Belevich via cfe-commits

@@ -951,221 +931,262 @@ static bool addSYCLDefaultTriple(Compilation &C, return true; } -void Driver::CreateOffloadingDeviceToolChains(Compilation &C, - InputList &Inputs) { - - // - // CUDA/HIP - // - // We need to generate a

[clang] [Clang] Rework creating offloading toolchains (PR #125556)

2025-07-17 Thread Artem Belevich via cfe-commits

@@ -3441,91 +3455,25 @@ class OffloadingActionBuilder final { return true; } - ToolChains.push_back( - AssociatedOffloadKind == Action::OFK_Cuda - ? C.getSingleOffloadToolChain() - : C.getSingleOffloadToolChain()); - -

[clang] [Clang] Rework creating offloading toolchains (PR #125556)

2025-07-17 Thread Artem Belevich via cfe-commits

@@ -3441,91 +3455,25 @@ class OffloadingActionBuilder final { return true; } - ToolChains.push_back( - AssociatedOffloadKind == Action::OFK_Cuda - ? C.getSingleOffloadToolChain() - : C.getSingleOffloadToolChain()); - -

[clang] [Clang] Rework creating offloading toolchains (PR #125556)

2025-07-17 Thread Artem Belevich via cfe-commits

https://github.com/Artem-B commented: Drive-by style/syntax mostly review. LGTM overall, with a few nits. https://github.com/llvm/llvm-project/pull/125556 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/li

[clang] [Clang] Rework creating offloading toolchains (PR #125556)

2025-07-17 Thread Artem Belevich via cfe-commits

@@ -4,7 +4,7 @@ // RUN: --rocm-path=%S/Inputs/rocm \ // RUN: %s 2>&1 | FileCheck -check-prefix=NOPLUS %s -// NOPLUS: error: invalid target ID 'gfx908xnack' +// NOPLUS: error: unsupported HIP gpu architecture: gfx908xnack Artem-B wrote: "HIP compilation c

[clang] [Clang] Rework creating offloading toolchains (PR #125556)

2025-07-17 Thread Artem Belevich via cfe-commits

https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/125556 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA] add wrapper header for libc++'s __utlility/declval.h (PR #148918)

2025-07-15 Thread Artem Belevich via cfe-commits

https://github.com/Artem-B updated https://github.com/llvm/llvm-project/pull/148918 >From ea1949d13608ac948ab34d1eeb073decdd11e2a3 Mon Sep 17 00:00:00 2001 From: Artem Belevich Date: Tue, 15 Jul 2025 11:10:40 -0700 Subject: [PATCH 1/2] [CUDA] add wrapper header for libc++'s __utlility/declval.

[clang] [CUDA] add wrapper header for libc++'s __utlility/declval.h (PR #148918)

2025-07-15 Thread Artem Belevich via cfe-commits

https://github.com/Artem-B created https://github.com/llvm/llvm-project/pull/148918 Since #116709 more libc++ code relies on std::declval() and it broke some CUDA compilations. The new wrapper adds GPU-side overloads for the declval() helper functions which allows it to continue working when

[clang] [HIP] Add warning for -mwavefrontsize64 on gfx10+ architectures (PR #140185)

2025-07-14 Thread Artem Belevich via cfe-commits

@@ -67,6 +67,12 @@ // DUP-NOT: "-target-feature" "{{.*}}wavefrontsize64" // DUP: {{.*}}lld{{.*}} "-plugin-opt=-mattr=+cumode" +// RUN: %clang -### --target=x86_64-linux-gnu -fgpu-rdc -nogpulib \ +// RUN: -nogpuinc --offload-arch=gfx1010 --no-offload-new-driver %s \ +// RUN:

[clang] [llvm] [LLVM] Add GNU make jobserver support (PR #145131)

2025-07-14 Thread Artem Belevich via cfe-commits

@@ -0,0 +1,257 @@ +//===- llvm/Support/Jobserver.cpp - Jobserver Client Implementation ---===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [llvm] [LLVM] Add GNU make jobserver support (PR #145131)

2025-07-14 Thread Artem Belevich via cfe-commits

@@ -0,0 +1,257 @@ +//===- llvm/Support/Jobserver.cpp - Jobserver Client Implementation ---===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [llvm] [LLVM] Add GNU make jobserver support (PR #145131)

2025-07-14 Thread Artem Belevich via cfe-commits

@@ -0,0 +1,257 @@ +//===- llvm/Support/Jobserver.cpp - Jobserver Client Implementation ---===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [llvm] [LLVM] Add GNU make jobserver support (PR #145131)

2025-07-14 Thread Artem Belevich via cfe-commits

@@ -0,0 +1,257 @@ +//===- llvm/Support/Jobserver.cpp - Jobserver Client Implementation ---===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [llvm] [LLVM] Add GNU make jobserver support (PR #145131)

2025-07-14 Thread Artem Belevich via cfe-commits

@@ -0,0 +1,141 @@ +//===- llvm/Support/Jobserver.h - Jobserver Client --*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [llvm] [LLVM] Add GNU make jobserver support (PR #145131)

2025-07-14 Thread Artem Belevich via cfe-commits

@@ -0,0 +1,257 @@ +//===- llvm/Support/Jobserver.cpp - Jobserver Client Implementation ---===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [llvm] [LLVM] Add GNU make jobserver support (PR #145131)

2025-07-14 Thread Artem Belevich via cfe-commits

@@ -0,0 +1,141 @@ +//===- llvm/Support/Jobserver.h - Jobserver Client --*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [llvm] [LLVM] Add GNU make jobserver support (PR #145131)

2025-07-14 Thread Artem Belevich via cfe-commits

@@ -0,0 +1,141 @@ +//===- llvm/Support/Jobserver.h - Jobserver Client --*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [llvm] [LLVM] Add GNU make jobserver support (PR #145131)

2025-07-14 Thread Artem Belevich via cfe-commits

https://github.com/Artem-B commented: Few comments on syntax/style. I didn't look at the job management logic itself. https://github.com/llvm/llvm-project/pull/145131 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bi

[clang] [llvm] [LLVM] Add GNU make jobserver support (PR #145131)

2025-07-14 Thread Artem Belevich via cfe-commits

@@ -1420,12 +1420,18 @@ int main(int Argc, char **Argv) { parallel::strategy = hardware_concurrency(1); if (auto *Arg = Args.getLastArg(OPT_wrapper_jobs)) { -unsigned Threads = 0; -if (!llvm::to_integer(Arg->getValue(), Threads) || Threads == 0) - reportError(

[clang] [llvm] [LLVM] Add GNU make jobserver support (PR #145131)

2025-07-14 Thread Artem Belevich via cfe-commits

https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/145131 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Extract offloading code from static libs with 'offload-arch=' (PR #147823)

2025-07-09 Thread Artem Belevich via cfe-commits

Artem-B wrote: > but we now assume that if the user specified --offload-arch= on the link job, > they definitely want that architecture to be used if it exists. That would be my assumption, too. Do we currently just ignore `--offload-arch=` for the linking phase? With the patch, what's expe

[clang] [llvm] [NVPTX] Consolidate and cleanup various NVPTXISD nodes (NFC) (PR #145581)

2025-06-25 Thread Artem Belevich via cfe-commits

@@ -457,3 +457,25 @@ void NVPTXInstPrinter::printCTAGroup(const MCInst *MI, int OpNum, } llvm_unreachable("Invalid cta_group in printCTAGroup"); } + +void NVPTXInstPrinter::printCallOperand(const MCInst *MI, int OpNum, +raw_ostream &

[clang] [llvm] [NVPTX] Consolidate and cleanup various NVPTXISD nodes (NFC) (PR #145581)

2025-06-25 Thread Artem Belevich via cfe-commits

@@ -457,3 +457,25 @@ void NVPTXInstPrinter::printCTAGroup(const MCInst *MI, int OpNum, } llvm_unreachable("Invalid cta_group in printCTAGroup"); } + +void NVPTXInstPrinter::printCallOperand(const MCInst *MI, int OpNum, +raw_ostream &

[clang] Add missing intrinsics to cuda headers (PR #143664)

2025-06-18 Thread Artem Belevich via cfe-commits

Artem-B wrote: It's a C++-11 feature. Tests still include c++98. We do not intend to keep everything working with c++98 (we already use c++11 in other headers), but we should not break it either. In this case, you can just enable the new stuff for c++11 or newer standards. https://github.com/

[clang] [CUDA][HIP] Add a device version of std::__glibcxx_assert_fail() (PR #136133)

2025-06-18 Thread Artem Belevich via cfe-commits

Artem-B wrote: @jmmartinez It appears that CUDA tests are broken by this change: https://lab.llvm.org/buildbot/#/builders/69/builds/22562/steps/8/logs/stdio ``` FAILED: External/CUDA/CMakeFiles/algorithm-cuda-11.8-c++98-libstdc++-10.dir/algorithm.cu.o /buildbot/cuda-t4-0/work/clang-cuda-t4/c

[clang] Revert "Add missing intrinsics to cuda headers" (PR #144755)

2025-06-18 Thread Artem Belevich via cfe-commits

https://github.com/Artem-B closed https://github.com/llvm/llvm-project/pull/144755 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Revert "Add missing intrinsics to cuda headers" (PR #144755)

2025-06-18 Thread Artem Belevich via cfe-commits

https://github.com/Artem-B created https://github.com/llvm/llvm-project/pull/144755 Reverts llvm/llvm-project#143664 as it breaks CUDA compilation. >From 2ed0932a540bb1a692fe442ab590d51674645f6c Mon Sep 17 00:00:00 2001 From: Artem Belevich Date: Wed, 18 Jun 2025 10:06:56 -0700 Subject: [PATCH

[clang] Add missing intrinsics to cuda headers (PR #143664)

2025-06-18 Thread Artem Belevich via cfe-commits

Artem-B wrote: It appears to be breaking CUDA tests: https://lab.llvm.org/buildbot/#/builders/69/builds/22559 I'll revert it for now and we'll try again later. ``` [29/988] Building CXX object External/CUDA/CMakeFiles/math_h-cuda-11.8-c++98-libstdc++-10.dir/math_h.cu.o FAILED: External/CUDA/

[clang] [CUDA][HIP] add options `--[no-]offload-inc` (PR #140106)

2025-06-17 Thread Artem Belevich via cfe-commits

https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/140106 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA][HIP] add options `--[no-]offload-inc` (PR #140106)

2025-06-17 Thread Artem Belevich via cfe-commits

https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/140106 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Add missing intrinsics to cuda headers (PR #143664)

2025-06-17 Thread Artem Belevich via cfe-commits

https://github.com/Artem-B closed https://github.com/llvm/llvm-project/pull/143664 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Add missing intrinsics to cuda headers (PR #143664)

2025-06-13 Thread Artem Belevich via cfe-commits

https://github.com/Artem-B approved this pull request. LGTM with one last nit. https://github.com/llvm/llvm-project/pull/143664 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Add missing intrinsics to cuda headers (PR #143664)

2025-06-13 Thread Artem Belevich via cfe-commits

@@ -479,7 +479,291 @@ inline __device__ unsigned __funnelshift_rc(unsigned low32, unsigned high32, return ret; } -#endif // !defined(__CUDA_ARCH__) || __CUDA_ARCH__ >= 320 +#pragma push_macro("__INTRINSIC_LOAD") +#define __INTRINSIC_LOAD(__FnName, __AsmOp, __DeclType, __Tmp

[clang] Add missing intrinsics to cuda headers (PR #143664)

2025-06-12 Thread Artem Belevich via cfe-commits

@@ -479,6 +479,275 @@ inline __device__ unsigned __funnelshift_rc(unsigned low32, unsigned high32, return ret; } +#define INTRINSIC_LOAD(func_name, asm_op, decl_type, internal_type, asm_type) \ Artem-B wrote: We have to be careful with the names used in th

[clang] Add missing intrinsics to cuda headers (PR #143664)

2025-06-12 Thread Artem Belevich via cfe-commits

https://github.com/Artem-B requested changes to this pull request. Nice. I like this approach better. There are few more things to polish up, but it looks good overall. https://github.com/llvm/llvm-project/pull/143664 ___ cfe-commits mailing list cfe-

[clang] Add missing intrinsics to cuda headers (PR #143664)

2025-06-12 Thread Artem Belevich via cfe-commits

@@ -479,6 +479,275 @@ inline __device__ unsigned __funnelshift_rc(unsigned low32, unsigned high32, return ret; } +#define INTRINSIC_LOAD(func_name, asm_op, decl_type, internal_type, asm_type) \ Artem-B wrote: Can we merge `INTRINSIC*` and `MINTRINSIC*` mac

[clang] Add missing intrinsics to cuda headers (PR #143664)

2025-06-12 Thread Artem Belevich via cfe-commits

https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/143664 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Add missing intrinsics to cuda headers (PR #143664)

2025-06-11 Thread Artem Belevich via cfe-commits

https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/143664 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [NVPTX] Enable OpenCL 3d_image_writes support (PR #143331)

2025-06-09 Thread Artem Belevich via cfe-commits

Artem-B wrote: @svenvh appears to be the current maintainer of OpenCL in LLVM. https://github.com/llvm/llvm-project/pull/143331 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA] Disallow use of address_space(N) on CUDA device variables. (PR #142857)

2025-06-09 Thread Artem Belevich via cfe-commits

https://github.com/Artem-B closed https://github.com/llvm/llvm-project/pull/142857 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA] Disallow use of address_space(N) on CUDA device variables. (PR #142857)

2025-06-06 Thread Artem Belevich via cfe-commits

Artem-B wrote: @yxsamliu Sam, do you have any thoughts on this? https://github.com/llvm/llvm-project/pull/142857 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA] Disallow use of address_space(N) on CUDA device variables. (PR #142857)

2025-06-04 Thread Artem Belevich via cfe-commits

https://github.com/Artem-B created https://github.com/llvm/llvm-project/pull/142857 The variables have implicit host-side shadow instances and explicit address space attribute breaks them on the host. >From e2e8da0271ae11711dbd54f6e8d9ff498f3226d4 Mon Sep 17 00:00:00 2001 From: Artem Belevich

[clang] [clang] Move opt level in clang toolchain to clang::ConstructJob start (PR #141036)

2025-05-27 Thread Artem Belevich via cfe-commits

https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/141036 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [NVPTX] Add pm_event intrinsics (PR #141278)

2025-05-27 Thread Artem Belevich via cfe-commits

@@ -177,6 +177,7 @@ let Attributes = [NoReturn] in { } let Attributes = [NoThrow] in { def __nvvm_nanosleep : NVPTXBuiltinSMAndPTX<"void(unsigned int)", SM_70, PTX63>; + def __nvvm_pm_event_mask : NVPTXBuiltin<"void(unsigned short)">; Artem-B wrote: The ar

[clang] [llvm] [NVPTX] Add pm_event intrinsics (PR #141278)

2025-05-27 Thread Artem Belevich via cfe-commits

https://github.com/Artem-B approved this pull request. Builtin signature needs a fix, but LGTM otherwise. https://github.com/llvm/llvm-project/pull/141278 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/li

[clang] [llvm] [NVPTX] Add pm_event intrinsics (PR #141278)

2025-05-27 Thread Artem Belevich via cfe-commits

https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/141278 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [mlir] Reland "[NVPTX] Unify and extend barrier{.cta} intrinsic support" (PR #141143)

2025-05-22 Thread Artem Belevich via cfe-commits

https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/141143 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [mlir] [NVPTX] Unify and extend barrier{.cta} intrinsic support (PR #140615)

2025-05-19 Thread Artem Belevich via cfe-commits

@@ -1349,6 +1349,10 @@ static bool upgradeIntrinsicFunction1(Function *F, Function *&NewFn, else if (Name == "clz.ll" || Name == "popc.ll" || Name == "h2f" || Name == "swap.lo.hi.b64") Expand = true; + else if (Name == "barrier0" || Name == "b

[clang] [NVPTX] Support the OpenCL generic addrspace feature by default (PR #137940)

2025-05-19 Thread Artem Belevich via cfe-commits

@@ -170,6 +170,8 @@ class LLVM_LIBRARY_VISIBILITY NVPTXTargetInfo : public TargetInfo { Opts["cl_khr_global_int32_extended_atomics"] = true; Opts["cl_khr_local_int32_base_atomics"] = true; Opts["cl_khr_local_int32_extended_atomics"] = true; + +Opts["__opencl_c_

[clang] [llvm] [NVPTX] Add errors for incorrect CUDA addrpaces (PR #138706)

2025-05-19 Thread Artem Belevich via cfe-commits

https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/138706 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [NVPTX] Add errors for incorrect CUDA addrpaces (PR #138706)

2025-05-19 Thread Artem Belevich via cfe-commits

@@ -2927,6 +2928,20 @@ void Verifier::visitFunction(const Function &F) { "Calling convention does not support varargs or " "perfect forwarding!", &F); +if (F.getCallingConv() == CallingConv::PTX_Kernel && +TT.getOS() == Triple::CUDA) {

[clang] [CUDA][HIP] add option -gpuinc (PR #140106)

2025-05-15 Thread Artem Belevich via cfe-commits

@@ -5734,6 +5734,9 @@ def nobuiltininc : Flag<["-"], "nobuiltininc">, def nogpuinc : Flag<["-"], "nogpuinc">, Group, HelpText<"Do not add include paths for CUDA/HIP and" " do not include the default CUDA/HIP wrapper headers">; +def gpuinc : Flag<["-"], "gpuinc">, Group, +

[clang] [CUDA][HIP] add option -gpuinc (PR #140106)

2025-05-15 Thread Artem Belevich via cfe-commits

@@ -5734,6 +5734,9 @@ def nobuiltininc : Flag<["-"], "nobuiltininc">, def nogpuinc : Flag<["-"], "nogpuinc">, Group, HelpText<"Do not add include paths for CUDA/HIP and" " do not include the default CUDA/HIP wrapper headers">; +def gpuinc : Flag<["-"], "gpuinc">, Group, +

[clang] [CUDA][HIP] add option -gpuinc (PR #140106)

2025-05-15 Thread Artem Belevich via cfe-commits

https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/140106 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA][HIP] add option -gpuinc (PR #140106)

2025-05-15 Thread Artem Belevich via cfe-commits

https://github.com/Artem-B commented: Being able to override a flag is a good thing to have, IMO. There are builds where the owner of the leaf targets do not have much control over which options are set by the "default" compilation, so they need to rely on being able to override preceding opti

[clang] [llvm] [NVPTX] Add errors for incorrect CUDA addrpaces (PR #138706)

2025-05-13 Thread Artem Belevich via cfe-commits

@@ -1399,19 +1399,27 @@ void NVPTXAsmPrinter::emitFunctionParamList(const Function *F, raw_ostream &O) { if (PTy) { O << "\t.param .u" << PTySizeInBits << " .ptr"; +bool IsCUDA = static_cast(TM).getDrvInterface() == + NVPTX::CUDA;

[clang] [CUDA] Remove obsolete GPU-side __constexpr_* wrappers. (PR #139164)

2025-05-12 Thread Artem Belevich via cfe-commits

https://github.com/Artem-B closed https://github.com/llvm/llvm-project/pull/139164 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA] Remove obsolete GPU-side __constexpr_* wrappers. (PR #139164)

2025-05-12 Thread Artem Belevich via cfe-commits

Artem-B wrote: No wrappers -- no problems. :-) https://github.com/llvm/llvm-project/pull/139164 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA] Remove obsolete GPU-side __constexpr_* wrappers. (PR #139164)

2025-05-12 Thread Artem Belevich via cfe-commits

https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/139164 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA] fix wrapper cmath header to match #136101 (PR #139164)

2025-05-12 Thread Artem Belevich via cfe-commits

https://github.com/Artem-B updated https://github.com/llvm/llvm-project/pull/139164 >From a1d60feed11174b9d2106b57ee15ff6d9bc56fa4 Mon Sep 17 00:00:00 2001 From: Artem Belevich Date: Thu, 8 May 2025 14:43:47 -0700 Subject: [PATCH] [CUDA] remove obsolete GPU-side __constexpr* wrappers libc++ no

[clang] [CUDA] fix wrapper cmath header to match #136101 (PR #139164)

2025-05-12 Thread Artem Belevich via cfe-commits

Artem-B wrote: > Right now this checks for `libc++` less than 14. Is that still relevant > following that change? That's a very good point. Looks like those `__constexpr_fmin/fmax` are gone now and we do not heed them any more. https://github.com/llvm/llvm-project/pull/139164

[clang] [CUDA] fix wrapper cmath header to match #136101 (PR #139164)

2025-05-12 Thread Artem Belevich via cfe-commits

Artem-B wrote: @jhuber6 @ldionne One concern I have for this change is that it will break folks who will use older libc++ with the new Clang + wrapper headers. Is older libc++ expected to work with non-matching clang version? If the expectation is that libc++ and clang are from the same versio

[clang] [llvm] [NVPTX] Add intrinsics and clang builtins for conversions of f4x2 type (PR #139244)

2025-05-09 Thread Artem Belevich via cfe-commits

https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/139244 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [HIP] change default offload archs (PR #139281)

2025-05-09 Thread Artem Belevich via cfe-commits

Artem-B wrote: @cgmb > I would suggest that we should either (a) change the default GPU target to > native and make the failure to detect the user’s GPU into a hard compiler > error, or (b) change the default GPU target to SPIR-V so that it works on > every machine. The thing is that the se

[clang] [HIP] change default offload archs (PR #139281)

2025-05-09 Thread Artem Belevich via cfe-commits

Artem-B wrote: @jhuber6 do you think can we use `native` instead? I think it would be a somewhat better option here. If we have to choose a GPU variant by default, we may as well choose the actual GPU, rather than a conditional choice between generic SPIR-V or an old GPU, which has the disadva

[clang] [CUDA][HIP] Fix host/device attribute of builtin (PR #138162)

2025-05-07 Thread Artem Belevich via cfe-commits

https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/138162 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] clang/OpenCL: Add baseline test showing broken codegen (PR #138862)

2025-05-07 Thread Artem Belevich via cfe-commits

@@ -109,3 +109,48 @@ void func2(void) { void func3(void) { float a[16][1] = {{0.}}; } + +// CL12-LABEL: define dso_local void @wrong_store_type_private_pointer_alloca( +// CL12-SAME: ) #[[ATTR0]] { +// CL12-NEXT: [[ENTRY:.*:]] +// CL12-NEXT:[[PLONG:%.*]] = alloca i64, al

[clang] clang/OpenCL: Add baseline test showing broken codegen (PR #138862)

2025-05-07 Thread Artem Belevich via cfe-commits

@@ -109,3 +109,48 @@ void func2(void) { void func3(void) { float a[16][1] = {{0.}}; } + +// CL12-LABEL: define dso_local void @wrong_store_type_private_pointer_alloca( +// CL12-SAME: ) #[[ATTR0]] { +// CL12-NEXT: [[ENTRY:.*:]] +// CL12-NEXT:[[PLONG:%.*]] = alloca i64, al

[clang] [clang][Sema] Don't warn for implicit uses of builtins in system headers (PR #138205)

2025-05-02 Thread Artem Belevich via cfe-commits

@@ -2376,9 +2376,14 @@ NamedDecl *Sema::LazilyCreateBuiltin(IdentifierInfo *II, unsigned ID, return nullptr; } + // Warn for implicit uses of header dependent libraries, + // except in system headers. if (!ForRedeclaration && (Context.BuiltinInfo.isPredefine

[clang] [clang][Sema] Don't warn for implicit uses of builtins in system headers (PR #138205)

2025-05-02 Thread Artem Belevich via cfe-commits

Artem-B wrote: OK. This makes sense. > sorry this change is so drawn out :) What matters is that you're making progress, and I appreciate your work on getting this issue sorted out the right way. https://github.com/llvm/llvm-project/pull/138205 _

[clang] [clang][Sema] Don't warn for implicit uses of builtins in system headers (PR #138205)

2025-05-02 Thread Artem Belevich via cfe-commits

Artem-B wrote: Something does not add up here. AFAICT, using builtins w/o explicitly declaring them is something that's done all the time. https://godbolt.org/z/ha47W53dh In that sense, we should not be needing to filter out the diagnostics coming from the system headers only. There should not

[clang] [CUDA][HIP] Fix implicit attribute of builtin (PR #138162)

2025-05-01 Thread Artem Belevich via cfe-commits

@@ -0,0 +1,23 @@ +// expected-no-diagnostics + +// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -aux-triple amdgcn-amd-amdhsa -fsyntax-only -verify -xhip %s +// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -fsyntax-only -fcuda-is-device -verify -xhip %s + +#include "Inputs/cuda

[clang] [CUDA][HIP] Add a device version of std::__glibcxx_assert_fail() (PR #136133)

2025-04-30 Thread Artem Belevich via cfe-commits

https://github.com/Artem-B commented: LGTM in principle. Now the question is -- how do we test it? There are multiple libstdc++ library versions in the wild and we must not break any of them. We do have some testing on CUDA test bots (which I've just discovered to be silently broken for a whil

[clang] [CUDA][HIP] Add a device version of std::__glibcxx_assert_fail() (PR #136133)

2025-04-30 Thread Artem Belevich via cfe-commits

@@ -0,0 +1,35 @@ +// libstdc++ uses the non-constexpr function std::__glibcxx_assert_fail() +// to trigger compilation errors when the __glibcxx_assert(cond) macro +// is used in a constexpr context. +// Compilation fails when using code from the libstdc++ (such as std::array) on

[clang] [CUDA][HIP] capture possible ODR-used var (PR #136645)

2025-04-22 Thread Artem Belevich via cfe-commits

https://github.com/Artem-B approved this pull request. LGTM w/ a nit. https://github.com/llvm/llvm-project/pull/136645 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA][HIP] capture possible ODR-used var (PR #136645)

2025-04-22 Thread Artem Belevich via cfe-commits

@@ -1100,3 +1101,49 @@ std::string SemaCUDA::getConfigureFuncName() const { // Legacy CUDA kernel configuration call return "cudaConfigureCall"; } + +// Record any local constexpr variables that are passed one way on the host +// and another on the device. +void SemaCUDA::r

[clang] [CUDA][HIP] capture possible ODR-used var (PR #136645)

2025-04-22 Thread Artem Belevich via cfe-commits

@@ -1100,3 +1101,49 @@ std::string SemaCUDA::getConfigureFuncName() const { // Legacy CUDA kernel configuration call return "cudaConfigureCall"; } + +// Record any local constexpr variables that are passed one way on the host +// and another on the device. +void SemaCUDA::r

[clang] [CUDA][HIP] capture possible ODR-used var (PR #136645)

2025-04-22 Thread Artem Belevich via cfe-commits

https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/136645 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-21 Thread Artem Belevich via cfe-commits

@@ -25,6 +25,7 @@ enum AddressSpace : unsigned { ADDRESS_SPACE_CONST = 4, ADDRESS_SPACE_LOCAL = 5, ADDRESS_SPACE_TENSOR = 6, + ADDRESS_SPACE_SHARED_CLUSTER = 7, Artem-B wrote: PTX docs say: ``` If no sub-qualifier is specified with the .shared state sp

[clang] [clang][ARM][AArch64] Define intrinsics guarded by __has_builtin on all platforms (PR #128222)

2025-04-21 Thread Artem Belevich via cfe-commits

https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/128222 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA][HIP] Add a device version of std::__glibcxx_assert_fail() (PR #136133)

2025-04-21 Thread Artem Belevich via cfe-commits

@@ -0,0 +1,35 @@ +// libstdc++ uses the non-constexpr function std::__glibcxx_assert_fail() +// to trigger compilation errors when the __glibcxx_assert(cond) macro +// is used in a constexpr context. +// Compilation fails when using code from the libstdc++ (such as std::array) on

[clang] [CUDA][HIP] Add a device version of std::__glibcxx_assert_fail() (PR #136133)

2025-04-18 Thread Artem Belevich via cfe-commits

https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/136133 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang][ARM][AArch64] Define intrinsics guarded by __has_builtin on all platforms (PR #128222)

2025-04-17 Thread Artem Belevich via cfe-commits

@@ -36,6 +36,28 @@ typedef __SIZE_TYPE__ size_t; #include +#ifdef __ARM_ACLE +// arm_acle.h needs some stdint types, but -ffreestanding prevents us from Artem-B wrote: Shouldn't that be fixed in arm_acle.h itself so it includes the headers with the types i

[clang] [CUDA][HIP] Add a device version of std::__glibcxx_assert_fail() (PR #136133)

2025-04-17 Thread Artem Belevich via cfe-commits

https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/136133 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA][HIP] Add a device version of std::__glibcxx_assert_fail() (PR #136133)

2025-04-17 Thread Artem Belevich via cfe-commits

@@ -0,0 +1,35 @@ +// libstdc++ uses the non-constexpr function std::__glibcxx_assert_fail() +// to trigger compilation errors when the __glibcxx_assert(cond) macro +// is used in a constexpr context. +// Compilation fails when using code from the libstdc++ (such as std::array) on

[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-16 Thread Artem Belevich via cfe-commits

@@ -982,8 +982,9 @@ void NVPTXDAGToDAGISel::SelectAddrSpaceCast(SDNode *N) { case ADDRESS_SPACE_SHARED: Opc = TM.is64Bit() ? NVPTX::cvta_shared_64 : NVPTX::cvta_shared; break; -case ADDRESS_SPACE_DSHARED: - Opc = TM.is64Bit() ? NVPTX::cvta_dshared_64 :

[clang] [llvm] [NVPTX] Cleanup and document nvvm.fabs intrinsics, adding f16 support (PR #135644)

2025-04-16 Thread Artem Belevich via cfe-commits

https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/135644 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [NVPTX] Cleanup and document nvvm.fabs intrinsics, adding f16 support (PR #135644)

2025-04-16 Thread Artem Belevich via cfe-commits

https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/135644 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [mlir] [NVPTX] Add support for Shared Cluster Memory address space. (PR #135444)

2025-04-15 Thread Artem Belevich via cfe-commits

@@ -982,8 +982,9 @@ void NVPTXDAGToDAGISel::SelectAddrSpaceCast(SDNode *N) { case ADDRESS_SPACE_SHARED: Opc = TM.is64Bit() ? NVPTX::cvta_shared_64 : NVPTX::cvta_shared; break; -case ADDRESS_SPACE_DSHARED: - Opc = TM.is64Bit() ? NVPTX::cvta_dshared_64 :

[clang] [llvm] [NVPTX] Cleanup and document nvvm.fabs intrinsics, adding f16 support (PR #135644)

2025-04-15 Thread Artem Belevich via cfe-commits

@@ -1034,6 +1034,10 @@ Value *CodeGenFunction::EmitNVPTXBuiltinExpr(unsigned BuiltinID, case NVPTX::BI__nvvm_fmin_xorsign_abs_f16x2: return MakeHalfType(Intrinsic::nvvm_fmin_xorsign_abs_f16x2, BuiltinID, E, *this); + case NVPTX::BI__nvvm_abs_bf16

[clang] [llvm] [NVPTX] Cleanup and document nvvm.fabs intrinsics, adding f16 support (PR #135644)

2025-04-15 Thread Artem Belevich via cfe-commits

@@ -1034,6 +1034,10 @@ Value *CodeGenFunction::EmitNVPTXBuiltinExpr(unsigned BuiltinID, case NVPTX::BI__nvvm_fmin_xorsign_abs_f16x2: return MakeHalfType(Intrinsic::nvvm_fmin_xorsign_abs_f16x2, BuiltinID, E, *this); + case NVPTX::BI__nvvm_abs_bf16

[clang] [llvm] [NVPTX] Cleanup and document nvvm.fabs intrinsics, adding f16 support (PR #135644)

2025-04-15 Thread Artem Belevich via cfe-commits

@@ -411,6 +412,13 @@ static Instruction *convertNvvmIntrinsicToLlvm(InstCombiner &IC, } return nullptr; } + case SPC_Fabs: { +if (!II->getType()->isDoubleTy()) + return nullptr; +auto *Fabs = Intrinsic::getOrInsertDeclaration( +II->getModule(),

[clang] [llvm] [mlir] [NVPTX] Add support for Distributed Shared Memory address space. (PR #135444)

2025-04-11 Thread Artem Belevich via cfe-commits

Artem-B wrote: I wish PTX would be a bit more consistent about naming things. Documentation calls it distributed shared memory (and it is distributed, and is shared), but the PTX instructions, compiler builtins and intrinsics use shared::cluster (as opposed to regular shared AKA shared::cta).

[clang] [llvm] [clang][NVPTX] Add builtins and intrinsics for conversions of new FP types (PR #134345)

2025-04-11 Thread Artem Belevich via cfe-commits

@@ -703,6 +703,41 @@ let hasSideEffects = false in { defm CVT_to_tf32_rz_satf : CVT_TO_TF32<"rz.satfinite", [hasPTX<86>, hasSM<100>]>; defm CVT_to_tf32_rn_relu_satf : CVT_TO_TF32<"rn.relu.satfinite", [hasPTX<86>, hasSM<100>]>; defm CVT_to_tf32_rz_relu_satf : CVT_TO_TF

[clang] [llvm] [clang][NVPTX] Add builtins and intrinsics for conversions of new FP types (PR #134345)

2025-04-11 Thread Artem Belevich via cfe-commits

https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/134345 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [clang][NVPTX] Add builtins and intrinsics for conversions of new FP types (PR #134345)

2025-04-11 Thread Artem Belevich via cfe-commits

https://github.com/Artem-B approved this pull request. LGTM in general, with an intrinsic naming nit. https://github.com/llvm/llvm-project/pull/134345 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listin

[clang] [llvm] [NVPTX] Add builtins and intrinsics for conversions of new FP types (PR #134345)

2025-04-10 Thread Artem Belevich via cfe-commits

@@ -596,6 +605,28 @@ def __nvvm_e4m3x2_to_f16x2_rn_relu : NVPTXBuiltinSMAndPTX<"_Vector<2, __fp16>(sh def __nvvm_e5m2x2_to_f16x2_rn : NVPTXBuiltinSMAndPTX<"_Vector<2, __fp16>(short)", SM_89, PTX81>; def __nvvm_e5m2x2_to_f16x2_rn_relu : NVPTXBuiltinSMAndPTX<"_Vector<2, __fp16>

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1314 matches

Mail list logo