https://github.com/Artem-B approved this pull request.
LGTM syntax/style-wise. Looks reasonable on the functionality side, but we
could use a second opinion on that.
https://github.com/llvm/llvm-project/pull/145131
___
cfe-commits mailing list
cfe-com
https://github.com/Artem-B edited
https://github.com/llvm/llvm-project/pull/145131
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -14,14 +14,14 @@
// RUN: | FileCheck %s --check-prefix=NO-OUTPUT-ERROR
// RUN: not %clang -### --target=x86_64-unknown-linux-gnu -nogpulib
--offload-new-driver --offload-arch=native
--amdgpu-arch-tool=%t/amdgpu_arch_fail -x hip %s 2>&1 \
// RUN: | FileCheck %s --chec
@@ -951,221 +931,262 @@ static bool addSYCLDefaultTriple(Compilation &C,
return true;
}
-void Driver::CreateOffloadingDeviceToolChains(Compilation &C,
- InputList &Inputs) {
-
- //
- // CUDA/HIP
- //
- // We need to generate a
@@ -3441,91 +3455,25 @@ class OffloadingActionBuilder final {
return true;
}
- ToolChains.push_back(
- AssociatedOffloadKind == Action::OFK_Cuda
- ? C.getSingleOffloadToolChain()
- : C.getSingleOffloadToolChain());
-
-
@@ -3441,91 +3455,25 @@ class OffloadingActionBuilder final {
return true;
}
- ToolChains.push_back(
- AssociatedOffloadKind == Action::OFK_Cuda
- ? C.getSingleOffloadToolChain()
- : C.getSingleOffloadToolChain());
-
-
https://github.com/Artem-B commented:
Drive-by style/syntax mostly review. LGTM overall, with a few nits.
https://github.com/llvm/llvm-project/pull/125556
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/li
@@ -4,7 +4,7 @@
// RUN: --rocm-path=%S/Inputs/rocm \
// RUN: %s 2>&1 | FileCheck -check-prefix=NOPLUS %s
-// NOPLUS: error: invalid target ID 'gfx908xnack'
+// NOPLUS: error: unsupported HIP gpu architecture: gfx908xnack
Artem-B wrote:
"HIP compilation c
https://github.com/Artem-B edited
https://github.com/llvm/llvm-project/pull/125556
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/Artem-B updated
https://github.com/llvm/llvm-project/pull/148918
>From ea1949d13608ac948ab34d1eeb073decdd11e2a3 Mon Sep 17 00:00:00 2001
From: Artem Belevich
Date: Tue, 15 Jul 2025 11:10:40 -0700
Subject: [PATCH 1/2] [CUDA] add wrapper header for libc++'s
__utlility/declval.
https://github.com/Artem-B created
https://github.com/llvm/llvm-project/pull/148918
Since #116709 more libc++ code relies on std::declval() and it broke some CUDA
compilations.
The new wrapper adds GPU-side overloads for the declval() helper functions
which allows it to continue working when
@@ -67,6 +67,12 @@
// DUP-NOT: "-target-feature" "{{.*}}wavefrontsize64"
// DUP: {{.*}}lld{{.*}} "-plugin-opt=-mattr=+cumode"
+// RUN: %clang -### --target=x86_64-linux-gnu -fgpu-rdc -nogpulib \
+// RUN: -nogpuinc --offload-arch=gfx1010 --no-offload-new-driver %s \
+// RUN:
@@ -0,0 +1,257 @@
+//===- llvm/Support/Jobserver.cpp - Jobserver Client Implementation
---===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Ap
@@ -0,0 +1,257 @@
+//===- llvm/Support/Jobserver.cpp - Jobserver Client Implementation
---===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Ap
@@ -0,0 +1,257 @@
+//===- llvm/Support/Jobserver.cpp - Jobserver Client Implementation
---===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Ap
@@ -0,0 +1,257 @@
+//===- llvm/Support/Jobserver.cpp - Jobserver Client Implementation
---===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Ap
@@ -0,0 +1,141 @@
+//===- llvm/Support/Jobserver.h - Jobserver Client --*- C++
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Ap
@@ -0,0 +1,257 @@
+//===- llvm/Support/Jobserver.cpp - Jobserver Client Implementation
---===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Ap
@@ -0,0 +1,141 @@
+//===- llvm/Support/Jobserver.h - Jobserver Client --*- C++
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Ap
@@ -0,0 +1,141 @@
+//===- llvm/Support/Jobserver.h - Jobserver Client --*- C++
-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Ap
https://github.com/Artem-B commented:
Few comments on syntax/style. I didn't look at the job management logic itself.
https://github.com/llvm/llvm-project/pull/145131
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bi
@@ -1420,12 +1420,18 @@ int main(int Argc, char **Argv) {
parallel::strategy = hardware_concurrency(1);
if (auto *Arg = Args.getLastArg(OPT_wrapper_jobs)) {
-unsigned Threads = 0;
-if (!llvm::to_integer(Arg->getValue(), Threads) || Threads == 0)
- reportError(
https://github.com/Artem-B edited
https://github.com/llvm/llvm-project/pull/145131
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Artem-B wrote:
> but we now assume that if the user specified --offload-arch= on the link job,
> they definitely want that architecture to be used if it exists.
That would be my assumption, too. Do we currently just ignore
`--offload-arch=` for the linking phase?
With the patch, what's expe
@@ -457,3 +457,25 @@ void NVPTXInstPrinter::printCTAGroup(const MCInst *MI, int
OpNum,
}
llvm_unreachable("Invalid cta_group in printCTAGroup");
}
+
+void NVPTXInstPrinter::printCallOperand(const MCInst *MI, int OpNum,
+raw_ostream &
@@ -457,3 +457,25 @@ void NVPTXInstPrinter::printCTAGroup(const MCInst *MI, int
OpNum,
}
llvm_unreachable("Invalid cta_group in printCTAGroup");
}
+
+void NVPTXInstPrinter::printCallOperand(const MCInst *MI, int OpNum,
+raw_ostream &
Artem-B wrote:
It's a C++-11 feature. Tests still include c++98. We do not intend to keep
everything working with c++98 (we already use c++11 in other headers), but we
should not break it either. In this case, you can just enable the new stuff for
c++11 or newer standards.
https://github.com/
Artem-B wrote:
@jmmartinez It appears that CUDA tests are broken by this change:
https://lab.llvm.org/buildbot/#/builders/69/builds/22562/steps/8/logs/stdio
```
FAILED:
External/CUDA/CMakeFiles/algorithm-cuda-11.8-c++98-libstdc++-10.dir/algorithm.cu.o
/buildbot/cuda-t4-0/work/clang-cuda-t4/c
https://github.com/Artem-B closed
https://github.com/llvm/llvm-project/pull/144755
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/Artem-B created
https://github.com/llvm/llvm-project/pull/144755
Reverts llvm/llvm-project#143664
as it breaks CUDA compilation.
>From 2ed0932a540bb1a692fe442ab590d51674645f6c Mon Sep 17 00:00:00 2001
From: Artem Belevich
Date: Wed, 18 Jun 2025 10:06:56 -0700
Subject: [PATCH
Artem-B wrote:
It appears to be breaking CUDA tests:
https://lab.llvm.org/buildbot/#/builders/69/builds/22559
I'll revert it for now and we'll try again later.
```
[29/988] Building CXX object
External/CUDA/CMakeFiles/math_h-cuda-11.8-c++98-libstdc++-10.dir/math_h.cu.o
FAILED:
External/CUDA/
https://github.com/Artem-B approved this pull request.
https://github.com/llvm/llvm-project/pull/140106
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/Artem-B edited
https://github.com/llvm/llvm-project/pull/140106
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/Artem-B closed
https://github.com/llvm/llvm-project/pull/143664
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/Artem-B approved this pull request.
LGTM with one last nit.
https://github.com/llvm/llvm-project/pull/143664
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -479,7 +479,291 @@ inline __device__ unsigned __funnelshift_rc(unsigned
low32, unsigned high32,
return ret;
}
-#endif // !defined(__CUDA_ARCH__) || __CUDA_ARCH__ >= 320
+#pragma push_macro("__INTRINSIC_LOAD")
+#define __INTRINSIC_LOAD(__FnName, __AsmOp, __DeclType, __Tmp
@@ -479,6 +479,275 @@ inline __device__ unsigned __funnelshift_rc(unsigned
low32, unsigned high32,
return ret;
}
+#define INTRINSIC_LOAD(func_name, asm_op, decl_type, internal_type, asm_type) \
Artem-B wrote:
We have to be careful with the names used in th
https://github.com/Artem-B requested changes to this pull request.
Nice. I like this approach better. There are few more things to polish up, but
it looks good overall.
https://github.com/llvm/llvm-project/pull/143664
___
cfe-commits mailing list
cfe-
@@ -479,6 +479,275 @@ inline __device__ unsigned __funnelshift_rc(unsigned
low32, unsigned high32,
return ret;
}
+#define INTRINSIC_LOAD(func_name, asm_op, decl_type, internal_type, asm_type) \
Artem-B wrote:
Can we merge `INTRINSIC*` and `MINTRINSIC*` mac
https://github.com/Artem-B edited
https://github.com/llvm/llvm-project/pull/143664
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/Artem-B approved this pull request.
https://github.com/llvm/llvm-project/pull/143664
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Artem-B wrote:
@svenvh appears to be the current maintainer of OpenCL in LLVM.
https://github.com/llvm/llvm-project/pull/143331
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/Artem-B closed
https://github.com/llvm/llvm-project/pull/142857
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Artem-B wrote:
@yxsamliu Sam, do you have any thoughts on this?
https://github.com/llvm/llvm-project/pull/142857
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/Artem-B created
https://github.com/llvm/llvm-project/pull/142857
The variables have implicit host-side shadow instances and explicit address
space attribute breaks them on the host.
>From e2e8da0271ae11711dbd54f6e8d9ff498f3226d4 Mon Sep 17 00:00:00 2001
From: Artem Belevich
https://github.com/Artem-B approved this pull request.
https://github.com/llvm/llvm-project/pull/141036
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -177,6 +177,7 @@ let Attributes = [NoReturn] in {
}
let Attributes = [NoThrow] in {
def __nvvm_nanosleep : NVPTXBuiltinSMAndPTX<"void(unsigned int)", SM_70,
PTX63>;
+ def __nvvm_pm_event_mask : NVPTXBuiltin<"void(unsigned short)">;
Artem-B wrote:
The ar
https://github.com/Artem-B approved this pull request.
Builtin signature needs a fix, but LGTM otherwise.
https://github.com/llvm/llvm-project/pull/141278
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/li
https://github.com/Artem-B edited
https://github.com/llvm/llvm-project/pull/141278
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/Artem-B approved this pull request.
https://github.com/llvm/llvm-project/pull/141143
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -1349,6 +1349,10 @@ static bool upgradeIntrinsicFunction1(Function *F,
Function *&NewFn,
else if (Name == "clz.ll" || Name == "popc.ll" || Name == "h2f" ||
Name == "swap.lo.hi.b64")
Expand = true;
+ else if (Name == "barrier0" || Name == "b
@@ -170,6 +170,8 @@ class LLVM_LIBRARY_VISIBILITY NVPTXTargetInfo : public
TargetInfo {
Opts["cl_khr_global_int32_extended_atomics"] = true;
Opts["cl_khr_local_int32_base_atomics"] = true;
Opts["cl_khr_local_int32_extended_atomics"] = true;
+
+Opts["__opencl_c_
https://github.com/Artem-B edited
https://github.com/llvm/llvm-project/pull/138706
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -2927,6 +2928,20 @@ void Verifier::visitFunction(const Function &F) {
"Calling convention does not support varargs or "
"perfect forwarding!",
&F);
+if (F.getCallingConv() == CallingConv::PTX_Kernel &&
+TT.getOS() == Triple::CUDA) {
@@ -5734,6 +5734,9 @@ def nobuiltininc : Flag<["-"], "nobuiltininc">,
def nogpuinc : Flag<["-"], "nogpuinc">, Group,
HelpText<"Do not add include paths for CUDA/HIP and"
" do not include the default CUDA/HIP wrapper headers">;
+def gpuinc : Flag<["-"], "gpuinc">, Group,
+
@@ -5734,6 +5734,9 @@ def nobuiltininc : Flag<["-"], "nobuiltininc">,
def nogpuinc : Flag<["-"], "nogpuinc">, Group,
HelpText<"Do not add include paths for CUDA/HIP and"
" do not include the default CUDA/HIP wrapper headers">;
+def gpuinc : Flag<["-"], "gpuinc">, Group,
+
https://github.com/Artem-B edited
https://github.com/llvm/llvm-project/pull/140106
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/Artem-B commented:
Being able to override a flag is a good thing to have, IMO. There are builds
where the owner of the leaf targets do not have much control over which options
are set by the "default" compilation, so they need to rely on being able to
override preceding opti
@@ -1399,19 +1399,27 @@ void NVPTXAsmPrinter::emitFunctionParamList(const
Function *F, raw_ostream &O) {
if (PTy) {
O << "\t.param .u" << PTySizeInBits << " .ptr";
+bool IsCUDA = static_cast(TM).getDrvInterface()
==
+ NVPTX::CUDA;
https://github.com/Artem-B closed
https://github.com/llvm/llvm-project/pull/139164
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Artem-B wrote:
No wrappers -- no problems. :-)
https://github.com/llvm/llvm-project/pull/139164
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/Artem-B edited
https://github.com/llvm/llvm-project/pull/139164
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/Artem-B updated
https://github.com/llvm/llvm-project/pull/139164
>From a1d60feed11174b9d2106b57ee15ff6d9bc56fa4 Mon Sep 17 00:00:00 2001
From: Artem Belevich
Date: Thu, 8 May 2025 14:43:47 -0700
Subject: [PATCH] [CUDA] remove obsolete GPU-side __constexpr* wrappers
libc++ no
Artem-B wrote:
> Right now this checks for `libc++` less than 14. Is that still relevant
> following that change?
That's a very good point. Looks like those `__constexpr_fmin/fmax` are gone now
and we do not heed them any more.
https://github.com/llvm/llvm-project/pull/139164
Artem-B wrote:
@jhuber6 @ldionne One concern I have for this change is that it will break
folks who will use older libc++ with the new Clang + wrapper headers.
Is older libc++ expected to work with non-matching clang version? If the
expectation is that libc++ and clang are from the same versio
https://github.com/Artem-B approved this pull request.
https://github.com/llvm/llvm-project/pull/139244
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Artem-B wrote:
@cgmb
> I would suggest that we should either (a) change the default GPU target to
> native and make the failure to detect the user’s GPU into a hard compiler
> error, or (b) change the default GPU target to SPIR-V so that it works on
> every machine.
The thing is that the se
Artem-B wrote:
@jhuber6 do you think can we use `native` instead? I think it would be a
somewhat better option here.
If we have to choose a GPU variant by default, we may as well choose the actual
GPU, rather than a conditional choice between generic SPIR-V or an old GPU,
which has the disadva
https://github.com/Artem-B approved this pull request.
https://github.com/llvm/llvm-project/pull/138162
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -109,3 +109,48 @@ void func2(void) {
void func3(void) {
float a[16][1] = {{0.}};
}
+
+// CL12-LABEL: define dso_local void @wrong_store_type_private_pointer_alloca(
+// CL12-SAME: ) #[[ATTR0]] {
+// CL12-NEXT: [[ENTRY:.*:]]
+// CL12-NEXT:[[PLONG:%.*]] = alloca i64, al
@@ -109,3 +109,48 @@ void func2(void) {
void func3(void) {
float a[16][1] = {{0.}};
}
+
+// CL12-LABEL: define dso_local void @wrong_store_type_private_pointer_alloca(
+// CL12-SAME: ) #[[ATTR0]] {
+// CL12-NEXT: [[ENTRY:.*:]]
+// CL12-NEXT:[[PLONG:%.*]] = alloca i64, al
@@ -2376,9 +2376,14 @@ NamedDecl *Sema::LazilyCreateBuiltin(IdentifierInfo *II,
unsigned ID,
return nullptr;
}
+ // Warn for implicit uses of header dependent libraries,
+ // except in system headers.
if (!ForRedeclaration &&
(Context.BuiltinInfo.isPredefine
Artem-B wrote:
OK. This makes sense.
> sorry this change is so drawn out :)
What matters is that you're making progress, and I appreciate your work on
getting this issue sorted out the right way.
https://github.com/llvm/llvm-project/pull/138205
_
Artem-B wrote:
Something does not add up here. AFAICT, using builtins w/o explicitly declaring
them is something that's done all the time. https://godbolt.org/z/ha47W53dh
In that sense, we should not be needing to filter out the diagnostics coming
from the system headers only. There should not
@@ -0,0 +1,23 @@
+// expected-no-diagnostics
+
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -aux-triple
amdgcn-amd-amdhsa -fsyntax-only -verify -xhip %s
+// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -fsyntax-only -fcuda-is-device
-verify -xhip %s
+
+#include "Inputs/cuda
https://github.com/Artem-B commented:
LGTM in principle.
Now the question is -- how do we test it? There are multiple libstdc++ library
versions in the wild and we must not break any of them. We do have some testing
on CUDA test bots (which I've just discovered to be silently broken for a whil
@@ -0,0 +1,35 @@
+// libstdc++ uses the non-constexpr function std::__glibcxx_assert_fail()
+// to trigger compilation errors when the __glibcxx_assert(cond) macro
+// is used in a constexpr context.
+// Compilation fails when using code from the libstdc++ (such as std::array) on
https://github.com/Artem-B approved this pull request.
LGTM w/ a nit.
https://github.com/llvm/llvm-project/pull/136645
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -1100,3 +1101,49 @@ std::string SemaCUDA::getConfigureFuncName() const {
// Legacy CUDA kernel configuration call
return "cudaConfigureCall";
}
+
+// Record any local constexpr variables that are passed one way on the host
+// and another on the device.
+void SemaCUDA::r
@@ -1100,3 +1101,49 @@ std::string SemaCUDA::getConfigureFuncName() const {
// Legacy CUDA kernel configuration call
return "cudaConfigureCall";
}
+
+// Record any local constexpr variables that are passed one way on the host
+// and another on the device.
+void SemaCUDA::r
https://github.com/Artem-B edited
https://github.com/llvm/llvm-project/pull/136645
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -25,6 +25,7 @@ enum AddressSpace : unsigned {
ADDRESS_SPACE_CONST = 4,
ADDRESS_SPACE_LOCAL = 5,
ADDRESS_SPACE_TENSOR = 6,
+ ADDRESS_SPACE_SHARED_CLUSTER = 7,
Artem-B wrote:
PTX docs say:
```
If no sub-qualifier is specified with the .shared state sp
https://github.com/Artem-B approved this pull request.
https://github.com/llvm/llvm-project/pull/128222
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -0,0 +1,35 @@
+// libstdc++ uses the non-constexpr function std::__glibcxx_assert_fail()
+// to trigger compilation errors when the __glibcxx_assert(cond) macro
+// is used in a constexpr context.
+// Compilation fails when using code from the libstdc++ (such as std::array) on
https://github.com/Artem-B edited
https://github.com/llvm/llvm-project/pull/136133
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -36,6 +36,28 @@ typedef __SIZE_TYPE__ size_t;
#include
+#ifdef __ARM_ACLE
+// arm_acle.h needs some stdint types, but -ffreestanding prevents us from
Artem-B wrote:
Shouldn't that be fixed in arm_acle.h itself so it includes the headers with
the types i
https://github.com/Artem-B edited
https://github.com/llvm/llvm-project/pull/136133
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -0,0 +1,35 @@
+// libstdc++ uses the non-constexpr function std::__glibcxx_assert_fail()
+// to trigger compilation errors when the __glibcxx_assert(cond) macro
+// is used in a constexpr context.
+// Compilation fails when using code from the libstdc++ (such as std::array) on
@@ -982,8 +982,9 @@ void NVPTXDAGToDAGISel::SelectAddrSpaceCast(SDNode *N) {
case ADDRESS_SPACE_SHARED:
Opc = TM.is64Bit() ? NVPTX::cvta_shared_64 : NVPTX::cvta_shared;
break;
-case ADDRESS_SPACE_DSHARED:
- Opc = TM.is64Bit() ? NVPTX::cvta_dshared_64 :
https://github.com/Artem-B edited
https://github.com/llvm/llvm-project/pull/135644
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/Artem-B approved this pull request.
https://github.com/llvm/llvm-project/pull/135644
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -982,8 +982,9 @@ void NVPTXDAGToDAGISel::SelectAddrSpaceCast(SDNode *N) {
case ADDRESS_SPACE_SHARED:
Opc = TM.is64Bit() ? NVPTX::cvta_shared_64 : NVPTX::cvta_shared;
break;
-case ADDRESS_SPACE_DSHARED:
- Opc = TM.is64Bit() ? NVPTX::cvta_dshared_64 :
@@ -1034,6 +1034,10 @@ Value *CodeGenFunction::EmitNVPTXBuiltinExpr(unsigned
BuiltinID,
case NVPTX::BI__nvvm_fmin_xorsign_abs_f16x2:
return MakeHalfType(Intrinsic::nvvm_fmin_xorsign_abs_f16x2, BuiltinID, E,
*this);
+ case NVPTX::BI__nvvm_abs_bf16
@@ -1034,6 +1034,10 @@ Value *CodeGenFunction::EmitNVPTXBuiltinExpr(unsigned
BuiltinID,
case NVPTX::BI__nvvm_fmin_xorsign_abs_f16x2:
return MakeHalfType(Intrinsic::nvvm_fmin_xorsign_abs_f16x2, BuiltinID, E,
*this);
+ case NVPTX::BI__nvvm_abs_bf16
@@ -411,6 +412,13 @@ static Instruction
*convertNvvmIntrinsicToLlvm(InstCombiner &IC,
}
return nullptr;
}
+ case SPC_Fabs: {
+if (!II->getType()->isDoubleTy())
+ return nullptr;
+auto *Fabs = Intrinsic::getOrInsertDeclaration(
+II->getModule(),
Artem-B wrote:
I wish PTX would be a bit more consistent about naming things. Documentation
calls it distributed shared memory (and it is distributed, and is shared), but
the PTX instructions, compiler builtins and intrinsics use shared::cluster (as
opposed to regular shared AKA shared::cta).
@@ -703,6 +703,41 @@ let hasSideEffects = false in {
defm CVT_to_tf32_rz_satf : CVT_TO_TF32<"rz.satfinite", [hasPTX<86>,
hasSM<100>]>;
defm CVT_to_tf32_rn_relu_satf : CVT_TO_TF32<"rn.relu.satfinite",
[hasPTX<86>, hasSM<100>]>;
defm CVT_to_tf32_rz_relu_satf : CVT_TO_TF
https://github.com/Artem-B edited
https://github.com/llvm/llvm-project/pull/134345
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/Artem-B approved this pull request.
LGTM in general, with an intrinsic naming nit.
https://github.com/llvm/llvm-project/pull/134345
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listin
@@ -596,6 +605,28 @@ def __nvvm_e4m3x2_to_f16x2_rn_relu :
NVPTXBuiltinSMAndPTX<"_Vector<2, __fp16>(sh
def __nvvm_e5m2x2_to_f16x2_rn : NVPTXBuiltinSMAndPTX<"_Vector<2,
__fp16>(short)", SM_89, PTX81>;
def __nvvm_e5m2x2_to_f16x2_rn_relu : NVPTXBuiltinSMAndPTX<"_Vector<2,
__fp16>
1 - 100 of 1314 matches
Mail list logo