[libcxx] [pstl] [lld] [llvm] [lldb] [clang] [libc] [compiler-rt] [mlir] [clang-tools-extra] [openmp] [Driver] Test ignored target-specific options for AMDGPU/NVPTX (PR #79222)

2024-01-24 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,5 @@ +/// Some target-specific options are ignored for GPU, so %clang exits with code 0. +// DEFINE: %{check} = %clang -### -c -mcmodel=medium Artem-B wrote: In this particular case, the changes we test (and the error messages) were originating in th

[libcxx] [pstl] [lld] [llvm] [lldb] [clang] [libc] [compiler-rt] [mlir] [clang-tools-extra] [openmp] [Driver] Test ignored target-specific options for AMDGPU/NVPTX (PR #79222)

2024-01-24 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/79222 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-24 Thread Artem Belevich via cfe-commits
Artem-B wrote: This option may not as well as one would hope. Problem #1 is that it will drastically slow down compilation for some users. NVIDIA GPU drivers are loaded on demand, and the process takes a while (O(second), depending on the kind and number of GPUs). If you build on a headless m

[clang] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Artem Belevich via cfe-commits
Artem-B wrote: > It's not unspecified per-se, it just picks the one the CUDA driver assigned > to ID zero, so it will correspond to the layman using a default device if > loaded into CUDA. The default "fastest card first" is also somewhat flaky. First, the "default" enumeration order is affec

[clang-tools-extra] [libc] [llvm] [compiler-rt] [lld] [libcxx] [lldb] [flang] [clang] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Artem Belevich via cfe-commits
Artem-B wrote: > This is what we already do for `--offload-arch=native` on CUDA, but this is > somewhat tangential. I've updated this patch to present the warning in the > case of multiply GPUs being detected, so I don't think there's a concern here > with the user being confused. If they have

[clang] [lld] [llvm] [libc] [flang] [libcxx] [compiler-rt] [clang-tools-extra] [lldb] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Artem Belevich via cfe-commits
Artem-B wrote: > I think the semantics of native on other architectures are clear enough here. I don't think we have the same idea about that. Let's spell it out, so there's no confusion. [GCC manual](https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html#index-march-16) says: > Using -march=na

[libcxx] [flang] [lldb] [clang] [clang-tools-extra] [lld] [llvm] [compiler-rt] [libc] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Artem Belevich via cfe-commits
Artem-B wrote: > This method of compilation is not like CUDA, so we can't target all the GPUs > at the same time. I think this is the key fact I was missing. If the patch is only for a standalone compilation which does not do multi-GPU compilation in principle, then your approach makes sense.

[lld] [lldb] [clang-tools-extra] [clang] [libcxx] [libc] [flang] [llvm] [compiler-rt] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. LGTM, as we can only handle a single GPU target during compilation. https://github.com/llvm/llvm-project/pull/79373 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/c

[llvm] [clang] [NVPTX] Add builtin support for 'globaltimer' (PR #79765)

2024-01-29 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/79765 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-29 Thread Artem Belevich via cfe-commits
Artem-B wrote: 'activemask' is a rather peculiar instruction which may not be a good candidate for exposing it to LLVM. The problem is that it can 'observe' the past branch decisions and reflects the state of not-yet-reconverged conditional branches. LLVM does not take it into account. Opaque

[clang] [llvm] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-29 Thread Artem Belevich via cfe-commits
Artem-B wrote: https://bugs.llvm.org/show_bug.cgi?id=35249 https://github.com/llvm/llvm-project/pull/79768 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-29 Thread Artem Belevich via cfe-commits
@@ -4599,6 +4599,14 @@ def int_nvvm_vote_ballot_sync : [IntrInaccessibleMemOnly, IntrConvergent, IntrNoCallback], "llvm.nvvm.vote.ballot.sync">, ClangBuiltin<"__nvvm_vote_ballot_sync">; +// +// ACTIVEMASK +// +def int_nvvm_activemask : + Intrinsic<[llvm_i32_ty]

[llvm] [clang] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-29 Thread Artem Belevich via cfe-commits
@@ -65,7 +65,7 @@ def : Proc<"sm_61", [SM61, PTX50]>; def : Proc<"sm_62", [SM62, PTX50]>; def : Proc<"sm_70", [SM70, PTX60]>; def : Proc<"sm_72", [SM72, PTX61]>; -def : Proc<"sm_75", [SM75, PTX63]>; +def : Proc<"sm_75", [SM75, PTX62, PTX63]>; Artem-B wrote: Wh

[llvm] [clang] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-29 Thread Artem Belevich via cfe-commits
@@ -65,7 +65,7 @@ def : Proc<"sm_61", [SM61, PTX50]>; def : Proc<"sm_62", [SM62, PTX50]>; def : Proc<"sm_70", [SM70, PTX60]>; def : Proc<"sm_72", [SM72, PTX61]>; -def : Proc<"sm_75", [SM75, PTX63]>; +def : Proc<"sm_75", [SM75, PTX62, PTX63]>; Artem-B wrote: Wh

[llvm] [clang] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-29 Thread Artem Belevich via cfe-commits
@@ -65,7 +65,7 @@ def : Proc<"sm_61", [SM61, PTX50]>; def : Proc<"sm_62", [SM62, PTX50]>; def : Proc<"sm_70", [SM70, PTX60]>; def : Proc<"sm_72", [SM72, PTX61]>; -def : Proc<"sm_75", [SM75, PTX63]>; +def : Proc<"sm_75", [SM75, PTX62, PTX63]>; Artem-B wrote: I'

[clang] [llvm] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-29 Thread Artem Belevich via cfe-commits
@@ -4599,6 +4599,14 @@ def int_nvvm_vote_ballot_sync : [IntrInaccessibleMemOnly, IntrConvergent, IntrNoCallback], "llvm.nvvm.vote.ballot.sync">, ClangBuiltin<"__nvvm_vote_ballot_sync">; +// +// ACTIVEMASK +// +def int_nvvm_activemask : + Intrinsic<[llvm_i32_ty]

[clang] [llvm] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-29 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/79768 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [NVPTX] Add builtin support for 'nanosleep' PTX instrunction (PR #79888)

2024-01-29 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/79888 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [NVPTX] Add builtin support for 'globaltimer' (PR #79765)

2024-01-29 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/79765 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [NVPTX] Allow compiling LLVM-IR without `-march` set (PR #79873)

2024-01-29 Thread Artem Belevich via cfe-commits
Artem-B wrote: Relying on something *not* being defined is probably not the best way to handle 'generic' target. For starters it makes it hard or impossible to recreate the same compilation state by undoing already-specified option. It also breaks established assumption that there *is* a defau

[clang] [CUDA] Change '__activemask' to use '__nvvm_activemask()' (PR #79892)

2024-01-29 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/79892 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [NVPTX] Allow compiling LLVM-IR without `-march` set (PR #79873)

2024-01-29 Thread Artem Belevich via cfe-commits
Artem-B wrote: > I think there's some precedent from both vendors to treat missing attributes > as a more generic target. It sounds more like a bug than a feature to me. The major difference between "you get sm_xx by default" and this "you get generic by default" is that With specific sm_XX,

[clang] [NVPTX] Allow compiling LLVM-IR without `-march` set (PR #79873)

2024-01-29 Thread Artem Belevich via cfe-commits
Artem-B wrote: > Right now if you specify target-cpu you get target-cpu attributes, which is > what we don't want. I'm fine handling 'generic' in a special way under the hood and not specifying target-CPU. My concern is about user-facing interface. Command line options must be overridable.

[clang] [CUDA][HIP] warn incompatible redeclare (PR #77359)

2024-01-16 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,19 @@ +// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -fsyntax-only \ +// RUN: -isystem %S/Inputs -verify %s +// RUN: %clang_cc1 -triple nvptx64-nvidia-cuda -fsyntax-only \ +// RUN: -isystem %S/Inputs -fcuda-is-device -verify %s +// RUN: %clang_cc1 -triple x86_

[clang] [CUDA][HIP] warn incompatible redeclare (PR #77359)

2024-01-16 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,19 @@ +// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -fsyntax-only \ +// RUN: -isystem %S/Inputs -verify %s +// RUN: %clang_cc1 -triple nvptx64-nvidia-cuda -fsyntax-only \ +// RUN: -isystem %S/Inputs -fcuda-is-device -verify %s +// RUN: %clang_cc1 -triple x86_

[clang] [CUDA][HIP] warn incompatible redeclare (PR #77359)

2024-01-16 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,19 @@ +// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -fsyntax-only \ +// RUN: -isystem %S/Inputs -verify %s +// RUN: %clang_cc1 -triple nvptx64-nvidia-cuda -fsyntax-only \ +// RUN: -isystem %S/Inputs -fcuda-is-device -verify %s +// RUN: %clang_cc1 -triple x86_

[clang] [CUDA][HIP] warn incompatible redeclare (PR #77359)

2024-01-16 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/77359 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [NVPTX][AMDGPU][CodeGen] Fix `local_space nullptr` handling for NVPTX and local/private `nullptr` value for AMDGPU. (PR #78759)

2024-01-22 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B commented: It would be great to add some tests for local AS null pointers for NVPTX and AMDGPU back-ends. https://github.com/llvm/llvm-project/pull/78759 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists

[clang] [NVPTX][AMDGPU][CodeGen] Fix `local_space nullptr` handling for NVPTX and local/private `nullptr` value for AMDGPU. (PR #78759)

2024-01-22 Thread Artem Belevich via cfe-commits
@@ -285,6 +289,20 @@ void NVPTXTargetCodeGenInfo::addNVVMMetadata(llvm::GlobalValue *GV, bool NVPTXTargetCodeGenInfo::shouldEmitStaticExternCAliases() const { return false; } + +llvm::Constant * +NVPTXTargetCodeGenInfo::getNullPointer(const CodeGen::CodeGenModule &CGM, +

[clang] [NVPTX][AMDGPU][CodeGen] Fix `local_space nullptr` handling for NVPTX and local/private `nullptr` value for AMDGPU. (PR #78759)

2024-01-22 Thread Artem Belevich via cfe-commits
Artem-B wrote: > * Address space cast of nullptr in local_space into a generic_space for the > CUDA backend. I think you mean "NVPTX back-end". CUDA is a front-end entity (C++ w/ GPU extensions) https://github.com/llvm/llvm-project/pull/78759 ___ cf

[clang] [HIP] Document func ptr and virtual func (PR #68126)

2023-10-18 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/68126 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [NVPTX] Fixed some wmma store builtins that had non-const src param. (PR #69354)

2023-10-18 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/69354 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA][HIP] Fix std::is_invocable (PR #70369)

2023-10-26 Thread Artem Belevich via cfe-commits
@@ -283,12 +283,18 @@ set(cuda_wrapper_files cuda_wrappers/cmath cuda_wrappers/complex cuda_wrappers/new + cuda_wrappers/type_traits ) set(cuda_wrapper_bits_files cuda_wrappers/bits/shared_ptr_base.h cuda_wrappers/bits/basic_string.h cuda_wrappers/bits/basic

[clang] [CUDA][HIP] Fix std::is_invocable (PR #70369)

2023-10-27 Thread Artem Belevich via cfe-commits
@@ -283,12 +283,18 @@ set(cuda_wrapper_files cuda_wrappers/cmath cuda_wrappers/complex cuda_wrappers/new + cuda_wrappers/type_traits ) set(cuda_wrapper_bits_files cuda_wrappers/bits/shared_ptr_base.h cuda_wrappers/bits/basic_string.h cuda_wrappers/bits/basic

r369777 - Fixed a typo.

2019-08-23 Thread Artem Belevich via cfe-commits
Author: tra Date: Fri Aug 23 09:24:17 2019 New Revision: 369777 URL: http://llvm.org/viewvc/llvm-project?rev=369777&view=rev Log: Fixed a typo. Modified: cfe/trunk/lib/Sema/SemaDecl.cpp Modified: cfe/trunk/lib/Sema/SemaDecl.cpp URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Sema/Sem

r370792 - [CUDA] Use activemask.b32 instruction to implement __activemask w/ CUDA-9.2+

2019-09-03 Thread Artem Belevich via cfe-commits
Author: tra Date: Tue Sep 3 10:31:58 2019 New Revision: 370792 URL: http://llvm.org/viewvc/llvm-project?rev=370792&view=rev Log: [CUDA] Use activemask.b32 instruction to implement __activemask w/ CUDA-9.2+ vote.ballot instruction is gone in recent CUDA versions and vote.sync.ballot can not be us

Re: [clang] 23058f9 - [OPENMP]Do not use RTTI by default for NVPTX devices.

2020-01-15 Thread Artem Belevich via cfe-commits
Alexey, This breaks compilation of our cuda code which happens to transitively include protobuf headers. Can you, please, revert it for now until we figure out how RTTI should be handled? --Artem On Tue, Jan 14, 2020 at 3:15 PM Alexey Bataev via cfe-commits < cfe-commits@lists.llvm.org> wrote:

Re: [clang] 23058f9 - [OPENMP]Do not use RTTI by default for NVPTX devices.

2020-01-15 Thread Artem Belevich via cfe-commits
Thank you. In general, RTTI should probably be treated similar to how we deal with inline assembly and ignore errors if they are in the code that we're not going to codegen during this side of compilation. E.g. during host-side compilation we don't complain about GPU-side registers in inline assem

Re: [clang] 23058f9 - [OPENMP]Do not use RTTI by default for NVPTX devices.

2020-01-15 Thread Artem Belevich via cfe-commits
On Wed, Jan 15, 2020 at 2:52 PM Alexey Bataev wrote: > 1. The problem is that it does not produce errors, > ATM, it does produce errors when it's disabled. > it leads to the emission of some declaration that cannot be resolved by > the linker. This what I was trying to avoid. > I'm OK with disab

Re: [clang] 23058f9 - [OPENMP]Do not use RTTI by default for NVPTX devices.

2020-01-15 Thread Artem Belevich via cfe-commits
On Wed, Jan 15, 2020 at 3:09 PM Alexey Bataev wrote: > And I disabled it only for device side, which is NVPTX, no? Can host side > target class report that the target is NVPTX? If you look at the patch, it > disable RTTI only if current triple is NVPTX. Can it be true for the host? > You are corr

[clang] 30514f0 - [CUDA] Added conversion functions to builtin vars.

2020-09-24 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2020-09-24T14:33:04-07:00 New Revision: 30514f0afa3ee1e6da6bf9c41e83c28e884f0740 URL: https://github.com/llvm/llvm-project/commit/30514f0afa3ee1e6da6bf9c41e83c28e884f0740 DIFF: https://github.com/llvm/llvm-project/commit/30514f0afa3ee1e6da6bf9c41e83c28e884f0740.diff

[clang] 016e4eb - [DWARF] Allow toolchain to adjust specified DWARF version.

2020-12-09 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2020-12-09T16:34:34-08:00 New Revision: 016e4ebfde28d6bb1ab6399fc8abd8cfc6a1d9fd URL: https://github.com/llvm/llvm-project/commit/016e4ebfde28d6bb1ab6399fc8abd8cfc6a1d9fd DIFF: https://github.com/llvm/llvm-project/commit/016e4ebfde28d6bb1ab6399fc8abd8cfc6a1d9fd.diff

[clang] 0936655 - [CUDA] Do not diagnose host/device variable access in dependent types.

2020-12-14 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2020-12-14T11:53:18-08:00 New Revision: 0936655bac78f6e9cb84dc3feb30c32012100839 URL: https://github.com/llvm/llvm-project/commit/0936655bac78f6e9cb84dc3feb30c32012100839 DIFF: https://github.com/llvm/llvm-project/commit/0936655bac78f6e9cb84dc3feb30c32012100839.diff

[clang] cdbf6bf - [HIP] Use argv[0] as the default choice for the Executable name.

2020-11-03 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2020-11-03T10:31:39-08:00 New Revision: cdbf6bfdc7d15fc6a078c7773f142042a11d2c1b URL: https://github.com/llvm/llvm-project/commit/cdbf6bfdc7d15fc6a078c7773f142042a11d2c1b DIFF: https://github.com/llvm/llvm-project/commit/cdbf6bfdc7d15fc6a078c7773f142042a11d2c1b.diff

[clang] 9a46505 - [CUDA] Unbreak CUDA compilation with -std=c++20

2020-11-19 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2020-11-19T10:35:47-08:00 New Revision: 9a465057a64dba8a8614424d26136f5c0452bcc3 URL: https://github.com/llvm/llvm-project/commit/9a465057a64dba8a8614424d26136f5c0452bcc3 DIFF: https://github.com/llvm/llvm-project/commit/9a465057a64dba8a8614424d26136f5c0452bcc3.diff

[clang] 4326792 - [CUDA] Another attempt to fix early inclusion of from libstdc++

2020-12-04 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2020-12-04T12:03:35-08:00 New Revision: 43267929423bf768bbbcc65e47a07e37af7f4e22 URL: https://github.com/llvm/llvm-project/commit/43267929423bf768bbbcc65e47a07e37af7f4e22 DIFF: https://github.com/llvm/llvm-project/commit/43267929423bf768bbbcc65e47a07e37af7f4e22.diff

[clang] 65d2064 - [CUDA] Improve clang's ability to detect recent CUDA versions.

2020-10-23 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2020-10-23T10:03:29-07:00 New Revision: 65d206484c54177641d4b11d42cab1f1acc8c0c7 URL: https://github.com/llvm/llvm-project/commit/65d206484c54177641d4b11d42cab1f1acc8c0c7 DIFF: https://github.com/llvm/llvm-project/commit/65d206484c54177641d4b11d42cab1f1acc8c0c7.diff

[clang] e7fe125 - [CUDA] Extract CUDA version from cuda.h if version.txt is not found

2020-10-23 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2020-10-23T10:03:30-07:00 New Revision: e7fe125b776bf08d95e60ff3354a5c836218a0e6 URL: https://github.com/llvm/llvm-project/commit/e7fe125b776bf08d95e60ff3354a5c836218a0e6 DIFF: https://github.com/llvm/llvm-project/commit/e7fe125b776bf08d95e60ff3354a5c836218a0e6.diff

[clang] f38a9e5 - [CUDA] Allow local static variables with target attributes.

2020-11-02 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2020-11-02T14:37:13-08:00 New Revision: f38a9e51178add132d2c8ae160787fb2175a48a4 URL: https://github.com/llvm/llvm-project/commit/f38a9e51178add132d2c8ae160787fb2175a48a4 DIFF: https://github.com/llvm/llvm-project/commit/f38a9e51178add132d2c8ae160787fb2175a48a4.diff

[clang] 0a3ebb4 - Revert "[CUDA] Allow local static variables with target attributes."

2020-11-02 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2020-11-02T15:09:07-08:00 New Revision: 0a3ebb4d8d988e063e395621d162fa224fa4fb08 URL: https://github.com/llvm/llvm-project/commit/0a3ebb4d8d988e063e395621d162fa224fa4fb08 DIFF: https://github.com/llvm/llvm-project/commit/0a3ebb4d8d988e063e395621d162fa224fa4fb08.diff

[clang] be86b67 - [CUDA] Allow local static variables with target attributes.

2020-11-03 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2020-11-03T10:30:38-08:00 New Revision: be86b6773b6ba4d101a848e109540548181d2ed5 URL: https://github.com/llvm/llvm-project/commit/be86b6773b6ba4d101a848e109540548181d2ed5 DIFF: https://github.com/llvm/llvm-project/commit/be86b6773b6ba4d101a848e109540548181d2ed5.diff

[clang] 02c2468 - [Clang][NVPTX] Add NVPTX intrinsics and builtins for CUDA PTX cp.async instructions

2021-05-17 Thread Artem Belevich via cfe-commits
Author: Stuart Adams Date: 2021-05-17T09:46:59-07:00 New Revision: 02c2468864bbb37f7b279aff84961815c1500b6c URL: https://github.com/llvm/llvm-project/commit/02c2468864bbb37f7b279aff84961815c1500b6c DIFF: https://github.com/llvm/llvm-project/commit/02c2468864bbb37f7b279aff84961815c1500b6c.diff

[clang] f226e28 - [Clang][NVPTX] Add NVPTX intrinsics and builtins for CUDA PTX redux.sync instructions

2021-05-17 Thread Artem Belevich via cfe-commits
Author: Steffen Larsen Date: 2021-05-17T09:46:59-07:00 New Revision: f226e28a880f8e40b1bfd4c77b9768a667372d22 URL: https://github.com/llvm/llvm-project/commit/f226e28a880f8e40b1bfd4c77b9768a667372d22 DIFF: https://github.com/llvm/llvm-project/commit/f226e28a880f8e40b1bfd4c77b9768a667372d22.diff

[clang] 9a75c06 - [CUDA] Work around compatibility issue with libstdc++ 11.1.0

2021-05-24 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2021-05-24T11:07:09-07:00 New Revision: 9a75c06cd9d94d3fd13c47a01044da97b98cf26b URL: https://github.com/llvm/llvm-project/commit/9a75c06cd9d94d3fd13c47a01044da97b98cf26b DIFF: https://github.com/llvm/llvm-project/commit/9a75c06cd9d94d3fd13c47a01044da97b98cf26b.diff

[clang] 6b20ea6 - [CUDA] Pass ExecConfig through BuildCallToMemberFunction

2021-09-16 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2021-09-16T11:18:12-07:00 New Revision: 6b20ea6963561f2c91490c0993390b7f2ff8f71c URL: https://github.com/llvm/llvm-project/commit/6b20ea6963561f2c91490c0993390b7f2ff8f71c DIFF: https://github.com/llvm/llvm-project/commit/6b20ea6963561f2c91490c0993390b7f2ff8f71c.diff

[clang] fd582ee - [CUDA] Move CUDA SDK include path further down the include search path.

2021-09-28 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2021-09-28T11:29:28-07:00 New Revision: fd582eeffe582665eacac522617a15e17e9872cd URL: https://github.com/llvm/llvm-project/commit/fd582eeffe582665eacac522617a15e17e9872cd DIFF: https://github.com/llvm/llvm-project/commit/fd582eeffe582665eacac522617a15e17e9872cd.diff

[clang] 2aa01cc - [CUDA, NVPTX] Allow targeting sm_86 GPUs.

2021-02-09 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2021-02-09T11:01:10-08:00 New Revision: 2aa01ccec30109fbcc65934c5d7c8907793e0660 URL: https://github.com/llvm/llvm-project/commit/2aa01ccec30109fbcc65934c5d7c8907793e0660 DIFF: https://github.com/llvm/llvm-project/commit/2aa01ccec30109fbcc65934c5d7c8907793e0660.diff

[clang] 6a9cf21 - [CUDA, MemCpyOpt] Add a flag to force-enable memcpyopt and use it for CUDA.

2021-08-06 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2021-08-06T11:13:52-07:00 New Revision: 6a9cf21f5a2dcd02f90075d6d3576a87f1abd8a9 URL: https://github.com/llvm/llvm-project/commit/6a9cf21f5a2dcd02f90075d6d3576a87f1abd8a9 DIFF: https://github.com/llvm/llvm-project/commit/6a9cf21f5a2dcd02f90075d6d3576a87f1abd8a9.diff

[clang] cab5f89 - [Clang] allow overriding -fbasic-block-sections

2021-06-30 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2021-06-30T14:32:08-07:00 New Revision: cab5f89cfd9efa9166e1362972e460323b8254ef URL: https://github.com/llvm/llvm-project/commit/cab5f89cfd9efa9166e1362972e460323b8254ef DIFF: https://github.com/llvm/llvm-project/commit/cab5f89cfd9efa9166e1362972e460323b8254ef.diff

[clang] 01d3a3d - [CUDA] Only allow NVIDIA offload-arch during CUDA compilation.

2021-07-13 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2021-07-13T11:09:14-07:00 New Revision: 01d3a3dcabaf862581b1d1aee604fcee6a18b240 URL: https://github.com/llvm/llvm-project/commit/01d3a3dcabaf862581b1d1aee604fcee6a18b240 DIFF: https://github.com/llvm/llvm-project/commit/01d3a3dcabaf862581b1d1aee604fcee6a18b240.diff

[clang] 25629bb - Fix cuda-bad-arch.cu test.

2021-07-13 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2021-07-13T11:57:25-07:00 New Revision: 25629bb45f0a4b8c8e99dbde4f4a7e3d980b9fd7 URL: https://github.com/llvm/llvm-project/commit/25629bb45f0a4b8c8e99dbde4f4a7e3d980b9fd7 DIFF: https://github.com/llvm/llvm-project/commit/25629bb45f0a4b8c8e99dbde4f4a7e3d980b9fd7.diff

[clang] 32e0645 - [CUDA] Remove `noreturn` attribute from __assertfail().

2021-03-01 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2021-03-01T13:59:22-08:00 New Revision: 32e0645276230bb5b736e378860df3b92b1f4ba8 URL: https://github.com/llvm/llvm-project/commit/32e0645276230bb5b736e378860df3b92b1f4ba8 DIFF: https://github.com/llvm/llvm-project/commit/32e0645276230bb5b736e378860df3b92b1f4ba8.diff

[clang] 0e8a414 - [CUDA, NVPTX] Added basic __bf16 support for NVPTX.

2022-10-25 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2022-10-25T11:08:06-07:00 New Revision: 0e8a414ab3d330ebb2996ec95d8141618ee0278b URL: https://github.com/llvm/llvm-project/commit/0e8a414ab3d330ebb2996ec95d8141618ee0278b DIFF: https://github.com/llvm/llvm-project/commit/0e8a414ab3d330ebb2996ec95d8141618ee0278b.diff

[clang] f3a2cbc - Refactored CUDA version housekeeping to use less boilerplate.

2022-10-07 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2022-10-07T13:59:23-07:00 New Revision: f3a2cbcf97f5c7a58f9d4c5588c2bea8028f8c58 URL: https://github.com/llvm/llvm-project/commit/f3a2cbcf97f5c7a58f9d4c5588c2bea8028f8c58 DIFF: https://github.com/llvm/llvm-project/commit/f3a2cbcf97f5c7a58f9d4c5588c2bea8028f8c58.diff

[clang] 9a01cca - Add support for CUDA-11.8 and sm_{87,89,90} GPUs.

2022-10-07 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2022-10-07T13:59:28-07:00 New Revision: 9a01cca66036087e4da37c221a4b911818910524 URL: https://github.com/llvm/llvm-project/commit/9a01cca66036087e4da37c221a4b911818910524 DIFF: https://github.com/llvm/llvm-project/commit/9a01cca66036087e4da37c221a4b911818910524.diff

[clang] a10eb07 - Do not append terminating NUL to the binary string with embedded fatbin.

2022-10-17 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2022-10-17T15:39:39-07:00 New Revision: a10eb07d1acc2f132b4d0cf522097814a8340b47 URL: https://github.com/llvm/llvm-project/commit/a10eb07d1acc2f132b4d0cf522097814a8340b47 DIFF: https://github.com/llvm/llvm-project/commit/a10eb07d1acc2f132b4d0cf522097814a8340b47.diff

[clang] 8173405 - [CUDA] make use of deprecated texture API conditional on CUDA version.

2022-11-17 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2022-11-17T11:38:16-08:00 New Revision: 817340569bf98b696329c53508a0d87cc0daec25 URL: https://github.com/llvm/llvm-project/commit/817340569bf98b696329c53508a0d87cc0daec25 DIFF: https://github.com/llvm/llvm-project/commit/817340569bf98b696329c53508a0d87cc0daec25.diff

[clang] 1ad5f6a - [CUDA] added cmath wrappers to unbreak CUDA compilation after D79555

2023-01-12 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2023-01-12T15:37:50-08:00 New Revision: 1ad5f6af816a439a84f7d8fe3dff87dd1f8a39ba URL: https://github.com/llvm/llvm-project/commit/1ad5f6af816a439a84f7d8fe3dff87dd1f8a39ba DIFF: https://github.com/llvm/llvm-project/commit/1ad5f6af816a439a84f7d8fe3dff87dd1f8a39ba.diff

[clang] [clang-repl][CUDA] Move CUDA module registration to beginning of global_ctors (PR #66658)

2023-09-18 Thread Artem Belevich via cfe-commits
@@ -794,7 +794,7 @@ void CodeGenModule::Release() { AddGlobalCtor(ObjCInitFunction); if (Context.getLangOpts().CUDA && CUDARuntime) { if (llvm::Function *CudaCtorFunction = CUDARuntime->finalizeModule()) - AddGlobalCtor(CudaCtorFunction); + AddGlobalCtor(C

[clang] [Driver][NVPTX] Add a warning that device debug info does not work with optimizations (PR #65327)

2023-09-18 Thread Artem Belevich via cfe-commits
@@ -28,6 +28,17 @@ // RUN: --offload-arch=sm_35 --cuda-path=%S/Inputs/CUDA/usr/local/cuda \ // RUN: | FileCheck -check-prefixes=CHECK,ARCH64,SM35,RDC %s +// Compiling -O{1,2,3,4,fast,s,z} with -g does not pass -g debug info to ptxas. +// NOTE: This is because ptxas does not

[clang] [Driver][NVPTX] Add a warning that device debug info does not work with optimizations (PR #65327)

2023-09-18 Thread Artem Belevich via cfe-commits
@@ -413,13 +413,25 @@ void NVPTX::Assembler::ConstructJob(Compilation &C, const JobAction &JA, // TODO: Perhaps we should map host -O2 to ptxas -O3. -O3 is ptxas's // default, so it may correspond more closely to the spirit of clang -O2. +bool noOptimization = A->

[clang] [clang-repl][CUDA] Move CUDA module registration to beginning of global_ctors (PR #66658)

2023-09-18 Thread Artem Belevich via cfe-commits
@@ -794,7 +794,7 @@ void CodeGenModule::Release() { AddGlobalCtor(ObjCInitFunction); if (Context.getLangOpts().CUDA && CUDARuntime) { if (llvm::Function *CudaCtorFunction = CUDARuntime->finalizeModule()) - AddGlobalCtor(CudaCtorFunction); + AddGlobalCtor(C

[clang] [clang-repl][CUDA] Move CUDA module registration to beginning of global_ctors (PR #66658)

2023-09-18 Thread Artem Belevich via cfe-commits
@@ -794,7 +794,7 @@ void CodeGenModule::Release() { AddGlobalCtor(ObjCInitFunction); if (Context.getLangOpts().CUDA && CUDARuntime) { if (llvm::Function *CudaCtorFunction = CUDARuntime->finalizeModule()) - AddGlobalCtor(CudaCtorFunction); + AddGlobalCtor(C

[clang] [NVPTX] Add support for maxclusterrank in launch_bounds (PR #66496)

2023-09-21 Thread Artem Belevich via cfe-commits
@@ -11836,6 +11836,10 @@ def err_sycl_special_type_num_init_method : Error< "types with 'sycl_special_class' attribute must have one and only one '__init' " "method defined">; +def warn_cuda_maxclusterrank_sm_90 : Warning< + "maxclusterrank requires sm_90 or higher, CUDA

[clang] [NVPTX] Add support for maxclusterrank in launch_bounds (PR #66496)

2023-09-21 Thread Artem Belevich via cfe-commits
@@ -5650,34 +5665,51 @@ static Expr *makeLaunchBoundsArgExpr(Sema &S, Expr *E, CUDALaunchBoundsAttr * Sema::CreateLaunchBoundsAttr(const AttributeCommonInfo &CI, Expr *MaxThreads, - Expr *MinBlocks) { - CUDALaunchBoundsAttr TmpAttr(Context, CI, Max

[clang] [NVPTX] Add support for maxclusterrank in launch_bounds (PR #66496)

2023-09-21 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,45 @@ +// RUN: %clang_cc1 -std=c++11 -fsyntax-only -triple nvptx-unknown-unknown -target-cpu sm_90 -verify %s + +#include "Inputs/cuda.h" + +__launch_bounds__(128, 7) void Test2Args(void); +__launch_bounds__(128) void Test1Arg(void); + +__launch_bounds__(0x) v

[clang] [NVPTX] Add support for maxclusterrank in launch_bounds (PR #66496)

2023-09-21 Thread Artem Belevich via cfe-commits
@@ -5607,6 +5607,21 @@ bool Sema::CheckRegparmAttr(const ParsedAttr &AL, unsigned &numParams) { return false; } +// Helper to get CudaArch. +static CudaArch getCudaArch(const TargetInfo &TI) { Artem-B wrote: Considering that we do have TargetInfo pointer h

[clang] [NVPTX] Add support for maxclusterrank in launch_bounds (PR #66496)

2023-09-21 Thread Artem Belevich via cfe-commits
@@ -12,7 +12,7 @@ __launch_bounds__(0x1) void TestWayTooBigArg(void); // expected- __launch_bounds__(-128, 7) void TestNegArg1(void); // expected-warning {{'launch_bounds' attribute parameter 0 is negative and will be ignored}} __launch_bounds__(128, -7) void T

[clang] [NVPTX] Add support for maxclusterrank in launch_bounds (PR #66496)

2023-09-21 Thread Artem Belevich via cfe-commits
@@ -537,59 +537,46 @@ void NVPTXAsmPrinter::emitKernelFunctionDirectives(const Function &F, raw_ostream &O) const { // If the NVVM IR has some of reqntid* specified, then output // the reqntid directive, and set the unspec

[clang] [NVPTX] Add support for maxclusterrank in launch_bounds (PR #66496)

2023-09-22 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B resolved https://github.com/llvm/llvm-project/pull/66496 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [NVPTX] Add support for maxclusterrank in launch_bounds (PR #66496)

2023-09-22 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B resolved https://github.com/llvm/llvm-project/pull/66496 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [NVPTX] Add support for maxclusterrank in launch_bounds (PR #66496)

2023-09-22 Thread Artem Belevich via cfe-commits
@@ -5607,6 +5607,21 @@ bool Sema::CheckRegparmAttr(const ParsedAttr &AL, unsigned &numParams) { return false; } +// Helper to get CudaArch. +static CudaArch getCudaArch(const TargetInfo &TI) { Artem-B wrote: You may need to verify that `TI->getTriple()->is

[clang] [NVPTX] Add support for maxclusterrank in launch_bounds (PR #66496)

2023-09-22 Thread Artem Belevich via cfe-commits
@@ -537,59 +537,46 @@ void NVPTXAsmPrinter::emitKernelFunctionDirectives(const Function &F, raw_ostream &O) const { // If the NVVM IR has some of reqntid* specified, then output // the reqntid directive, and set the unspec

[clang] [NVPTX] Add support for maxclusterrank in launch_bounds (PR #66496)

2023-09-22 Thread Artem Belevich via cfe-commits
@@ -537,59 +537,46 @@ void NVPTXAsmPrinter::emitKernelFunctionDirectives(const Function &F, raw_ostream &O) const { // If the NVVM IR has some of reqntid* specified, then output // the reqntid directive, and set the unspec

[clang] [NVPTX] Add support for maxclusterrank in launch_bounds (PR #66496)

2023-09-25 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/66496 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA][HIP] Fix host/device context in concept (PR #67721)

2023-10-04 Thread Artem Belevich via cfe-commits
@@ -176,3 +176,34 @@ Predefined Macros * - ``HIP_API_PER_THREAD_DEFAULT_STREAM`` - Alias to ``__HIP_API_PER_THREAD_DEFAULT_STREAM__``. Deprecated. +C++20 Concepts with HIP and CUDA + + +In Clang, when working with HIP or CUDA, it's impor

[clang] [HIP] Document func ptr and virtual func (PR #68126)

2023-10-04 Thread Artem Belevich via cfe-commits
@@ -176,3 +176,65 @@ Predefined Macros * - ``HIP_API_PER_THREAD_DEFAULT_STREAM`` - Alias to ``__HIP_API_PER_THREAD_DEFAULT_STREAM__``. Deprecated. +Function Pointers Support in Clang with HIP +=== + +Function pointers' support va

[clang] [HIP] Document func ptr and virtual func (PR #68126)

2023-10-04 Thread Artem Belevich via cfe-commits
@@ -176,3 +176,65 @@ Predefined Macros * - ``HIP_API_PER_THREAD_DEFAULT_STREAM`` - Alias to ``__HIP_API_PER_THREAD_DEFAULT_STREAM__``. Deprecated. +Function Pointers Support in Clang with HIP +=== + +Function pointers' support va

[clang] 7275734 - [CUDA/NVPTX] Improve handling of memcpy for -Os compilations.

2023-08-18 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2023-08-18T11:27:36-07:00 New Revision: 72757343fa866b7bfcbaa67edad895297c8cb2c5 URL: https://github.com/llvm/llvm-project/commit/72757343fa866b7bfcbaa67edad895297c8cb2c5 DIFF: https://github.com/llvm/llvm-project/commit/72757343fa866b7bfcbaa67edad895297c8cb2c5.diff

[clang] 8f8df78 - Added missing test constraints.

2023-08-18 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2023-08-18T11:39:11-07:00 New Revision: 8f8df788aefaf9c947f0b8768ebca45176c7e9ee URL: https://github.com/llvm/llvm-project/commit/8f8df788aefaf9c947f0b8768ebca45176c7e9ee DIFF: https://github.com/llvm/llvm-project/commit/8f8df788aefaf9c947f0b8768ebca45176c7e9ee.diff

[clang] [NVPTX] Improve lowering of v4i8 (PR #67866)

2023-10-06 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,1248 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 3 +; ## Support i16x2 instructions +; RUN: llc < %s -mtriple=nvptx64-nvidia-cuda -mcpu=sm_90 -mattr=+ptx80 \ +; RUN: -O0 -disable-post-ra -frame-pointer=

[clang] [NVPTX] Improve lowering of v4i8 (PR #67866)

2023-10-06 Thread Artem Belevich via cfe-commits
Artem-B wrote: I see one suspicious failure in tensorflow tests. I suspect I've messed something up in v4i8 comparison. https://github.com/llvm/llvm-project/pull/67866 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi

[clang] [NVPTX] Improve lowering of v4i8 (PR #67866)

2023-10-06 Thread Artem Belevich via cfe-commits
Artem-B wrote: > I see one suspicious failure in tensorflow tests. I suspect I've messed > something up in v4i8 comparison. Yup, there is a problem: ``` Successfully custom legalized node ... replacing: t10: v4i8 = BUILD_VECTOR Constant:i16<-128>, Constant:i16<-128>, Constant:i16<-128>, Const

[clang-tools-extra] [NVPTX] Improve lowering of v4i8 (PR #67866)

2023-10-06 Thread Artem Belevich via cfe-commits
Artem-B wrote: > I see one suspicious failure in tensorflow tests. I suspect I've messed > something up in v4i8 comparison. Yup, there is a problem: ``` Successfully custom legalized node ... replacing: t10: v4i8 = BUILD_VECTOR Constant:i16<-128>, Constant:i16<-128>, Constant:i16<-128>, Const

[clang] [NVPTX] Improve lowering of v4i8 (PR #67866)

2023-10-06 Thread Artem Belevich via cfe-commits
@@ -2150,58 +2179,94 @@ NVPTXTargetLowering::LowerCONCAT_VECTORS(SDValue Op, SelectionDAG &DAG) const { return DAG.getBuildVector(Node->getValueType(0), dl, Ops); } -// We can init constant f16x2 with a single .b32 move. Normally it +// We can init constant f16x2/v2i16/v4i

[clang-tools-extra] [NVPTX] Improve lowering of v4i8 (PR #67866)

2023-10-06 Thread Artem Belevich via cfe-commits
@@ -2150,58 +2179,94 @@ NVPTXTargetLowering::LowerCONCAT_VECTORS(SDValue Op, SelectionDAG &DAG) const { return DAG.getBuildVector(Node->getValueType(0), dl, Ops); } -// We can init constant f16x2 with a single .b32 move. Normally it +// We can init constant f16x2/v2i16/v4i

[clang] [NVPTX] Improve lowering of v4i8 (PR #67866)

2023-10-06 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B deleted https://github.com/llvm/llvm-project/pull/67866 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang-tools-extra] [NVPTX] Improve lowering of v4i8 (PR #67866)

2023-10-06 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B deleted https://github.com/llvm/llvm-project/pull/67866 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [NVPTX] Improve lowering of v4i8 (PR #67866)

2023-10-06 Thread Artem Belevich via cfe-commits
@@ -2150,58 +2179,94 @@ NVPTXTargetLowering::LowerCONCAT_VECTORS(SDValue Op, SelectionDAG &DAG) const { return DAG.getBuildVector(Node->getValueType(0), dl, Ops); } -// We can init constant f16x2 with a single .b32 move. Normally it +// We can init constant f16x2/v2i16/v4i

[clang-tools-extra] [NVPTX] Improve lowering of v4i8 (PR #67866)

2023-10-06 Thread Artem Belevich via cfe-commits
@@ -2150,58 +2179,94 @@ NVPTXTargetLowering::LowerCONCAT_VECTORS(SDValue Op, SelectionDAG &DAG) const { return DAG.getBuildVector(Node->getValueType(0), dl, Ops); } -// We can init constant f16x2 with a single .b32 move. Normally it +// We can init constant f16x2/v2i16/v4i

<    1   2   3   4   5   6   7   8   9   10   >