[clang] [NVPTX] Improve lowering of v4i8 (PR #67866)

2023-10-06 Thread Artem Belevich via cfe-commits
Artem-B wrote: > > I see one suspicious failure in tensorflow tests. I suspect I've messed > > something up in v4i8 comparison. > > Yup, there is a problem: > > ``` > Successfully custom legalized node > ... replacing: t10: v4i8 = BUILD_VECTOR Constant:i16<-128>, > Constant:i16<-128>, Consta

[clang-tools-extra] [NVPTX] Improve lowering of v4i8 (PR #67866)

2023-10-06 Thread Artem Belevich via cfe-commits
Artem-B wrote: > > I see one suspicious failure in tensorflow tests. I suspect I've messed > > something up in v4i8 comparison. > > Yup, there is a problem: > > ``` > Successfully custom legalized node > ... replacing: t10: v4i8 = BUILD_VECTOR Constant:i16<-128>, > Constant:i16<-128>, Consta

[clang] [NVPTX] Improve lowering of v4i8 (PR #67866)

2023-10-06 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,1248 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 3 +; ## Support i16x2 instructions +; RUN: llc < %s -mtriple=nvptx64-nvidia-cuda -mcpu=sm_90 -mattr=+ptx80 \ +; RUN: -O0 -disable-post-ra -frame-pointer=

[clang-tools-extra] [NVPTX] Improve lowering of v4i8 (PR #67866)

2023-10-06 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,1248 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 3 +; ## Support i16x2 instructions +; RUN: llc < %s -mtriple=nvptx64-nvidia-cuda -mcpu=sm_90 -mattr=+ptx80 \ +; RUN: -O0 -disable-post-ra -frame-pointer=

[clang] [NVPTX] Improve lowering of v4i8 (PR #67866)

2023-10-06 Thread Artem Belevich via cfe-commits
Artem-B wrote: Found another issue. We merge four independent byte loads with `align 1` into a 32-bit load, which fails at runtime on misaligned pointers. ``` %t0 = type { [17 x i8] } @shared_storage = linkonce_odr local_unnamed_addr addrspace(3) global %t0 undef, align 1 define <4 x i8> @i

[clang-tools-extra] [NVPTX] Improve lowering of v4i8 (PR #67866)

2023-10-06 Thread Artem Belevich via cfe-commits
Artem-B wrote: Found another issue. We merge four independent byte loads with `align 1` into a 32-bit load, which fails at runtime on misaligned pointers. ``` %t0 = type { [17 x i8] } @shared_storage = linkonce_odr local_unnamed_addr addrspace(3) global %t0 undef, align 1 define <4 x i8> @i

[clang] [NVPTX] Improve lowering of v4i8 (PR #67866)

2023-10-09 Thread Artem Belevich via cfe-commits
Artem-B wrote: clang-format failure on GitHub is weird -- it just silently exits with an error. I ran the same command locally and fixed one place it was not happy about. The buildkite failure somewhere in RISC-V appears to be unrelated. https://github.com/llvm/llvm-project/pull/67866 ___

[clang-tools-extra] [NVPTX] Improve lowering of v4i8 (PR #67866)

2023-10-09 Thread Artem Belevich via cfe-commits
Artem-B wrote: clang-format failure on GitHub is weird -- it just silently exits with an error. I ran the same command locally and fixed one place it was not happy about. The buildkite failure somewhere in RISC-V appears to be unrelated. https://github.com/llvm/llvm-project/pull/67866 ___

[clang] [NVPTX] Improve lowering of v4i8 (PR #67866)

2023-10-09 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B closed https://github.com/llvm/llvm-project/pull/67866 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang-tools-extra] [NVPTX] Improve lowering of v4i8 (PR #67866)

2023-10-09 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B closed https://github.com/llvm/llvm-project/pull/67866 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Let clang-cl support CUDA/HIP (PR #68921)

2023-10-12 Thread Artem Belevich via cfe-commits
Artem-B wrote: @rnk -- would that be acceptable for clang-cl on windows? https://github.com/llvm/llvm-project/pull/68921 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA][HIP] Fix init var diag in temmplate (PR #69081)

2023-10-16 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/69081 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Reland "[CUDA][HIP] Fix overloading resolution in global variable ini… (PR #65606)

2023-09-07 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. LGTM. I'm still figuring out the github-based workflow. One thing that may be useful in the future would be to start the pull request branch with the original/reverted commit and put the updates into separate commits, so one could see inc

[clang] [CUDA][HIP] Do not mark extern shared var (PR #65990)

2023-09-11 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. LGTM. Thank you! https://github.com/llvm/llvm-project/pull/65990 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Work around two more instances of __noinline__ conflicts. (PR #66138)

2023-09-12 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B review_requested https://github.com/llvm/llvm-project/pull/66138 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Work around two more instances of __noinline__ conflicts. (PR #66138)

2023-09-12 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B review_requested https://github.com/llvm/llvm-project/pull/66138 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Work around two more instances of __noinline__ conflicts. (PR #66138)

2023-09-12 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B created https://github.com/llvm/llvm-project/pull/66138: https://github.com/llvm/llvm-project/issues/57544 >From 91c9d12e8f71cd55c877f80a0820615531cb62bd Mon Sep 17 00:00:00 2001 From: Artem Belevich Date: Tue, 12 Sep 2023 11:47:17 -0700 Subject: [PATCH] Work around

[clang] [HIP] Fix comdat of template kernel handle (PR #66283)

2023-09-13 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/66283 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [HIP] Fix comdat of template kernel handle (PR #66283)

2023-09-13 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/66283 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [HIP] Fix comdat of template kernel handle (PR #66283)

2023-09-13 Thread Artem Belevich via cfe-commits
@@ -43,6 +44,9 @@ __global__ void kernelfunc() {} __global__ void kernel_decl(); +template +__global__ void temp_kernel_decl(T x); Artem-B wrote: Nit: rename temp -> template? `temp` is strongly associated with 'temporary'. https://github.com/llvm/llvm-proj

[clang] f05b58a - [clang] Support '-fgpu-default-stream=per-thread' for NVIDIA CUDA

2023-07-13 Thread Artem Belevich via cfe-commits
Author: boxu.zhang Date: 2023-07-13T16:54:57-07:00 New Revision: f05b58a9468cc2990678e06bc51df56b30344807 URL: https://github.com/llvm/llvm-project/commit/f05b58a9468cc2990678e06bc51df56b30344807 DIFF: https://github.com/llvm/llvm-project/commit/f05b58a9468cc2990678e06bc51df56b30344807.diff LO

[clang] 2a702ec - Use unsigned types for __popc/__popcll to match their declarations in CUDA headers.

2023-09-05 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2023-09-05T16:02:42-07:00 New Revision: 2a702eca3efa066e3a470cd3b17082a05e118c91 URL: https://github.com/llvm/llvm-project/commit/2a702eca3efa066e3a470cd3b17082a05e118c91 DIFF: https://github.com/llvm/llvm-project/commit/2a702eca3efa066e3a470cd3b17082a05e118c91.diff

[clang] fe8063e - Revert "[cuda][hip] Add CUDA builtin surface/texture reference support."

2020-03-27 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2020-03-27T10:01:38-07:00 New Revision: fe8063e1a0e983f1b4d38530f4fb157a26c0771c URL: https://github.com/llvm/llvm-project/commit/fe8063e1a0e983f1b4d38530f4fb157a26c0771c DIFF: https://github.com/llvm/llvm-project/commit/fe8063e1a0e983f1b4d38530f4fb157a26c0771c.diff

[clang] 7215b7e - [creduce] Fixed a typo in the error message we're looking for.

2019-11-07 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2019-11-07T17:16:51-08:00 New Revision: 7215b7ef530bff896a1da70c6b062e9259f5fde7 URL: https://github.com/llvm/llvm-project/commit/7215b7ef530bff896a1da70c6b062e9259f5fde7 DIFF: https://github.com/llvm/llvm-project/commit/7215b7ef530bff896a1da70c6b062e9259f5fde7.diff

[clang] 0c06a38 - [CUDA,clang-cl] Filter out unsupported arguments for device-side compilation.

2020-03-11 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2020-03-11T13:42:16-07:00 New Revision: 0c06a389e5937895579effd5e608c79bc6332e53 URL: https://github.com/llvm/llvm-project/commit/0c06a389e5937895579effd5e608c79bc6332e53 DIFF: https://github.com/llvm/llvm-project/commit/0c06a389e5937895579effd5e608c79bc6332e53.diff

[clang] 8527c1e - Added constraints on cl-options.cu test

2020-03-11 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2020-03-11T16:06:09-07:00 New Revision: 8527c1ed66c63db0590cd69320ba0bf8fad59b87 URL: https://github.com/llvm/llvm-project/commit/8527c1ed66c63db0590cd69320ba0bf8fad59b87 DIFF: https://github.com/llvm/llvm-project/commit/8527c1ed66c63db0590cd69320ba0bf8fad59b87.diff

[clang] eb2ba2e - [CUDA] Warn about unsupported CUDA SDK version only if it's used.

2020-03-12 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2020-03-12T10:04:10-07:00 New Revision: eb2ba2ea953b5ea73cdbb598f77470bde1c6a011 URL: https://github.com/llvm/llvm-project/commit/eb2ba2ea953b5ea73cdbb598f77470bde1c6a011 DIFF: https://github.com/llvm/llvm-project/commit/eb2ba2ea953b5ea73cdbb598f77470bde1c6a011.diff

[clang] cc14de8 - [CUDA] Fix order of memcpy arguments in __shfl_*(<64-bit type>).

2020-01-23 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2020-01-23T13:17:52-08:00 New Revision: cc14de88da27a8178976972bdc8211c31f7ca9ae URL: https://github.com/llvm/llvm-project/commit/cc14de88da27a8178976972bdc8211c31f7ca9ae DIFF: https://github.com/llvm/llvm-project/commit/cc14de88da27a8178976972bdc8211c31f7ca9ae.diff

[clang] 12fefee - [CUDA] Assume the latest known CUDA version if we've found an unknown one.

2020-01-28 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2020-01-28T10:11:42-08:00 New Revision: 12fefeef203ab4ef52d19bcdbd4180608a4deae1 URL: https://github.com/llvm/llvm-project/commit/12fefeef203ab4ef52d19bcdbd4180608a4deae1 DIFF: https://github.com/llvm/llvm-project/commit/12fefeef203ab4ef52d19bcdbd4180608a4deae1.diff

r328006 - [NVPTX] Make tensor load/store intrinsics overloaded.

2018-03-20 Thread Artem Belevich via cfe-commits
Author: tra Date: Tue Mar 20 10:18:59 2018 New Revision: 328006 URL: http://llvm.org/viewvc/llvm-project?rev=328006&view=rev Log: [NVPTX] Make tensor load/store intrinsics overloaded. This way we can support address-space specific variants without explicitly encoding the space in the name of the

Re: r328006 - [NVPTX] Make tensor load/store intrinsics overloaded.

2018-03-20 Thread Artem Belevich via cfe-commits
*, > ArrayRef, ArrayRef, const > llvm::Twine &): Assertion `(i >= FTy->getNumParams()|| > FTy->getParamType(i) == Args[i]->getType()) && "Calling a function with > a bad signature!"' failed. > > Cheers, > Rafael > > > Artem Be

r328158 - [NVPTX] Make tensor shape part of WMMA intrinsic's name.

2018-03-21 Thread Artem Belevich via cfe-commits
Author: tra Date: Wed Mar 21 14:55:02 2018 New Revision: 328158 URL: http://llvm.org/viewvc/llvm-project?rev=328158&view=rev Log: [NVPTX] Make tensor shape part of WMMA intrinsic's name. This is needed for the upcoming implementation of the new 8x32x16 and 32x8x16 variants of WMMA instructions in

r328161 - [CUDA] Disable LTO for device-side compilations.

2018-03-21 Thread Artem Belevich via cfe-commits
Author: tra Date: Wed Mar 21 15:22:59 2018 New Revision: 328161 URL: http://llvm.org/viewvc/llvm-project?rev=328161&view=rev Log: [CUDA] Disable LTO for device-side compilations. This fixes host-side LTO during CUDA compilation. Before, LTO pipeline construction was clashing with CUDA pipeline co

Re: [PATCH] D44691: [CUDA] Disable LTO for device-side compilations.

2018-03-22 Thread Artem Belevich via cfe-commits
On Thu, Mar 22, 2018 at 12:02 AM Yvan Roux wrote: > This patch broke ARM/AArch64 bots, see: > > http://lab.llvm.org:8011/builders/clang-cmake-armv8-full/builds/841/steps/ninja%20check%201/logs/FAIL%3A%20Clang%3A%3Athinlto.cu > > ​Sorry about that. I'll fix it ASAP. ​ -- --Artem Belevich ___

r328213 - [CUDA] add REQUIRES fields for CUDA variants of LTO tests.

2018-03-22 Thread Artem Belevich via cfe-commits
Author: tra Date: Thu Mar 22 09:47:41 2018 New Revision: 328213 URL: http://llvm.org/viewvc/llvm-project?rev=328213&view=rev Log: [CUDA] add REQUIRES fields for CUDA variants of LTO tests. Also relax checking for nvptx triple. This should avoid test failure if the test is executed on 32-bit platf

r328362 - [CUDA] Fixed false error reporting in case of calling H->G->HD->D.

2018-03-23 Thread Artem Belevich via cfe-commits
Author: tra Date: Fri Mar 23 12:49:03 2018 New Revision: 328362 URL: http://llvm.org/viewvc/llvm-project?rev=328362&view=rev Log: [CUDA] Fixed false error reporting in case of calling H->G->HD->D. Launching a kernel from the host code does not generate code for the kernel itself. This fixes an is

r329099 - Revert "Set calling convention for CUDA kernel"

2018-04-03 Thread Artem Belevich via cfe-commits
Author: tra Date: Tue Apr 3 11:29:31 2018 New Revision: 329099 URL: http://llvm.org/viewvc/llvm-project?rev=329099&view=rev Log: Revert "Set calling convention for CUDA kernel" This reverts r328795 which introduced an issue with referencing __global__ function templates. More details in the orig

r329127 - [CUDA] Check initializers of instantiated template variables.

2018-04-03 Thread Artem Belevich via cfe-commits
Author: tra Date: Tue Apr 3 15:41:06 2018 New Revision: 329127 URL: http://llvm.org/viewvc/llvm-project?rev=329127&view=rev Log: [CUDA] Check initializers of instantiated template variables. We were already performing checks on non-template variables, but the checks on templated ones were missin

r329229 - Revert "[CUDA] Check initializers of instantiated template variables."

2018-04-04 Thread Artem Belevich via cfe-commits
Author: tra Date: Wed Apr 4 13:48:42 2018 New Revision: 329229 URL: http://llvm.org/viewvc/llvm-project?rev=329229&view=rev Log: Revert "[CUDA] Check initializers of instantiated template variables." This (temporarily) reverts commit r329127 due to the problems it exposed in TensorFlow. Modifie

r329737 - [CUDA] Added --[no-]cuda-include-ptx=sm_XX|all option.

2018-04-10 Thread Artem Belevich via cfe-commits
Author: tra Date: Tue Apr 10 11:38:22 2018 New Revision: 329737 URL: http://llvm.org/viewvc/llvm-project?rev=329737&view=rev Log: [CUDA] Added --[no-]cuda-include-ptx=sm_XX|all option. Currently we always include PTX into the fatbin along with the GPU code.It about doubles the size of the GPU bin

r329829 - [NVPTX, CUDA] Improved feature constraints on NVPTX target builtins.

2018-04-11 Thread Artem Belevich via cfe-commits
Author: tra Date: Wed Apr 11 10:51:19 2018 New Revision: 329829 URL: http://llvm.org/viewvc/llvm-project?rev=329829&view=rev Log: [NVPTX, CUDA] Improved feature constraints on NVPTX target builtins. When NVPTX TARGET_BUILTIN specifies sm_XX or ptxYY as required feature, consider those features av

r329830 - [NVPTX] Removed 'satom' feature which is no longer used.

2018-04-11 Thread Artem Belevich via cfe-commits
Author: tra Date: Wed Apr 11 10:51:33 2018 New Revision: 329830 URL: http://llvm.org/viewvc/llvm-project?rev=329830&view=rev Log: [NVPTX] Removed 'satom' feature which is no longer used. Differential Revision: https://reviews.llvm.org/D45061 Modified: cfe/trunk/lib/Basic/Targets/NVPTX.cpp

Re: r318601 - [OpenMP] Show error if VLAs are not supported

2017-11-20 Thread Artem Belevich via cfe-commits
This change breaks CUDA as clang now reports an error during device-side compilation when VLA is used in the *host-side* code. http://lab.llvm.org:8011/builders/clang-cuda-build/builds/15591/steps/ninja%20build%20simple%20CUDA%20tests/logs/stdio E.g. I would expect this code to compile successfull

Re: r318601 - [OpenMP] Show error if VLAs are not supported

2017-11-20 Thread Artem Belevich via cfe-commits
Proposed fix: https://reviews.llvm.org/D40275 On Mon, Nov 20, 2017 at 4:13 PM, Artem Belevich wrote: > This change breaks CUDA as clang now reports an error during device-side > compilation when VLA is used in the *host-side* code. > http://lab.llvm.org:8011/builders/clang-cuda-build/ > builds/1

[clang] 2aa90da - [CUDA] Update cached kernel handle when the function instance changes.

2023-03-21 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2023-03-21T15:36:12-07:00 New Revision: 2aa90da012596712a4166e96d2a40fc90598c7fb URL: https://github.com/llvm/llvm-project/commit/2aa90da012596712a4166e96d2a40fc90598c7fb DIFF: https://github.com/llvm/llvm-project/commit/2aa90da012596712a4166e96d2a40fc90598c7fb.diff

[clang] 6963c61 - [NVPTX/CUDA] added an optional src_size argument to __nvvm_cp_async*

2023-05-19 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2023-05-19T10:59:36-07:00 New Revision: 6963c61f0f6e4be2039cb45e824ea1e83a8f1526 URL: https://github.com/llvm/llvm-project/commit/6963c61f0f6e4be2039cb45e824ea1e83a8f1526 DIFF: https://github.com/llvm/llvm-project/commit/6963c61f0f6e4be2039cb45e824ea1e83a8f1526.diff

[clang] 4450285 - [CUDA] provide wrapper functions for new NVCC builtins.

2023-05-19 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2023-05-19T11:48:08-07:00 New Revision: 4450285bd74079bf87ba7b824a8dec8dcfb586ef URL: https://github.com/llvm/llvm-project/commit/4450285bd74079bf87ba7b824a8dec8dcfb586ef DIFF: https://github.com/llvm/llvm-project/commit/4450285bd74079bf87ba7b824a8dec8dcfb586ef.diff

[clang] 29cb080 - [CUDA] Fix wrappers for sm_80 functions

2023-05-24 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2023-05-24T11:48:39-07:00 New Revision: 29cb080c363d655ab1179a5564f1a82460e49a06 URL: https://github.com/llvm/llvm-project/commit/29cb080c363d655ab1179a5564f1a82460e49a06 DIFF: https://github.com/llvm/llvm-project/commit/29cb080c363d655ab1179a5564f1a82460e49a06.diff

[clang] ffb635c - [CUDA] bump supported CUDA version to 12.1/11.8

2023-05-25 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2023-05-25T11:57:55-07:00 New Revision: ffb635cb2d4e374e52b12066893458a8b70889fa URL: https://github.com/llvm/llvm-project/commit/ffb635cb2d4e374e52b12066893458a8b70889fa DIFF: https://github.com/llvm/llvm-project/commit/ffb635cb2d4e374e52b12066893458a8b70889fa.diff

[clang] 0ad5d40 - [CUDA] Relax restrictions on variadics in host-side compilation.

2023-05-25 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2023-05-25T11:57:54-07:00 New Revision: 0ad5d40fa19f27db0e5f999d0e17b0c18b811019 URL: https://github.com/llvm/llvm-project/commit/0ad5d40fa19f27db0e5f999d0e17b0c18b811019 DIFF: https://github.com/llvm/llvm-project/commit/0ad5d40fa19f27db0e5f999d0e17b0c18b811019.diff

[clang] 0a0bae1 - [CUDA] plumb through new sm_90-specific builtins.

2023-05-25 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2023-05-25T11:57:56-07:00 New Revision: 0a0bae1e9f94ec86ac17b0b4eb817741689f3739 URL: https://github.com/llvm/llvm-project/commit/0a0bae1e9f94ec86ac17b0b4eb817741689f3739 DIFF: https://github.com/llvm/llvm-project/commit/0a0bae1e9f94ec86ac17b0b4eb817741689f3739.diff

[clang] 25708b3 - [NVPTX, CUDA] barrier intrinsics and builtins for sm_90

2023-05-25 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2023-05-25T11:57:57-07:00 New Revision: 25708b3df6e359123d5bce137652af812e168cfc URL: https://github.com/llvm/llvm-project/commit/25708b3df6e359123d5bce137652af812e168cfc DIFF: https://github.com/llvm/llvm-project/commit/25708b3df6e359123d5bce137652af812e168cfc.diff

[clang] 5c082e7 - [CUDA] Add CUDA wrappers over clang builtins for sm_90.

2023-05-25 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2023-05-25T11:57:58-07:00 New Revision: 5c082e7e15e38a2eea1f506725efe636a5b1bf8a URL: https://github.com/llvm/llvm-project/commit/5c082e7e15e38a2eea1f506725efe636a5b1bf8a DIFF: https://github.com/llvm/llvm-project/commit/5c082e7e15e38a2eea1f506725efe636a5b1bf8a.diff

[clang] df1b2be - [CUDA] Explicitly construct dim3() return values.

2023-05-25 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2023-05-25T12:41:25-07:00 New Revision: df1b2bef0c7cad11681a02e9e2f816b27fb480a6 URL: https://github.com/llvm/llvm-project/commit/df1b2bef0c7cad11681a02e9e2f816b27fb480a6 DIFF: https://github.com/llvm/llvm-project/commit/df1b2bef0c7cad11681a02e9e2f816b27fb480a6.diff

[clang] 6cdc07a - [CUDA] correctly install cuda_wrappers/bits/shared_ptr_base.h

2023-05-30 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2023-05-30T10:44:33-07:00 New Revision: 6cdc07a701eec08da450be58d6e1b67428a983dd URL: https://github.com/llvm/llvm-project/commit/6cdc07a701eec08da450be58d6e1b67428a983dd DIFF: https://github.com/llvm/llvm-project/commit/6cdc07a701eec08da450be58d6e1b67428a983dd.diff

[clang] 0f49116 - [CUDA] Update Kepler(sm_3*) support info.

2023-06-02 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2023-06-02T14:16:13-07:00 New Revision: 0f49116e261cf5a156221b006acb677e3565fd1a URL: https://github.com/llvm/llvm-project/commit/0f49116e261cf5a156221b006acb677e3565fd1a DIFF: https://github.com/llvm/llvm-project/commit/0f49116e261cf5a156221b006acb677e3565fd1a.diff

[clang] a825f37 - [CUDA] Relax restrictions on GPU-side variadic functions

2023-05-17 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2023-05-17T12:51:01-07:00 New Revision: a825f3754b3ca1591068cf99bc224af30a311e63 URL: https://github.com/llvm/llvm-project/commit/a825f3754b3ca1591068cf99bc224af30a311e63 DIFF: https://github.com/llvm/llvm-project/commit/a825f3754b3ca1591068cf99bc224af30a311e63.diff

[clang] e7b9c2f - [NVPTX/CUDA] added an optional src_size argument to __nvvm_cp_async*

2023-05-18 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2023-05-18T11:05:44-07:00 New Revision: e7b9c2f00fa04ef8d9b69ee0e36d7775823dbe6b URL: https://github.com/llvm/llvm-project/commit/e7b9c2f00fa04ef8d9b69ee0e36d7775823dbe6b DIFF: https://github.com/llvm/llvm-project/commit/e7b9c2f00fa04ef8d9b69ee0e36d7775823dbe6b.diff

[clang] 0e43eb2 - Revert "[NVPTX/CUDA] added an optional src_size argument to __nvvm_cp_async*"

2023-05-18 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2023-05-18T11:45:06-07:00 New Revision: 0e43eb24bd791e8b857e7347ac3646a4454d97e3 URL: https://github.com/llvm/llvm-project/commit/0e43eb24bd791e8b857e7347ac3646a4454d97e3 DIFF: https://github.com/llvm/llvm-project/commit/0e43eb24bd791e8b857e7347ac3646a4454d97e3.diff

[clang] a50e54f - [CUDA] Temporarily undefine __noinline__ when including bits/shared_ptr_base.h

2023-05-01 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2023-05-01T16:27:13-07:00 New Revision: a50e54fbeb48fb8a218a2914d827e1087bae2f8d URL: https://github.com/llvm/llvm-project/commit/a50e54fbeb48fb8a218a2914d827e1087bae2f8d DIFF: https://github.com/llvm/llvm-project/commit/a50e54fbeb48fb8a218a2914d827e1087bae2f8d.diff

r290982 - [CUDA] Pre-include sm_60 and sm_61 headers.

2017-01-04 Thread Artem Belevich via cfe-commits
Author: tra Date: Wed Jan 4 12:39:29 2017 New Revision: 290982 URL: http://llvm.org/viewvc/llvm-project?rev=290982&view=rev Log: [CUDA] Pre-include sm_60 and sm_61 headers. CUDA-8.0 comes with new headers which nvcc pre-includes via cuda_runtime.h Clang now makes them available as well. Differe

[PATCH] D23526: [CUDA] Collapsed offload actions should not be top-level jobs.

2016-08-15 Thread Artem Belevich via cfe-commits
tra created this revision. tra added reviewers: jlebar, sfantao. tra added a subscriber: cfe-commits. If they are, we end up with the last intermediary output preserved in the current directory after compilation. Added a test case to verify that we're using appropriate filenames for outputs of di

Re: [PATCH] D23526: [CUDA] Collapsed offload actions should not be top-level jobs.

2016-08-15 Thread Artem Belevich via cfe-commits
tra updated this revision to Diff 68100. tra added a comment. Addressed comments. https://reviews.llvm.org/D23526 Files: lib/Driver/Driver.cpp test/Driver/cuda-bindings.cu Index: test/Driver/cuda-bindings.cu === --- /dev/null

Re: [PATCH] D23526: [CUDA] Collapsed offload actions should not be top-level jobs.

2016-08-15 Thread Artem Belevich via cfe-commits
tra marked 2 inline comments as done. tra added a comment. https://reviews.llvm.org/D23526 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Re: [PATCH] D23627: [CUDA] Improve handling of math functions.

2016-08-17 Thread Artem Belevich via cfe-commits
tra added inline comments. Comment at: clang/lib/Headers/__clang_cuda_cmath.h:125-133 @@ -122,8 +124,11 @@ __DEVICE__ float modf(float __x, float *__iptr) { return ::modff(__x, __iptr); } -__DEVICE__ float nexttoward(float __from, float __to) { +__DEVICE__ float nexttoward(float

Re: [PATCH] D23627: [CUDA] Improve handling of math functions.

2016-08-17 Thread Artem Belevich via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. LGTM, but we may want someone familiar with math library to take a look. Comment at: clang/lib/Headers/__clang_cuda_cmath.h:125-133 @@ -122,8 +124,11 @@ __DEVICE__ float modf(float

r279455 - [CUDA] Collapsed offload actions should not be top-level jobs.

2016-08-22 Thread Artem Belevich via cfe-commits
Author: tra Date: Mon Aug 22 13:50:34 2016 New Revision: 279455 URL: http://llvm.org/viewvc/llvm-project?rev=279455&view=rev Log: [CUDA] Collapsed offload actions should not be top-level jobs. If they are, we end up with the last intermediary output preserved in the current directory after compil

Re: [PATCH] D23526: [CUDA] Collapsed offload actions should not be top-level jobs.

2016-08-22 Thread Artem Belevich via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL279455: [CUDA] Collapsed offload actions should not be top-level jobs. (authored by tra). Changed prior to commit: https://reviews.llvm.org/D23526?vs=68100&id=68896#toc Repository: rL LLVM https://r

Re: [PATCH] D24407: [CUDA] Make __GCC_ATOMIC_XXX_LOCK_FREE macros the same on host/device.

2016-09-09 Thread Artem Belevich via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. LGTM https://reviews.llvm.org/D24407 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D24522: [CUDA] Do not merge CUDA target attributes.

2016-09-13 Thread Artem Belevich via cfe-commits
tra created this revision. tra added a reviewer: jlebar. tra added a subscriber: cfe-commits. Herald added a subscriber: jlebar. CUDA target attributes are used for function overloading and must not be merged. This fixes a bug where attributes were inherited during function template specializati

Re: r281351 - Add a class ObjCProtocolQualifiers to wrap APIs for ObjC protocol list.

2016-09-13 Thread Artem Belevich via cfe-commits
Manman, FYI, It appears that some of your ObjC commits today trigger asan error. Sanitizer bots are broken by PR30341, so they don't report the issue yet. --Artem $ llvm/tools/clang/clang -cc1 -internal-isystem llvm/tools/clang/staging/include -nostdsysteminc -fblocks -fsyntax-only -Wnullable-to

Re: [PATCH] D24522: [CUDA] Do not merge CUDA target attributes.

2016-09-13 Thread Artem Belevich via cfe-commits
tra updated this revision to Diff 71244. tra marked an inline comment as done. tra added a comment. Removed REQUIRED lines. https://reviews.llvm.org/D24522 Files: lib/Sema/SemaDecl.cpp test/SemaCUDA/function-overload.cu test/SemaCUDA/target_attr_inheritance.cu Index: test/SemaCUDA/target

r281406 - [CUDA] Do not merge CUDA target attributes.

2016-09-13 Thread Artem Belevich via cfe-commits
Author: tra Date: Tue Sep 13 17:16:30 2016 New Revision: 281406 URL: http://llvm.org/viewvc/llvm-project?rev=281406&view=rev Log: [CUDA] Do not merge CUDA target attributes. CUDA target attributes are used for function overloading and must not be merged. This fixes a bug where attributes were in

Re: [PATCH] D24522: [CUDA] Do not merge CUDA target attributes.

2016-09-13 Thread Artem Belevich via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL281406: [CUDA] Do not merge CUDA target attributes. (authored by tra). Changed prior to commit: https://reviews.llvm.org/D24522?vs=71244&id=71249#toc Repository: rL LLVM https://reviews.llvm.org/D24

Re: r281351 - Add a class ObjCProtocolQualifiers to wrap APIs for ObjC protocol list.

2016-09-13 Thread Artem Belevich via cfe-commits
Thanks for the quick fix. ASAN is happy now. --Artem On Tue, Sep 13, 2016 at 3:09 PM, Manman wrote: > I checked in r281404. Hopefully it will fix the issue. > > Let me know if it does not. > > Thanks, > Manman > > On Sep 13, 2016, at 3:03 PM, Artem Belevich via cfe

Re: [PATCH] D24581: [CUDA] Add test checking our ability to take a function pointer to a __global__ function on the host side.

2016-09-14 Thread Artem Belevich via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. LGTM. https://reviews.llvm.org/D24581 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commit

Re: [PATCH] D24588: [CUDA] Make __clang_cuda_cmath.h compatible with libc++.

2016-09-14 Thread Artem Belevich via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. LGTM. https://reviews.llvm.org/D24588 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commit

Re: [PATCH] D24589: [test-suite] [CUDA] Add and tests.

2016-09-14 Thread Artem Belevich via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. LGTM. https://reviews.llvm.org/D24589 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commit

Re: [PATCH] D24590: [test-suite] [CUDA] Update README.

2016-09-14 Thread Artem Belevich via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. LGTM Comment at: External/CUDA/README:58 @@ +57,3 @@ + There's a cuda-tests-simple target that excludes tests that take a long time + to build (thrust). + It may

r281557 - Revert r281457 "Supports adding insertion around non-insertion replacements."

2016-09-14 Thread Artem Belevich via cfe-commits
Author: tra Date: Wed Sep 14 18:03:06 2016 New Revision: 281557 URL: http://llvm.org/viewvc/llvm-project?rev=281557&view=rev Log: Revert r281457 "Supports adding insertion around non-insertion replacements." Commit was breaking our internal tests. Modified: cfe/trunk/include/clang/Tooling/Co

r308675 - [NVPTX] Add lowering of i128 params.

2017-07-20 Thread Artem Belevich via cfe-commits
Author: tra Date: Thu Jul 20 14:16:03 2017 New Revision: 308675 URL: http://llvm.org/viewvc/llvm-project?rev=308675&view=rev Log: [NVPTX] Add lowering of i128 params. The patch adds support of i128 params lowering. The changes are quite trivial to support i128 as a "special case" of integer type.

r322742 - [DeclPrinter] Fix two cases that crash clang -ast-print.

2018-01-17 Thread Artem Belevich via cfe-commits
Author: tra Date: Wed Jan 17 11:29:39 2018 New Revision: 322742 URL: http://llvm.org/viewvc/llvm-project?rev=322742&view=rev Log: [DeclPrinter] Fix two cases that crash clang -ast-print. Both are related to handling anonymous structures. * clang didn't handle () around an anonymous struct variabl

r323239 - [CUDA] CUDA has no device-side library builtins.

2018-01-23 Thread Artem Belevich via cfe-commits
Author: tra Date: Tue Jan 23 11:08:18 2018 New Revision: 323239 URL: http://llvm.org/viewvc/llvm-project?rev=323239&view=rev Log: [CUDA] CUDA has no device-side library builtins. We should (almost) never consider a device-side declaration to match a library builtin functio. Otherwise clang may i

r323345 - [CUDA] Disable PGO and coverage instrumentation in NVPTX.

2018-01-24 Thread Artem Belevich via cfe-commits
Author: tra Date: Wed Jan 24 09:41:02 2018 New Revision: 323345 URL: http://llvm.org/viewvc/llvm-project?rev=323345&view=rev Log: [CUDA] Disable PGO and coverage instrumentation in NVPTX. NVPTX does not have runtime support necessary for profiling to work and even call arc collection is prohibiti

r323713 - [CUDA] Added partial support for CUDA-9.1

2018-01-29 Thread Artem Belevich via cfe-commits
Author: tra Date: Mon Jan 29 16:00:12 2018 New Revision: 323713 URL: http://llvm.org/viewvc/llvm-project?rev=323713&view=rev Log: [CUDA] Added partial support for CUDA-9.1 Clang can use CUDA-9.1 now, though new APIs (are not implemented yet. The major change is that headers in CUDA-9.1 went thro

[PATCH] D26774: [CUDA] Driver changes to support CUDA compilation on MacOS.

2016-11-17 Thread Artem Belevich via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. LGTM, with couple of minor nits. Comment at: clang/lib/Driver/Driver.cpp:3650-3654 + + // Intentionally omitted from the switch above: llvm::Triple::CUDA. CUDA + // compiles alw

r288406 - Send compiler output to /dev/null in defsym.s test.

2016-12-01 Thread Artem Belevich via cfe-commits
Author: tra Date: Thu Dec 1 13:34:35 2016 New Revision: 288406 URL: http://llvm.org/viewvc/llvm-project?rev=288406&view=rev Log: Send compiler output to /dev/null in defsym.s test. Fixes test failures if tests are run in a read-only source tree. Modified: cfe/trunk/test/Driver/defsym.s Mod

r288962 - [CUDA] Improve target attribute checking for function templates.

2016-12-07 Thread Artem Belevich via cfe-commits
Author: tra Date: Wed Dec 7 13:27:16 2016 New Revision: 288962 URL: http://llvm.org/viewvc/llvm-project?rev=288962&view=rev Log: [CUDA] Improve target attribute checking for function templates. * __host__ __device__ functions are no longer considered to be redeclarations of __host__ or __devic

r289091 - [CUDA] Ignore implicit target attributes during function template instantiation.

2016-12-08 Thread Artem Belevich via cfe-commits
Author: tra Date: Thu Dec 8 13:38:13 2016 New Revision: 289091 URL: http://llvm.org/viewvc/llvm-project?rev=289091&view=rev Log: [CUDA] Ignore implicit target attributes during function template instantiation. Some functions and templates are treated as __host__ __device__ even when they don't h

r289287 - [CUDA,Driver] Added --no-cuda-gpu-arch= option.

2016-12-09 Thread Artem Belevich via cfe-commits
Author: tra Date: Fri Dec 9 16:59:17 2016 New Revision: 289287 URL: http://llvm.org/viewvc/llvm-project?rev=289287&view=rev Log: [CUDA,Driver] Added --no-cuda-gpu-arch= option. This allows us to negate preceding --cuda-gpu-arch=X. This comes handy when user needs to override default flags set fo

[PATCH] D25755: [CUDA] Rework tests now that we emit deferred diagnostics during sema. Test-only change.

2016-10-18 Thread Artem Belevich via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. LGTM. https://reviews.llvm.org/D25755 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commit

[PATCH] D25796: [CUDA] Create __host__ and device variants of standard allocator declarations.

2016-10-19 Thread Artem Belevich via cfe-commits
tra created this revision. tra added a reviewer: jlebar. tra added a subscriber: cfe-commits. Implicit functions are treated as if they were __host__ __device__ and clang does not allow overloading those with __host__ or __device__ variants. In order for users to provide their own standard allo

[PATCH] D25809: [CUDA] Improved target attribute-based overloading.

2016-10-19 Thread Artem Belevich via cfe-commits
tra created this revision. tra added reviewers: jlebar, rsmith. tra added a subscriber: cfe-commits. Current behavior: - __host__ __device__ (HD) functions are considered to be redeclarations of `__host__` (H) of `__device__` (D) functions with same signature. - Target attributes are not taken i

[PATCH] D25839: Removed unused function argument. NFC.

2016-10-20 Thread Artem Belevich via cfe-commits
tra created this revision. tra added a reviewer: jlebar. tra added a subscriber: cfe-commits. https://reviews.llvm.org/D25839 Files: include/clang/Sema/Sema.h lib/Sema/SemaCUDA.cpp lib/Sema/SemaDecl.cpp Index: lib/Sema/SemaDecl.cpp =

[PATCH] D25845: [CUDA] Ignore implicit target attributes during function template instantiation.

2016-10-20 Thread Artem Belevich via cfe-commits
tra created this revision. tra added reviewers: jlebar, rsmith. tra added a subscriber: cfe-commits. Some functions and templates are treated as `__host__` `__device__` even when they don't have explicitly specified target attributes. What's worse, this treatment may change depending on command l

r284843 - Removed unused function argument. NFC.

2016-10-21 Thread Artem Belevich via cfe-commits
Author: tra Date: Fri Oct 21 12:15:46 2016 New Revision: 284843 URL: http://llvm.org/viewvc/llvm-project?rev=284843&view=rev Log: Removed unused function argument. NFC. Differential Revision: https://reviews.llvm.org/D25839 Modified: cfe/trunk/include/clang/Sema/Sema.h cfe/trunk/lib/Sema

[PATCH] D25839: Removed unused function argument. NFC.

2016-10-21 Thread Artem Belevich via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL284843: Removed unused function argument. NFC. (authored by tra). Changed prior to commit: https://reviews.llvm.org/D25839?vs=75339&id=75447#toc Repository: rL LLVM https://reviews.llvm.org/D25839

[PATCH] D25796: [CUDA] Create __host__ and device variants of standard allocator declarations.

2016-10-21 Thread Artem Belevich via cfe-commits
tra updated this revision to Diff 75462. tra added a comment. Addressed jlebar's comments. https://reviews.llvm.org/D25796 Files: lib/Sema/SemaExprCXX.cpp test/SemaCUDA/overloaded-delete.cu Index: test/SemaCUDA/overloaded-delete.cu ==

r284879 - Declare H and H new/delete.

2016-10-21 Thread Artem Belevich via cfe-commits
Author: tra Date: Fri Oct 21 15:34:05 2016 New Revision: 284879 URL: http://llvm.org/viewvc/llvm-project?rev=284879&view=rev Log: Declare H and H new/delete. Modified: cfe/trunk/lib/Sema/SemaExprCXX.cpp cfe/trunk/test/SemaCUDA/overloaded-delete.cu Modified: cfe/trunk/lib/Sema/SemaExprCXX

[PATCH] D25845: [CUDA] Ignore implicit target attributes during function template instantiation.

2016-10-21 Thread Artem Belevich via cfe-commits
tra updated this revision to Diff 75482. tra added a comment. Added a comment explaining expected constexpr function template matching behavior. https://reviews.llvm.org/D25845 Files: include/clang/Sema/Sema.h lib/Sema/SemaCUDA.cpp lib/Sema/SemaDeclAttr.cpp lib/Sema/SemaTemplate.cpp

<    1   2   3   4   5   6   7   8   9   10   >