[clang] [libc] [llvm] [NVPTX] Implement variadic functions using IR lowering (PR #96015)

2024-07-12 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. LGTM with a minor nit. https://github.com/llvm/llvm-project/pull/96015 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [libc] [llvm] [NVPTX] Implement variadic functions using IR lowering (PR #96015)

2024-07-12 Thread Artem Belevich via cfe-commits
@@ -203,8 +203,12 @@ ABIArgInfo NVPTXABIInfo::classifyArgumentType(QualType Ty) const { void NVPTXABIInfo::computeInfo(CGFunctionInfo &FI) const { if (!getCXXABI().classifyReturnType(FI)) FI.getReturnInfo() = classifyReturnType(FI.getReturnType()); + + unsigned Argument

[clang] [llvm] [CUDA] Mark CUDA-12.4 as supported and introduce ptx 8.4. (PR #91516)

2024-05-08 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B created https://github.com/llvm/llvm-project/pull/91516 None >From 6bb4800a5ed7c5f2ffeaded874d72f7624539122 Mon Sep 17 00:00:00 2001 From: Artem Belevich Date: Wed, 8 May 2024 11:07:34 -0700 Subject: [PATCH] [CUDA] Mark CUDA-12.4 as supported and introduce ptx 8.4.

[clang] [llvm] [CUDA] Mark CUDA-12.4 as supported and introduce ptx 8.4. (PR #91516)

2024-05-08 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B closed https://github.com/llvm/llvm-project/pull/91516 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Do not emit intrinsic math functions on GPU targets (PR #98209)

2024-08-05 Thread Artem Belevich via cfe-commits
Artem-B wrote: Given that the prevalent compilation for CUDA has no standard library whatsoever, preserving libcalls may break some existing users that may be relying on library call lowering to an intrinsic that *is* implemented by the back-end. Perhaps this "no library call to intrinsic con

[clang] [llvm] [NVPTX] Support __usAtomicCAS builtin (PR #99646)

2024-08-07 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/99646 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [NVPTX] Support __usAtomicCAS builtin (PR #99646)

2024-08-07 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. LGTM with a test nit https://github.com/llvm/llvm-project/pull/99646 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [NVPTX] Support __usAtomicCAS builtin (PR #99646)

2024-08-07 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,37 @@ +; RUN: llc < %s -march=nvptx64 -mcpu=sm_32 | FileCheck %s --check-prefixes=SM30,CHECK Artem-B wrote: This test should be suitable for automatic check generation -- we probably do want to see the details of what we're doing when we do 32-bit CA

[clang] [llvm] [NVPTX] Support __usAtomicCAS builtin (PR #99646)

2024-08-07 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,37 @@ +; RUN: llc < %s -march=nvptx64 -mcpu=sm_32 | FileCheck %s --check-prefixes=SM30,CHECK Artem-B wrote: https://llvm.org/docs/TestingGuide.html#generating-assertions-in-regression-tests https://github.com/llvm/llvm-project/pull/99646 __

[clang] [llvm] [CUDA] Add a pseudo GPU sm_next which allows overriding for SM/PTX version. (PR #100247)

2024-08-08 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B updated https://github.com/llvm/llvm-project/pull/100247 >From 44a1045eee71777fa916e2a8043b2f99afc96a96 Mon Sep 17 00:00:00 2001 From: Artem Belevich Date: Thu, 18 Jul 2024 15:05:01 -0700 Subject: [PATCH 1/3] [CUDA] Add a pseudo GPU sm_next which allows overrides for

[clang] [llvm] [CUDA] Add a pseudo GPU sm_next which allows overriding for SM/PTX version. (PR #100247)

2024-08-08 Thread Artem Belevich via cfe-commits
@@ -26,24 +27,38 @@ static cl::opt NoF16Math("nvptx-no-f16-math", cl::Hidden, cl::desc("NVPTX Specific: Disable generation of f16 math ops."), cl::init(false)); +static cl::opt +NextSM("nvptx-next-sm", cl::Hidden, + cl::desc("NVPTX

[clang] [llvm] [CUDA] Add a pseudo GPU sm_next which allows overriding for SM/PTX version. (PR #100247)

2024-08-08 Thread Artem Belevich via cfe-commits
@@ -26,24 +27,38 @@ static cl::opt NoF16Math("nvptx-no-f16-math", cl::Hidden, cl::desc("NVPTX Specific: Disable generation of f16 math ops."), cl::init(false)); +static cl::opt +NextSM("nvptx-next-sm", cl::Hidden, + cl::desc("NVPTX

[clang] [llvm] [CUDA] Add a pseudo GPU sm_next which allows overriding for SM/PTX version. (PR #100247)

2024-08-08 Thread Artem Belevich via cfe-commits
@@ -52,6 +53,42 @@ const char *CudaVersionToString(CudaVersion V); // Input is "Major.Minor" CudaVersion CudaStringToVersion(const llvm::Twine &S); +enum class PTXVersion { + PTX_UNKNOWN = 0, + PTX_32 = 32, + PTX_40 = 40, + PTX_41, + PTX_42, + PTX_43, + PTX_50 = 50, +

[clang] 0ad19a8 - [CUDA, NVPTX] Corrected fragment size for tf32 LD B matrix.

2022-01-25 Thread Artem Belevich via cfe-commits
Author: JackAKirk Date: 2022-01-25T11:29:19-08:00 New Revision: 0ad19a833177861be55fefaff725ab89c8695d01 URL: https://github.com/llvm/llvm-project/commit/0ad19a833177861be55fefaff725ab89c8695d01 DIFF: https://github.com/llvm/llvm-project/commit/0ad19a833177861be55fefaff725ab89c8695d01.diff LOG

[clang] 7a6d692 - [NVPTX] Expose float tys min, max, abs, neg as builtins

2022-03-01 Thread Artem Belevich via cfe-commits
Author: Jakub Chlanda Date: 2022-03-01T11:07:11-08:00 New Revision: 7a6d692b3b11e80fd19e7c9b65e1e6f70035c676 URL: https://github.com/llvm/llvm-project/commit/7a6d692b3b11e80fd19e7c9b65e1e6f70035c676 DIFF: https://github.com/llvm/llvm-project/commit/7a6d692b3b11e80fd19e7c9b65e1e6f70035c676.diff

[clang] a895182 - [NVPTX] Add more FMA intriniscs/builtins

2022-03-01 Thread Artem Belevich via cfe-commits
Author: Jakub Chlanda Date: 2022-03-01T11:07:11-08:00 New Revision: a8951823024b38c455e839d40656ad533b4aa8ff URL: https://github.com/llvm/llvm-project/commit/a8951823024b38c455e839d40656ad533b4aa8ff DIFF: https://github.com/llvm/llvm-project/commit/a8951823024b38c455e839d40656ad533b4aa8ff.diff

[clang] 510fd28 - [NVPTX] Add ex2.approx.f16/f16x2 support

2022-03-01 Thread Artem Belevich via cfe-commits
Author: Nicolas Miller Date: 2022-03-01T11:07:11-08:00 New Revision: 510fd283fda2d7c5118ae1b451a1f2365cfc3f27 URL: https://github.com/llvm/llvm-project/commit/510fd283fda2d7c5118ae1b451a1f2365cfc3f27 DIFF: https://github.com/llvm/llvm-project/commit/510fd283fda2d7c5118ae1b451a1f2365cfc3f27.diff

[clang] c99b2c6 - CUDA/HIP: Allow __int128 on the host side

2022-01-04 Thread Artem Belevich via cfe-commits
Author: Henry Linjamäki Date: 2022-01-04T16:09:26-08:00 New Revision: c99b2c63169d5aa6499143078790cb3eb87dee45 URL: https://github.com/llvm/llvm-project/commit/c99b2c63169d5aa6499143078790cb3eb87dee45 DIFF: https://github.com/llvm/llvm-project/commit/c99b2c63169d5aa6499143078790cb3eb87dee45.dif

[clang] bef3eb8 - [Clang][NVPTX]Add NVPTX intrinsics and builtins for CUDA PTX cvt sm80 instructions

2022-01-13 Thread Artem Belevich via cfe-commits
Author: Jack Kirk Date: 2022-01-13T13:29:48-08:00 New Revision: bef3eb83442a2f30c761a03793bd56c961f49cdd URL: https://github.com/llvm/llvm-project/commit/bef3eb83442a2f30c761a03793bd56c961f49cdd DIFF: https://github.com/llvm/llvm-project/commit/bef3eb83442a2f30c761a03793bd56c961f49cdd.diff LOG

[clang] abbdc13 - [CUDA][SPIRV] Use OpenCLKernel CC for CUDA -> SPIRV

2021-12-06 Thread Artem Belevich via cfe-commits
Author: Daniele Castagna Date: 2021-12-06T15:06:57-08:00 New Revision: abbdc13e6803562bce4e42866a9ebf7f2a04a89b URL: https://github.com/llvm/llvm-project/commit/abbdc13e6803562bce4e42866a9ebf7f2a04a89b DIFF: https://github.com/llvm/llvm-project/commit/abbdc13e6803562bce4e42866a9ebf7f2a04a89b.di

[clang] 4e94cba - [HIPSPV][2/4] Add HIPSPV tool chain

2021-12-14 Thread Artem Belevich via cfe-commits
Author: Henry Linjamäki Date: 2021-12-14T10:22:38-08:00 New Revision: 4e94cba5b4e431794026085b89a34112b2d9ac0d URL: https://github.com/llvm/llvm-project/commit/4e94cba5b4e431794026085b89a34112b2d9ac0d DIFF: https://github.com/llvm/llvm-project/commit/4e94cba5b4e431794026085b89a34112b2d9ac0d.dif

[clang] 38cf112 - Allow applying attributes to subset of allowed subjects.

2021-04-12 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2021-04-12T09:33:33-07:00 New Revision: 38cf112a6bc8502ff8cce6ef524cf04c07f90f96 URL: https://github.com/llvm/llvm-project/commit/38cf112a6bc8502ff8cce6ef524cf04c07f90f96 DIFF: https://github.com/llvm/llvm-project/commit/38cf112a6bc8502ff8cce6ef524cf04c07f90f96.diff

[clang] eaa9ef0 - [CUDA, FDO] Filter out profiling options from GPU-side compilations.

2021-04-16 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2021-04-16T11:35:28-07:00 New Revision: eaa9ef075d9b4d49ce9dae723516e7e6e8b0c4b6 URL: https://github.com/llvm/llvm-project/commit/eaa9ef075d9b4d49ce9dae723516e7e6e8b0c4b6 DIFF: https://github.com/llvm/llvm-project/commit/eaa9ef075d9b4d49ce9dae723516e7e6e8b0c4b6.diff

[clang] 127091b - [CUDA] Normalize handling of defauled dtor.

2021-01-21 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2021-01-21T10:48:07-08:00 New Revision: 127091bfd5edf10495fee4724fd21c666e5d79c1 URL: https://github.com/llvm/llvm-project/commit/127091bfd5edf10495fee4724fd21c666e5d79c1 DIFF: https://github.com/llvm/llvm-project/commit/127091bfd5edf10495fee4724fd21c666e5d79c1.diff

[clang] ccfb055 - [CUDA] Implement experimental support for texture lookups.

2021-10-06 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2021-10-06T15:15:53-07:00 New Revision: ccfb0555f76b865cf50bd354558dd00bfa7b2762 URL: https://github.com/llvm/llvm-project/commit/ccfb0555f76b865cf50bd354558dd00bfa7b2762 DIFF: https://github.com/llvm/llvm-project/commit/ccfb0555f76b865cf50bd354558dd00bfa7b2762.diff

[clang] 6707a7d - [CUDA] remove unneeded includes from CUDA-related headers.

2021-10-06 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2021-10-06T17:20:21-07:00 New Revision: 6707a7d7e96ac23ba66f16bdb44927082d2fd4d3 URL: https://github.com/llvm/llvm-project/commit/6707a7d7e96ac23ba66f16bdb44927082d2fd4d3 DIFF: https://github.com/llvm/llvm-project/commit/6707a7d7e96ac23ba66f16bdb44927082d2fd4d3.diff

[clang] 29e00b2 - [CUDA] Make sure is included with original __THROW defined.

2021-10-07 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2021-10-07T11:43:56-07:00 New Revision: 29e00b29f76adb15a51c1ccd6c1fdb6fce5f4d7b URL: https://github.com/llvm/llvm-project/commit/29e00b29f76adb15a51c1ccd6c1fdb6fce5f4d7b DIFF: https://github.com/llvm/llvm-project/commit/29e00b29f76adb15a51c1ccd6c1fdb6fce5f4d7b.diff

[clang] f526ee5 - [CUDA] Provide address space conversion builtins.

2021-10-12 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2021-10-12T14:56:39-07:00 New Revision: f526ee5b8517b60620cd03bb3e5945ed69d6bfaa URL: https://github.com/llvm/llvm-project/commit/f526ee5b8517b60620cd03bb3e5945ed69d6bfaa DIFF: https://github.com/llvm/llvm-project/commit/f526ee5b8517b60620cd03bb3e5945ed69d6bfaa.diff

[clang] 0060fff - [CUDA] Bump default GPU architecture to sm_35.

2021-08-23 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2021-08-23T13:24:45-07:00 New Revision: 0060fffc822261ff7350e34371c4456f363f866d URL: https://github.com/llvm/llvm-project/commit/0060fffc822261ff7350e34371c4456f363f866d DIFF: https://github.com/llvm/llvm-project/commit/0060fffc822261ff7350e34371c4456f363f866d.diff

[clang] 49d982d - [CUDA] Add support for CUDA-11.4

2021-08-23 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2021-08-23T13:24:46-07:00 New Revision: 49d982d8c6e01b6f8e4f173ed6325beab08b URL: https://github.com/llvm/llvm-project/commit/49d982d8c6e01b6f8e4f173ed6325beab08b DIFF: https://github.com/llvm/llvm-project/commit/49d982d8c6e01b6f8e4f173ed6325beab08b.diff

[clang] 3db8e48 - [CUDA] Improve CUDA version detection and diagnostics.

2021-08-23 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2021-08-23T13:24:48-07:00 New Revision: 3db8e486e560183f064e31a228aada52fdeac5d6 URL: https://github.com/llvm/llvm-project/commit/3db8e486e560183f064e31a228aada52fdeac5d6 DIFF: https://github.com/llvm/llvm-project/commit/3db8e486e560183f064e31a228aada52fdeac5d6.diff

[clang] ce4545d - [CUDA] Bump the latest supported CUDA version to 11.4.

2021-08-23 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2021-08-23T13:24:49-07:00 New Revision: ce4545db1d31f447bb42987099d691d5658da4bf URL: https://github.com/llvm/llvm-project/commit/ce4545db1d31f447bb42987099d691d5658da4bf DIFF: https://github.com/llvm/llvm-project/commit/ce4545db1d31f447bb42987099d691d5658da4bf.diff

[clang] 4c40c03 - Fixed doc build.

2021-08-23 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2021-08-23T13:45:36-07:00 New Revision: 4c40c03b3933ce32a2b5f532810dc30f6f329fd4 URL: https://github.com/llvm/llvm-project/commit/4c40c03b3933ce32a2b5f532810dc30f6f329fd4 DIFF: https://github.com/llvm/llvm-project/commit/4c40c03b3933ce32a2b5f532810dc30f6f329fd4.diff

[clang] 5c24a1e - [CUDA] update constraints on NVPTX builtins to include PTX73 and 74.

2021-08-26 Thread Artem Belevich via cfe-commits
Author: Artem Belevich Date: 2021-08-26T16:01:57-07:00 New Revision: 5c24a1e1db63f1ac3a956458df5edf87fac7be49 URL: https://github.com/llvm/llvm-project/commit/5c24a1e1db63f1ac3a956458df5edf87fac7be49 DIFF: https://github.com/llvm/llvm-project/commit/5c24a1e1db63f1ac3a956458df5edf87fac7be49.diff

[clang] [llvm] [NVPTX] Add conversion intrinsics from/to fp8 types (e4m3, e5m2) (PR #102969)

2024-08-16 Thread Artem Belevich via cfe-commits
Artem-B wrote: Looks like that it was clang-format check github was waiting on an approaval for. I've just clicked that button, let's see what it brings. The patch is good to go otherwise, IMO. https://github.com/llvm/llvm-project/pull/102969 ___ cfe-

[clang] [llvm] [CUDA] Add a pseudo GPU sm_next which allows overriding for SM/PTX version. (PR #100247)

2024-08-16 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B updated https://github.com/llvm/llvm-project/pull/100247 >From da1ac9d36bd284dc607b7366ff83ba556fb64fb5 Mon Sep 17 00:00:00 2001 From: Artem Belevich Date: Thu, 18 Jul 2024 15:05:01 -0700 Subject: [PATCH 1/2] [CUDA] Add a pseudo GPU sm_next which allows overrides for

[clang] [llvm] [CUDA] Add a pseudo GPU sm_next which allows overriding for SM/PTX version. (PR #100247)

2024-08-16 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B updated https://github.com/llvm/llvm-project/pull/100247 >From 2a948e8803cd881937e9a121ca9fe9c4816e857e Mon Sep 17 00:00:00 2001 From: Artem Belevich Date: Thu, 18 Jul 2024 15:05:01 -0700 Subject: [PATCH 1/2] [CUDA] Add a pseudo GPU sm_next which allows overrides for

[clang] [HIP] search fatbin symbols for libs passed by -l (PR #104638)

2024-08-16 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/104638 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [HIP] search fatbin symbols for libs passed by -l (PR #104638)

2024-08-16 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. LGTM in general with a couple of nits. https://github.com/llvm/llvm-project/pull/104638 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-c

[clang] [HIP] search fatbin symbols for libs passed by -l (PR #104638)

2024-08-16 Thread Artem Belevich via cfe-commits
@@ -76,8 +79,75 @@ class HIPUndefinedFatBinSymbols { return GPUBinHandleSymbols; } + // Collect symbols from static libraries specified by -l options. + void processStaticLibraries() { +llvm::SmallVector LibNames; +llvm::SmallVector LibPaths; +llvm::SmallVe

[clang] [HIP] search fatbin symbols for libs passed by -l (PR #104638)

2024-08-16 Thread Artem Belevich via cfe-commits
@@ -76,8 +79,75 @@ class HIPUndefinedFatBinSymbols { return GPUBinHandleSymbols; } + // Collect symbols from static libraries specified by -l options. + void processStaticLibraries() { +llvm::SmallVector LibNames; +llvm::SmallVector LibPaths; +llvm::SmallVe

[clang] [Clang] Fix sema checks thinking kernels aren't kernels (PR #104460)

2024-08-16 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/104460 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [NVPTX] Add conversion intrinsics from/to fp8 types (e4m3, e5m2) (PR #102969)

2024-08-19 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B closed https://github.com/llvm/llvm-project/pull/102969 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][HIP] Target-dependent overload resolution in declarators and specifiers (PR #103031)

2024-08-19 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B commented: Considering that we're adding another interesting quirk to how we interpret target attributes & function calls, it would be useful to run this by a language lawyer to make sure we're not missing something. @zygoloid - would you have time to take a look or

[clang] [Clang][HIP] Target-dependent overload resolution in declarators and specifiers (PR #103031)

2024-08-19 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/103031 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][HIP] Target-dependent overload resolution in declarators and specifiers (PR #103031)

2024-08-19 Thread Artem Belevich via cfe-commits
@@ -115,20 +143,65 @@ static bool hasAttr(const Decl *D, bool IgnoreImplicitAttr) { }); } +SemaCUDA::CUDATargetContext::CUDATargetContext(SemaCUDA *S, Artem-B wrote: This could probably be moved into the header. https://github.com/llvm/llvm-projec

[clang] [Clang][HIP] Target-dependent overload resolution in declarators and specifiers (PR #103031)

2024-08-19 Thread Artem Belevich via cfe-commits
@@ -9017,6 +9017,10 @@ def err_global_call_not_config : Error< def err_ref_bad_target : Error< "reference to %select{__device__|__global__|__host__|__host__ __device__}0 " "%select{function|variable}1 %2 in %select{__device__|__global__|__host__|__host__ __device__}3 funct

[clang] [llvm] [NVPTX] Support __usAtomicCAS builtin (PR #99646)

2024-08-19 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/99646 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [NVPTX] Support __usAtomicCAS builtin (PR #99646)

2024-08-20 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,216 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --default-march nvptx64 --version 5 +; RUN: llc < %s -march=nvptx64 -mcpu=sm_32 | FileCheck %s --check-prefixes=SM30,CHECK +; RUN: %if ptxas %{ llc < %s -march=nvptx64 -mcp

[clang] [llvm] [NVPTX] Support __usAtomicCAS builtin (PR #99646)

2024-08-22 Thread Artem Belevich via cfe-commits
Artem-B wrote: Buildkite failures are caused by lldb and are unrelated. We're good to go. https://github.com/llvm/llvm-project/pull/99646 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commit

[clang] [llvm] [NVPTX] Support __usAtomicCAS builtin (PR #99646)

2024-08-23 Thread Artem Belevich via cfe-commits
Artem-B wrote: I can land the patch. The buildkite failures appear to be unrelated (something in lldb tests). Let's wait till clang format checks are done. https://github.com/llvm/llvm-project/pull/99646 ___ cfe-commits mailing list cfe-commits@lists

[clang] Fix amdgpu-arch for dll name on Windows (PR #101350)

2024-08-23 Thread Artem Belevich via cfe-commits
Artem-B wrote: This is very Windows-specific. @rnk -- would you have time to take a look? https://github.com/llvm/llvm-project/pull/101350 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-comm

[clang] [llvm] [NVPTX] Remove nvvm.bitcast.* intrinsics (PR #107936)

2024-09-12 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/107936 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [lld] [llvm] [mlir] [IR] Introduce `T` to `DataLayout` to represent flat address space if a target supports it (PR #108786)

2024-09-16 Thread Artem Belevich via cfe-commits
Artem-B wrote: The description of the flat address space in the `TargetTransformInfo.h` is somewhat vague and both, soo specific and not precise enough, IMO: ``` The flat address space is a /// generic address space that can be used access multiple segments of memory /// with different addre

[clang] [lld] [llvm] [mlir] [IR] Introduce `T` to `DataLayout` to represent flat address space if a target supports it (PR #108786)

2024-09-16 Thread Artem Belevich via cfe-commits
Artem-B wrote: > I'm still concerned about the (no-)aliasing guarantees. It's useful to have > two non-flat address spaces that can alias, Another example for NVIDIA GPUs would be `.param` space. According to the [PTX spec](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#gen

[clang] [CUDA/HIP] fix propagate -cuid to a host-only compilation. (PR #111650)

2024-10-09 Thread Artem Belevich via cfe-commits
Artem-B wrote: > This does not seem to be the right fix. I tends to think the test > https://github.com/ROCm/hip-tests/tree/amd-staging/samples/2_Cookbook/16_assembly_to_executable > needs fix. Since it does not expect host-only compilation to use CUID, it > should add `-fuse-cuid=none` to the

[clang] [HIP] Use original file path for CUID (PR #107734)

2024-10-09 Thread Artem Belevich via cfe-commits
@@ -16,15 +18,15 @@ // RUN: %clang -### -x hip --target=x86_64-unknown-linux-gnu -DX=1 --no-offload-new-driver \ // RUN: --offload-arch=gfx906 -c -nogpuinc -nogpulib -fuse-cuid=hash \ -// RUN: %S/Inputs/hip_multiple_inputs/a.cu >%t.out 2>&1 +// RUN: Inputs/hip_multiple_

[clang] [llvm] [CUDA] Add support for CUDA-12.6 and sm_100 (PR #112028)

2024-10-11 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B updated https://github.com/llvm/llvm-project/pull/112028 >From 5dac14aab180fd965d996b47cf983b8c462fe703 Mon Sep 17 00:00:00 2001 From: Sergey Kozub Date: Tue, 2 Jul 2024 02:44:56 -0700 Subject: [PATCH] [CUDA] Add support for CUDA-12.6 and sm_100 --- clang/docs/Relea

[clang] [llvm] [CUDA] Add support for CUDA-12.6 and sm_100 (PR #112028)

2024-10-11 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B created https://github.com/llvm/llvm-project/pull/112028 This is a copy of #97402, which is now ready to land. >From 8f2122e6ff890320d66e6d9f3cc5327b897c25e9 Mon Sep 17 00:00:00 2001 From: Sergey Kozub Date: Tue, 2 Jul 2024 02:44:56 -0700 Subject: [PATCH] [CUDA] Add

[clang] [llvm] [NVPTX] Remove nvvm.ldg.global.* intrinsics (PR #112834)

2024-10-18 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. LGTM. Please add a note about the intrinsic removal/deprecation to the release notes. https://github.com/llvm/llvm-project/pull/112834 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https

[clang] [NvlinkWrapper] Use `-plugin-opt=mattr=` instead of a custom feature (PR #111712)

2024-10-18 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/111712 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [llvm][NVPTX] Strip unneeded '+0' in PTX load/store (PR #113017)

2024-10-19 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. LGTM overall, with a minor style nit. https://github.com/llvm/llvm-project/pull/113017 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-c

[clang] [llvm] [llvm][NVPTX] Strip unneeded '+0' in PTX load/store (PR #113017)

2024-10-19 Thread Artem Belevich via cfe-commits
@@ -363,6 +363,14 @@ void NVPTXInstPrinter::printMemOperand(const MCInst *MI, int OpNum, } } +void NVPTXInstPrinter::printOffseti32imm(const MCInst *MI, int OpNum, + raw_ostream &O, const char *Modifier) { + if (auto &Op = MI->getOp

[clang] [llvm] [llvm][NVPTX] Strip unneeded '+0' in PTX load/store (PR #113017)

2024-10-19 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/113017 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Cuda] Handle -fcuda-short-ptr even with -nocudalib (PR #111682)

2024-10-10 Thread Artem Belevich via cfe-commits
Artem-B wrote: > I'm not sure why the option isn't enabled by default, personally While it does indeed help with generating better code, using this option while compiling CUDA code may be problematic. Front-end is not aware of address spaces and all pointers are generic, so `sizeof(any pointer

[clang] Reland "[HIP] Use original file path for CUID" (#108771) (PR #111885)

2024-10-10 Thread Artem Belevich via cfe-commits
@@ -1,13 +1,15 @@ // Check CUID generated by hash. // The same CUID is generated for the same file with the same options. +// RUN: cd %S + // RUN: %clang -### -x hip --target=x86_64-unknown-linux-gnu --no-offload-new-driver \ // RUN: --offload-arch=gfx906 -c -nogpuinc -no

[clang] [llvm] [CUDA] Add support for CUDA-12.6 and sm_100 (PR #112028)

2024-10-14 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B updated https://github.com/llvm/llvm-project/pull/112028 >From 5dac14aab180fd965d996b47cf983b8c462fe703 Mon Sep 17 00:00:00 2001 From: Sergey Kozub Date: Tue, 2 Jul 2024 02:44:56 -0700 Subject: [PATCH 1/2] [CUDA] Add support for CUDA-12.6 and sm_100 --- clang/docs/R

[clang] [llvm] [CUDA] Add support for CUDA-12.6 and sm_100 (PR #112028)

2024-10-14 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B updated https://github.com/llvm/llvm-project/pull/112028 >From 17989b287b28fa1234dc7aca726f973086b687f2 Mon Sep 17 00:00:00 2001 From: Sergey Kozub Date: Tue, 2 Jul 2024 02:44:56 -0700 Subject: [PATCH 1/2] [CUDA] Add support for CUDA-12.6 and sm_100 --- clang/docs/R

[clang] [llvm] [CUDA] Add support for CUDA-12.6 and sm_100 (PR #112028)

2024-10-14 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B closed https://github.com/llvm/llvm-project/pull/112028 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [CUDA] Add a pseudo GPU sm_next which allows overriding for SM/PTX version. (PR #100247)

2024-10-25 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B closed https://github.com/llvm/llvm-project/pull/100247 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [CUDA] Add a pseudo GPU sm_next which allows overriding for SM/PTX version. (PR #100247)

2024-10-24 Thread Artem Belevich via cfe-commits
Artem-B wrote: Closing the patch now, as we've figured out a way to move forward with simpler changes. https://github.com/llvm/llvm-project/pull/100247 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/list

[clang] Remove device override for operator new when the C++ standard >= 26 (PR #114056)

2024-10-29 Thread Artem Belevich via cfe-commits
@@ -91,12 +91,14 @@ __device__ inline void operator delete[](void *ptr, #endif // Device overrides for placement new and delete. +#if _LIBCPP_STD_VER < 26 Artem-B wrote: Yup. Reproducible on compiler explorer: https://godbolt.org/z/xr5nEhGnr https://github.c

[clang] Remove device override for operator new when the C++ standard >= 26 (PR #114056)

2024-10-29 Thread Artem Belevich via cfe-commits
@@ -91,12 +91,14 @@ __device__ inline void operator delete[](void *ptr, #endif // Device overrides for placement new and delete. +#if _LIBCPP_STD_VER < 26 Artem-B wrote: This helps with libc++. Do we run into this issue with libstdc++, too? If so, we may ch

[clang] [CUDA/HIP] fix propagate -cuid to a host-only compilation. (PR #111650)

2024-11-04 Thread Artem Belevich via cfe-commits
Artem-B wrote: I'm saying is that whatever refers to the fatbin handle has to have the same idea about the name of that handle as the object file that provides that handle. For that both have to be compiled with the same `cuid`. Normally, it's clang driver that does that all under the hood. If

[clang] [LinkerWrapper] Remove special handling for archives (PR #114843)

2024-11-04 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. LGTM. https://github.com/llvm/llvm-project/pull/114843 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA] Add support for __grid_constant__ attribute (PR #114589)

2024-11-04 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B updated https://github.com/llvm/llvm-project/pull/114589 >From ac0790a431d94f78ee73e96fd97f9263192c3153 Mon Sep 17 00:00:00 2001 From: Artem Belevich Date: Tue, 27 Aug 2024 16:16:14 -0700 Subject: [PATCH 1/2] [CUDA] Add support for __grid_constant__ attribute --- cl

[clang] [CUDA] Add support for __grid_constant__ attribute (PR #114589)

2024-11-04 Thread Artem Belevich via cfe-commits
@@ -1450,6 +1450,13 @@ def CUDAHost : InheritableAttr { } def : MutualExclusions<[CUDAGlobal, CUDAHost]>; +def CUDAGridConstant : InheritableAttr { + let Spellings = [GNU<"grid_constant">, Declspec<"__grid_constant__">]; + let Subjects = SubjectList<[ParmVar]>; + let LangOp

[clang] Add clang atomic control options and attribute (PR #114841)

2024-11-04 Thread Artem Belevich via cfe-commits
@@ -1093,6 +1097,169 @@ inline void FPOptions::applyChanges(FPOptionsOverride FPO) { *this = FPO.applyOverrides(*this); } +/// Atomic control options +class AtomicOptionsOverride; +class AtomicOptions { +public: + using storage_type = uint16_t; + + static constexpr unsign

[clang] [CUDA] Add support for __grid_constant__ attribute (PR #114589)

2024-11-04 Thread Artem Belevich via cfe-commits
@@ -1,8 +1,17 @@ bool Sema::CheckFunctionDeclaration(Scope *S, FunctionDecl *NewFD, << NewFD; } -if (!Redeclaration && LangOpts.CUDA) +if (!Redeclaration && LangOpts.CUDA) { Artem-B wrote: I deliberately decided *not* to do th

[clang] [CUDA] Add support for __grid_constant__ attribute (PR #114589)

2024-11-04 Thread Artem Belevich via cfe-commits
@@ -1,8 +1,17 @@ bool Sema::CheckFunctionDeclaration(Scope *S, FunctionDecl *NewFD, << NewFD; } -if (!Redeclaration && LangOpts.CUDA) +if (!Redeclaration && LangOpts.CUDA) { Artem-B wrote: My mental model of whether we should

[clang] [CUDA] Add support for __grid_constant__ attribute (PR #114589)

2024-11-05 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B closed https://github.com/llvm/llvm-project/pull/114589 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Remove device override for operator new when the C++ standard >= 26 (PR #114056)

2024-10-30 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/114056 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA/HIP] fix propagate -cuid to a host-only compilation. (PR #111650)

2024-11-01 Thread Artem Belevich via cfe-commits
Artem-B wrote: > now it is complaining about __hip_fatbin earlier it was > __hip_gpubin_handle_2ba9067058fbe93a. In both cases there's some sort of inconsistency in your build. Find the compilation which creates the object file which refers to the missing symbol, and then we can try figuring

[clang] [CUDA] Add support for __grid_constant__ attribute (PR #114589)

2024-11-01 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B created https://github.com/llvm/llvm-project/pull/114589 LLVM support for the attribute has been implemented already, so it just plumbs it through to the CUDA front-end. One notable difference from NVCC is that the attribute can be used regardless of the targeted GP

[clang] [NvlinkWrapper] Add support for `--undefined` (PR #113934)

2024-10-28 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/113934 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Add `-fdefault-generic-addrspace` flag for targeting GPUs (PR #115777)

2024-11-11 Thread Artem Belevich via cfe-commits
Artem-B wrote: > This has a lot of unfortable effects that limit using address spaces in C++ > as well > as making it more difficult to work with. Can you give some examples? It sounds that what you really want is for address space qualifiers to not be part of a type signature. OpenCL sort o

[clang] [Clang] Add `-fdefault-generic-addrspace` flag for targeting GPUs (PR #115777)

2024-11-11 Thread Artem Belevich via cfe-commits
Artem-B wrote: I think I generally agree with @AlexVlx argument. While the patch may solve you immediate issue, I think it's not going to give you a usable compilation model for AS-qualified pointers. If you are defining your own C++ extension along the lines of CUDA/HIP/OpenCL, you would hav

[clang] [CUDA][HIP] Fix host/device context in concept (PR #67721)

2024-11-13 Thread Artem Belevich via cfe-commits
Artem-B wrote: While I'm not sure how concepts will work in CUDA in the end, I'm OK with fixing obviously wrong behaviors now, without potentially paining ourselves into a corner. The key issue in the example in https://godbolt.org/z/o7Wa68n9c seems to be that CUDA's context-aware overload res

[clang] [nvlink-wrapper] Use a symbolic link instead of copying the file (PR #110139)

2024-09-26 Thread Artem Belevich via cfe-commits
Artem-B wrote: @rnk Are symlinks OK to use on windows? https://github.com/llvm/llvm-project/pull/110139 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [cuda][[HIP] `__constant__` should imply constant (PR #110182)

2024-09-26 Thread Artem Belevich via cfe-commits
Artem-B wrote: Well, it's certainly used that way in existing CUDA code and it's been around forever: Here are few random examples from both 10 years ago: https://stackoverflow.com/questions/20535683/cuda-5-5-cudamemcpytosymbol-constant-and-out-of-scope-error and a fairly recent example: https

[clang] [nvlink-wrapper] Use a symbolic link instead of copying the file (PR #110139)

2024-09-26 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/110139 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [cuda][[HIP] `__constant__` should imply constant (PR #110182)

2024-09-26 Thread Artem Belevich via cfe-commits
Artem-B wrote: `__constant__` may not necessarily be `const` for IR purposes. I.e. IR may not rely on the 'known' values, as seen in IR, as the data may actually be populated by the host via CUDA API calls `cudaMemcpyToSymbol` before the GPU kernel launch. https://github.com/llvm/llvm-project

[clang] [cuda][[HIP] `__constant__` should imply constant (PR #110182)

2024-09-26 Thread Artem Belevich via cfe-commits
Artem-B wrote: It has nothing to do with writing to those arrays while the kernel is running. That would indeed be UB. > both would still work just the same even with this change, No, they will not. Here's the demonstration of the behavior change that `const` brings to the table: https://cuda

[clang] [cuda][[HIP] `__constant__` should imply constant (PR #110182)

2024-09-26 Thread Artem Belevich via cfe-commits
Artem-B wrote: I'm not 100% sure that `externally_initialized` is sufficient to deal with this use pattern. IR manual says: https://llvm.org/docs/LangRef.html#global-variables > By default, global initializers are optimized by assuming that global > variables defined within the module are not

[clang] [cuda][HIP] `__constant__` should imply constant (PR #110182)

2024-09-27 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/110182 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [cuda][HIP] `__constant__` should imply constant (PR #110182)

2024-09-27 Thread Artem Belevich via cfe-commits
Artem-B wrote: > In this case, _ZL4cxxx does not have externally_initialized . If this patch > does not remove externally_initialized, probably this constant propagation > won't happen. Indeed, unoptimized code shows that `cxxx` has no `externally_initialized`, only `constant`. If we keep e

[clang] [Clang][HIP] Warn when __AMDGCN_WAVEFRONT_SIZE is used in host code without relying on target-dependent overload resolution (PR #109663)

2024-09-30 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,109 @@ +// REQUIRES: amdgpu-registered-target +// RUN: %clang -xhip --offload-arch=gfx1030 --offload-host-only -pedantic -nogpuinc -nogpulib -nobuiltininc -fsyntax-only -Xclang -verify=onhost %s +// RUN: %clang -xhip --offload-arch=gfx1030 --offload-device-only -pedant

[clang] [Clang][HIP] Warn when __AMDGCN_WAVEFRONT_SIZE is used in host code without relying on target-dependent overload resolution (PR #109663)

2024-09-30 Thread Artem Belevich via cfe-commits
Artem-B wrote: I'm curious why are those macros even defined on the host? It looks like these macros should be handled in a way similar to `__HIP_ARCH__`. https://github.com/llvm/llvm-project/pull/109663 ___ cfe-commits mailing list cfe-commits@lists

[clang] [Clang][HIP] Warn when __AMDGCN_WAVEFRONT_SIZE is used in host code without relying on target-dependent overload resolution (PR #109663)

2024-10-01 Thread Artem Belevich via cfe-commits
Artem-B wrote: > This patch checks for numeric literals in clearly identifiable host code if > they are the result of expanding the wavefront-size macros and issues a > diagnostic if that's the case. What's the ultimate goal here? If we're OK to warn on some obvious misuses, then it may do.

[clang] [Clang][HIP] Warn when __AMDGCN_WAVEFRONT_SIZE is used in host code without relying on target-dependent overload resolution (PR #109663)

2024-10-02 Thread Artem Belevich via cfe-commits
Artem-B wrote: Unless HIP explicitly defines wavefront size property for the host (I do not think so), it would appear that it's a property of a GPU, and as such should not be treated as a constant on the host, because the host needs to deal with multiple GPU variants, with different idea of t

<    5   6   7   8   9   10   11   12   >