[clang] [HIP] Allow partial linking for `-fgpu-rdc` (PR #81700)

2024-02-14 Thread Artem Belevich via cfe-commits
@@ -36,6 +47,146 @@ static std::string normalizeForBundler(const llvm::Triple &T, : T.normalize(); } +// Collect undefined __hip_fatbin* and __hip_gpubin_handle* symbols from all +// input object or archive files. +class HIPUndefinedFatBinSymbols { +publi

[clang] [HIP] Allow partial linking for `-fgpu-rdc` (PR #81700)

2024-02-14 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. Overall LGTM. Please wait for @jhuber6's to double check the partial linking mechanics details. https://github.com/llvm/llvm-project/pull/81700 ___ cfe-commits mailing list cfe-commits@lists.llvm.

[clang] [Clang][NVPTX] Allow passing arguments to the linker while standalone (PR #73030)

2024-02-20 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/73030 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [NVPTX] Enable the _Float16 type for NVPTX compilation (PR #82436)

2024-02-20 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/82436 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [CUDA] Add support for CUDA-12.6 and sm_100 (PR #97402)

2024-07-08 Thread Artem Belevich via cfe-commits
Artem-B wrote: > This PR is redundant, closing. I think the patch was perfectly fine. Considering that other NVIDIA open-source projects already mention sm_100 (E.g. https://github.com/NVIDIA/cccl/blob/5efe53dbd71ea3e4bc4fdbb73edc001e0bf81547/libcudacxx/include/nv/detail/__target_macros#L241),

[clang] [CUDA/HIP] propagate -cuid to a host-only compilation. (PR #107483)

2024-09-05 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B created https://github.com/llvm/llvm-project/pull/107483 Right now we're bailing out too early, and `-cuid` does not get set for the host-only compilations. >From 52a27293d1c93a7ed4dcef845f705808afa3c273 Mon Sep 17 00:00:00 2001 From: Artem Belevich Date: Thu, 5 Se

[clang] [CUDA/HIP] propagate -cuid to a host-only compilation. (PR #107483)

2024-09-09 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B closed https://github.com/llvm/llvm-project/pull/107483 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [HIP] Use original file path for CUID (PR #107734)

2024-09-09 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/107734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [NVPTX] Remove nvvm.bitcast.* intrinsics (PR #107936)

2024-09-09 Thread Artem Belevich via cfe-commits
Artem-B wrote: It may be worth adding a note about this in the release notes. https://github.com/llvm/llvm-project/pull/107936 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA] Emit used function list in deterministic order. (PR #102661)

2024-08-09 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B created https://github.com/llvm/llvm-project/pull/102661 Fixes https://github.com/llvm/llvm-project/issues/101560 >From 6ee0add21bd2a9b25d28640c91de2fc6dab7fa72 Mon Sep 17 00:00:00 2001 From: Artem Belevich Date: Fri, 9 Aug 2024 11:51:23 -0700 Subject: [PATCH] [CUDA]

[clang] [CUDA] Emit used function list in deterministic order. (PR #102661)

2024-08-09 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B updated https://github.com/llvm/llvm-project/pull/102661 >From 0f3944e1c12baa958f52c3c015a0cf5f9aeff1ed Mon Sep 17 00:00:00 2001 From: Artem Belevich Date: Fri, 9 Aug 2024 11:51:23 -0700 Subject: [PATCH] [CUDA] Emit used function list in deterministic order. Fixes ht

[clang] [CUDA] Emit used function list in deterministic order. (PR #102661)

2024-08-09 Thread Artem Belevich via cfe-commits
@@ -950,6 +950,9 @@ void CodeGenModule::Release() { UsedArray.push_back(llvm::ConstantExpr::getPointerBitCastOrAddrSpaceCast( GetAddrOfGlobal(GD), Int8PtrTy)); } +// Sort decls by name to always emit them in deterministic order. Artem-B

[clang] [CUDA] Emit used function list in deterministic order. (PR #102661)

2024-08-09 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/102661 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA] Emit used function list in deterministic order. (PR #102661)

2024-08-09 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B updated https://github.com/llvm/llvm-project/pull/102661 >From 0f3944e1c12baa958f52c3c015a0cf5f9aeff1ed Mon Sep 17 00:00:00 2001 From: Artem Belevich Date: Fri, 9 Aug 2024 11:51:23 -0700 Subject: [PATCH 1/2] [CUDA] Emit used function list in deterministic order. Fixe

[clang] [CUDA] Emit used function list in deterministic order. (PR #102661)

2024-08-12 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B closed https://github.com/llvm/llvm-project/pull/102661 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [CUDA] Add a pseudo GPU sm_next which allows overriding for SM/PTX version. (PR #100247)

2024-08-12 Thread Artem Belevich via cfe-commits
Artem-B wrote: OK, I've reworked the patch, and it appears to correctly propagate arbitrary SM/PTX versions from clang, down to the LLVM and generated PTX, and to ptxas and fatbinary command line options. PTAL. https://github.com/llvm/llvm-project/pull/100247

[clang] [llvm] [CUDA] Add a pseudo GPU sm_next which allows overriding for SM/PTX version. (PR #100247)

2024-08-12 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B updated https://github.com/llvm/llvm-project/pull/100247 >From 44a1045eee71777fa916e2a8043b2f99afc96a96 Mon Sep 17 00:00:00 2001 From: Artem Belevich Date: Thu, 18 Jul 2024 15:05:01 -0700 Subject: [PATCH 1/4] [CUDA] Add a pseudo GPU sm_next which allows overrides for

[clang] [llvm] [CUDA] Add a pseudo GPU sm_next which allows overriding for SM/PTX version. (PR #100247)

2024-08-12 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B updated https://github.com/llvm/llvm-project/pull/100247 >From da1ac9d36bd284dc607b7366ff83ba556fb64fb5 Mon Sep 17 00:00:00 2001 From: Artem Belevich Date: Thu, 18 Jul 2024 15:05:01 -0700 Subject: [PATCH] [CUDA] Add a pseudo GPU sm_next which allows overrides for SM/

[clang] [llvm] [CUDA] Add a pseudo GPU sm_next which allows overriding for SM/PTX version. (PR #100247)

2024-08-12 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B updated https://github.com/llvm/llvm-project/pull/100247 >From da1ac9d36bd284dc607b7366ff83ba556fb64fb5 Mon Sep 17 00:00:00 2001 From: Artem Belevich Date: Thu, 18 Jul 2024 15:05:01 -0700 Subject: [PATCH 1/2] [CUDA] Add a pseudo GPU sm_next which allows overrides for

[clang] [llvm] [NVPTX] Add conversion intrinsics from/to fp8 types (e4m3, e5m2) (PR #102969)

2024-08-13 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/102969 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [NVPTX] Add conversion intrinsics from/to fp8 types (e4m3, e5m2) (PR #102969)

2024-08-13 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. LGTM with a couple of nits. https://github.com/llvm/llvm-project/pull/102969 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [NVPTX] Add conversion intrinsics from/to fp8 types (e4m3, e5m2) (PR #102969)

2024-08-13 Thread Artem Belevich via cfe-commits
@@ -722,6 +722,37 @@ let hasSideEffects = false in { defm CVT_f16x2 : CVT_FROM_FLOAT_V2_SM80<"f16x2", Int32Regs>; defm CVT_bf16x2 : CVT_FROM_FLOAT_V2_SM80<"bf16x2", Int32Regs>; + + // FP8 conversions. + multiclass CVT_TO_F8X2 { +def _f32 : + NVPTXInst<(outs Int1

[clang] [llvm] [NVPTX] Add conversion intrinsics from/to fp8 types (e4m3, e5m2) (PR #102969)

2024-08-13 Thread Artem Belevich via cfe-commits
@@ -968,6 +971,39 @@ __device__ void nvvm_cvt_sm80() { // CHECK: ret void } +// CHECK-LABEL: nvvm_cvt_sm89 +__device__ void nvvm_cvt_sm89() { +#if __CUDA_ARCH__ >= 890 + // CHECK_PTX81_SM89: call i16 @llvm.nvvm.ff.to.e4m3x2.rn(float 1.00e+00, float 1.00e+00) + __n

[clang] [llvm] [NVPTX] Add conversion intrinsics from/to fp8 types (e4m3, e5m2) (PR #102969)

2024-08-13 Thread Artem Belevich via cfe-commits
@@ -722,6 +722,37 @@ let hasSideEffects = false in { defm CVT_f16x2 : CVT_FROM_FLOAT_V2_SM80<"f16x2", Int32Regs>; defm CVT_bf16x2 : CVT_FROM_FLOAT_V2_SM80<"bf16x2", Int32Regs>; + + // FP8 conversions. + multiclass CVT_TO_F8X2 { +def _f32 : + NVPTXInst<(outs Int1

[clang] [llvm] [NVPTX] Add conversion intrinsics from/to fp8 types (e4m3, e5m2) (PR #102969)

2024-08-14 Thread Artem Belevich via cfe-commits
@@ -722,6 +722,37 @@ let hasSideEffects = false in { defm CVT_f16x2 : CVT_FROM_FLOAT_V2_SM80<"f16x2", Int32Regs>; defm CVT_bf16x2 : CVT_FROM_FLOAT_V2_SM80<"bf16x2", Int32Regs>; + + // FP8 conversions. + multiclass CVT_TO_F8X2 { +def _f32 : + NVPTXInst<(outs Int1

[clang] [llvm] [NVPTX] Add conversion intrinsics from/to fp8 types (e4m3, e5m2) (PR #102969)

2024-08-14 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/102969 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Fix sema checks thinking kernels aren't kernels (PR #104460)

2024-08-15 Thread Artem Belevich via cfe-commits
@@ -7163,24 +7163,27 @@ void Sema::ProcessDeclAttributeList( } else if (const auto *A = D->getAttr()) { Diag(D->getLocation(), diag::err_opencl_kernel_attr) << A; D->setInvalidDecl(); -} else if (!D->hasAttr()) { - if (const auto *A = D->getAttr()) { -

[clang] [Clang] Fix sema checks thinking kernels aren't kernels (PR #104460)

2024-08-15 Thread Artem Belevich via cfe-commits
@@ -7163,24 +7163,27 @@ void Sema::ProcessDeclAttributeList( } else if (const auto *A = D->getAttr()) { Diag(D->getLocation(), diag::err_opencl_kernel_attr) << A; D->setInvalidDecl(); -} else if (!D->hasAttr()) { - if (const auto *A = D->getAttr()) { -

[clang] [Clang] Fix sema checks thinking kernels aren't kernels (PR #104460)

2024-08-15 Thread Artem Belevich via cfe-commits
@@ -7163,24 +7163,27 @@ void Sema::ProcessDeclAttributeList( } else if (const auto *A = D->getAttr()) { Diag(D->getLocation(), diag::err_opencl_kernel_attr) << A; D->setInvalidDecl(); -} else if (!D->hasAttr()) { - if (const auto *A = D->getAttr()) { -

[clang] [Offload] Do not pass `-fcf-protection=` for offloading (PR #88402)

2024-04-12 Thread Artem Belevich via cfe-commits
@@ -6867,8 +6867,14 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA, CmdArgs.push_back("-nogpulib"); if (Arg *A = Args.getLastArg(options::OPT_fcf_protection_EQ)) { -CmdArgs.push_back( -Args.MakeArgString(Twine("-fcf-protection=") + A->getVal

[clang] [Offload] Do not pass `-fcf-protection=` for offloading (PR #88402)

2024-04-12 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/88402 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Offload] Do not pass `-fcf-protection=` for offloading (PR #88402)

2024-04-12 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B commented: LGTM in principle. https://github.com/llvm/llvm-project/pull/88402 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Offload] Do not pass `-fcf-protection=` for offloading (PR #88402)

2024-04-12 Thread Artem Belevich via cfe-commits
@@ -6867,8 +6867,14 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA, CmdArgs.push_back("-nogpulib"); if (Arg *A = Args.getLastArg(options::OPT_fcf_protection_EQ)) { -CmdArgs.push_back( -Args.MakeArgString(Twine("-fcf-protection=") + A->getVal

[clang] [clang] Introduce `SemaCUDA` (PR #88559)

2024-04-12 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. LGTM. The changes appear to be mechanical in nature, so `check clang` tests should be sufficient to verify we've re-connected things correctly. https://github.com/llvm/llvm-project/pull/88559 ___

[clang] [llvm] [CUDA] Mark CUDA-12.5 as supported and introduce ptx 8.5. (PR #94113)

2024-06-05 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B closed https://github.com/llvm/llvm-project/pull/94113 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [NVPTX] Revamp NVVMIntrRange pass (PR #94422)

2024-06-05 Thread Artem Belevich via cfe-commits
Artem-B wrote: > I've also re-enabled this pass in the TM, it was disabled years ago due to > "numerical discrepancies" https://reviews.llvm.org/D96166. In our testing we > haven't seen any issues with adding ranges to intrinsics, and I cannot find > any further info about what issues were enc

[clang] [llvm] [NVPTX] Revamp NVVMIntrRange pass (PR #94422)

2024-06-05 Thread Artem Belevich via cfe-commits
@@ -128,6 +128,15 @@ bool findOneNVVMAnnotation(const GlobalValue *gv, const std::string &prop, return true; } +static std::optional +findOneNVVMAnnotation(const GlobalValue &GV, const std::string &PropName) { + unsigned RetVal; + bool Found = findOneNVVMAnnotation(&GV, P

[clang] [llvm] [NVPTX] Revamp NVVMIntrRange pass (PR #94422)

2024-06-05 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/94422 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [NVPTX] Revamp NVVMIntrRange pass (PR #94422)

2024-06-05 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B commented: Nice. https://github.com/llvm/llvm-project/pull/94422 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [NVPTX] Revamp NVVMIntrRange pass (PR #94422)

2024-06-05 Thread Artem Belevich via cfe-commits
@@ -1,50 +1,51 @@ -//===- NVVMIntrRange.cpp - Set !range metadata for NVVM intrinsics ===// +//===- NVVMIntrRange.cpp - Set range attributes for NVVM intrinsics ---===// // // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. // See ht

[clang] [llvm] [NVPTX] Revamp NVVMIntrRange pass (PR #94422)

2024-06-05 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,60 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --check-attributes --version 5 +; RUN: opt < %s -S -mtriple=nvptx-nvidia-cuda -mcpu=sm_20 -passes=nvvm-intr-range | FileCheck %s + +define i32 @test_maxntid() { +; CHECK-LABEL:

[clang] [llvm] [NVPTX] Revamp NVVMIntrRange pass (PR #94422)

2024-06-05 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. LGTM. https://github.com/llvm/llvm-project/pull/94422 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [NVPTX] Revamp NVVMIntrRange pass (PR #94422)

2024-06-05 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/94422 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [NVPTX] Revamp NVVMIntrRange pass (PR #94422)

2024-06-05 Thread Artem Belevich via cfe-commits
@@ -1,15 +1,15 @@ ; RUN: llc < %s -march=nvptx -mcpu=sm_20 | FileCheck -allow-deprecated-dag-overlap %s ; RUN: llc < %s -march=nvptx64 -mcpu=sm_20 | FileCheck -allow-deprecated-dag-overlap %s ; RUN: opt < %s -S -mtriple=nvptx-nvidia-cuda -passes=nvvm-intr-range \ -; RUN: |

[clang] [llvm] [NVPTX] Revamp NVVMIntrRange pass (PR #94422)

2024-06-05 Thread Artem Belevich via cfe-commits
@@ -139,24 +138,23 @@ define ptx_device i32 @test_ctaid_w() { define ptx_device i32 @test_nctaid_y() { ; CHECK: mov.u32 %r{{[0-9]+}}, %nctaid.y; -; RANGE: call i32 @llvm.nvvm.read.ptx.sreg.nctaid.y(), !range ![[GRID_SIZE_YZ:[0-9]+]] +; RANGE: call range(i32 1, 65536) i32 @llv

[clang] [llvm] [NVPTX] Revamp NVVMIntrRange pass (PR #94422)

2024-06-05 Thread Artem Belevich via cfe-commits
@@ -139,24 +138,23 @@ define ptx_device i32 @test_ctaid_w() { define ptx_device i32 @test_nctaid_y() { ; CHECK: mov.u32 %r{{[0-9]+}}, %nctaid.y; -; RANGE: call i32 @llvm.nvvm.read.ptx.sreg.nctaid.y(), !range ![[GRID_SIZE_YZ:[0-9]+]] +; RANGE: call range(i32 1, 65536) i32 @llv

[clang] [CUDA][HIP] warn incompatible redeclare (PR #77359)

2024-06-06 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/77359 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA][HIP] warn incompatible redeclare (PR #77359)

2024-06-06 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. LGTM with some wording/namiung nits. https://github.com/llvm/llvm-project/pull/77359 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-comm

[clang] [CUDA][HIP] warn incompatible redeclare (PR #77359)

2024-06-06 Thread Artem Belevich via cfe-commits
@@ -9013,6 +9013,12 @@ def err_cuda_ovl_target : Error< "cannot overload %select{__device__|__global__|__host__|__host__ __device__}2 function %3">; def note_cuda_ovl_candidate_target_mismatch : Note< "candidate template ignored: target attributes do not match">; +def wa

[clang] [CUDA][HIP] warn incompatible redeclare (PR #77359)

2024-06-06 Thread Artem Belevich via cfe-commits
@@ -9013,6 +9013,12 @@ def err_cuda_ovl_target : Error< "cannot overload %select{__device__|__global__|__host__|__host__ __device__}2 function %3">; def note_cuda_ovl_candidate_target_mismatch : Note< "candidate template ignored: target attributes do not match">; +def wa

[clang] [Clang] Add timeout for GPU detection utilities (PR #94751)

2024-06-07 Thread Artem Belevich via cfe-commits
Artem-B wrote: Ooh... I think I know exactly what may be causing this. On machines where NVIDIA GPUs are used for compute only (e.g. a headless server machine), NVIDIA drivers are not always loaded by default and may not have driver persistence enabled. The drivers get loaded when GPU is acces

[clang] [Clang] Add timeout for GPU detection utilities (PR #94751)

2024-06-07 Thread Artem Belevich via cfe-commits
Artem-B wrote: > What's the config to set this by default without any graphics? https://docs.nvidia.com/deploy/driver-persistence/index.html I usually use "nvidia-smi -i -pm ENABLED" to force the driver to be loaded permanently. As for `__nvcc_device_query`, my guess is that it just uses a

[clang] [llvm] [Offload][CUDA] Allow CUDA kernels to use LLVM/Offload (PR #94549)

2024-06-07 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/94549 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [Offload][CUDA] Allow CUDA kernels to use LLVM/Offload (PR #94549)

2024-06-07 Thread Artem Belevich via cfe-commits
@@ -,17 +6684,26 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA, break; } } else { +if (Args.hasFlag(options::OPT_foffload_via_llvm, + options::OPT_fno_offload_via_llvm, false)) + Args.AddLastArg(CmdArgs, options::O

[clang] [llvm] [Offload][CUDA] Allow CUDA kernels to use LLVM/Offload (PR #94549)

2024-06-07 Thread Artem Belevich via cfe-commits
@@ -1125,6 +1125,22 @@ void Clang::AddPreprocessingOptions(Compilation &C, const JobAction &JA, CmdArgs.push_back("__clang_openmp_device_functions.h"); } + if (Args.hasArg(options::OPT_foffload_via_llvm)) { +// Add llvm_wrappers/* to our system include path. This

[clang] [llvm] [Offload][CUDA] Allow CUDA kernels to use LLVM/Offload (PR #94549)

2024-06-07 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. LGTM in principle. Will kernels in TUs compiled with `-foffload-via-llvm` be interoperable with code that wants to launch them from another TU compiled w/o `-foffload-via-llvm` ? E.g.: - a.cu: `__global__ void kernel() { ... }` - b.cu: `e

[clang] [llvm] [Offload][CUDA] Allow CUDA kernels to use LLVM/Offload (PR #94549)

2024-06-07 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,31 @@ +/*===-- LLVM/Offload helpers for kernel languages (CUDA/HIP) -*- c++ -*-=== + * + * Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. + * See https://llvm.org/LICENSE.txt for license information. + * SPDX-License-Identifier: Apach

[clang] [Clang][HIP] Suppress availability diagnostics for mismatched host/device overloads (PR #93546)

2024-05-28 Thread Artem Belevich via cfe-commits
Artem-B wrote: > Therefore, the following code would cause a deprecation warning during host > compilation, even though val is only used as part of a device function: This is where we may need help from @zygoloid. > __attribute__((device)) std::enable_if<(val() > 0), int>::type fun(void) Here

[clang] [CUDA][HIP] Fix std::min in wrapper header (PR #93976)

2024-05-31 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. LGTM for following the standard c++ library. Floating point is an endless stream of surprises. E.g. https://pixorblog.wordpress.com/2016/06/27/some-remarks-about-minmax-functions/ > For instance, min() is not commutative and is not equiv

[clang] [llvm] [CUDA] Mark CUDA-12.5 as supported and introduce ptx 8.5. (PR #94113)

2024-06-03 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/94113 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] Fix name conflict with `sys/mac.h` on AIX (PR #88644)

2024-04-15 Thread Artem Belevich via cfe-commits
@@ -50,6 +50,10 @@ const char *CudaVersionToString(CudaVersion V); // Input is "Major.Minor" CudaVersion CudaStringToVersion(const llvm::Twine &S); +// We have a name conflict with sys/mac.h on AIX +#ifdef SM_32 +#undef SM_32 +#endif Artem-B wrote: Ugh. What

[clang] [clang] Fix name conflict with `sys/mac.h` on AIX (PR #88644)

2024-04-15 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/88644 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] Fix name conflict with `sys/mac.h` on AIX (PR #88644)

2024-04-15 Thread Artem Belevich via cfe-commits
@@ -50,6 +50,10 @@ const char *CudaVersionToString(CudaVersion V); // Input is "Major.Minor" CudaVersion CudaStringToVersion(const llvm::Twine &S); +// We have a name conflict with sys/mac.h on AIX +#ifdef SM_32 +#undef SM_32 +#endif Artem-B wrote: > We could

[clang] [clang] Fix name conflict with `sys/mac.h` on AIX (PR #88644)

2024-04-15 Thread Artem Belevich via cfe-commits
@@ -50,6 +50,10 @@ const char *CudaVersionToString(CudaVersion V); // Input is "Major.Minor" CudaVersion CudaStringToVersion(const llvm::Twine &S); +// We have a name conflict with sys/mac.h on AIX +#ifdef SM_32 +#undef SM_32 +#endif Artem-B wrote: Deprecatin

[clang] [clang] Fix name conflict with `sys/mac.h` on AIX (PR #88644)

2024-04-15 Thread Artem Belevich via cfe-commits
@@ -50,6 +50,10 @@ const char *CudaVersionToString(CudaVersion V); // Input is "Major.Minor" CudaVersion CudaStringToVersion(const llvm::Twine &S); +// We have a name conflict with sys/mac.h on AIX +#ifdef SM_32 +#undef SM_32 +#endif Artem-B wrote: SGTM. Than

[clang] [CUDA] Rename SM_32 to SM_32_ to work around AIX headers (PR #88779)

2024-04-15 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/88779 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA] Rename SM_32 to SM_32_ to work around AIX headers (PR #88779)

2024-04-15 Thread Artem Belevich via cfe-commits
@@ -86,7 +88,7 @@ static const CudaArchToStringMap arch_names[] = { // clang-format off {CudaArch::UNUSED, "", ""}, SM2(20, "compute_20"), SM2(21, "compute_20"), // Fermi -SM(30), SM(32), SM(35), SM(37), // Kepler +SM(30), SM3(32, "compute_32"), SM(35), SM(

[clang] [CUDA][HIP] Fix record layout on Windows (PR #87651)

2024-04-17 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/87651 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Correctly forward `--cuda-path` to the nvlink wrapper (PR #100170)

2024-07-23 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/100170 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-23 Thread Artem Belevich via cfe-commits
Artem-B wrote: @nico: > Why do we need a new binary for this, instead of having something like `clang > -cc1_nvlink` that calls a custom mode within clang? Do we have existing precedents for such built-in tools, other than cc1 itself? If the linker wrapper can be part of clang itself, it woul

[clang] [llvm] [CUDA] Add a pseudo GPU sm_next which allows overriding for SM/PTX version. (PR #100247)

2024-07-23 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B created https://github.com/llvm/llvm-project/pull/100247 Sometimes users may need to use older clang with newer SM/PTX versions which clang does not know anything about, yet. --offload-arch=sm_next, combined with --cuda-next-sm=X and --cuda-next-ptx=Y allows passing

[clang] [llvm] [CUDA] Add support for CUDA-12.6 and sm_100 (PR #97402)

2024-07-23 Thread Artem Belevich via cfe-commits
Artem-B wrote: @sergey-kozub FYI, https://github.com/llvm/llvm-project/pull/100247 should allow forward-testing CUDA w/o relying on specific GPU/PTX variant being hardcoded in clang. https://github.com/llvm/llvm-project/pull/97402 ___ cfe-commits ma

[clang] [llvm] [CUDA] Add a pseudo GPU sm_next which allows overriding for SM/PTX version. (PR #100247)

2024-07-23 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B updated https://github.com/llvm/llvm-project/pull/100247 >From 44a1045eee71777fa916e2a8043b2f99afc96a96 Mon Sep 17 00:00:00 2001 From: Artem Belevich Date: Thu, 18 Jul 2024 15:05:01 -0700 Subject: [PATCH 1/2] [CUDA] Add a pseudo GPU sm_next which allows overrides for

[clang] [llvm] [CUDA] Add a pseudo GPU sm_next which allows overriding for SM/PTX version. (PR #100247)

2024-07-23 Thread Artem Belevich via cfe-commits
@@ -96,6 +96,7 @@ static const OffloadArchToStringMap arch_names[] = { SM(89), // Ada Lovelace SM(90), // Hopper SM(90a), // Hopper +SM(next),// Placeholder for a n

[clang] [llvm] [CUDA] Add a pseudo GPU sm_next which allows overriding for SM/PTX version. (PR #100247)

2024-07-23 Thread Artem Belevich via cfe-commits
@@ -648,6 +658,13 @@ void NVPTX::getNVPTXTargetFeatures(const Driver &D, const llvm::Triple &Triple, Features.push_back(Args.MakeArgString(PtxFeature)); return; } + // Add --cuda-next-ptx to the list of features, but carry on to add the + // default PTX feature for

[clang] [llvm] [CUDA] Add a pseudo GPU sm_next which allows overriding for SM/PTX version. (PR #100247)

2024-07-23 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/100247 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [CUDA] Add a pseudo GPU sm_next which allows overriding for SM/PTX version. (PR #100247)

2024-07-24 Thread Artem Belevich via cfe-commits
@@ -26,24 +27,38 @@ static cl::opt NoF16Math("nvptx-no-f16-math", cl::Hidden, cl::desc("NVPTX Specific: Disable generation of f16 math ops."), cl::init(false)); +static cl::opt +NextSM("nvptx-next-sm", cl::Hidden, + cl::desc("NVPTX

[clang] [NVPTX] Restore old va_list builtin type (PR #100438)

2024-07-24 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/100438 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [NVPTX] Correctly forward the PTX feature to the nvlink wrapper (PR #100607)

2024-07-25 Thread Artem Belevich via cfe-commits
Artem-B wrote: The patch seems to change only the test file. Should there be more changes in the patch? https://github.com/llvm/llvm-project/pull/100607 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/lis

[clang] [NVPTX] Correctly forward the PTX feature to the nvlink wrapper (PR #100607)

2024-07-25 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/100607 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-15 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/96561 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-15 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,64 @@ + +Clang nvlink Wrapper + + +.. contents:: + :local: + +.. _clang-nvlink-wrapper: + +Introduction + + +This tools works as a wrapper around the NVIDIA ``nvlink`` linker. The purpose +of this wrapper is to prov

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-15 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,776 @@ +//===-- clang-nvlink-wrapper/ClangNVLinkWrapper.cpp - NVIDIA linker util --===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-15 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,776 @@ +//===-- clang-nvlink-wrapper/ClangNVLinkWrapper.cpp - NVIDIA linker util --===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-15 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B commented: First batch of comments on the patch -- I only got till about the middle of ClangNVLinkWrapper.cpp. Will continue reviewing tomorrow. https://github.com/llvm/llvm-project/pull/96561 ___ cfe-commits mailing list cf

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-15 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,776 @@ +//===-- clang-nvlink-wrapper/ClangNVLinkWrapper.cpp - NVIDIA linker util --===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-15 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,64 @@ + +Clang nvlink Wrapper + + +.. contents:: + :local: + +.. _clang-nvlink-wrapper: + +Introduction + + +This tools works as a wrapper around the NVIDIA ``nvlink`` linker. The purpose +of this wrapper is to prov

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-22 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/96561 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-22 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,778 @@ +//===-- clang-nvlink-wrapper/ClangNVLinkWrapper.cpp - NVIDIA linker util --===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-22 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,778 @@ +//===-- clang-nvlink-wrapper/ClangNVLinkWrapper.cpp - NVIDIA linker util --===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-22 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. LGTM for the patch in general, though I can't vouch for the details of the linking process. I'll defer to @MaskRay on that. https://github.com/llvm/llvm-project/pull/96561 ___ cfe-commits mailing

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-22 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,778 @@ +//===-- clang-nvlink-wrapper/ClangNVLinkWrapper.cpp - NVIDIA linker util --===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-22 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,778 @@ +//===-- clang-nvlink-wrapper/ClangNVLinkWrapper.cpp - NVIDIA linker util --===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-22 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,778 @@ +//===-- clang-nvlink-wrapper/ClangNVLinkWrapper.cpp - NVIDIA linker util --===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-22 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,778 @@ +//===-- clang-nvlink-wrapper/ClangNVLinkWrapper.cpp - NVIDIA linker util --===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [LinkerWrapper] Pass all files to the device linker (PR #97573)

2024-07-22 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,64 @@ + +Clang nvlink Wrapper + + +.. contents:: + :local: + +.. _clang-nvlink-wrapper: + +Introduction + + +This tools works as a wrapper around the NVIDIA ``nvlink`` linker. The purpose +of this wrapper is to prov

[clang] [LinkerWrapper] Pass all files to the device linker (PR #97573)

2024-07-22 Thread Artem Belevich via cfe-commits
@@ -504,18 +511,23 @@ Expected clang(ArrayRef InputFiles, const ArgList &Args) { llvm::copy(LinkerArgs, std::back_inserter(CmdArgs)); } - // Pass on -mllvm options to the clang invocation. - for (const opt::Arg *Arg : Args.filtered(OPT_mllvm)) { -CmdArgs.push_back

[clang] [LinkerWrapper] Pass all files to the device linker (PR #97573)

2024-07-22 Thread Artem Belevich via cfe-commits
@@ -504,18 +511,23 @@ Expected clang(ArrayRef InputFiles, const ArgList &Args) { llvm::copy(LinkerArgs, std::back_inserter(CmdArgs)); } - // Pass on -mllvm options to the clang invocation. - for (const opt::Arg *Arg : Args.filtered(OPT_mllvm)) { -CmdArgs.push_back

[clang] [llvm] [NVPTX] Support __usAtomicCAS builtin (PR #99646)

2024-08-28 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B closed https://github.com/llvm/llvm-project/pull/99646 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [libc] [llvm] [NVPTX] Implement variadic functions using IR lowering (PR #96015)

2024-07-12 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/96015 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

<    4   5   6   7   8   9   10   11   12   >