from:"Yaxun Liu via cfe\-commits"

[clang] [CUDA][HIP] improve error message for missing cmath (PR #122155)

2025-01-09 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu closed https://github.com/llvm/llvm-project/pull/122155 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA][HIP] improve error message for missing cmath (PR #122155)

2025-01-08 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu created https://github.com/llvm/llvm-project/pull/122155 One common error seen in CUDA/HIP compilation is: fatal error: 'cmath' file not found which is due to inproper installation of standard C++ libraries. Since it happens with #include_next, users may feel confu

[clang] [Clang] __has_builtin should return false for aux triple builtins (PR #121839)

2025-01-08 Thread Yaxun Liu via cfe-commits

yxsamliu wrote: > > I am afraid this will break all existing CUDA/HIP programs since they > > expect to be able to parse the builtins for both host and device targets. > > In the spirit of single source, the compiler sees the entire code for all > > targets, including host target and all device

[clang] [Clang] __has_builtin should return false for aux triple builtins (PR #121839)

2025-01-08 Thread Yaxun Liu via cfe-commits

yxsamliu wrote: I am afraid this will break all existing CUDA/HIP programs since they expect to be able to parse the builtins for both host and device targets. In the spirit of single source, the compiler sees the entire code for all targets, including host target and all device targets. It is

[clang] [CUDA][HIP] Fix overriding of constexpr virtual function (PR #121986)

2025-01-08 Thread Yaxun Liu via cfe-commits

yxsamliu wrote: > This needs a release note added https://github.com/llvm/llvm-project/pull/121986 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA][HIP] Fix overriding of constexpr virtual function (PR #121986)

2025-01-08 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu updated https://github.com/llvm/llvm-project/pull/121986 >From ae55b59e9e7d944b02ce0059f879718fd733c301 Mon Sep 17 00:00:00 2001 From: "Yaxun (Sam) Liu" Date: Tue, 7 Jan 2025 13:52:09 -0500 Subject: [PATCH] [CUDA][HIP] Fix overriding of constexpr virtual function In

[clang] [CUDA][HIP] Fix overriding of constexpr virtual function (PR #121986)

2025-01-08 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu updated https://github.com/llvm/llvm-project/pull/121986 >From fa0df07b80b0f704f4e10fa1ec468fa6ed02291a Mon Sep 17 00:00:00 2001 From: "Yaxun (Sam) Liu" Date: Tue, 7 Jan 2025 13:52:09 -0500 Subject: [PATCH] [CUDA][HIP] Fix overriding of constexpr virtual function In

[clang] [CUDA][HIP] Fix overriding of constexpr virtual function (PR #121986)

2025-01-08 Thread Yaxun Liu via cfe-commits

@@ -1595,8 +1606,21 @@ static bool IsOverloadOrOverrideImpl(Sema &SemaRef, FunctionDecl *New, // Allow overloading of functions with same signature and different CUDA // target attributes. -if (NewTarget != OldTarget) +if (NewTarget != OldTarg

[clang] [CUDA][HIP] Fix overriding of constexpr virtual function (PR #121986)

2025-01-08 Thread Yaxun Liu via cfe-commits

@@ -1309,6 +1309,16 @@ Sema::CheckOverload(Scope *S, FunctionDecl *New, const LookupResult &Old, return Ovl_Overload; } +template static bool hasExplicitAttr(const FunctionDecl *D) { + if (!D) +return false; + if (auto *A = D->getAttr()) +return !A->isImplicit();

[clang] [CUDA][HIP] Fix overriding of constexpr virtual function (PR #121986)

2025-01-08 Thread Yaxun Liu via cfe-commits

@@ -1309,6 +1309,16 @@ Sema::CheckOverload(Scope *S, FunctionDecl *New, const LookupResult &Old, return Ovl_Overload; } +template static bool hasExplicitAttr(const FunctionDecl *D) { + if (!D) +return false; yxsamliu wrote: will do https://github.co

[clang] [CUDA][HIP] Fix overriding of constexpr virtual function (PR #121986)

2025-01-07 Thread Yaxun Liu via cfe-commits

@@ -1595,8 +1606,21 @@ static bool IsOverloadOrOverrideImpl(Sema &SemaRef, FunctionDecl *New, // Allow overloading of functions with same signature and different CUDA // target attributes. -if (NewTarget != OldTarget) +if (NewTarget != OldTarg

[clang] [CUDA][HIP] Fix overriding of constexpr virtual function (PR #121986)

2025-01-07 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu edited https://github.com/llvm/llvm-project/pull/121986 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA][HIP] Fix overriding of constexpr virtual function (PR #121986)

2025-01-07 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu created https://github.com/llvm/llvm-project/pull/121986 In C++20 constexpr virtual function is allowed. In C++17 although non-pure virtual function is not allowed to be constexpr, pure virtual function is allowed to be constexpr and is allowed to be overriden by no

[clang] [clang][CodeGen][SPIRV] Translate `amdgpu_flat_work_group_size` into `reqd_work_group_size`. (PR #116820)

2025-01-06 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/116820 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [HIP] Fix tests broken by #117074 / 689c532 (PR #117361)

2024-11-22 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu approved this pull request. https://github.com/llvm/llvm-project/pull/117361 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA] pass -fno-threadsafe-statics to GPU sub-compilations. (PR #117074)

2024-11-22 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu approved this pull request. LGTM. Thanks! https://github.com/llvm/llvm-project/pull/117074 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang][CodeGen][SPIRV] Translate `amdgpu_flat_work_group_size` into `reqd_work_group_size`. (PR #116820)

2024-11-21 Thread Yaxun Liu via cfe-commits

yxsamliu wrote: reqd_work_group_size is for OpenCL reqd_work_group_size attribute and it sets exact block size. amdgpu-flat-work-group-size sets a (min, max) range for block size. HIP launch bounds sets a block size range (1, bound). It cannot be represented by reqd_work_group_size. https:/

[clang] [AMDGPU] Use the AMDGPUToolChain when targeting C/C++ directly (PR #99687)

2024-11-20 Thread Yaxun Liu via cfe-commits

@@ -6405,7 +6424,12 @@ const ToolChain &Driver::getToolChain(const ArgList &Args, TC = std::make_unique(*this, Target, Args); break; case llvm::Triple::AMDHSA: - TC = std::make_unique(*this, Target, Args); + TC = + llvm::any_of(Inputs, +

[clang] [CUDA] pass -fno-threadsafe-statics to GPU sub-compilations. (PR #117074)

2024-11-20 Thread Yaxun Liu via cfe-commits

yxsamliu wrote: > @yxsamliu -- should I add it for HIP, too? Yes please. I would appreciate that. Thanks. https://github.com/llvm/llvm-project/pull/117074 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/l

[clang] [clang][CodeGen][SPIRV] Translate `amdgpu_flat_work_group_size` into `reqd_work_group_size`. (PR #116820)

2024-11-19 Thread Yaxun Liu via cfe-commits

yxsamliu wrote: > > > > reqd_work_group_size is for OpenCL reqd_work_group_size attribute and > > > > it sets exact block size. amdgpu-flat-work-group-size sets a (min, max) > > > > range for block size. > > > > HIP launch bounds sets a block size range (1, bound). It cannot be > > > > represe

[clang] [clang][CodeGen][SPIRV] Translate `amdgpu_flat_work_group_size` into `reqd_work_group_size`. (PR #116820)

2024-11-19 Thread Yaxun Liu via cfe-commits

yxsamliu wrote: > > reqd_work_group_size is for OpenCL reqd_work_group_size attribute and it > > sets exact block size. amdgpu-flat-work-group-size sets a (min, max) range > > for block size. > > HIP launch bounds sets a block size range (1, bound). It cannot be > > represented by reqd_work_gr

[clang] [amdgpu-arch] Replace use of HSA with reading sysfs directly (PR #116651)

2024-11-18 Thread Yaxun Liu via cfe-commits

yxsamliu wrote: > I feel like this is a workaround. Can we not fix the "limitation of the > driver" and if there is an issue with HSA overhead, shouldn't we file a > ticket? I think there was already a ticket but it cannot be fixed easily. https://github.com/llvm/llvm-project/pull/116651

[clang] [Clang] Add support for scoped atomic thread fence (PR #115545)

2024-11-18 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu approved this pull request. https://github.com/llvm/llvm-project/pull/115545 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [amdgpu-arch] Replace use of HSA with reading sysfs directly (PR #116651)

2024-11-18 Thread Yaxun Liu via cfe-commits

yxsamliu wrote: I think we could use this approach for Linux. amdgpu-arch can still use HIP runtime to detect GPU for both Linux and Windows, so we still have a fallback even if we remove the HSA approach. https://github.com/llvm/llvm-project/pull/116651 ___

[clang] [amdgpu-arch] Replace use of HSA with reading sysfs directly (PR #116651)

2024-11-18 Thread Yaxun Liu via cfe-commits

yxsamliu wrote: > > > @jhuber6 can you comment on "lot of overhead" and if that matters? Also, > > > not sure why the HSA library dependence is a problem. This seems to be > > > exposing amdgpu-arch to more maintenance overhead. > > > > > > Sometimes the driver will hang and since this is use

[clang] [clang] [NFC] Merge two ifs to a single one (PR #116226)

2024-11-15 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu approved this pull request. https://github.com/llvm/llvm-project/pull/116226 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [AMDGPU] Use the AMDGPUToolChain when targeting C/C++ directly (PR #99687)

2024-11-14 Thread Yaxun Liu via cfe-commits

@@ -6405,7 +6424,12 @@ const ToolChain &Driver::getToolChain(const ArgList &Args, TC = std::make_unique(*this, Target, Args); break; case llvm::Triple::AMDHSA: - TC = std::make_unique(*this, Target, Args); + TC = + llvm::any_of(Inputs, +

[clang] [CUDA][HIP] Fix host/device context in concept (PR #67721)

2024-11-13 Thread Yaxun Liu via cfe-commits

yxsamliu wrote: > Is this still relevant? Yes. The issue still exists and my arguments still hold https://github.com/llvm/llvm-project/pull/67721 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/c

[clang] [Clang] Add support for scoped atomic thread fence (PR #115545)

2024-11-11 Thread Yaxun Liu via cfe-commits

yxsamliu wrote: LGTM https://github.com/llvm/llvm-project/pull/115545 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Add clang atomic control options and attribute (PR #114841)

2024-11-07 Thread Yaxun Liu via cfe-commits

@@ -0,0 +1,30 @@ +// RUN: %clang_cc1 -fsyntax-only -verify %s +// RUN: %clang_cc1 -fsyntax-only -verify -fcuda-is-device %s +// RUN: %clang_cc1 -fsyntax-only -verify -fcuda-is-device %s \ +// RUN: -fatomic=no_fine_grained_memory:off,no_remote_memory:on,ignore_denormal_mode:on +

[clang] Add clang atomic control options and attribute (PR #114841)

2024-11-07 Thread Yaxun Liu via cfe-commits

@@ -0,0 +1,30 @@ +// RUN: %clang_cc1 -fsyntax-only -verify %s +// RUN: %clang_cc1 -fsyntax-only -verify -fcuda-is-device %s +// RUN: %clang_cc1 -fsyntax-only -verify -fcuda-is-device %s \ +// RUN: -fatomic=no_fine_grained_memory:off,no_remote_memory:on,ignore_denormal_mode:on +

[clang] Add clang atomic control options and attribute (PR #114841)

2024-11-07 Thread Yaxun Liu via cfe-commits

@@ -569,19 +569,21 @@ void AMDGPUTargetCodeGenInfo::setTargetAtomicMetadata( AtomicInst.setMetadata(llvm::LLVMContext::MD_noalias_addrspace, ASRange); } - if (!RMW || !CGF.getTarget().allowAMDGPUUnsafeFPAtomics()) + if (!RMW) return; - // TODO: Introduce new, m

[clang] Add clang atomic control options and attribute (PR #114841)

2024-11-07 Thread Yaxun Liu via cfe-commits

@@ -0,0 +1,19 @@ +//===--- AtomicOptions.def - Atomic Options database -*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] Add clang atomic control options and attribute (PR #114841)

2024-11-07 Thread Yaxun Liu via cfe-commits

@@ -1093,6 +1097,169 @@ inline void FPOptions::applyChanges(FPOptionsOverride FPO) { *this = FPO.applyOverrides(*this); } +/// Atomic control options +class AtomicOptionsOverride; +class AtomicOptions { +public: + using storage_type = uint16_t; + + static constexpr unsign

[clang] [CUDA] Add support for __grid_constant__ attribute (PR #114589)

2024-11-04 Thread Yaxun Liu via cfe-commits

@@ -1450,6 +1450,13 @@ def CUDAHost : InheritableAttr { } def : MutualExclusions<[CUDAGlobal, CUDAHost]>; +def CUDAGridConstant : InheritableAttr { + let Spellings = [GNU<"grid_constant">, Declspec<"__grid_constant__">]; + let Subjects = SubjectList<[ParmVar]>; + let LangOp

[clang] [clang][Driver][HIP] Add support for mixing AMDGCNSPIRV & concrete `offload-arch`s. (PR #113509)

2024-11-04 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu approved this pull request. https://github.com/llvm/llvm-project/pull/113509 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [RFC] Add clang atomic control options and pragmas (PR #102569)

2024-11-04 Thread Yaxun Liu via cfe-commits

yxsamliu wrote: As discussed in the RFC at LLVM discourse, close this PR and open another PR for implementing it as a compound statement attribute https://github.com/llvm/llvm-project/pull/114841 https://github.com/llvm/llvm-project/pull/102569 ___ c

[clang] [RFC] Add clang atomic control options and pragmas (PR #102569)

2024-11-04 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu closed https://github.com/llvm/llvm-project/pull/102569 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [Clang] Put offloading globals in the `.llvm.rodata.offloading` section (PR #111890)

2024-10-25 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/111890 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Add a type for the named barrier (PR #113614)

2024-10-25 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu approved this pull request. https://github.com/llvm/llvm-project/pull/113614 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Fix amdgpu-arch for dll name on Windows (PR #101350)

2024-10-24 Thread Yaxun Liu via cfe-commits

yxsamliu wrote: ping https://github.com/llvm/llvm-project/pull/101350 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [HIP] Always add -fnative-half-arguments-and-returns cmdline option. (PR #113335)

2024-10-22 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu approved this pull request. https://github.com/llvm/llvm-project/pull/113335 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang][HIP] Don't use the OpenCLKernel CC when targeting AMDGCNSPIRV (PR #110447)

2024-10-22 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu approved this pull request. https://github.com/llvm/llvm-project/pull/110447 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] clang/HIP: Remove REQUIRES libgcc from a test (PR #112412)

2024-10-16 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu approved this pull request. https://github.com/llvm/llvm-project/pull/112412 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [AMDGPU] Allow overload of __builtin_amdgcn_mov/update_dpp (PR #112447)

2024-10-16 Thread Yaxun Liu via cfe-commits

yxsamliu wrote: > This needs some sema type restrictions to make sure it's something sensible +1 also need sema lit test https://github.com/llvm/llvm-project/pull/112447 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cg

[clang] [HIP] fix host min/max in header (PR #82956)

2024-10-11 Thread Yaxun Liu via cfe-commits

yxsamliu wrote: yet another usage of mixed signed/unsigned min https://github.com/ROCm/hipBLASLt/issues/1227 https://github.com/llvm/llvm-project/pull/82956 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman

[clang] [CUDA/HIP] fix propagate -cuid to a host-only compilation. (PR #111650)

2024-10-10 Thread Yaxun Liu via cfe-commits

yxsamliu wrote: > > This does not seem to be the right fix. I tends to think the test > > https://github.com/ROCm/hip-tests/tree/amd-staging/samples/2_Cookbook/16_assembly_to_executable > > needs fix. Since it does not expect host-only compilation to use CUID, it > > should add `-fuse-cuid=non

[clang] [HIP] Use original file path for CUID (PR #107734)

2024-10-09 Thread Yaxun Liu via cfe-commits

@@ -16,15 +18,15 @@ // RUN: %clang -### -x hip --target=x86_64-unknown-linux-gnu -DX=1 --no-offload-new-driver \ // RUN: --offload-arch=gfx906 -c -nogpuinc -nogpulib -fuse-cuid=hash \ -// RUN: %S/Inputs/hip_multiple_inputs/a.cu >%t.out 2>&1 +// RUN: Inputs/hip_multiple_

[clang] [CUDA/HIP] fix propagate -cuid to a host-only compilation. (PR #111650)

2024-10-09 Thread Yaxun Liu via cfe-commits

yxsamliu wrote: This does not seem to be the right fix. I tends to think the test https://github.com/ROCm/hip-tests/tree/amd-staging/samples/2_Cookbook/16_assembly_to_executable needs fix. Since it does not expect host-only compilation to use CUID, it should add `-fuse-cuid=none` to the host-o

[clang] [Clang][HIP] Warn when __AMDGCN_WAVEFRONT_SIZE is used in host code without relying on target-dependent overload resolution (PR #109663)

2024-10-02 Thread Yaxun Liu via cfe-commits

yxsamliu wrote: One drawback of not defining `__AMDGCN_WAVEFRONT_SIZE__` in host compilation is the impairment of uniformity of source code across host and device sides. Users have to put `#if __HIP_DEVICE_COMPILE__` anywhere they use `__AMDGCN_WAVEFRONT_SIZE__`. Previous experience tells us t

[clang] [Clang][HIP] Warn when __AMDGCN_WAVEFRONT_SIZE is used in host code without relying on target-dependent overload resolution (PR #109663)

2024-10-01 Thread Yaxun Liu via cfe-commits

yxsamliu wrote: `__AMDGCN_WAVEFRONT_SIZE__` could also be used in other offloading languages e.g. OpenMP. The check for non-constantness is not specific to HIP. It is a constant when the triple is amdgcn and -target-cpu is specified. Otherwise it should not be treated as constant. I think you

[clang] [cuda][[HIP] `constant` should imply constant (PR #110182)

2024-09-27 Thread Yaxun Liu via cfe-commits

yxsamliu wrote: > It has nothing to do with writing to those arrays while the kernel is > running. That would indeed be UB. > > > both would still work just the same even with this change, > > No, they will not. Here's the demonstration of the behavior change that > `const` brings to the tabl

[clang] [llvm] [SPIRV][RFC] Rework / extend support for memory scopes (PR #106429)

2024-09-24 Thread Yaxun Liu via cfe-commits

yxsamliu wrote: > Thank you ever so much for the review @VyacheslavLevytskyy! I will create a > PR for the Translator as well, since there's some handling missing there; I > will refer to it here for future readers. Final check: are you OK with the > OpenCL changes @yxsamliu? LGTM https://gi

[clang] [HIP] Use original file path for CUID (PR #107734)

2024-09-14 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu closed https://github.com/llvm/llvm-project/pull/107734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [HIP][Clang][CodeGen] Handle hip bin symbols properly. (PR #107458)

2024-09-11 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu approved this pull request. LGTM. Thanks https://github.com/llvm/llvm-project/pull/107458 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [HIP][Clang][CodeGen] Handle hip bin symbols properly. (PR #107458)

2024-09-10 Thread Yaxun Liu via cfe-commits

@@ -905,10 +907,10 @@ llvm::Function *CGNVCUDARuntime::makeModuleCtorFunction() { GpuBinaryHandle = new llvm::GlobalVariable( TheModule, PtrTy, /*isConstant=*/false, Linkage, /*Initializer=*/ -CudaGpuBinary ? llvm::ConstantPointerNull::get(PtrTy) :

[clang] [HIP][Clang][CodeGen] Handle hip bin symbols properly. (PR #107458)

2024-09-10 Thread Yaxun Liu via cfe-commits

@@ -175,7 +175,6 @@ __device__ void device_use() { // HIP-SAME: section ".hipFatBinSegment" // * variable to save GPU binary handle after initialization // CUDANORDC: @__[[PREFIX]]_gpubin_handle = internal global ptr null -// HIPNEF: @__[[PREFIX]]_gpubin_handle_{{[0-9a-f]+}} =

[clang] [HIP][Clang][CodeGen] Handle hip bin symbols properly. (PR #107458)

2024-09-10 Thread Yaxun Liu via cfe-commits

@@ -30,8 +28,6 @@ // RUN: 2>&1 | FileCheck -check-prefix=LD-R %s // LD-R: Found undefined HIP fatbin symbol: __hip_fatbin_[[ID1:[0-9a-f]+]] // LD-R: Found undefined HIP fatbin symbol: __hip_fatbin_[[ID2:[0-9a-f]+]] -// LD-R: Found undefined HIP gpubin handle symbol: __hip_gpu

[clang] [HIP][Clang][CodeGen] Handle hip bin symbols properly. (PR #107458)

2024-09-10 Thread Yaxun Liu via cfe-commits

@@ -905,10 +907,10 @@ llvm::Function *CGNVCUDARuntime::makeModuleCtorFunction() { GpuBinaryHandle = new llvm::GlobalVariable( TheModule, PtrTy, /*isConstant=*/false, Linkage, /*Initializer=*/ -CudaGpuBinary ? llvm::ConstantPointerNull::get(PtrTy) :

[clang] [Clang][HIP] Target-dependent overload resolution in declarators and specifiers (PR #103031)

2024-09-10 Thread Yaxun Liu via cfe-commits

@@ -0,0 +1,703 @@ +// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -fsyntax-only -verify=expected,onhost %s +// RUN: %clang_cc1 -triple nvptx64-nvidia-cuda -fsyntax-only -fcuda-is-device -verify=expected,ondevice %s + + +// Tests to ensure that functions with host and device

[clang] [NFC][AMDGPU][Driver] Move 'shouldSkipSanitizeOption' utility to AMDGPU. (PR #107997)

2024-09-10 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu approved this pull request. https://github.com/llvm/llvm-project/pull/107997 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA/HIP] propagate -cuid to a host-only compilation. (PR #107483)

2024-09-07 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu approved this pull request. LGTM. Thanks https://github.com/llvm/llvm-project/pull/107483 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [HIP] Use original file path for CUID (PR #107734)

2024-09-07 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu created https://github.com/llvm/llvm-project/pull/107734 to avoid being nondeterministic due to random path in distributed build. >From 725953ccbdb1f57eaac234cf5729f64a9fdbce13 Mon Sep 17 00:00:00 2001 From: "Yaxun (Sam) Liu" Date: Sat, 7 Sep 2024 20:32:48 -0400 Sub

[clang] [HIP][Clang][CodeGen] Handle hip bin symbols properly. (PR #107458)

2024-09-06 Thread Yaxun Liu via cfe-commits

yxsamliu wrote: you need to update lit test clang/test/CodeGenCUDA/device-stub.cu as it fails now [clang/test/CodeGenCUDA/device-stub.cu](https://buildkite.com/llvm-project/github-pull-requests/builds/98392#0191c73e-12f8-4cf3-9618-d3fd752f9149) https://github.com/llvm/llvm-project/pull/107458

[clang] [HIP][Clang][CodeGen] Handle hip bin symbols properly. (PR #107458)

2024-09-05 Thread Yaxun Liu via cfe-commits

@@ -840,8 +840,10 @@ llvm::Function *CGNVCUDARuntime::makeModuleCtorFunction() { FatBinStr = new llvm::GlobalVariable( CGM.getModule(), CGM.Int8Ty, /*isConstant=*/true, llvm::GlobalValue::ExternalLinkage, nullptr, - "__hip_fatbin_" + CGM.getCo

[clang] [HIP][Clang][CodeGen] Handle hip bin symbols properly. (PR #107458)

2024-09-05 Thread Yaxun Liu via cfe-commits

yxsamliu wrote: The current behavior of clang is expected. when gpu binary is not specified, it is expected to be used for -fgpu-rdc and the __hip_gpubin_handle_ symbol needs to be external and unique since they may need to be merged for partial linking. Make them internal will break partial

[clang] [llvm] [Offload] Move HIP and CUDA to new driver by default (PR #84420)

2024-08-29 Thread Yaxun Liu via cfe-commits

yxsamliu wrote: > @yxsamliu Do you know what the next steps for merging this would be? I'd like > to get it into the Clang 20 release if possible. The only thing this loses > currently is managed variables being registered in RDC mode, but I'm going to > assume that's hardly seen in practice s

[clang] [clang] Fixing Clang HIP inconsistent order for template functions (PR #101627)

2024-08-28 Thread Yaxun Liu via cfe-commits

yxsamliu wrote: is it already fixed by https://github.com/llvm/llvm-project/pull/102661 ? https://github.com/llvm/llvm-project/pull/101627 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commi

[clang] Fix amdgpu-arch for dll name on Windows (PR #101350)

2024-08-23 Thread Yaxun Liu via cfe-commits

yxsamliu wrote: ping https://github.com/llvm/llvm-project/pull/101350 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang][CodeGen][SPIR-V][AMDGPU] Tweak AMDGCNSPIRV ABI to allow for the correct handling of aggregates passed to kernels / functions. (PR #102776)

2024-08-20 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu approved this pull request. https://github.com/llvm/llvm-project/pull/102776 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [HIP] search fatbin symbols for libs passed by -l (PR #104638)

2024-08-18 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu closed https://github.com/llvm/llvm-project/pull/104638 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [HIP] search fatbin symbols for libs passed by -l (PR #104638)

2024-08-18 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu updated https://github.com/llvm/llvm-project/pull/104638 >From 6e6bb355f2cf79f30d01c97b580d4354cbb7e727 Mon Sep 17 00:00:00 2001 From: "Yaxun (Sam) Liu" Date: Fri, 16 Aug 2024 14:24:08 -0400 Subject: [PATCH] [HIP] search fatbin symbols for libs passed by -l For -fgp

[clang] [HIP] search fatbin symbols for libs passed by -l (PR #104638)

2024-08-18 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu updated https://github.com/llvm/llvm-project/pull/104638 >From 3c281d8cfc99674f2a4de0dfe5e1f02e35e68d6d Mon Sep 17 00:00:00 2001 From: "Yaxun (Sam) Liu" Date: Fri, 16 Aug 2024 14:24:08 -0400 Subject: [PATCH] [HIP] search fatbin symbols for libs passed by -l For -fgp

[clang] [HIP] search fatbin symbols for libs passed by -l (PR #104638)

2024-08-18 Thread Yaxun Liu via cfe-commits

@@ -76,8 +79,75 @@ class HIPUndefinedFatBinSymbols { return GPUBinHandleSymbols; } + // Collect symbols from static libraries specified by -l options. + void processStaticLibraries() { +llvm::SmallVector LibNames; +llvm::SmallVector LibPaths; +llvm::SmallVe

[clang] [HIP] search fatbin symbols for libs passed by -l (PR #104638)

2024-08-18 Thread Yaxun Liu via cfe-commits

@@ -76,8 +79,75 @@ class HIPUndefinedFatBinSymbols { return GPUBinHandleSymbols; } + // Collect symbols from static libraries specified by -l options. + void processStaticLibraries() { +llvm::SmallVector LibNames; +llvm::SmallVector LibPaths; +llvm::SmallVe

[clang] [HIP] search fatbin symbols for libs passed by -l (PR #104638)

2024-08-16 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu created https://github.com/llvm/llvm-project/pull/104638 For -fgpu-rdc linking, clang needs to collect undefined fatbin symbols and resolve them to the embedded fatbin. This has been done for object files and archive files passed as input files to clang. However,

[clang] [Clang] Fix sema checks thinking kernels aren't kernels (PR #104460)

2024-08-15 Thread Yaxun Liu via cfe-commits

@@ -7163,7 +7165,8 @@ void Sema::ProcessDeclAttributeList( } else if (const auto *A = D->getAttr()) { Diag(D->getLocation(), diag::err_opencl_kernel_attr) << A; D->setInvalidDecl(); -} else if (!D->hasAttr()) { +} else if (!D->hasAttr() && --

[clang] [Clang] Fix sema checks thinking kernels aren't kernels (PR #104460)

2024-08-15 Thread Yaxun Liu via cfe-commits

@@ -7147,7 +7147,9 @@ void Sema::ProcessDeclAttributeList( // good to have a way to specify "these attributes must appear as a group", // for these. Additionally, it would be good to have a way to specify "these // attribute must never appear as a group" for attributes li

[clang] clang/AMDGPU: Emit atomicrmw for __builtin_amdgcn_global_atomic_fadd_{f32|f64} (PR #96872)

2024-08-15 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu approved this pull request. https://github.com/llvm/llvm-project/pull/96872 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA] Emit used function list in deterministic order. (PR #102661)

2024-08-12 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu approved this pull request. LGTM. Thanks https://github.com/llvm/llvm-project/pull/102661 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [RFC] Add clang atomic control options and pragmas (PR #102569)

2024-08-09 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu edited https://github.com/llvm/llvm-project/pull/102569 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [RFC] Add clang atomic control options and pragmas (PR #102569)

2024-08-09 Thread Yaxun Liu via cfe-commits

yxsamliu wrote: > Thank you for the patch, but RFCs for Clang should be published in > https://discourse.llvm.org/c/clang/6. PRs doesn't have the visibility we want > RFCs to have. Discourse topic created: https://discourse.llvm.org/t/rfc-add-clang-atomic-control-options-and-pragmas/80641. T

[clang] [RFC] Add clang atomic control options and pragmas (PR #102569)

2024-08-08 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu edited https://github.com/llvm/llvm-project/pull/102569 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [AMDGPU] Use the AMDGPUToolChain when targeting C/C++ directly (PR #99687)

2024-08-07 Thread Yaxun Liu via cfe-commits

yxsamliu wrote: I feel choosing toolchain based on input files does not solve all the use cases. You may want to handle the object files, bitcodes, or assembly files differently by using different toolchains, e.g. you may want to choose rocm toolchain or amdgpu toolchain or HIPAMD toolchain to

[clang] [clang][NFC] Make OffloadLTOMode getter a separate method (PR #101200)

2024-08-06 Thread Yaxun Liu via cfe-commits

yxsamliu wrote: > @yxsamliu Would you mind reviewing this change? Sorry for the delay. LGTM https://github.com/llvm/llvm-project/pull/101200 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-co

[clang] [HIP] Fix __clang_hip_cmath.hip for ambiguity (PR #101341)

2024-08-02 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu closed https://github.com/llvm/llvm-project/pull/101341 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [libc] [Clang] Suppress missing architecture error when doing LTO (PR #100652)

2024-07-31 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu approved this pull request. https://github.com/llvm/llvm-project/pull/100652 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Fix amdgpu-arch for dll name on Windows (PR #101350)

2024-07-31 Thread Yaxun Liu via cfe-commits

@@ -31,16 +43,108 @@ typedef hipError_t (*hipGetDeviceCount_t)(int *); typedef hipError_t (*hipDeviceGet_t)(int *, int); typedef hipError_t (*hipGetDeviceProperties_t)(hipDeviceProp_t *, int); -int printGPUsByHIP() { +extern cl::opt Verbose; + #ifdef _WIN32 - constexpr const

[clang] Fix amdgpu-arch for dll name on Windows (PR #101350)

2024-07-31 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu updated https://github.com/llvm/llvm-project/pull/101350 >From e7c39dbcb05d8fa9232a68c90b0ec4fc4d2a126b Mon Sep 17 00:00:00 2001 From: "Yaxun (Sam) Liu" Date: Wed, 31 Jul 2024 09:23:05 -0400 Subject: [PATCH] Fix amdgpu-arch for dll name on Windows Recently HIP runti

[clang] Fix amdgpu-arch for dll name on Windows (PR #101350)

2024-07-31 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu created https://github.com/llvm/llvm-project/pull/101350 Recently HIP runtime changed dll name to amdhip64_n.dll on Windows, where n is ROCm major version number. Fix amdgpu-arch to search for amdhip64_n.dll on Windows. >From 8819e99b64f3293a758f8a81258a25c91fab6ef

[clang] [HIP] Fix __clang_hip_cmath.hip for ambiguity (PR #101341)

2024-07-31 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu created https://github.com/llvm/llvm-project/pull/101341 If there is a type T which can be converted to both float and double etc but itself is not specialized for __numeric_type, and it is called for math functions eg. fma, it will cause ambiguity with test functio

[clang] [HIP] fix host min/max in header (PR #82956)

2024-07-18 Thread Yaxun Liu via cfe-commits

yxsamliu wrote: found another library using mixed min: https://github.com/ROCm/Tensile/issues/1977 https://github.com/llvm/llvm-project/pull/82956 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/

[clang] [CUDA][HIP] Fix template static member (PR #98580)

2024-07-12 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu closed https://github.com/llvm/llvm-project/pull/98580 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA][HIP] Fix template static member (PR #98580)

2024-07-11 Thread Yaxun Liu via cfe-commits

yxsamliu wrote: sorry for the trouble. It is the same change but rebased to main branch. https://github.com/llvm/llvm-project/pull/98580 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA][HIP] Fix template static member (PR #98580)

2024-07-11 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu created https://github.com/llvm/llvm-project/pull/98580 Should check host/device attributes before emitting static member of template instantiation. Fixes: https://github.com/llvm/llvm-project/issues/98151 >From ba7ab88308c5af2e1c5e6c841524a932c42afeb2 Mon Sep 17 0

[clang] [CUDA][HIP][NFC] add CodeGenModule::shouldEmitCUDAGlobalVar (PR #98543)

2024-07-11 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu closed https://github.com/llvm/llvm-project/pull/98543 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [compiler-rt] [nsan] Add shared runtime (PR #98415)

2024-07-10 Thread Yaxun Liu via cfe-commits

yxsamliu wrote: > It seems to cause a build failure: > > https://lab.llvm.org/buildbot/#/builders/123/builds/1580 It seems the issue was due to old system linker. If I use lld to build compier-rt the build passes. I will fix the buildbot https://github.com/llvm/llvm-zorg/pull/225 https://git

[clang] [compiler-rt] [nsan] Add shared runtime (PR #98415)

2024-07-10 Thread Yaxun Liu via cfe-commits

yxsamliu wrote: It seems to cause a build failure: https://lab.llvm.org/buildbot/#/builders/123/builds/1580 https://github.com/llvm/llvm-project/pull/98415 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/

[clang] [Clang] Add `__CLANG_GPU_DISABLE_MATH_WRAPPERS` macro for offloading math (PR #98234)

2024-07-10 Thread Yaxun Liu via cfe-commits

@@ -345,4 +349,5 @@ __DEVICE__ float ynf(int __a, float __b) { return __nv_ynf(__a, __b); } #pragma pop_macro("__DEVICE_VOID__") #pragma pop_macro("__FAST_OR_SLOW") +#endif // __CLANG_GPU_DISABLE_MATH_WRAPPERS yxsamliu wrote: some non-libm functions e.g. `__

[clang] [Clang] Make the GPU toolchains implicitly link `-lm` and `-lc` (PR #98170)

2024-07-09 Thread Yaxun Liu via cfe-commits

https://github.com/yxsamliu edited https://github.com/llvm/llvm-project/pull/98170 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1210 matches

Mail list logo