[clang] afc9d67 - [CUDA][HIP] support __noinline__ as keyword

2022-05-10 Thread Yaxun Liu via cfe-commits
Author: Yaxun (Sam) Liu Date: 2022-05-10T14:32:27-04:00 New Revision: afc9d674fe5a14b95c50a38d8605a159c2460427 URL: https://github.com/llvm/llvm-project/commit/afc9d674fe5a14b95c50a38d8605a159c2460427 DIFF: https://github.com/llvm/llvm-project/commit/afc9d674fe5a14b95c50a38d8605a159c2460427.dif

[clang] 180a853 - Fix indentation in ReleaseNotes.rst

2022-05-10 Thread Yaxun Liu via cfe-commits
Author: Yaxun (Sam) Liu Date: 2022-05-10T14:56:28-04:00 New Revision: 180a8536cec8e5e13e86863b17982daf95f2038a URL: https://github.com/llvm/llvm-project/commit/180a8536cec8e5e13e86863b17982daf95f2038a DIFF: https://github.com/llvm/llvm-project/commit/180a8536cec8e5e13e86863b17982daf95f2038a.dif

[clang] 84db355 - [clang] Fix KEYALL

2022-05-11 Thread Yaxun Liu via cfe-commits
Author: Yaxun (Sam) Liu Date: 2022-05-11T14:28:08-04:00 New Revision: 84db35594953a6f7aff7cbc007f1c5d2fd35b1a9 URL: https://github.com/llvm/llvm-project/commit/84db35594953a6f7aff7cbc007f1c5d2fd35b1a9 DIFF: https://github.com/llvm/llvm-project/commit/84db35594953a6f7aff7cbc007f1c5d2fd35b1a9.dif

[clang] 0f29214 - [clang]Silence warning in MicrosoftCXXABI.cpp

2022-05-12 Thread Yaxun Liu via cfe-commits
Author: Yaxun (Sam) Liu Date: 2022-05-12T12:04:05-04:00 New Revision: 0f292141aadb0489231de31de966c239486e019d URL: https://github.com/llvm/llvm-project/commit/0f292141aadb0489231de31de966c239486e019d DIFF: https://github.com/llvm/llvm-project/commit/0f292141aadb0489231de31de966c239486e019d.dif

[clang] cefe472 - [clang] Fix __has_builtin

2022-05-19 Thread Yaxun Liu via cfe-commits
Author: Yaxun (Sam) Liu Date: 2022-05-19T11:34:42-04:00 New Revision: cefe472c51fbcd1aed4d4a090709f25a12a8bc2c URL: https://github.com/llvm/llvm-project/commit/cefe472c51fbcd1aed4d4a090709f25a12a8bc2c DIFF: https://github.com/llvm/llvm-project/commit/cefe472c51fbcd1aed4d4a090709f25a12a8bc2c.dif

[clang] 559b8fc - [AMDGPU] emit macro __GFX9__ etc

2022-05-19 Thread Yaxun Liu via cfe-commits
Author: Yaxun (Sam) Liu Date: 2022-05-19T12:06:56-04:00 New Revision: 559b8fc17ef6f5a65ccf9a11fce5f91c0a011b00 URL: https://github.com/llvm/llvm-project/commit/559b8fc17ef6f5a65ccf9a11fce5f91c0a011b00 DIFF: https://github.com/llvm/llvm-project/commit/559b8fc17ef6f5a65ccf9a11fce5f91c0a011b00.dif

[clang] [llvm] [AMDGPU] Add an option to disable unsafe uses of atomic xor (PR #69229)

2024-03-15 Thread Yaxun Liu via cfe-commits
yxsamliu wrote: > > @arsenm I agree that the default should be assuming fine-grained is > > possible. My thinking behind the original naming and direction was not > > wanting to introduce an unexpected performance regression by default. I'm > > happy for both to be changed, and this patch bein

[clang] [llvm] [AMDGPU] Add an option to disable unsafe uses of atomic xor (PR #69229)

2024-03-15 Thread Yaxun Liu via cfe-commits
yxsamliu wrote: > > Will the metadata for unsafe-fp-atomics also be controlled by the pragma > > that controls the no-fine-grained and no-remote metadata? e.g. something > > like > > ``` > > #pragma clang atomics begin no-fine-grained(on) no-remote(on) unsafe-fp(on) > > ``` > > Yes, I would ex

[clang] [llvm] [CodeGen][LLVM] Make the `va_list` related intrinsics generic. (PR #85460)

2024-03-18 Thread Yaxun Liu via cfe-commits
yxsamliu wrote: Is LLVM IR using the old vaarg intrinsics still consumable? Should we update AutoUpgrade.cpp to auto upgrade the old vaarg intrinsics and add a test for them? https://github.com/llvm/llvm-project/pull/85460 ___ cfe-commits mailing lis

[clang] [llvm] [CodeGen][LLVM] Make the `va_list` related intrinsics generic. (PR #85460)

2024-03-18 Thread Yaxun Liu via cfe-commits
yxsamliu wrote: > Doesn’t AutoUpgrade automatically infer overloads? You can see a bunch of > tests in this patch where the output references the overloaded intrinsics but > the input is unchanged. OK it seems to be handled already and test covered. Thanks. https://github.com/llvm/llvm-projec

[clang] [HIP] do not link runtime for -r (PR #85675)

2024-03-18 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu created https://github.com/llvm/llvm-project/pull/85675 since it will cause duplicate symbols when the partially linked object is linked again. Change-Id: I2aea39ad0d57d3dc80b6aff395d9506ab9ebbf4d >From 6d6b362fbf965706108362a55c894588e80ad778 Mon Sep 17 00:00:00 2

[clang] [HIP] do not link runtime for -r (PR #85675)

2024-03-18 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu updated https://github.com/llvm/llvm-project/pull/85675 >From 2e0967c0c606ad647185a739442647ab7d90ed52 Mon Sep 17 00:00:00 2001 From: "Yaxun (Sam) Liu" Date: Mon, 18 Mar 2024 14:09:56 -0400 Subject: [PATCH] [HIP] do not link runtime for -r since it will cause duplic

[clang] [HIP] do not link runtime for -r (PR #85675)

2024-03-19 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu edited https://github.com/llvm/llvm-project/pull/85675 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [HIP] do not link runtime for -r (PR #85675)

2024-03-19 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu closed https://github.com/llvm/llvm-project/pull/85675 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [HIP] Correctly omit bundling with the new driver (PR #85842)

2024-03-19 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu approved this pull request. https://github.com/llvm/llvm-project/pull/85842 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [HIP][NFC] Refactor managed var codegen (PR #85976)

2024-03-20 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu created https://github.com/llvm/llvm-project/pull/85976 Refactor managed variable handling in codegen so that the transformation is done separately from registration. This will allow the new driver to register the managed var in the linker wrapper. >From 11d10a8ac

[clang] [HIP][NFC] Refactor managed var codegen (PR #85976)

2024-03-22 Thread Yaxun Liu via cfe-commits
@@ -1160,9 +1152,8 @@ void CGNVCUDARuntime::createOffloadingEntries() { // Returns module constructor to be added. llvm::Function *CGNVCUDARuntime::finalizeModule() { + transformManagedVars(); yxsamliu wrote: we did the equivalent transformation previously d

[clang] [HIP][NFC] Refactor managed var codegen (PR #85976)

2024-03-22 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu edited https://github.com/llvm/llvm-project/pull/85976 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [HIP][NFC] Refactor managed var codegen (PR #85976)

2024-03-22 Thread Yaxun Liu via cfe-commits
@@ -1160,9 +1152,8 @@ void CGNVCUDARuntime::createOffloadingEntries() { // Returns module constructor to be added. llvm::Function *CGNVCUDARuntime::finalizeModule() { + transformManagedVars(); yxsamliu wrote: we do not have test for managed var in llvm-test-

[clang] [HIP][NFC] Refactor managed var codegen (PR #85976)

2024-03-22 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu updated https://github.com/llvm/llvm-project/pull/85976 >From 1d14bcff6363b34ae48eac2bf68221b16dd1c855 Mon Sep 17 00:00:00 2001 From: "Yaxun (Sam) Liu" Date: Wed, 20 Mar 2024 13:34:29 -0400 Subject: [PATCH] [HIP][NFC] Refactor managed var codegen Refactor managed va

[clang] [HIP][NFC] Refactor managed var codegen (PR #85976)

2024-03-22 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu closed https://github.com/llvm/llvm-project/pull/85976 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [HIP] document difference with CUDA (PR #86838)

2024-03-27 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu created https://github.com/llvm/llvm-project/pull/86838 None >From 3e00450177338a14c5eb0c39e3d49e7b2202056e Mon Sep 17 00:00:00 2001 From: "Yaxun (Sam) Liu" Date: Wed, 27 Mar 2024 13:27:19 -0400 Subject: [PATCH] [HIP] document difference with CUDA --- clang/docs/H

[clang] [llvm] [Offload] Change unregister library to use `atexit` instead of destructor (PR #86830)

2024-03-27 Thread Yaxun Liu via cfe-commits
@@ -186,57 +186,60 @@ GlobalVariable *createBinDesc(Module &M, ArrayRef> Bufs, ".omp_offloading.descriptor" + Suffix); } -void createRegisterFunction(Module &M, GlobalVariable *BinDesc, -StringRef Suffix) { +Function *cr

[clang] [Clang][NFC] Clean up unused binary files for offloading tests (PR #87351)

2024-04-02 Thread Yaxun Liu via cfe-commits
yxsamliu wrote: LGTM for HIP https://github.com/llvm/llvm-project/pull/87351 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [LinkerWrapper] Support relocatable linking for offloading (PR #80066)

2024-02-07 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu approved this pull request. LGTM. Thanks https://github.com/llvm/llvm-project/pull/80066 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [HIP] Allow partial linking for `-fgpu-rdc` (PR #81700)

2024-02-13 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu created https://github.com/llvm/llvm-project/pull/81700 `-fgpu-rdc` mode allows device functions call device functions in different TU. However, currently all device objects have to be linked together since only one fat binary is supported. This is time consuming fo

[clang] [HIP] fix host-used external kernel (PR #83870)

2024-03-05 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu updated https://github.com/llvm/llvm-project/pull/83870 >From 9c6991bbcdce6f24c8f99c8f2a6ff0e5b6c2ac5a Mon Sep 17 00:00:00 2001 From: "Yaxun (Sam) Liu" Date: Mon, 4 Mar 2024 11:38:06 -0500 Subject: [PATCH] [HIP] fix host-used external kernel In -fgpu-rdc mode, when

[clang] [HIP] fix host-used external kernel (PR #83870)

2024-03-05 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu updated https://github.com/llvm/llvm-project/pull/83870 >From 902f09d9124b387ad02bd758e9c54bf44746b0fd Mon Sep 17 00:00:00 2001 From: "Yaxun (Sam) Liu" Date: Mon, 4 Mar 2024 11:38:06 -0500 Subject: [PATCH] [HIP] fix host-used external kernel In -fgpu-rdc mode, when

[clang] [HIP] fix host-used external kernel (PR #83870)

2024-03-05 Thread Yaxun Liu via cfe-commits
@@ -24,6 +24,7 @@ // NEG-NOT: @__clang_gpu_used_external = {{.*}} @_Z7kernel2v // NEG-NOT: @__clang_gpu_used_external = {{.*}} @_Z7kernel3v +// XEG-NOT: @__clang_gpu_used_external = {{.*}} @_Z7kernel5v yxsamliu wrote: fixed https://github.com/llvm/llvm-proje

[clang] [ClangOffloadBundler] fix unbundling archive (PR #84195)

2024-03-07 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu closed https://github.com/llvm/llvm-project/pull/84195 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

2024-03-07 Thread Yaxun Liu via cfe-commits
yxsamliu wrote: Here is the size distribution of individual code object file (each code object file is for one GPU arch, and a fat binary contains a bunch of code object files, therefore the optimal compression parameter is mostly related to code object file size ). | Bin Size | Count |

[clang] [HIP] Do not include the CUID module hash with the new driver (PR #84332)

2024-03-07 Thread Yaxun Liu via cfe-commits
yxsamliu wrote: CUID is needed for device static variable to be accessible on host side. Since the driver does not know whether device static variables are accessed on host side, it should always enable CUID for HIP. https://github.com/llvm/llvm-project/pull/84332 _

[clang] [HIP] Do not include the CUID module hash with the new driver (PR #84332)

2024-03-07 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu approved this pull request. https://github.com/llvm/llvm-project/pull/84332 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [LinkerWrapper] Accept compression arguments for HIP fatbins (PR #84337)

2024-03-07 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu approved this pull request. https://github.com/llvm/llvm-project/pull/84337 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [HIP] Do not include the CUID module hash with the new driver (PR #84332)

2024-03-07 Thread Yaxun Liu via cfe-commits
yxsamliu wrote: > > CUID is needed for device static variable to be accessible on host side. > > Since the driver does not know whether device static variables are accessed > > on host side, it should always enable CUID for HIP. > > Oh! I think I remember what I did. I made the CUID hash gener

[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

2024-03-07 Thread Yaxun Liu via cfe-commits
yxsamliu wrote: > It may be worth asking on https://github.com/facebook/zstd/ . I am sure zstd > maintainers are happy to see more adoption:) Posted a question to zstd https://github.com/facebook/zstd/issues/3932 https://github.com/llvm/llvm-project/pull/83605 _

[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

2024-03-07 Thread Yaxun Liu via cfe-commits
@@ -906,6 +906,16 @@ CreateFileHandler(MemoryBuffer &FirstInput, } OffloadBundlerConfig::OffloadBundlerConfig() { + if (llvm::compression::zstd::isAvailable()) { +CompressionFormat = llvm::compression::Format::Zstd; +// Use a high zstd compress level by default for be

[clang] [HIP] Make the HIP default architecture use the enum value (PR #84400)

2024-03-07 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu approved this pull request. https://github.com/llvm/llvm-project/pull/84400 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [HIP] fix host-used external kernel (PR #83870)

2024-03-08 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu closed https://github.com/llvm/llvm-project/pull/83870 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [HIP] Support compressing bundle by LZMA (PR #83306)

2024-03-08 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu closed https://github.com/llvm/llvm-project/pull/83306 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [HIP] Support compressing bundle by LZMA (PR #83306)

2024-03-08 Thread Yaxun Liu via cfe-commits
yxsamliu wrote: since zstd has comparable compression rate and is much faster, we will use zstd. close this PR. https://github.com/llvm/llvm-project/pull/83306 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mail

[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

2024-03-08 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu updated https://github.com/llvm/llvm-project/pull/83605 >From 16796bc8eb3b32436903db4b689d4cb9cfc348d8 Mon Sep 17 00:00:00 2001 From: "Yaxun (Sam) Liu" Date: Fri, 1 Mar 2024 13:16:45 -0500 Subject: [PATCH] [HIP] add --offload-compression-level= option Added --offloa

[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

2024-03-08 Thread Yaxun Liu via cfe-commits
yxsamliu wrote: zstd developers suggest to enable long distance matching (LDM), i.e. the `--long` option. I updated the PR with the change, and tested that it works well for bundle entry sizes range from 1KB to 20MB, for both compression rate and compression/decompression speed. https://githu

[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

2024-03-08 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu edited https://github.com/llvm/llvm-project/pull/83605 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

2024-03-08 Thread Yaxun Liu via cfe-commits
yxsamliu wrote: > Should an option like in #84337 be added for the new driver? Yes please https://github.com/llvm/llvm-project/pull/83605 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commi

[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

2024-03-08 Thread Yaxun Liu via cfe-commits
yxsamliu wrote: > > Should an option like in #84337 be added for the new driver? > > Yes please Oh. I can add it https://github.com/llvm/llvm-project/pull/83605 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/ma

[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

2024-03-08 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu updated https://github.com/llvm/llvm-project/pull/83605 >From 60faf7f657fdcc00edfa0a1813d1e2746c341ef1 Mon Sep 17 00:00:00 2001 From: "Yaxun (Sam) Liu" Date: Fri, 1 Mar 2024 13:16:45 -0500 Subject: [PATCH] [HIP] add --offload-compression-level= option Added --offloa

[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

2024-03-08 Thread Yaxun Liu via cfe-commits
yxsamliu wrote: > Should an option like in #84337 be added for the new driver? added the option to linker wrapper https://github.com/llvm/llvm-project/pull/83605 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/ma

[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

2024-03-08 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu updated https://github.com/llvm/llvm-project/pull/83605 >From 78ad578a19d2a3585f20ab64d364a46a584ec035 Mon Sep 17 00:00:00 2001 From: "Yaxun (Sam) Liu" Date: Fri, 1 Mar 2024 13:16:45 -0500 Subject: [PATCH] [HIP] add --offload-compression-level= option Added --offloa

[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

2024-03-08 Thread Yaxun Liu via cfe-commits
@@ -2863,3 +2863,18 @@ void tools::addOutlineAtomicsArgs(const Driver &D, const ToolChain &TC, CmdArgs.push_back("+outline-atomics"); } } + +void tools::addOffloadCompressArgs(const llvm::opt::ArgList &TCArgs, + llvm::opt::ArgStringList

[clang] [HIP] Make the new driver bundle outputs for device-only (PR #84534)

2024-03-08 Thread Yaxun Liu via cfe-commits
@@ -4638,7 +4638,10 @@ Action *Driver::BuildOffloadingActions(Compilation &C, } } - if (offloadDeviceOnly()) + // All kinds exit now in device-only mode except for non-RDC mode HIP. yxsamliu wrote: I am wondering whether we should restrict this change

[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

2024-03-08 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu updated https://github.com/llvm/llvm-project/pull/83605 >From 906b23c5f8ef815b7727fe2bda852c33f0d9147b Mon Sep 17 00:00:00 2001 From: "Yaxun (Sam) Liu" Date: Fri, 1 Mar 2024 13:16:45 -0500 Subject: [PATCH] [HIP] add --offload-compression-level= option Added --offloa

[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

2024-03-08 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu updated https://github.com/llvm/llvm-project/pull/83605 >From c46a3ce625a34a497cd0b14631cb755b903e93d6 Mon Sep 17 00:00:00 2001 From: "Yaxun (Sam) Liu" Date: Fri, 1 Mar 2024 13:16:45 -0500 Subject: [PATCH] [HIP] add --offload-compression-level= option Added --offloa

[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

2024-03-09 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu closed https://github.com/llvm/llvm-project/pull/83605 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] e733d7e - Fix test clang-offload-bundler-zstd.c

2024-03-09 Thread Yaxun Liu via cfe-commits
Author: Yaxun (Sam) Liu Date: 2024-03-09T10:07:57-05:00 New Revision: e733d7e23f6553c55c85edd55511b133d2064677 URL: https://github.com/llvm/llvm-project/commit/e733d7e23f6553c55c85edd55511b133d2064677 DIFF: https://github.com/llvm/llvm-project/commit/e733d7e23f6553c55c85edd55511b133d2064677.dif

[clang] [HIP] Make the new driver bundle outputs for device-only (PR #84534)

2024-03-11 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu approved this pull request. https://github.com/llvm/llvm-project/pull/84534 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [HIP] fix host min/max in header (PR #82956)

2024-02-25 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu created https://github.com/llvm/llvm-project/pull/82956 CUDA defines min/max functions for host in global namespace. HIP header needs to define them too to be compatible. Currently only min/max(int, int) is defined. This causes wrong result for arguments that are ou

[clang] [HIP] fix host min/max in header (PR #82956)

2024-02-25 Thread Yaxun Liu via cfe-commits
@@ -1306,14 +1306,50 @@ float min(float __x, float __y) { return __builtin_fminf(__x, __y); } __DEVICE__ double min(double __x, double __y) { return __builtin_fmin(__x, __y); } +// Define host min/max functions. + #if !defined(__HIPCC_RTC__) && !defined(__OPENMP_AMDGCN__) -_

[clang] [HIP] fix host min/max in header (PR #82956)

2024-02-26 Thread Yaxun Liu via cfe-commits
@@ -1306,14 +1306,50 @@ float min(float __x, float __y) { return __builtin_fminf(__x, __y); } __DEVICE__ double min(double __x, double __y) { return __builtin_fmin(__x, __y); } +// Define host min/max functions. + #if !defined(__HIPCC_RTC__) && !defined(__OPENMP_AMDGCN__) -_

[clang] [HIP] fix host min/max in header (PR #82956)

2024-02-26 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu updated https://github.com/llvm/llvm-project/pull/82956 >From aa50cadf0baf84ea38379fd3276f306a27164007 Mon Sep 17 00:00:00 2001 From: "Yaxun (Sam) Liu" Date: Sun, 25 Feb 2024 11:13:40 -0500 Subject: [PATCH] [HIP] fix host min/max in header CUDA defines min/max funct

[clang] [HIP] fix host min/max in header (PR #82956)

2024-02-26 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu updated https://github.com/llvm/llvm-project/pull/82956 >From bd87c56b2d96b834788e8fa449f3ac308faec1f0 Mon Sep 17 00:00:00 2001 From: "Yaxun (Sam) Liu" Date: Sun, 25 Feb 2024 11:13:40 -0500 Subject: [PATCH] [HIP] fix host min/max in header CUDA defines min/max funct

[clang] [HIP] fix host min/max in header (PR #82956)

2024-02-26 Thread Yaxun Liu via cfe-commits
@@ -1306,14 +1306,50 @@ float min(float __x, float __y) { return __builtin_fminf(__x, __y); } __DEVICE__ double min(double __x, double __y) { return __builtin_fmin(__x, __y); } +// Define host min/max functions. + #if !defined(__HIPCC_RTC__) && !defined(__OPENMP_AMDGCN__) -_

[clang] [HIP] fix host min/max in header (PR #82956)

2024-02-26 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu updated https://github.com/llvm/llvm-project/pull/82956 >From c8331bffa27011b953747d3ad4f7b423cf73b4a4 Mon Sep 17 00:00:00 2001 From: "Yaxun (Sam) Liu" Date: Sun, 25 Feb 2024 11:13:40 -0500 Subject: [PATCH] [HIP] fix host min/max in header CUDA defines min/max funct

[clang] [HIP] fix host min/max in header (PR #82956)

2024-02-26 Thread Yaxun Liu via cfe-commits
@@ -1306,15 +1306,68 @@ float min(float __x, float __y) { return __builtin_fminf(__x, __y); } __DEVICE__ double min(double __x, double __y) { return __builtin_fmin(__x, __y); } -#if !defined(__HIPCC_RTC__) && !defined(__OPENMP_AMDGCN__) -__host__ inline static int min(int __a

[clang] [HIP] fix host min/max in header (PR #82956)

2024-02-26 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu updated https://github.com/llvm/llvm-project/pull/82956 >From f7471303abf989ceb1bdbce0d580d74097572dec Mon Sep 17 00:00:00 2001 From: "Yaxun (Sam) Liu" Date: Sun, 25 Feb 2024 11:13:40 -0500 Subject: [PATCH] [HIP] fix host min/max in header CUDA defines min/max funct

[clang] [HIP] fix host min/max in header (PR #82956)

2024-02-26 Thread Yaxun Liu via cfe-commits
@@ -1306,15 +1306,68 @@ float min(float __x, float __y) { return __builtin_fminf(__x, __y); } __DEVICE__ double min(double __x, double __y) { return __builtin_fmin(__x, __y); } -#if !defined(__HIPCC_RTC__) && !defined(__OPENMP_AMDGCN__) -__host__ inline static int min(int __a

[clang] [HIP] fix host min/max in header (PR #82956)

2024-02-27 Thread Yaxun Liu via cfe-commits
@@ -1306,15 +1306,73 @@ float min(float __x, float __y) { return __builtin_fminf(__x, __y); } __DEVICE__ double min(double __x, double __y) { return __builtin_fmin(__x, __y); } -#if !defined(__HIPCC_RTC__) && !defined(__OPENMP_AMDGCN__) -__host__ inline static int min(int __a

[clang] [HIP] fix host min/max in header (PR #82956)

2024-02-27 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu updated https://github.com/llvm/llvm-project/pull/82956 >From ebad3ba006445f290d17c338cc1b39293c18cdad Mon Sep 17 00:00:00 2001 From: "Yaxun (Sam) Liu" Date: Sun, 25 Feb 2024 11:13:40 -0500 Subject: [PATCH] [HIP] fix host min/max in header CUDA defines min/max funct

[clang] [HIP] fix host min/max in header (PR #82956)

2024-02-27 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu closed https://github.com/llvm/llvm-project/pull/82956 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] bcbce80 - Revert "[HIP] fix host min/max in header (#82956)"

2024-02-27 Thread Yaxun Liu via cfe-commits
Author: Yaxun (Sam) Liu Date: 2024-02-27T20:19:07-05:00 New Revision: bcbce807d76a30388b366d14051c5f80e9724dab URL: https://github.com/llvm/llvm-project/commit/bcbce807d76a30388b366d14051c5f80e9724dab DIFF: https://github.com/llvm/llvm-project/commit/bcbce807d76a30388b366d14051c5f80e9724dab.dif

[clang] [HIP] fix host min/max in header (PR #82956)

2024-02-27 Thread Yaxun Liu via cfe-commits
@@ -1306,15 +1306,73 @@ float min(float __x, float __y) { return __builtin_fminf(__x, __y); } __DEVICE__ double min(double __x, double __y) { return __builtin_fmin(__x, __y); } -#if !defined(__HIPCC_RTC__) && !defined(__OPENMP_AMDGCN__) -__host__ inline static int min(int __a

[clang] [llvm] [HIP] Support compressing bundle by LZMA (PR #83297)

2024-02-28 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu created https://github.com/llvm/llvm-project/pull/83297 LZMA (Lempel-Ziv/Markov-chain Algorithm) provides better comparession rate than zstd and zlib for clang-offload-bundler bundles which often contains large number of similar entries. This patch adds liblzma as

[clang] [llvm] [HIP] Support compressing bundle by LZMA (PR #83306)

2024-02-28 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu created https://github.com/llvm/llvm-project/pull/83306 LZMA (Lempel-Ziv/Markov-chain Algorithm) provides better comparession rate than zstd and zlib for clang-offload-bundler bundles which often contains large number of similar entries. This patch let clang-offload-

[clang] [llvm] [HIP] Support compressing bundle by LZMA (PR #83306)

2024-02-28 Thread Yaxun Liu via cfe-commits
yxsamliu wrote: depends on https://github.com/llvm/llvm-project/pull/83297 https://github.com/llvm/llvm-project/pull/83306 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [HIP] fix host min/max in header (PR #82956)

2024-02-29 Thread Yaxun Liu via cfe-commits
yxsamliu wrote: > > Probably I need to define those functions with mixed args by default to > > avoid regressions. > > Are there any other regressions? Can hupCUB be fixed instead? While their use > case is probably benign, I'd rather fix the user code, than propagate CUDA > bugs into HIP. S

[clang] [HIP] fix host min/max in header (PR #82956)

2024-02-29 Thread Yaxun Liu via cfe-commits
yxsamliu wrote: > > > Probably I need to define those functions with mixed args by default to > > > avoid regressions. > > > > > > Are there any other regressions? Can hupCUB be fixed instead? While their > > use case is probably benign, I'd rather fix the user code, than propagate > > CUDA

[clang] [clang][AMDGPU] Don't define feature macros on host code (PR #83558)

2024-03-01 Thread Yaxun Liu via cfe-commits
yxsamliu wrote: Did you try this patch with internal PSDB? This will likely break all HIP programs. This is because HIP is single source program and users usually expect the common device-side predefined macros is seen in both host and device compilations. e.g. they could write a kernel using

[clang] [clang][AMDGPU] Don't define feature macros on host code (PR #83558)

2024-03-01 Thread Yaxun Liu via cfe-commits
@@ -6,32 +6,32 @@ // R600-based processors. // -// RUN: %clang -E -dM -target r600 -mcpu=r600 %s 2>&1 | FileCheck --check-prefixes=ARCH-R600,R600 %s -DCPU=r600 -// RUN: %clang -E -dM -target r600 -mcpu=rv630 %s 2>&1 | FileCheck --check-prefixes=ARCH-R600,R600 %s -DCPU=r600 -

[clang] [clang][AMDGPU] Don't define feature macros on host code (PR #83558)

2024-03-01 Thread Yaxun Liu via cfe-commits
@@ -4306,10 +4306,10 @@ // Begin amdgcn tests -// RUN: %clang -mcpu=gfx803 -E -dM %s -o - 2>&1 \ +// RUN: %clang -mcpu=gfx803 -E -dM -Xclang -fcuda-is-device %s -o - 2>&1 \ yxsamliu wrote: C code compiled with target amdgcn should not depend

[clang] [clang][AMDGPU] Don't define feature macros on host code (PR #83558)

2024-03-01 Thread Yaxun Liu via cfe-commits
yxsamliu wrote: for example, rocprim assumes warpSize is constant https://github.com/ROCm/rocPRIM/blob/6325547d514b46d1ab51aff0195851b3fcc626d1/rocprim/include/rocprim/intrinsics/thread.hpp#L54 since device_warp_size() is used as non-type template arguments and these code are not conditioned f

[clang] [clang][AMDGPU] Don't define feature macros on host code (PR #83558)

2024-03-01 Thread Yaxun Liu via cfe-commits
yxsamliu wrote: in HIP headers, warpSize is defined with __AMDGCN_WAVEFRONT_SIZE and there are a bunch of uses of __AMDGCN_WAVEFRONT_SIZE or warpSize as constants: https://github.com/search?q=repo%3AROCm%2Fclr%20__AMDGCN_WAVEFRONT_SIZE&type=code These can be fixed relatively easily by conditio

[clang] [llvm] [HIP] change compress level (PR #83605)

2024-03-01 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu created https://github.com/llvm/llvm-project/pull/83605 Change compression level to 20 for zstd better compression rate. >From 4fac5b1defe9ce1174da4a2c75f84087f26c63ab Mon Sep 17 00:00:00 2001 From: "Yaxun (Sam) Liu" Date: Fri, 1 Mar 2024 13:16:45 -0500 Subject: [PA

[clang] [llvm] [HIP] change compress level (PR #83605)

2024-03-01 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu updated https://github.com/llvm/llvm-project/pull/83605 >From 6b5687e16c826053d690b08b6fe714e055905479 Mon Sep 17 00:00:00 2001 From: "Yaxun (Sam) Liu" Date: Fri, 1 Mar 2024 13:16:45 -0500 Subject: [PATCH] [HIP] change compress level Change compression level to 20 f

[clang] [llvm] [HIP] change compress level (PR #83605)

2024-03-01 Thread Yaxun Liu via cfe-commits
@@ -942,20 +942,28 @@ CompressedOffloadBundle::compress(const llvm::MemoryBuffer &Input, Input.getBuffer().size()); llvm::compression::Format CompressionFormat; + int Level; - if (llvm::compression::zstd::isAvailable()) + if (llvm::compression::zstd::isAvailable(

[clang] [clang][AMDGPU] Don't define feature macros on host code (PR #83558)

2024-03-02 Thread Yaxun Liu via cfe-commits
yxsamliu wrote: When clang does host compilation, it essentially makes an assumption that the generated IR for host does not depend on the assumed GPU arch, or, the generated IR may be affected by assumed GPU arch, but it won't affect the program output. This is true in most cases. For example

[clang] [llvm] [HIP] change compress level (PR #83605)

2024-03-02 Thread Yaxun Liu via cfe-commits
@@ -942,20 +942,28 @@ CompressedOffloadBundle::compress(const llvm::MemoryBuffer &Input, Input.getBuffer().size()); llvm::compression::Format CompressionFormat; + int Level; - if (llvm::compression::zstd::isAvailable()) + if (llvm::compression::zstd::isAvailable(

[clang] [llvm] [HIP] change compress level (PR #83605)

2024-03-03 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu updated https://github.com/llvm/llvm-project/pull/83605 >From f846e24d2ac287f6f9466615536c4f53f6d0e0ed Mon Sep 17 00:00:00 2001 From: "Yaxun (Sam) Liu" Date: Fri, 1 Mar 2024 13:16:45 -0500 Subject: [PATCH] [HIP] add --offload-compression-level= option Added --offloa

[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

2024-03-03 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu edited https://github.com/llvm/llvm-project/pull/83605 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [HIP] add --offload-compression-level= option (PR #83605)

2024-03-03 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu edited https://github.com/llvm/llvm-project/pull/83605 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [HIP] fix host-used external kernel (PR #83870)

2024-03-04 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu created https://github.com/llvm/llvm-project/pull/83870 In -fgpu-rdc mode, when an external kernel is used by a host function with weak_odr linkage (e.g. explicitly instantiated template function), the kernel should not be marked as host-used external kernel, since

[clang] [HIP] fix host-used external kernel (PR #83870)

2024-03-04 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu updated https://github.com/llvm/llvm-project/pull/83870 >From dc94bb78adb323a539d195b791e50cf69c774246 Mon Sep 17 00:00:00 2001 From: "Yaxun (Sam) Liu" Date: Mon, 4 Mar 2024 11:38:06 -0500 Subject: [PATCH] [HIP] fix host-used external kernel In -fgpu-rdc mode, when

[clang] [AMDGPU] Do not emit arch dependent macros with unspecified cpu (PR #80035)

2024-01-30 Thread Yaxun Liu via cfe-commits
@@ -175,6 +175,8 @@ Predefined Macros - Defined when the GPU default stream is set to per-thread mode. * - ``HIP_API_PER_THREAD_DEFAULT_STREAM`` - Alias to ``__HIP_API_PER_THREAD_DEFAULT_STREAM__``. Deprecated. + * - ``__AMDGCN_WAVEFRONT_SIZE__``

[clang] [AMDGPU] Do not emit arch dependent macros with unspecified cpu (PR #80035)

2024-01-30 Thread Yaxun Liu via cfe-commits
@@ -4294,13 +4294,20 @@ // Begin amdgcn tests -// RUN: %clang -march=amdgcn -E -dM %s -o - 2>&1 \ +// RUN: %clang -mcpu=gfx803 -E -dM %s -o - 2>&1 \ +// RUN: -target amdgcn-unknown-unknown \ +// RUN: | FileCheck -match-full-lines %s -check-prefixes=CHE

[clang] [AMDGPU] Do not emit arch dependent macros with unspecified cpu (PR #80035)

2024-01-30 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu approved this pull request. LGTM. Thanks https://github.com/llvm/llvm-project/pull/80035 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [LinkerWrapper] Support relocatable linking for offloading (PR #80066)

2024-01-31 Thread Yaxun Liu via cfe-commits
@@ -181,5 +181,6 @@ __attribute__((visibility("protected"), used)) int x; // RUN: --linker-path=/usr/bin/ld.lld -- -r --whole-archive %t.a --no-whole-archive \ // RUN: %t.o -o a.out 2>&1 | FileCheck %s --check-prefix=RELOCATABLE-LINK yxsamliu wrote: need

[clang] [AMDGPU] Allow w64 ballot to be used on w32 targets (PR #80183)

2024-01-31 Thread Yaxun Liu via cfe-commits
@@ -151,7 +151,7 @@ BUILTIN(__builtin_amdgcn_mqsad_u32_u8, "V4UiWUiUiV4Ui", "nc") //===--===// TARGET_BUILTIN(__builtin_amdgcn_ballot_w32, "ZUib", "nc", "wavefrontsize32") -TARGET_BUILTIN(__builtin_amdgcn_ba

[clang] [HIP] fix HIP detection for /usr (PR #80190)

2024-01-31 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu created https://github.com/llvm/llvm-project/pull/80190 Skip checking HIP version file under parent directory for /usr/local since /usr will be checked after /usr/local. Fixes: https://github.com/llvm/llvm-project/issues/78344 >From 4da60eac1a940d922703381a9a07c932

[clang] Partial revert "[HIP] Fix -mllvm option for device lld linker" (PR #80202)

2024-01-31 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu created https://github.com/llvm/llvm-project/pull/80202 This partially reverts commit aa964f157f9b50fab3895afbfda6e0915cf6bb4a because it caused perf regressions in rccl due to drop of -mllvm -amgpu-kernarg-preload-count=16 from the linker step. Pontentially it coul

[clang] Partial revert "[HIP] Fix -mllvm option for device lld linker" (PR #80202)

2024-01-31 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu edited https://github.com/llvm/llvm-project/pull/80202 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Partial revert "[HIP] Fix -mllvm option for device lld linker" (PR #80202)

2024-01-31 Thread Yaxun Liu via cfe-commits
https://github.com/yxsamliu closed https://github.com/llvm/llvm-project/pull/80202 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

<    1   2   3   4   5   6   7   8   9   10   >