Author: Yaxun (Sam) Liu
Date: 2022-05-10T14:32:27-04:00
New Revision: afc9d674fe5a14b95c50a38d8605a159c2460427
URL:
https://github.com/llvm/llvm-project/commit/afc9d674fe5a14b95c50a38d8605a159c2460427
DIFF:
https://github.com/llvm/llvm-project/commit/afc9d674fe5a14b95c50a38d8605a159c2460427.dif
Author: Yaxun (Sam) Liu
Date: 2022-05-10T14:56:28-04:00
New Revision: 180a8536cec8e5e13e86863b17982daf95f2038a
URL:
https://github.com/llvm/llvm-project/commit/180a8536cec8e5e13e86863b17982daf95f2038a
DIFF:
https://github.com/llvm/llvm-project/commit/180a8536cec8e5e13e86863b17982daf95f2038a.dif
Author: Yaxun (Sam) Liu
Date: 2022-05-11T14:28:08-04:00
New Revision: 84db35594953a6f7aff7cbc007f1c5d2fd35b1a9
URL:
https://github.com/llvm/llvm-project/commit/84db35594953a6f7aff7cbc007f1c5d2fd35b1a9
DIFF:
https://github.com/llvm/llvm-project/commit/84db35594953a6f7aff7cbc007f1c5d2fd35b1a9.dif
Author: Yaxun (Sam) Liu
Date: 2022-05-12T12:04:05-04:00
New Revision: 0f292141aadb0489231de31de966c239486e019d
URL:
https://github.com/llvm/llvm-project/commit/0f292141aadb0489231de31de966c239486e019d
DIFF:
https://github.com/llvm/llvm-project/commit/0f292141aadb0489231de31de966c239486e019d.dif
Author: Yaxun (Sam) Liu
Date: 2022-05-19T11:34:42-04:00
New Revision: cefe472c51fbcd1aed4d4a090709f25a12a8bc2c
URL:
https://github.com/llvm/llvm-project/commit/cefe472c51fbcd1aed4d4a090709f25a12a8bc2c
DIFF:
https://github.com/llvm/llvm-project/commit/cefe472c51fbcd1aed4d4a090709f25a12a8bc2c.dif
Author: Yaxun (Sam) Liu
Date: 2022-05-19T12:06:56-04:00
New Revision: 559b8fc17ef6f5a65ccf9a11fce5f91c0a011b00
URL:
https://github.com/llvm/llvm-project/commit/559b8fc17ef6f5a65ccf9a11fce5f91c0a011b00
DIFF:
https://github.com/llvm/llvm-project/commit/559b8fc17ef6f5a65ccf9a11fce5f91c0a011b00.dif
yxsamliu wrote:
> > @arsenm I agree that the default should be assuming fine-grained is
> > possible. My thinking behind the original naming and direction was not
> > wanting to introduce an unexpected performance regression by default. I'm
> > happy for both to be changed, and this patch bein
yxsamliu wrote:
> > Will the metadata for unsafe-fp-atomics also be controlled by the pragma
> > that controls the no-fine-grained and no-remote metadata? e.g. something
> > like
> > ```
> > #pragma clang atomics begin no-fine-grained(on) no-remote(on) unsafe-fp(on)
> > ```
>
> Yes, I would ex
yxsamliu wrote:
Is LLVM IR using the old vaarg intrinsics still consumable?
Should we update AutoUpgrade.cpp to auto upgrade the old vaarg intrinsics and
add a test for them?
https://github.com/llvm/llvm-project/pull/85460
___
cfe-commits mailing lis
yxsamliu wrote:
> Doesn’t AutoUpgrade automatically infer overloads? You can see a bunch of
> tests in this patch where the output references the overloaded intrinsics but
> the input is unchanged.
OK it seems to be handled already and test covered. Thanks.
https://github.com/llvm/llvm-projec
https://github.com/yxsamliu created
https://github.com/llvm/llvm-project/pull/85675
since it will cause duplicate symbols when the partially linked object is
linked again.
Change-Id: I2aea39ad0d57d3dc80b6aff395d9506ab9ebbf4d
>From 6d6b362fbf965706108362a55c894588e80ad778 Mon Sep 17 00:00:00 2
https://github.com/yxsamliu updated
https://github.com/llvm/llvm-project/pull/85675
>From 2e0967c0c606ad647185a739442647ab7d90ed52 Mon Sep 17 00:00:00 2001
From: "Yaxun (Sam) Liu"
Date: Mon, 18 Mar 2024 14:09:56 -0400
Subject: [PATCH] [HIP] do not link runtime for -r
since it will cause duplic
https://github.com/yxsamliu edited
https://github.com/llvm/llvm-project/pull/85675
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/yxsamliu closed
https://github.com/llvm/llvm-project/pull/85675
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/yxsamliu approved this pull request.
https://github.com/llvm/llvm-project/pull/85842
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/yxsamliu created
https://github.com/llvm/llvm-project/pull/85976
Refactor managed variable handling in codegen so that the transformation is
done separately from registration.
This will allow the new driver to register the managed var in the linker
wrapper.
>From 11d10a8ac
@@ -1160,9 +1152,8 @@ void CGNVCUDARuntime::createOffloadingEntries() {
// Returns module constructor to be added.
llvm::Function *CGNVCUDARuntime::finalizeModule() {
+ transformManagedVars();
yxsamliu wrote:
we did the equivalent transformation previously d
https://github.com/yxsamliu edited
https://github.com/llvm/llvm-project/pull/85976
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -1160,9 +1152,8 @@ void CGNVCUDARuntime::createOffloadingEntries() {
// Returns module constructor to be added.
llvm::Function *CGNVCUDARuntime::finalizeModule() {
+ transformManagedVars();
yxsamliu wrote:
we do not have test for managed var in llvm-test-
https://github.com/yxsamliu updated
https://github.com/llvm/llvm-project/pull/85976
>From 1d14bcff6363b34ae48eac2bf68221b16dd1c855 Mon Sep 17 00:00:00 2001
From: "Yaxun (Sam) Liu"
Date: Wed, 20 Mar 2024 13:34:29 -0400
Subject: [PATCH] [HIP][NFC] Refactor managed var codegen
Refactor managed va
https://github.com/yxsamliu closed
https://github.com/llvm/llvm-project/pull/85976
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/yxsamliu created
https://github.com/llvm/llvm-project/pull/86838
None
>From 3e00450177338a14c5eb0c39e3d49e7b2202056e Mon Sep 17 00:00:00 2001
From: "Yaxun (Sam) Liu"
Date: Wed, 27 Mar 2024 13:27:19 -0400
Subject: [PATCH] [HIP] document difference with CUDA
---
clang/docs/H
@@ -186,57 +186,60 @@ GlobalVariable *createBinDesc(Module &M,
ArrayRef> Bufs,
".omp_offloading.descriptor" + Suffix);
}
-void createRegisterFunction(Module &M, GlobalVariable *BinDesc,
-StringRef Suffix) {
+Function *cr
yxsamliu wrote:
LGTM for HIP
https://github.com/llvm/llvm-project/pull/87351
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/yxsamliu approved this pull request.
LGTM. Thanks
https://github.com/llvm/llvm-project/pull/80066
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/yxsamliu created
https://github.com/llvm/llvm-project/pull/81700
`-fgpu-rdc` mode allows device functions call device functions in different TU.
However, currently all device objects have to be linked together since only one
fat binary is supported. This is time consuming fo
https://github.com/yxsamliu updated
https://github.com/llvm/llvm-project/pull/83870
>From 9c6991bbcdce6f24c8f99c8f2a6ff0e5b6c2ac5a Mon Sep 17 00:00:00 2001
From: "Yaxun (Sam) Liu"
Date: Mon, 4 Mar 2024 11:38:06 -0500
Subject: [PATCH] [HIP] fix host-used external kernel
In -fgpu-rdc mode, when
https://github.com/yxsamliu updated
https://github.com/llvm/llvm-project/pull/83870
>From 902f09d9124b387ad02bd758e9c54bf44746b0fd Mon Sep 17 00:00:00 2001
From: "Yaxun (Sam) Liu"
Date: Mon, 4 Mar 2024 11:38:06 -0500
Subject: [PATCH] [HIP] fix host-used external kernel
In -fgpu-rdc mode, when
@@ -24,6 +24,7 @@
// NEG-NOT: @__clang_gpu_used_external = {{.*}} @_Z7kernel2v
// NEG-NOT: @__clang_gpu_used_external = {{.*}} @_Z7kernel3v
+// XEG-NOT: @__clang_gpu_used_external = {{.*}} @_Z7kernel5v
yxsamliu wrote:
fixed
https://github.com/llvm/llvm-proje
https://github.com/yxsamliu closed
https://github.com/llvm/llvm-project/pull/84195
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
yxsamliu wrote:
Here is the size distribution of individual code object file (each code object
file is for one GPU arch, and a fat binary contains a bunch of code object
files, therefore the optimal compression parameter is mostly related to code
object file size ).
| Bin Size | Count |
yxsamliu wrote:
CUID is needed for device static variable to be accessible on host side. Since
the driver does not know whether device static variables are accessed on host
side, it should always enable CUID for HIP.
https://github.com/llvm/llvm-project/pull/84332
_
https://github.com/yxsamliu approved this pull request.
https://github.com/llvm/llvm-project/pull/84332
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/yxsamliu approved this pull request.
https://github.com/llvm/llvm-project/pull/84337
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
yxsamliu wrote:
> > CUID is needed for device static variable to be accessible on host side.
> > Since the driver does not know whether device static variables are accessed
> > on host side, it should always enable CUID for HIP.
>
> Oh! I think I remember what I did. I made the CUID hash gener
yxsamliu wrote:
> It may be worth asking on https://github.com/facebook/zstd/ . I am sure zstd
> maintainers are happy to see more adoption:)
Posted a question to zstd https://github.com/facebook/zstd/issues/3932
https://github.com/llvm/llvm-project/pull/83605
_
@@ -906,6 +906,16 @@ CreateFileHandler(MemoryBuffer &FirstInput,
}
OffloadBundlerConfig::OffloadBundlerConfig() {
+ if (llvm::compression::zstd::isAvailable()) {
+CompressionFormat = llvm::compression::Format::Zstd;
+// Use a high zstd compress level by default for be
https://github.com/yxsamliu approved this pull request.
https://github.com/llvm/llvm-project/pull/84400
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/yxsamliu closed
https://github.com/llvm/llvm-project/pull/83870
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/yxsamliu closed
https://github.com/llvm/llvm-project/pull/83306
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
yxsamliu wrote:
since zstd has comparable compression rate and is much faster, we will use
zstd. close this PR.
https://github.com/llvm/llvm-project/pull/83306
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mail
https://github.com/yxsamliu updated
https://github.com/llvm/llvm-project/pull/83605
>From 16796bc8eb3b32436903db4b689d4cb9cfc348d8 Mon Sep 17 00:00:00 2001
From: "Yaxun (Sam) Liu"
Date: Fri, 1 Mar 2024 13:16:45 -0500
Subject: [PATCH] [HIP] add --offload-compression-level= option
Added --offloa
yxsamliu wrote:
zstd developers suggest to enable long distance matching (LDM), i.e. the
`--long` option. I updated the PR with the change, and tested that it works
well for bundle entry sizes range from 1KB to 20MB, for both compression rate
and compression/decompression speed.
https://githu
https://github.com/yxsamliu edited
https://github.com/llvm/llvm-project/pull/83605
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
yxsamliu wrote:
> Should an option like in #84337 be added for the new driver?
Yes please
https://github.com/llvm/llvm-project/pull/83605
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commi
yxsamliu wrote:
> > Should an option like in #84337 be added for the new driver?
>
> Yes please
Oh. I can add it
https://github.com/llvm/llvm-project/pull/83605
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/ma
https://github.com/yxsamliu updated
https://github.com/llvm/llvm-project/pull/83605
>From 60faf7f657fdcc00edfa0a1813d1e2746c341ef1 Mon Sep 17 00:00:00 2001
From: "Yaxun (Sam) Liu"
Date: Fri, 1 Mar 2024 13:16:45 -0500
Subject: [PATCH] [HIP] add --offload-compression-level= option
Added --offloa
yxsamliu wrote:
> Should an option like in #84337 be added for the new driver?
added the option to linker wrapper
https://github.com/llvm/llvm-project/pull/83605
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/ma
https://github.com/yxsamliu updated
https://github.com/llvm/llvm-project/pull/83605
>From 78ad578a19d2a3585f20ab64d364a46a584ec035 Mon Sep 17 00:00:00 2001
From: "Yaxun (Sam) Liu"
Date: Fri, 1 Mar 2024 13:16:45 -0500
Subject: [PATCH] [HIP] add --offload-compression-level= option
Added --offloa
@@ -2863,3 +2863,18 @@ void tools::addOutlineAtomicsArgs(const Driver &D, const
ToolChain &TC,
CmdArgs.push_back("+outline-atomics");
}
}
+
+void tools::addOffloadCompressArgs(const llvm::opt::ArgList &TCArgs,
+ llvm::opt::ArgStringList
@@ -4638,7 +4638,10 @@ Action *Driver::BuildOffloadingActions(Compilation &C,
}
}
- if (offloadDeviceOnly())
+ // All kinds exit now in device-only mode except for non-RDC mode HIP.
yxsamliu wrote:
I am wondering whether we should restrict this change
https://github.com/yxsamliu updated
https://github.com/llvm/llvm-project/pull/83605
>From 906b23c5f8ef815b7727fe2bda852c33f0d9147b Mon Sep 17 00:00:00 2001
From: "Yaxun (Sam) Liu"
Date: Fri, 1 Mar 2024 13:16:45 -0500
Subject: [PATCH] [HIP] add --offload-compression-level= option
Added --offloa
https://github.com/yxsamliu updated
https://github.com/llvm/llvm-project/pull/83605
>From c46a3ce625a34a497cd0b14631cb755b903e93d6 Mon Sep 17 00:00:00 2001
From: "Yaxun (Sam) Liu"
Date: Fri, 1 Mar 2024 13:16:45 -0500
Subject: [PATCH] [HIP] add --offload-compression-level= option
Added --offloa
https://github.com/yxsamliu closed
https://github.com/llvm/llvm-project/pull/83605
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Author: Yaxun (Sam) Liu
Date: 2024-03-09T10:07:57-05:00
New Revision: e733d7e23f6553c55c85edd55511b133d2064677
URL:
https://github.com/llvm/llvm-project/commit/e733d7e23f6553c55c85edd55511b133d2064677
DIFF:
https://github.com/llvm/llvm-project/commit/e733d7e23f6553c55c85edd55511b133d2064677.dif
https://github.com/yxsamliu approved this pull request.
https://github.com/llvm/llvm-project/pull/84534
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/yxsamliu created
https://github.com/llvm/llvm-project/pull/82956
CUDA defines min/max functions for host in global namespace. HIP header needs
to define them too to be compatible. Currently only min/max(int, int) is
defined. This causes wrong result for arguments that are ou
@@ -1306,14 +1306,50 @@ float min(float __x, float __y) { return
__builtin_fminf(__x, __y); }
__DEVICE__
double min(double __x, double __y) { return __builtin_fmin(__x, __y); }
+// Define host min/max functions.
+
#if !defined(__HIPCC_RTC__) && !defined(__OPENMP_AMDGCN__)
-_
@@ -1306,14 +1306,50 @@ float min(float __x, float __y) { return
__builtin_fminf(__x, __y); }
__DEVICE__
double min(double __x, double __y) { return __builtin_fmin(__x, __y); }
+// Define host min/max functions.
+
#if !defined(__HIPCC_RTC__) && !defined(__OPENMP_AMDGCN__)
-_
https://github.com/yxsamliu updated
https://github.com/llvm/llvm-project/pull/82956
>From aa50cadf0baf84ea38379fd3276f306a27164007 Mon Sep 17 00:00:00 2001
From: "Yaxun (Sam) Liu"
Date: Sun, 25 Feb 2024 11:13:40 -0500
Subject: [PATCH] [HIP] fix host min/max in header
CUDA defines min/max funct
https://github.com/yxsamliu updated
https://github.com/llvm/llvm-project/pull/82956
>From bd87c56b2d96b834788e8fa449f3ac308faec1f0 Mon Sep 17 00:00:00 2001
From: "Yaxun (Sam) Liu"
Date: Sun, 25 Feb 2024 11:13:40 -0500
Subject: [PATCH] [HIP] fix host min/max in header
CUDA defines min/max funct
@@ -1306,14 +1306,50 @@ float min(float __x, float __y) { return
__builtin_fminf(__x, __y); }
__DEVICE__
double min(double __x, double __y) { return __builtin_fmin(__x, __y); }
+// Define host min/max functions.
+
#if !defined(__HIPCC_RTC__) && !defined(__OPENMP_AMDGCN__)
-_
https://github.com/yxsamliu updated
https://github.com/llvm/llvm-project/pull/82956
>From c8331bffa27011b953747d3ad4f7b423cf73b4a4 Mon Sep 17 00:00:00 2001
From: "Yaxun (Sam) Liu"
Date: Sun, 25 Feb 2024 11:13:40 -0500
Subject: [PATCH] [HIP] fix host min/max in header
CUDA defines min/max funct
@@ -1306,15 +1306,68 @@ float min(float __x, float __y) { return
__builtin_fminf(__x, __y); }
__DEVICE__
double min(double __x, double __y) { return __builtin_fmin(__x, __y); }
-#if !defined(__HIPCC_RTC__) && !defined(__OPENMP_AMDGCN__)
-__host__ inline static int min(int __a
https://github.com/yxsamliu updated
https://github.com/llvm/llvm-project/pull/82956
>From f7471303abf989ceb1bdbce0d580d74097572dec Mon Sep 17 00:00:00 2001
From: "Yaxun (Sam) Liu"
Date: Sun, 25 Feb 2024 11:13:40 -0500
Subject: [PATCH] [HIP] fix host min/max in header
CUDA defines min/max funct
@@ -1306,15 +1306,68 @@ float min(float __x, float __y) { return
__builtin_fminf(__x, __y); }
__DEVICE__
double min(double __x, double __y) { return __builtin_fmin(__x, __y); }
-#if !defined(__HIPCC_RTC__) && !defined(__OPENMP_AMDGCN__)
-__host__ inline static int min(int __a
@@ -1306,15 +1306,73 @@ float min(float __x, float __y) { return
__builtin_fminf(__x, __y); }
__DEVICE__
double min(double __x, double __y) { return __builtin_fmin(__x, __y); }
-#if !defined(__HIPCC_RTC__) && !defined(__OPENMP_AMDGCN__)
-__host__ inline static int min(int __a
https://github.com/yxsamliu updated
https://github.com/llvm/llvm-project/pull/82956
>From ebad3ba006445f290d17c338cc1b39293c18cdad Mon Sep 17 00:00:00 2001
From: "Yaxun (Sam) Liu"
Date: Sun, 25 Feb 2024 11:13:40 -0500
Subject: [PATCH] [HIP] fix host min/max in header
CUDA defines min/max funct
https://github.com/yxsamliu closed
https://github.com/llvm/llvm-project/pull/82956
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Author: Yaxun (Sam) Liu
Date: 2024-02-27T20:19:07-05:00
New Revision: bcbce807d76a30388b366d14051c5f80e9724dab
URL:
https://github.com/llvm/llvm-project/commit/bcbce807d76a30388b366d14051c5f80e9724dab
DIFF:
https://github.com/llvm/llvm-project/commit/bcbce807d76a30388b366d14051c5f80e9724dab.dif
@@ -1306,15 +1306,73 @@ float min(float __x, float __y) { return
__builtin_fminf(__x, __y); }
__DEVICE__
double min(double __x, double __y) { return __builtin_fmin(__x, __y); }
-#if !defined(__HIPCC_RTC__) && !defined(__OPENMP_AMDGCN__)
-__host__ inline static int min(int __a
https://github.com/yxsamliu created
https://github.com/llvm/llvm-project/pull/83297
LZMA (Lempel-Ziv/Markov-chain Algorithm) provides better comparession rate than
zstd and zlib for clang-offload-bundler bundles which often contains large
number of similar entries.
This patch adds liblzma as
https://github.com/yxsamliu created
https://github.com/llvm/llvm-project/pull/83306
LZMA (Lempel-Ziv/Markov-chain Algorithm) provides better comparession
rate than zstd and zlib for clang-offload-bundler bundles which often
contains large number of similar entries.
This patch let clang-offload-
yxsamliu wrote:
depends on https://github.com/llvm/llvm-project/pull/83297
https://github.com/llvm/llvm-project/pull/83306
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
yxsamliu wrote:
> > Probably I need to define those functions with mixed args by default to
> > avoid regressions.
>
> Are there any other regressions? Can hupCUB be fixed instead? While their use
> case is probably benign, I'd rather fix the user code, than propagate CUDA
> bugs into HIP.
S
yxsamliu wrote:
> > > Probably I need to define those functions with mixed args by default to
> > > avoid regressions.
> >
> >
> > Are there any other regressions? Can hupCUB be fixed instead? While their
> > use case is probably benign, I'd rather fix the user code, than propagate
> > CUDA
yxsamliu wrote:
Did you try this patch with internal PSDB? This will likely break all HIP
programs.
This is because HIP is single source program and users usually expect the
common device-side predefined macros is seen in both host and device
compilations. e.g. they could write a kernel using
@@ -6,32 +6,32 @@
// R600-based processors.
//
-// RUN: %clang -E -dM -target r600 -mcpu=r600 %s 2>&1 | FileCheck
--check-prefixes=ARCH-R600,R600 %s -DCPU=r600
-// RUN: %clang -E -dM -target r600 -mcpu=rv630 %s 2>&1 | FileCheck
--check-prefixes=ARCH-R600,R600 %s -DCPU=r600
-
@@ -4306,10 +4306,10 @@
// Begin amdgcn tests
-// RUN: %clang -mcpu=gfx803 -E -dM %s -o - 2>&1 \
+// RUN: %clang -mcpu=gfx803 -E -dM -Xclang -fcuda-is-device %s -o - 2>&1 \
yxsamliu wrote:
C code compiled with target amdgcn should not depend
yxsamliu wrote:
for example, rocprim assumes warpSize is constant
https://github.com/ROCm/rocPRIM/blob/6325547d514b46d1ab51aff0195851b3fcc626d1/rocprim/include/rocprim/intrinsics/thread.hpp#L54
since device_warp_size() is used as non-type template arguments and these code
are not conditioned f
yxsamliu wrote:
in HIP headers, warpSize is defined with __AMDGCN_WAVEFRONT_SIZE and there are
a bunch of uses of __AMDGCN_WAVEFRONT_SIZE or warpSize as constants:
https://github.com/search?q=repo%3AROCm%2Fclr%20__AMDGCN_WAVEFRONT_SIZE&type=code
These can be fixed relatively easily by conditio
https://github.com/yxsamliu created
https://github.com/llvm/llvm-project/pull/83605
Change compression level to 20 for zstd better
compression rate.
>From 4fac5b1defe9ce1174da4a2c75f84087f26c63ab Mon Sep 17 00:00:00 2001
From: "Yaxun (Sam) Liu"
Date: Fri, 1 Mar 2024 13:16:45 -0500
Subject: [PA
https://github.com/yxsamliu updated
https://github.com/llvm/llvm-project/pull/83605
>From 6b5687e16c826053d690b08b6fe714e055905479 Mon Sep 17 00:00:00 2001
From: "Yaxun (Sam) Liu"
Date: Fri, 1 Mar 2024 13:16:45 -0500
Subject: [PATCH] [HIP] change compress level
Change compression level to 20 f
@@ -942,20 +942,28 @@ CompressedOffloadBundle::compress(const
llvm::MemoryBuffer &Input,
Input.getBuffer().size());
llvm::compression::Format CompressionFormat;
+ int Level;
- if (llvm::compression::zstd::isAvailable())
+ if (llvm::compression::zstd::isAvailable(
yxsamliu wrote:
When clang does host compilation, it essentially makes an assumption that the
generated IR for host does not depend on the assumed GPU arch, or, the
generated IR may be affected by assumed GPU arch, but it won't affect the
program output. This is true in most cases. For example
@@ -942,20 +942,28 @@ CompressedOffloadBundle::compress(const
llvm::MemoryBuffer &Input,
Input.getBuffer().size());
llvm::compression::Format CompressionFormat;
+ int Level;
- if (llvm::compression::zstd::isAvailable())
+ if (llvm::compression::zstd::isAvailable(
https://github.com/yxsamliu updated
https://github.com/llvm/llvm-project/pull/83605
>From f846e24d2ac287f6f9466615536c4f53f6d0e0ed Mon Sep 17 00:00:00 2001
From: "Yaxun (Sam) Liu"
Date: Fri, 1 Mar 2024 13:16:45 -0500
Subject: [PATCH] [HIP] add --offload-compression-level= option
Added --offloa
https://github.com/yxsamliu edited
https://github.com/llvm/llvm-project/pull/83605
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/yxsamliu edited
https://github.com/llvm/llvm-project/pull/83605
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/yxsamliu created
https://github.com/llvm/llvm-project/pull/83870
In -fgpu-rdc mode, when an external kernel is used by a host function with
weak_odr linkage (e.g. explicitly instantiated template function), the kernel
should not be marked as host-used external kernel, since
https://github.com/yxsamliu updated
https://github.com/llvm/llvm-project/pull/83870
>From dc94bb78adb323a539d195b791e50cf69c774246 Mon Sep 17 00:00:00 2001
From: "Yaxun (Sam) Liu"
Date: Mon, 4 Mar 2024 11:38:06 -0500
Subject: [PATCH] [HIP] fix host-used external kernel
In -fgpu-rdc mode, when
@@ -175,6 +175,8 @@ Predefined Macros
- Defined when the GPU default stream is set to per-thread mode.
* - ``HIP_API_PER_THREAD_DEFAULT_STREAM``
- Alias to ``__HIP_API_PER_THREAD_DEFAULT_STREAM__``. Deprecated.
+ * - ``__AMDGCN_WAVEFRONT_SIZE__``
@@ -4294,13 +4294,20 @@
// Begin amdgcn tests
-// RUN: %clang -march=amdgcn -E -dM %s -o - 2>&1 \
+// RUN: %clang -mcpu=gfx803 -E -dM %s -o - 2>&1 \
+// RUN: -target amdgcn-unknown-unknown \
+// RUN: | FileCheck -match-full-lines %s
-check-prefixes=CHE
https://github.com/yxsamliu approved this pull request.
LGTM. Thanks
https://github.com/llvm/llvm-project/pull/80035
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -181,5 +181,6 @@ __attribute__((visibility("protected"), used)) int x;
// RUN: --linker-path=/usr/bin/ld.lld -- -r --whole-archive %t.a
--no-whole-archive \
// RUN: %t.o -o a.out 2>&1 | FileCheck %s --check-prefix=RELOCATABLE-LINK
yxsamliu wrote:
need
@@ -151,7 +151,7 @@ BUILTIN(__builtin_amdgcn_mqsad_u32_u8, "V4UiWUiUiV4Ui",
"nc")
//===--===//
TARGET_BUILTIN(__builtin_amdgcn_ballot_w32, "ZUib", "nc", "wavefrontsize32")
-TARGET_BUILTIN(__builtin_amdgcn_ba
https://github.com/yxsamliu created
https://github.com/llvm/llvm-project/pull/80190
Skip checking HIP version file under parent directory for /usr/local since /usr
will be checked after /usr/local.
Fixes: https://github.com/llvm/llvm-project/issues/78344
>From 4da60eac1a940d922703381a9a07c932
https://github.com/yxsamliu created
https://github.com/llvm/llvm-project/pull/80202
This partially reverts commit aa964f157f9b50fab3895afbfda6e0915cf6bb4a because
it caused perf regressions in rccl due to drop of -mllvm
-amgpu-kernarg-preload-count=16 from the linker step. Pontentially it coul
https://github.com/yxsamliu edited
https://github.com/llvm/llvm-project/pull/80202
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/yxsamliu closed
https://github.com/llvm/llvm-project/pull/80202
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
301 - 400 of 1680 matches
Mail list logo