https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/84420
>From 677e374d1a0ca87d734c03aa2e97e73510e04e4e Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Thu, 7 Mar 2024 15:48:00 -0600
Subject: [PATCH] [Offload] Move HIP and CUDA to new driver by default
Summary:
This
jhuber6 wrote:
> Do you mean the SPIR-V target (backend)? I have not followed this area of
> work closely. What is missing or what exactly needs to be supported by the
> SPIR-V target? Any help or pointers would be greatly appreciated!
I believe there was some work to port SYCL to work with th
jhuber6 wrote:
> @jhuber6 , looks like these changes break the following builds
>
> * https://lab.llvm.org/buildbot/#/builders/235/builds/5630
>
> * https://lab.llvm.org/buildbot/#/builders/232/builds/19808
>
>
> there are a lot of CMake error messages started with
>
> ```
> CMake Er
jhuber6 wrote:
@vvereschaka Should be fixed now.
https://github.com/llvm/llvm-project/pull/81921
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jhuber6 closed
https://github.com/llvm/llvm-project/pull/82699
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jhuber6 edited
https://github.com/llvm/llvm-project/pull/81058
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jhuber6 commented:
Some nits, mostly just formatting and naming that hasn't been updated.
I agree overall that we should just put this in some canonical form and rely on
other LLVM passes to take care of things like inlining. Eager to have this
functionality in, so hopefully
@@ -0,0 +1,701 @@
+//===-- ExpandVariadicsPass.cpp *- C++ -*-=//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache
@@ -0,0 +1,701 @@
+//===-- ExpandVariadicsPass.cpp *- C++ -*-=//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache
@@ -0,0 +1,698 @@
+//===-- ExpandVariadicsPass.cpp *- C++ -*-=//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache
@@ -1306,14 +1306,50 @@ float min(float __x, float __y) { return
__builtin_fminf(__x, __y); }
__DEVICE__
double min(double __x, double __y) { return __builtin_fmin(__x, __y); }
+// Define host min/max functions.
+
#if !defined(__HIPCC_RTC__) && !defined(__OPENMP_AMDGCN__)
-_
@@ -1306,14 +1306,50 @@ float min(float __x, float __y) { return
__builtin_fminf(__x, __y); }
__DEVICE__
double min(double __x, double __y) { return __builtin_fmin(__x, __y); }
+// Define host min/max functions.
+
#if !defined(__HIPCC_RTC__) && !defined(__OPENMP_AMDGCN__)
-_
https://github.com/jhuber6 created
https://github.com/llvm/llvm-project/pull/83282
Summary:
One recurring problem we have with the OpenMP libraries is that they are
potentially conflicting with ones found on the system, this occurs when
there are two copies and one is used for linking that it no
jhuber6 wrote:
So, I'm wondering if we could do a clang configuration file based solution for
this. The problem that I see now is that we'd like to make some clang
configuration files only active for a certain language. I think we already have
OS specific files and target specific files, so it
jhuber6 wrote:
This seems to be adding an entirely new compression scheme to LLVM. I feel like
that should be a separate patch and the part where we make HIP use it is a
follow-up.
https://github.com/llvm/llvm-project/pull/83297
___
cfe-commits maili
https://github.com/jhuber6 approved this pull request.
https://github.com/llvm/llvm-project/pull/83306
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jhuber6 closed
https://github.com/llvm/llvm-project/pull/83282
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
jhuber6 wrote:
> Hi @jhuber6, @MaskRay
>
> We are having some problems with this patch on a server where the file
> /lib64/libomptarget-nvptx-sm_52.bc exists. The test case that fails is
> clang/test/Driver/openmp-offload-gpu.c.
>
> **Problem 1** I think one problem is related to this check l
jhuber6 wrote:
This was the original behavior of my patch, but I reverted it because it broke
all the HIP headers that were unintentionally relying on this. Has that been
resolved?
https://github.com/llvm/llvm-project/pull/83558
___
cfe-commits maili
jhuber6 wrote:
> Problem 1 can be solved by flipping the order. But Problem 2 would remain as
> it doesn't depend on the order.
Honestly, we should just remove the second test. We just treat these things as
libraries and it doesn't make sense for a test to ensure that `-lstdc++`
doesn't exist
https://github.com/jhuber6 created
https://github.com/llvm/llvm-project/pull/83573
Summary:
We still use this bitcode library in one case, the NVPTX non-LTO build.
The patch updated the search paths to treat it the same as other
libraries, which unintentionally prioritized system paths over
LIBR
jhuber6 wrote:
> Problem 1 can be solved by flipping the order. But Problem 2 would remain as
> it doesn't depend on the order.
https://github.com/llvm/llvm-project/pull/83573 I made a patch to fix it.
https://github.com/llvm/llvm-project/pull/83282
@@ -101,17 +101,6 @@
/// ###
-/// Check that the warning is thrown when the libomptarget bitcode library is
not found.
-/// Libomptarget requires sm_52 or newer so an sm_52 bitcode library should
never
https://github.com/jhuber6 closed
https://github.com/llvm/llvm-project/pull/83573
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
jhuber6 wrote:
> ```
> yeluo@epyc-server:/soft/llvm/main-20240301/lib$ ls libomp* -l
> lrwxrwxrwx 1 yeluo yeluo 34 Mar 1 11:18 libomptarget.rtl.amdgpu.so ->
> libomptarget.rtl.amdgpu.so.19.0git
> -r--r--r-- 1 yeluo yeluo 67532024 Mar 1 11:04
> libomptarget.rtl.amdgpu.so.19.0git
> lrwxrw
jhuber6 wrote:
> It seems being installed twice both under `lib` and
> `lib/x86_64-unknown-linux-gnu`. files are the identical as diff show nothing.
Makes sense, like `add_llvm_library` is implicitly installing it there, then
our subsequent `install` call is doing it again. I wonder if there's
https://github.com/jhuber6 created
https://github.com/llvm/llvm-project/pull/83906
Summary:
This patch implements the LLVM floating point environment control
intrinsics and also exposes it through clang. We encode the floating
point environment as a 64-bit value that simply concatenates the valu
jhuber6 wrote:
Note that this patch is not quite ready to land. I encountered issues when
working with `s_setreg`. The listing of `SOPK` instructions should have this as
an instruction that takes a 16-bit zero extended immediate value. However, this
was apparently not the case for the `s_setre
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/83906
>From d7e20596434636753610ceb4326ddc1116f0bdce Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Fri, 1 Mar 2024 15:28:32 -0600
Subject: [PATCH] [AMDGPU] Implement 'llvm.get.fpenv' and 'llvm.set.fpenv'
Summary:
@@ -1122,7 +1122,7 @@ class S_SETREG_B32_Pseudo pattern=[]> :
SOPK_Pseudo <
pattern>;
def S_SETREG_B32 : S_SETREG_B32_Pseudo <
- [(int_amdgcn_s_setreg (i32 SIMM16bit:$simm16), i32:$sdst)]> {
+ [(int_amdgcn_s_setreg (i32 timm:$simm16), i32:$sdst)]> {
jhub
https://github.com/jhuber6 edited
https://github.com/llvm/llvm-project/pull/83906
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -1122,7 +1122,7 @@ class S_SETREG_B32_Pseudo pattern=[]> :
SOPK_Pseudo <
pattern>;
def S_SETREG_B32 : S_SETREG_B32_Pseudo <
- [(int_amdgcn_s_setreg (i32 SIMM16bit:$simm16), i32:$sdst)]> {
+ [(int_amdgcn_s_setreg (i32 timm:$simm16), i32:$sdst)]> {
jhub
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/83906
>From e349a9d436cdb99f0d9fb8d6df772a600ca0ea94 Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Fri, 1 Mar 2024 15:28:32 -0600
Subject: [PATCH] [AMDGPU] Implement 'llvm.get.fpenv' and 'llvm.set.fpenv'
Summary:
@@ -1122,7 +1122,7 @@ class S_SETREG_B32_Pseudo pattern=[]> :
SOPK_Pseudo <
pattern>;
def S_SETREG_B32 : S_SETREG_B32_Pseudo <
- [(int_amdgcn_s_setreg (i32 SIMM16bit:$simm16), i32:$sdst)]> {
+ [(int_amdgcn_s_setreg (i32 timm:$simm16), i32:$sdst)]> {
jhub
https://github.com/jhuber6 created
https://github.com/llvm/llvm-project/pull/83927
Summary:
The AMDGPU traget was originally designed with OpenCL in mind. The first
verisions only provided the grid size, which is the total numver of
threads in the execution context. In order to get the number of
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/83927
>From 56059fdb5a0e22f8c7dcce6642899fdccf77a55b Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Mon, 4 Mar 2024 17:27:28 -0600
Subject: [PATCH] [AMDGPU] Introduce 'amdgpu_num_workgroups_{xyz}' builtin
Summary:
https://github.com/jhuber6 commented:
We had a lot that were like this previously. Guessing this one slipped through
because of the `zlib` requirement. Does this work with `-nogpulib` instead?
Usually easier than passing the dummy CUDA path.
https://github.com/llvm/llvm-project/pull/84008
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/83906
>From 7808b8a0f4ab70733ebff4a6b8793f4918d0107b Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Fri, 1 Mar 2024 15:28:32 -0600
Subject: [PATCH] [AMDGPU] Implement 'llvm.get.fpenv' and 'llvm.set.fpenv'
Summary:
jhuber6 wrote:
> It definitely doesn't work for the "pure" CUDA invocations, it still finds my
> local installation and complains. It might work for the OpenMP invocations,
> but hard to tell for me on a system with CUDA installed. As it's a `.cu` test
> after all, I think I would prefer the u
jhuber6 wrote:
> > Might need `-nogpulib -nogpuinc` in those cases, we do that in other `.cu`
> > files in the test suite.
>
> No, I already tried that, it doesn't work for me. All
> `clang/test/Driver/*.cu` that supply `-nocudainc` also pass `--cuda-path`...
The only reason it will fail with
@@ -325,6 +325,9 @@ BUILTIN(__builtin_amdgcn_read_exec_hi, "Ui", "nc")
BUILTIN(__builtin_amdgcn_endpgm, "v", "nr")
+BUILTIN(__builtin_amdgcn_get_fpenv, "WUi", "n")
jhuber6 wrote:
There's no builtin as far as I'm aware. I think there might be some pragmas
ho
jhuber6 wrote:
In any case, it's not really important and this works. I'm mostly just curious
why it doesn't seem to work as I would expect since there might be something to
fix.
https://github.com/llvm/llvm-project/pull/84008
___
cfe-commits mailing
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/83906
>From 169f8914270725bd94b14a20f5f91005ce59f494 Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Fri, 1 Mar 2024 15:28:32 -0600
Subject: [PATCH] [AMDGPU] Implement 'llvm.get.fpenv' and 'llvm.set.fpenv'
Summary:
jhuber6 wrote:
> The invocations in `clang/test/Driver/cuda-omp-unsupported-debug-options.cu`
> don't pass `-emit-llvm` but `-###`.
>
> ```
> > cat /dev/null | ./bin/clang -### -x cuda - -nogpulib -nogpuinc -c && echo
> $?
> ```
>
> should error with recent CUDA installations because of `sm_
https://github.com/jhuber6 created
https://github.com/llvm/llvm-project/pull/84017
Summary:
We already had a special CUDA default that better tracked the state as
of modern CUDA installations. Recently this was bumped up to `sm_52`,
but there was a location that wasn't respecting this. Fix that.
jhuber6 wrote:
Found it https://github.com/llvm/llvm-project/pull/84017.
https://github.com/llvm/llvm-project/pull/84008
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
jhuber6 wrote:
> Ok, but that still doesn't change the fact that the Clang driver will search
> for a system-wide CUDA installation unless passed `--cuda-path`...
We also do this with the GCC toolchain, the issue is whether or not there's an
error if it didn't find it. Doing `clang -v` will al
@@ -839,6 +839,18 @@ unsigned test_wavefrontsize() {
return __builtin_amdgcn_wavefrontsize();
}
+// CHECK-LABEL test_get_fpenv(
jhuber6 wrote:
Is this related to the DX10 clamp / traps potentially being disabled? Or is
this an LLVM concept.
https://github
@@ -839,6 +839,18 @@ unsigned test_wavefrontsize() {
return __builtin_amdgcn_wavefrontsize();
}
+// CHECK-LABEL test_get_fpenv(
jhuber6 wrote:
Hm, I'm not sure. I feel like this is just letting the user access the hardware
directly which has a different us
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/83906
>From 1cb734f3df298a34d76f7c9ee059dff84ba50c10 Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Fri, 1 Mar 2024 15:28:32 -0600
Subject: [PATCH] [AMDGPU] Implement 'llvm.get.fpenv' and 'llvm.set.fpenv'
Summary:
@@ -839,6 +839,18 @@ unsigned test_wavefrontsize() {
return __builtin_amdgcn_wavefrontsize();
}
+// CHECK-LABEL test_get_fpenv(
jhuber6 wrote:
Added a sema check.
https://github.com/llvm/llvm-project/pull/83906
_
jhuber6 wrote:
> > Right now if you specify target-cpu you get target-cpu attributes, which is
> > what we don't want.
>
> I'm fine handling 'generic' in a special way under the hood and not
> specifying target-CPU.
>
> My concern is about user-facing interface. Command line options must be
https://github.com/jhuber6 created
https://github.com/llvm/llvm-project/pull/80035
Summary:
Currently, the AMDGPU toolchain accepts not passing `-mcpu` as a means
to create a sort of "generic" IR. The resulting IR will not contain any
target dependent attributes and can then be inserted into ano
jhuber6 wrote:
Rework of https://github.com/llvm/llvm-project/pull/79660 to handle old
behavior of these being defined for the host.
https://github.com/llvm/llvm-project/pull/80035
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.
@@ -175,6 +175,8 @@ Predefined Macros
- Defined when the GPU default stream is set to per-thread mode.
* - ``HIP_API_PER_THREAD_DEFAULT_STREAM``
- Alias to ``__HIP_API_PER_THREAD_DEFAULT_STREAM__``. Deprecated.
+ * - ``__AMDGCN_WAVEFRONT_SIZE__``
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/80035
>From f606aaa9c711d2ece6b1600160a61232abb69eb4 Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Mon, 29 Jan 2024 08:46:14 -0600
Subject: [PATCH 1/2] [AMDGPU] Do not emit arch dependent macros with
unspecified c
@@ -175,6 +175,8 @@ Predefined Macros
- Defined when the GPU default stream is set to per-thread mode.
* - ``HIP_API_PER_THREAD_DEFAULT_STREAM``
- Alias to ``__HIP_API_PER_THREAD_DEFAULT_STREAM__``. Deprecated.
+ * - ``__AMDGCN_WAVEFRONT_SIZE__``
https://github.com/jhuber6 closed
https://github.com/llvm/llvm-project/pull/80035
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Author: Joseph Huber
Date: 2024-01-30T13:17:02-06:00
New Revision: 626fe71fa5ed79cbd41b7b29582560d7adb1220e
URL:
https://github.com/llvm/llvm-project/commit/626fe71fa5ed79cbd41b7b29582560d7adb1220e
DIFF:
https://github.com/llvm/llvm-project/commit/626fe71fa5ed79cbd41b7b29582560d7adb1220e.diff
jhuber6 wrote:
> This seems to break tests: http://45.33.8.238/linux/129493/step_7.txt
>
> Please take a look and revert for now if it takes a while to fix.
Is it still broken? I pushed a fix because I'm pretty sure the problem was not
passing `-nogpulib` `-nogpuinc` so the test runs on machin
jhuber6 wrote:
> i.e. it helped with Clang :: Preprocessor/predefined-arch-macros.c but not
> with:
>
> Failed Tests (2): Clang :: Driver/amdgpu-macros.cl Clang ::
> Driver/target-id-macros.cl
Thanks, seeing it locally now. I'll try to fix it quick and revert if it's not
working soon.
https
Author: Joseph Huber
Date: 2024-01-30T13:45:01-06:00
New Revision: 6fecfbc7b62f54bd633e83c22630d7c2a3e5741e
URL:
https://github.com/llvm/llvm-project/commit/6fecfbc7b62f54bd633e83c22630d7c2a3e5741e
DIFF:
https://github.com/llvm/llvm-project/commit/6fecfbc7b62f54bd633e83c22630d7c2a3e5741e.diff
jhuber6 wrote:
> i.e. it helped with Clang :: Preprocessor/predefined-arch-macros.c but not
> with:
>
> Failed Tests (2): Clang :: Driver/amdgpu-macros.cl Clang ::
> Driver/target-id-macros.cl
Pushed a fix, `check-clang` passes on my machine now. Let me know if it's still
broken.
https://gi
https://github.com/jhuber6 created
https://github.com/llvm/llvm-project/pull/80066
Summary:
The standard GPU compilation process embeds each intermediate object
file into the host file at the `.llvm.offloading` section so it can be
linked later. We also use a sepcial section called something lik
jhuber6 wrote:
This is related to the discussions at the
https://github.com/llvm/llvm-project/issues/77018 issue.
https://github.com/llvm/llvm-project/pull/80066
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/ma
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/79873
>From 35e12c3d83f3be93618805ffaf05e3424689f32f Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Mon, 29 Jan 2024 11:08:04 -0600
Subject: [PATCH 1/2] [NVPTX] Allow compiling LLVM-IR without `-march` set
Summary:
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/79873
>From 35e12c3d83f3be93618805ffaf05e3424689f32f Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Mon, 29 Jan 2024 11:08:04 -0600
Subject: [PATCH 1/3] [NVPTX] Allow compiling LLVM-IR without `-march` set
Summary:
https://github.com/jhuber6 closed
https://github.com/llvm/llvm-project/pull/79873
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/80066
>From af382e03e41ef679c35a6126a1b131a7a8a28360 Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Tue, 30 Jan 2024 15:34:22 -0600
Subject: [PATCH] [LinkerWrapper] Support relocatable linking for offloading
Summar
@@ -181,5 +181,6 @@ __attribute__((visibility("protected"), used)) int x;
// RUN: --linker-path=/usr/bin/ld.lld -- -r --whole-archive %t.a
--no-whole-archive \
// RUN: %t.o -o a.out 2>&1 | FileCheck %s --check-prefix=RELOCATABLE-LINK
jhuber6 wrote:
Added
jhuber6 wrote:
> So, the idea is to carry two separate embedded offloading sections -- one for
> already fully linked GPU executables, and another for GPU objects to be
> linked at the final link stage.
>
It's more or less doing `-fno-gpu-rdc` on a subset of files. So you can do GPU
linking
https://github.com/jhuber6 edited
https://github.com/llvm/llvm-project/pull/80066
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/80066
>From af382e03e41ef679c35a6126a1b131a7a8a28360 Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Tue, 30 Jan 2024 15:34:22 -0600
Subject: [PATCH 1/2] [LinkerWrapper] Support relocatable linking for
offloading
S
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/80066
>From af382e03e41ef679c35a6126a1b131a7a8a28360 Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Tue, 30 Jan 2024 15:34:22 -0600
Subject: [PATCH 1/3] [LinkerWrapper] Support relocatable linking for
offloading
S
jhuber6 wrote:
> Supporting such mixed mode opens an interesting set of issues we may need to
> consider going forward:
>
> who/where/how runs initializers in the fully linked parts?
I'm assuming you're talking about GPU-side constructors? I don't think the CUDA
runtime supports those, but Op
https://github.com/jhuber6 created
https://github.com/llvm/llvm-project/pull/80183
Summary:
Currently we cannot compile `__builtin_amdgcn_ballot_w64` on non-wave64
targets even though it is valid. This is relevant for making library
code that can handle both without needing to check the wavefron
jhuber6 wrote:
> > I'm assuming you're talking about GPU-side constructors? I don't think the
> > CUDA runtime supports those, but OpenMP runs them when the image is loaded,
> > so it would handle both independantly.
>
> Yes. I'm thinking of the expectations from a C++ user standpoint, and thi
@@ -151,7 +151,7 @@ BUILTIN(__builtin_amdgcn_mqsad_u32_u8, "V4UiWUiUiV4Ui",
"nc")
//===--===//
TARGET_BUILTIN(__builtin_amdgcn_ballot_w32, "ZUib", "nc", "wavefrontsize32")
-TARGET_BUILTIN(__builtin_amdgcn_ba
jhuber6 wrote:
> > the idea is that it would be the desired effect if someone went out of
> > their way to do this GPU subset linking thing.
>
> That would only be true when someone owns the whole build. That will not be
> the case in practice. A large enough project is usually a bunch of libr
https://github.com/jhuber6 approved this pull request.
Do we have any tests for this kind of stuff? We really should have some mock
ROCm installation in one of the `Inputs/` directories and then do
`--rocm-path=` or something.
https://github.com/llvm/llvm-project/pull/80190
___
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/80066
>From af382e03e41ef679c35a6126a1b131a7a8a28360 Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Tue, 30 Jan 2024 15:34:22 -0600
Subject: [PATCH 1/4] [LinkerWrapper] Support relocatable linking for
offloading
S
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/80066
>From af382e03e41ef679c35a6126a1b131a7a8a28360 Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Tue, 30 Jan 2024 15:34:22 -0600
Subject: [PATCH 1/5] [LinkerWrapper] Support relocatable linking for
offloading
S
@@ -151,7 +151,7 @@ BUILTIN(__builtin_amdgcn_mqsad_u32_u8, "V4UiWUiUiV4Ui",
"nc")
//===--===//
TARGET_BUILTIN(__builtin_amdgcn_ballot_w32, "ZUib", "nc", "wavefrontsize32")
-TARGET_BUILTIN(__builtin_amdgcn_ba
@@ -151,7 +151,7 @@ BUILTIN(__builtin_amdgcn_mqsad_u32_u8, "V4UiWUiUiV4Ui",
"nc")
//===--===//
TARGET_BUILTIN(__builtin_amdgcn_ballot_w32, "ZUib", "nc", "wavefrontsize32")
-TARGET_BUILTIN(__builtin_amdgcn_ba
https://github.com/jhuber6 approved this pull request.
https://github.com/llvm/llvm-project/pull/80190
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
jhuber6 wrote:
> After this change is there any value in having two different builtins? You
> could just have one that always return 64 bits.
I personally think it would be better to just have the one, but I figured that
decision was made earlier and it would break backwards compatibility.
ht
@@ -4,13 +4,10 @@
// RUN: %clang_cc1 -triple amdgcn-- -target-cpu gfx1010 -target-feature
-wavefrontsize64 -verify -S -o - %s
// RUN: %clang_cc1 -triple amdgcn-- -target-cpu gfx1010 -verify -S -o - %s
+// expected-no-diagnostics
+
typedef unsigned long ulong;
void test_ba
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/80183
>From 26b75cdba1aebc881e52dc82ca61e1082ef67a5e Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Wed, 31 Jan 2024 13:18:04 -0600
Subject: [PATCH] [AMDGPU] Allow w64 ballot to be used on w32 targets
Summary:
Curr
@@ -4,13 +4,10 @@
// RUN: %clang_cc1 -triple amdgcn-- -target-cpu gfx1010 -target-feature
-wavefrontsize64 -verify -S -o - %s
// RUN: %clang_cc1 -triple amdgcn-- -target-cpu gfx1010 -verify -S -o - %s
+// expected-no-diagnostics
+
typedef unsigned long ulong;
void test_ba
@@ -199,7 +199,7 @@ static int initLibrary(DeviceTy &Device) {
Entry.size) != OFFLOAD_SUCCESS)
REPORT("Failed to write symbol for USM %s\n", Entry.name);
}
-} else {
+} else if (Entry.addr) {
--
@@ -199,7 +199,7 @@ static int initLibrary(DeviceTy &Device) {
Entry.size) != OFFLOAD_SUCCESS)
REPORT("Failed to write symbol for USM %s\n", Entry.name);
}
-} else {
+} else if (Entry.addr) {
--
@@ -199,7 +199,7 @@ static int initLibrary(DeviceTy &Device) {
Entry.size) != OFFLOAD_SUCCESS)
REPORT("Failed to write symbol for USM %s\n", Entry.name);
}
-} else {
+} else if (Entry.addr) {
--
@@ -199,7 +199,7 @@ static int initLibrary(DeviceTy &Device) {
Entry.size) != OFFLOAD_SUCCESS)
REPORT("Failed to write symbol for USM %s\n", Entry.name);
}
-} else {
+} else if (Entry.addr) {
--
@@ -6872,35 +6883,6 @@ void OpenMPIRBuilder::loadOffloadInfoMetadata(StringRef
HostFilePath) {
loadOffloadInfoMetadata(*M.get());
}
-Function *OpenMPIRBuilder::createRegisterRequires(StringRef Name) {
jhuber6 wrote:
Thanks for the heads up. Do you know if
@@ -6872,35 +6883,6 @@ void OpenMPIRBuilder::loadOffloadInfoMetadata(StringRef
HostFilePath) {
loadOffloadInfoMetadata(*M.get());
}
-Function *OpenMPIRBuilder::createRegisterRequires(StringRef Name) {
jhuber6 wrote:
That looks a little weird, the `i32` val
@@ -6872,35 +6883,6 @@ void OpenMPIRBuilder::loadOffloadInfoMetadata(StringRef
HostFilePath) {
loadOffloadInfoMetadata(*M.get());
}
-Function *OpenMPIRBuilder::createRegisterRequires(StringRef Name) {
jhuber6 wrote:
I encoded the fact that this is a "requi
https://github.com/jhuber6 closed
https://github.com/llvm/llvm-project/pull/80183
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Author: Joseph Huber
Date: 2024-02-05T09:08:31-06:00
New Revision: d1722868d34a69df8466b72098176f54a7af8823
URL:
https://github.com/llvm/llvm-project/commit/d1722868d34a69df8466b72098176f54a7af8823
DIFF:
https://github.com/llvm/llvm-project/commit/d1722868d34a69df8466b72098176f54a7af8823.diff
@@ -6872,35 +6883,6 @@ void OpenMPIRBuilder::loadOffloadInfoMetadata(StringRef
HostFilePath) {
loadOffloadInfoMetadata(*M.get());
}
-Function *OpenMPIRBuilder::createRegisterRequires(StringRef Name) {
jhuber6 wrote:
It was a very obvious problem. I mixed u
https://github.com/jhuber6 created
https://github.com/llvm/llvm-project/pull/80741
Summary:
The backend supports the wavefrontsize intrinsic, and suggests that it
is tied to a corresponding clang builtin, but it is not actually
present. This simply adds it in so it can be used from clang. This
a
501 - 600 of 2677 matches
Mail list logo