https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/115777
>From 23a8d5af0ab181814885bca6ab6494be9d71f59b Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Wed, 13 Nov 2024 18:14:05 -0600
Subject: [PATCH 1/3] use ASTContext
---
.../bugprone/VirtualNearMissCheck.cpp
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/115777
>From bc76c323aefe52b019375cf3a3227223e2d97133 Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Wed, 13 Nov 2024 18:14:05 -0600
Subject: [PATCH] use ASTContext
---
.../bugprone/VirtualNearMissCheck.cpp
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/115777
>From 23a8d5af0ab181814885bca6ab6494be9d71f59b Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Wed, 13 Nov 2024 18:14:05 -0600
Subject: [PATCH] use ASTContext
---
.../bugprone/VirtualNearMissCheck.cpp
jhuber6 wrote:
> Oh, you can just forward-declare `class ASTContext` at the top of that file.
> It's funny that that isn't already there.
Done, and a sema test.
https://github.com/llvm/llvm-project/pull/115777
___
cfe-commits mailing list
cfe-commits
@@ -126,3 +137,19 @@ void use() {
// FVIS-DEFAULT: declare void @ext_func_default()
// FVIS-PROTECTED: declare void @ext_func_default()
// FVIS-HIDDEN: declare void @ext_func_default()
+
+// FVIS-DEFAULT: define{{.*}} void @__clang_ocl_kern_imp_kern()
+// FVIS-PROTECTED: define
https://github.com/jhuber6 commented:
So, the kernel metadata has a lot of special codegen associated with it. It
seems the approach here is to turn the kernels into thin wrappers that call an
outlined function?
https://github.com/llvm/llvm-project/pull/115821
https://github.com/jhuber6 edited
https://github.com/llvm/llvm-project/pull/115821
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -0,0 +1,43 @@
+// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -emit-llvm -o - %s |
FileCheck %s
jhuber6 wrote:
Might be easier to autogenerate this with `update_cc_test_checks.py`.
https://github.com/llvm/llvm-project/pull/115821
https://github.com/jhuber6 created
https://github.com/llvm/llvm-project/pull/109052
Summary:
I initially thought that it would be convenient to automatically link
these libraries like they are for standard C/C++ targets. However, this
created issues when trying to use C++ as a GPU target. This p
@@ -6405,7 +6424,12 @@ const ToolChain &Driver::getToolChain(const ArgList
&Args,
TC = std::make_unique(*this, Target, Args);
break;
case llvm::Triple::AMDHSA:
- TC = std::make_unique(*this, Target, Args);
+ TC =
+ llvm::any_of(Inputs,
+
https://github.com/jhuber6 closed
https://github.com/llvm/llvm-project/pull/112248
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/115777
>From 1e400acbd574703adcebd704c53991427815b090 Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Tue, 12 Nov 2024 11:20:19 -0600
Subject: [PATCH] [Clang] Use TargetInfo when deciding is an address space is
comp
@@ -272,6 +277,13 @@ class RocmInstallationDetector {
return Loc->second;
}
+ void init(bool DetectHIPRuntime = true, bool DetectDeviceLib = false) {
jhuber6 wrote:
I don't understand why we need this, isn't it fine just to let it fail? Let the
detect
@@ -6405,7 +6424,12 @@ const ToolChain &Driver::getToolChain(const ArgList
&Args,
TC = std::make_unique(*this, Target, Args);
break;
case llvm::Triple::AMDHSA:
- TC = std::make_unique(*this, Target, Args);
+ TC =
+ llvm::any_of(Inputs,
+
https://github.com/jhuber6 edited
https://github.com/llvm/llvm-project/pull/115777
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Author: Joseph Huber
Date: 2024-11-15T10:48:05-06:00
New Revision: 31ee667eb02c68ad186cb129f9dcb72a9dbc
URL:
https://github.com/llvm/llvm-project/commit/31ee667eb02c68ad186cb129f9dcb72a9dbc
DIFF:
https://github.com/llvm/llvm-project/commit/31ee667eb02c68ad186cb129f9dcb72a9dbc.diff
Author: Joseph Huber
Date: 2024-11-15T10:28:20-06:00
New Revision: 3eb1bc5edfc69895bfdc0a8ddd31af3969e6aacc
URL:
https://github.com/llvm/llvm-project/commit/3eb1bc5edfc69895bfdc0a8ddd31af3969e6aacc
DIFF:
https://github.com/llvm/llvm-project/commit/3eb1bc5edfc69895bfdc0a8ddd31af3969e6aacc.diff
@@ -31,6 +31,7 @@
#include "clang/Basic/PointerAuthOptions.h"
#include "clang/Basic/SourceLocation.h"
#include "clang/Basic/Specifiers.h"
+#include "clang/Basic/TargetInfo.h"
jhuber6 wrote:
Shouldn't be necessary, I can make a PR to remove it again.
https://g
https://github.com/jhuber6 closed
https://github.com/llvm/llvm-project/pull/116410
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -6440,7 +6440,8 @@ const ToolChain &Driver::getToolChain(const ArgList &Args,
TC = std::make_unique(*this, Target, Args);
break;
case llvm::Triple::AMDHSA:
- TC = std::make_unique(*this, Target, Args);
+ TC = std::make_unique(*this, Target, Args,
+
https://github.com/jhuber6 edited
https://github.com/llvm/llvm-project/pull/115777
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -6440,7 +6440,8 @@ const ToolChain &Driver::getToolChain(const ArgList &Args,
TC = std::make_unique(*this, Target, Args);
break;
case llvm::Triple::AMDHSA:
- TC = std::make_unique(*this, Target, Args);
+ TC = std::make_unique(*this, Target, Args,
+
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/99687
>From de34ac1b42dda7adb63bfb13cfee40d41e2d7313 Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Fri, 19 Jul 2024 14:07:18 -0500
Subject: [PATCH] [AMDGPU] Use the AMDGPUToolChain when targeting C/C++
directly
S
https://github.com/jhuber6 approved this pull request.
LGTM once CI passes. Thanks Matt for making this less awful.
https://github.com/llvm/llvm-project/pull/114481
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/
https://github.com/jhuber6 commented:
Does V6 have any nasty runtime changes associated with it? Or is it just
recognizing the generic targets.
https://github.com/llvm/llvm-project/pull/118515
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
ht
jhuber6 wrote:
> AFAIK it has nothing fancy for runtime.
The `libc` CMake has a variable that sets it to 5, might be worth bumping that
up since it's actively tested. It's in the `prepare_libc_gpu_build.cmake`.
https://github.com/llvm/llvm-project/pull/118515
__
https://github.com/jhuber6 created
https://github.com/llvm/llvm-project/pull/118661
Summary:
Currently, we only use `-mmlink-builtin-bitcode` for non-LTO NVIDIA
compiliations. THis has the problem that it will internalize the RPC
client symbol which needs to be visible to the host. To counteract
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/118661
>From 9749143d3e919a9dbd442c5c3d87affe4b63c2ba Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Wed, 4 Dec 2024 10:10:11 -0600
Subject: [PATCH] [Clang] Prevent `mlink-builtin-bitcode` from internalizing
the RP
jhuber6 wrote:
> > clang attribute for externally_initialized
>
> I'm surprised there isn't one already. Also seems better if you're going to
> special case the symbol to special case it by just setting this rather than
> skipping internalize for it
I'm not sure I want it in the generic case
@@ -69,8 +69,8 @@ function(llvm_create_cross_target project_name target_name
toolchain buildtype)
"-DLLVM_EXTERNAL_${name}_SOURCE_DIR=${LLVM_EXTERNAL_${name}_SOURCE_DIR}")
endforeach()
- if("libc" IN_LIST LLVM_ENABLE_PROJECTS AND NOT LIBC_HDRGEN_EXE)
-set(lib
https://github.com/jhuber6 created
https://github.com/llvm/llvm-project/pull/118674
Summary:
This is consistent with other intrinsic headers like the SSE/AVX
intrinsics. I don't think function names need to be specificlaly
reserved because we are not natively including this into any TUs. The
mai
@@ -6440,7 +6440,8 @@ const ToolChain &Driver::getToolChain(const ArgList &Args,
TC = std::make_unique(*this, Target, Args);
break;
case llvm::Triple::AMDHSA:
- TC = std::make_unique(*this, Target, Args);
+ TC = std::make_unique(*this, Target, Args,
+
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/118674
>From 7e28f1039a0443baea8bca7c994bb85429730674 Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Wed, 4 Dec 2024 11:55:07 -0600
Subject: [PATCH] [Clang] Rename GPU intrinsic functions from `__gpu_` to
`_gpu_`
jhuber6 wrote:
I'm fine with this solution, but I'll sit on it for a bit to see if
@AaronBallman has an opinion or if the `clangd` people get back to me on
https://github.com/llvm/llvm-project/issues/118684.
https://github.com/llvm/llvm-project/pull/118674
_
jhuber6 wrote:
> > > Does this change the behavior with amdgcn-- or amdgcn-mesa-mesa3d?
> >
> >
> > Those both use the `AMDGPUToolChain`. I suppose you could make the argument
> > that `--target=amdgcn--` is the intended target for "standalone amd" stuff.
>
> Mostly I'm just wondering if it b
jhuber6 wrote:
> Does this change the behavior with amdgcn-- or amdgcn-mesa-mesa3d?
Those both use the `AMDGPUToolChain`. I suppose you could make the argument
that `--target=amdgcn--` is the intended target for "standalone amd" stuff.
https://github.com/llvm/llvm-project/pull/99687
_
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/119091
>From 4c710e49eea97e542b97e0b5e78b7915acd32383 Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Sat, 7 Dec 2024 13:47:23 -0600
Subject: [PATCH 1/3] [OpenMP] Use generic IR for the OpenMP DeviceRTL
Summary:
We
jhuber6 wrote:
@hidekisaito Might be relevant to your patch.
https://github.com/llvm/llvm-project/pull/119091
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -288,18 +258,11 @@ function(compileDeviceRTLLibrary target_cpu target_name
target_triple)
endif()
endfunction()
-# Generate a Bitcode library for all the gpu architectures the user requested.
-add_custom_target(omptarget.devicertl.nvptx)
add_custom_target(omptarget.devi
@@ -288,18 +258,11 @@ function(compileDeviceRTLLibrary target_cpu target_name
target_triple)
endif()
endfunction()
-# Generate a Bitcode library for all the gpu architectures the user requested.
-add_custom_target(omptarget.devicertl.nvptx)
add_custom_target(omptarget.devi
@@ -288,18 +258,11 @@ function(compileDeviceRTLLibrary target_cpu target_name
target_triple)
endif()
endfunction()
-# Generate a Bitcode library for all the gpu architectures the user requested.
-add_custom_target(omptarget.devicertl.nvptx)
add_custom_target(omptarget.devi
jhuber6 wrote:
> First I will always consider NVVM reflect a giant hack. NVVM reflect cannot
> actually deal with the full range of wavesize issues. It is an incompatible
> ABI and the code should never be intermixed
It's a hack, but still better than whatever it is AMD does currently.
https:
jhuber6 wrote:
> This probably should retain separate wave32/wave64 builds. Additionally,
> should have extension points for subtarget specific implementations
That's what Shilei was talking about since we have `__nvvm_reflect` for that
for NVPTX.
https://github.com/llvm/llvm-project/pull/119
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/119091
>From ccbbc8cd83415aa56fbc3726069776255bcbc918 Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Sat, 7 Dec 2024 13:47:23 -0600
Subject: [PATCH] [OpenMP] Use generic IR for the OpenMP DeviceRTL
Summary:
We prev
https://github.com/jhuber6 created
https://github.com/llvm/llvm-project/pull/119091
Summary:
We previously built this for every single architecture to deal with
incompatibility. This patch updates it to use the 'generic' IR that
`libc` and other projects use. Who knows if this will have any
side
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/119091
>From 4c710e49eea97e542b97e0b5e78b7915acd32383 Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Sat, 7 Dec 2024 13:47:23 -0600
Subject: [PATCH 1/2] [OpenMP] Use generic IR for the OpenMP DeviceRTL
Summary:
We
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/119091
>From 4c710e49eea97e542b97e0b5e78b7915acd32383 Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Sat, 7 Dec 2024 13:47:23 -0600
Subject: [PATCH] [OpenMP] Use generic IR for the OpenMP DeviceRTL
Summary:
We prev
@@ -141,20 +109,21 @@ set(bc_flags -c -foffload-lto -std=c++17
-fvisibility=hidden
# first create an object target
add_library(omptarget.devicertl.all_objs OBJECT IMPORTED)
-function(compileDeviceRTLLibrary target_cpu target_name target_triple)
+function(compileDeviceRTLLibra
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/119091
>From 4c710e49eea97e542b97e0b5e78b7915acd32383 Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Sat, 7 Dec 2024 13:47:23 -0600
Subject: [PATCH 1/4] [OpenMP] Use generic IR for the OpenMP DeviceRTL
Summary:
We
jhuber6 wrote:
@ronlieb This will probably conflict a lot with downstream, should probably
wait until I've talked it over with others before trying to merge it in AOMP.
https://github.com/llvm/llvm-project/pull/119091
___
cfe-commits mailing list
cfe-
@@ -74,49 +72,53 @@ static int32_t nvptx_parallel_reduce_nowait(void
*reduce_data,
uint32_t NumThreads = omp_get_num_threads();
if (NumThreads == 1)
return 1;
-/*
- * This reduce function handles reduction within a team. It handles
- * parallel regions in b
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/119091
>From 4c710e49eea97e542b97e0b5e78b7915acd32383 Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Sat, 7 Dec 2024 13:47:23 -0600
Subject: [PATCH 1/2] [OpenMP] Use generic IR for the OpenMP DeviceRTL
Summary:
We
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/119091
>From 4c710e49eea97e542b97e0b5e78b7915acd32383 Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Sat, 7 Dec 2024 13:47:23 -0600
Subject: [PATCH 1/3] [OpenMP] Use generic IR for the OpenMP DeviceRTL
Summary:
We
jhuber6 wrote:
> I like this method, but just out of curiosity, did we use anything in AMDGPU
> implementation that has target dependent lowering in the front end? If not,
> this is totally fine I'd say.
We used to use the `__AMDGCN_WAVEFRONT_SIZE` but that was removed for unrelated
reasons.
jhuber6 wrote:
> `CodeGenHipStdPar/unsupported-builtins.cpp` is pretty interesting actually,
> it looks like it tests for some behavior in CodeGen that seems like it's
> trying to fix the exact same problem
>
> The other two tests seem to be actually unrelated breakages though.
Maybe @AlexVlx
https://github.com/jhuber6 commented:
Is it possible that we could just skip generating the builtin IDs at all for
the aux target? Or does that break something.
https://github.com/llvm/llvm-project/pull/121839
___
cfe-commits mailing list
cfe-commits@
jhuber6 wrote:
Ah, I see how it is. And it probably worked in the NVPTX / AMDGCN case because
we had valid code paths. This is definitely something that should be fixed,
because it makes no sense to say a builtin is available if you can't even use
it. Though offloading languages do tend to hat
jhuber6 wrote:
You can compile and run static C applications, I'm not sure what the threshold
is before we can start adding support to make it usable in `clang`.
https://github.com/llvm/llvm-project/pull/121123
___
cfe-commits mailing list
cfe-commits
https://github.com/jhuber6 closed
https://github.com/llvm/llvm-project/pull/120145
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
jhuber6 wrote:
> > Sure, what's left for this to work? I'm probably going to be messing around
> > with the OpenMP 'DeviceRTL' more, likely killing off the 'fatbinary' and
> > just using the per-target runtime dir stuff. I'm going to assume this
> > wouldn't work well with SPIR-V since they do
@@ -537,7 +537,11 @@ AMDGPUTargetCodeGenInfo::getLLVMSyncScopeID(const
LangOptions &LangOpts,
break;
}
- if (Ordering != llvm::AtomicOrdering::SequentiallyConsistent) {
+ // OpenCL assumes by default that atomic scopes are per-address space for
+ // non-sequentially
@@ -537,7 +537,11 @@ AMDGPUTargetCodeGenInfo::getLLVMSyncScopeID(const
LangOptions &LangOpts,
break;
}
- if (Ordering != llvm::AtomicOrdering::SequentiallyConsistent) {
+ // OpenCL assumes by default that atomic scopes are per-address space for
+ // non-sequentially
@@ -537,7 +537,11 @@ AMDGPUTargetCodeGenInfo::getLLVMSyncScopeID(const
LangOptions &LangOpts,
break;
}
- if (Ordering != llvm::AtomicOrdering::SequentiallyConsistent) {
+ // OpenCL assumes by default that atomic scopes are per-address space for
+ // non-sequentially
@@ -537,7 +537,11 @@ AMDGPUTargetCodeGenInfo::getLLVMSyncScopeID(const
LangOptions &LangOpts,
break;
}
- if (Ordering != llvm::AtomicOrdering::SequentiallyConsistent) {
+ // OpenCL assumes by default that atomic scopes are per-address space for
+ // non-sequentially
https://github.com/jhuber6 closed
https://github.com/llvm/llvm-project/pull/120095
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
jhuber6 wrote:
> I'm trying to understand this. Is the function being changed a generic util
> called by multiple builtins, and this change is just to make `one-as`
> exclusive to the OpenCL variant of those builtins ? Can an identical builtin
> have different behavior depending on the input l
jhuber6 wrote:
I don't think it makes any sense for `__has_builtin` to return true when the
target does not in-fact have the builtin. Most of the time this is used to
guard target specific code, which will then be wrong if it's compiled on the
device. Realistically the solution that makes sens
jhuber6 wrote:
> > I don't think it makes any sense for `__has_builtin` to return true when
> > the target does not in-fact have the builtin. Most of the time this is used
> > to guard target specific code, which will then be wrong if it's compiled on
> > the device. Realistically the solution
jhuber6 wrote:
I discussed this with @t-tye, he said it's correct but wants @Pierre-vh to sign
off on it.
https://github.com/llvm/llvm-project/pull/120095
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/l
@@ -0,0 +1,31 @@
+// RUN: %clang -x cuda %s -Xarch_nvptx64 -O3 -S -nogpulib -nogpuinc -### 2>&1
| FileCheck -check-prefix=O3ONCE %s
+// RUN: %clang -x hip %s -Xarch_amdgcn -O3 -S -nogpulib -nogpuinc -### 2>&1 |
FileCheck -check-prefix=O3ONCE %s
jhuber6 wrote:
W
@@ -1115,14 +1117,13 @@ def fno_convergent_functions : Flag<["-"],
"fno-convergent-functions">,
// Common offloading options
let Group = offload_Group in {
-def offload_arch_EQ : Joined<["--"], "offload-arch=">, Flags<[NoXarchOption]>,
jhuber6 wrote:
Overloa
@@ -0,0 +1,31 @@
+// RUN: %clang -x cuda %s -Xarch_nvptx64 -O3 -S -nogpulib -nogpuinc -### 2>&1
| FileCheck -check-prefix=O3ONCE %s
+// RUN: %clang -x hip %s -Xarch_amdgcn -O3 -S -nogpulib -nogpuinc -### 2>&1 |
FileCheck -check-prefix=O3ONCE %s
jhuber6 wrote:
I
https://github.com/jhuber6 edited
https://github.com/llvm/llvm-project/pull/125556
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -1115,14 +1117,13 @@ def fno_convergent_functions : Flag<["-"],
"fno-convergent-functions">,
// Common offloading options
let Group = offload_Group in {
-def offload_arch_EQ : Joined<["--"], "offload-arch=">, Flags<[NoXarchOption]>,
jhuber6 wrote:
I don't
@@ -0,0 +1,31 @@
+// RUN: %clang -x cuda %s -Xarch_nvptx64 -O3 -S -nogpulib -nogpuinc -### 2>&1
| FileCheck -check-prefix=O3ONCE %s
+// RUN: %clang -x hip %s -Xarch_amdgcn -O3 -S -nogpulib -nogpuinc -### 2>&1 |
FileCheck -check-prefix=O3ONCE %s
jhuber6 wrote:
W
jhuber6 wrote:
Contains two dependent commits, last one is the patch. Might need to have some
additional error handling to reject known broken architectures, also need to
correctly handle things like
`--offload-targets=spirv64-amd-amdhsa,amdgcn-amd-amdhsa` in the new driver.
https://github.co
jhuber6 wrote:
> Apologies, if I am missing it but I don't see a test emitting the diagnostic
> `err_drv_mix_offload` anywhere.
I'll add a test for it, there's also a few other things I need to tweak.
https://github.com/llvm/llvm-project/pull/125556
jhuber6 wrote:
> > Summary: Currently, `-Xarch_` is used to forward argument specially to
> > certain toolchains. Currently, this is only supported by the Darwin
> > toolchain. We want to be able to use this generically, and for offloading
> > too. This patch moves the handling out of the Darw
@@ -6601,6 +6573,72 @@ std::string Driver::GetClPchPath(Compilation &C,
StringRef BaseName) const {
return std::string(Output);
}
+const ToolChain &Driver::getOffloadToolChain(
+const llvm::opt::ArgList &Args, const Action::OffloadKind Kind,
+const llvm::Triple &Tar
@@ -0,0 +1,31 @@
+// RUN: %clang -x cuda %s -Xarch_nvptx64 -O3 -S -nogpulib -nogpuinc -### 2>&1
| FileCheck -check-prefix=O3ONCE %s
+// RUN: %clang -x hip %s -Xarch_amdgcn -O3 -S -nogpulib -nogpuinc -### 2>&1 |
FileCheck -check-prefix=O3ONCE %s
jhuber6 wrote:
A
@@ -1115,14 +1117,13 @@ def fno_convergent_functions : Flag<["-"],
"fno-convergent-functions">,
// Common offloading options
let Group = offload_Group in {
-def offload_arch_EQ : Joined<["--"], "offload-arch=">, Flags<[NoXarchOption]>,
jhuber6 wrote:
I defin
https://github.com/jhuber6 edited
https://github.com/llvm/llvm-project/pull/125421
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -1697,19 +1721,17 @@ llvm::opt::DerivedArgList
*ToolChain::TranslateXarchArgs(
} else if (A->getOption().matches(options::OPT_Xarch_host)) {
NeedTrans = !IsDevice;
Skip = IsDevice;
-} else if (A->getOption().matches(options::OPT_Xarch__) && IsDevice) {
-
jhuber6 wrote:
Here's something fun, `-O0` and `-O3` are accepted by `-Xarch` but `-O1` and
`-O2` are rejected.
https://github.com/llvm/llvm-project/pull/125421
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mai
@@ -1115,14 +1117,13 @@ def fno_convergent_functions : Flag<["-"],
"fno-convergent-functions">,
// Common offloading options
let Group = offload_Group in {
-def offload_arch_EQ : Joined<["--"], "offload-arch=">, Flags<[NoXarchOption]>,
jhuber6 wrote:
That's
jhuber6 wrote:
That's fun, well for now I've updated it to not hit that bug and also accept
`-Xarch_sm_52 --offload-arch=sm_52` even though it's stupid.
https://github.com/llvm/llvm-project/pull/125421
___
cfe-commits mailing list
cfe-commits@lists.ll
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/125421
>From 79000d0a1ecd1312fb9bc06af0369b66a133e5d4 Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Sun, 2 Feb 2025 10:39:01 -0600
Subject: [PATCH] [Clang] Make `-Xarch_` handling generic for all toolchains
Summar
https://github.com/jhuber6 edited
https://github.com/llvm/llvm-project/pull/125421
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/125421
>From 03852104d5945e0e92c97b68e993bd699b275ab5 Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Sun, 2 Feb 2025 10:39:01 -0600
Subject: [PATCH] [Clang] Make `-Xarch_` handling generic for all toolchains
Summar
https://github.com/jhuber6 requested changes to this pull request.
I don't understand the motivation for this, it is intentionally passing `-flto`
because this is *not* an offloading compile job. This flag is mostly here for
convenience, but it's not strictly necessary because the Nvlink job do
@@ -498,12 +498,16 @@ Expected clang(ArrayRef InputFiles,
const ArgList &Args) {
};
// Forward all of the `--offload-opt` and similar options to the device.
- CmdArgs.push_back("-flto");
for (auto &Arg : Args.filtered(OPT_offload_opt_eq_minus, OPT_mllvm))
CmdArgs
https://github.com/jhuber6 commented:
You can use `-flto` for the NVPTX target, https://godbolt.org/z/MYf6cc6e3.
https://github.com/llvm/llvm-project/pull/125243
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mai
https://github.com/jhuber6 edited
https://github.com/llvm/llvm-project/pull/125243
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jhuber6 edited
https://github.com/llvm/llvm-project/pull/125243
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -498,12 +498,16 @@ Expected clang(ArrayRef InputFiles,
const ArgList &Args) {
};
// Forward all of the `--offload-opt` and similar options to the device.
- CmdArgs.push_back("-flto");
for (auto &Arg : Args.filtered(OPT_offload_opt_eq_minus, OPT_mllvm))
CmdArgs
@@ -1115,14 +1117,13 @@ def fno_convergent_functions : Flag<["-"],
"fno-convergent-functions">,
// Common offloading options
let Group = offload_Group in {
-def offload_arch_EQ : Joined<["--"], "offload-arch=">, Flags<[NoXarchOption]>,
jhuber6 wrote:
using `
https://github.com/jhuber6 created
https://github.com/llvm/llvm-project/pull/125556
Summary:
This is a new option that tries to make selecting offloading toolchains
more generic. Currently we infer the toolchain from a combination of the
kind and the `--offload-arch=` option. Doing this becomes
@@ -71,10 +71,10 @@ llvm::opt::DerivedArgList
*AMDGPUOpenMPToolChain::TranslateArgs(
const OptTable &Opts = getDriver().getOpts();
- for (Arg *A : Args) {
-if (!llvm::is_contained(*DAL, A))
+ for (Arg *A : Args)
+if (!shouldSkipSanitizeOption(*this, Args, BoundAr
https://github.com/jhuber6 edited
https://github.com/llvm/llvm-project/pull/124754
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -498,12 +498,16 @@ Expected clang(ArrayRef InputFiles,
const ArgList &Args) {
};
// Forward all of the `--offload-opt` and similar options to the device.
- CmdArgs.push_back("-flto");
for (auto &Arg : Args.filtered(OPT_offload_opt_eq_minus, OPT_mllvm))
CmdArgs
1901 - 2000 of 2694 matches
Mail list logo