@@ -735,11 +736,15 @@ wrapDeviceImages(ArrayRef>
Buffers,
}
Expected>>
-bundleOpenMP(ArrayRef Images) {
+bundleOpenMP(SmallVectorImpl &Images) {
jhuber6 wrote:
Why does the container need to be mutable now?
https://github.com/llvm/llvm-project/pull/120145
_
@@ -537,7 +537,11 @@ AMDGPUTargetCodeGenInfo::getLLVMSyncScopeID(const
LangOptions &LangOpts,
break;
}
- if (Ordering != llvm::AtomicOrdering::SequentiallyConsistent) {
+ // OpenCL assumes by default that atomic scopes are per-address space for
+ // non-sequentially
jhuber6 wrote:
> As there is no production quality SPIR-V linker available, manually create an
> ELF binary containing the offloading image in a way that fits into the
> existing `liboffload` plugin infrastructure. This ELF will eventually be
> passed to a runtime plugin that interacts with th
@@ -337,9 +337,12 @@ void AMDGPUTargetInfo::getTargetDefines(const LangOptions
&Opts,
if (hasFastFMA())
Builder.defineMacro("FP_FAST_FMA");
- Builder.defineMacro("__AMDGCN_WAVEFRONT_SIZE__", Twine(WavefrontSize));
- // ToDo: deprecate this macro for naming consistency
https://github.com/jhuber6 edited
https://github.com/llvm/llvm-project/pull/112849
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jhuber6 commented:
I'm not sure, those enums might evaluate to zero but it makes it clearer and
correct if they ever change. Realistically, the chance of that happening is
pretty much zero, but still. I don't see this warning show up when I check with
my LSP (though I do se
@@ -337,9 +337,12 @@ void AMDGPUTargetInfo::getTargetDefines(const LangOptions
&Opts,
if (hasFastFMA())
Builder.defineMacro("FP_FAST_FMA");
- Builder.defineMacro("__AMDGCN_WAVEFRONT_SIZE__", Twine(WavefrontSize));
- // ToDo: deprecate this macro for naming consistency
@@ -337,9 +337,12 @@ void AMDGPUTargetInfo::getTargetDefines(const LangOptions
&Opts,
if (hasFastFMA())
Builder.defineMacro("FP_FAST_FMA");
- Builder.defineMacro("__AMDGCN_WAVEFRONT_SIZE__", Twine(WavefrontSize));
- // ToDo: deprecate this macro for naming consistency
https://github.com/jhuber6 created
https://github.com/llvm/llvm-project/pull/119735
Summary:
`clangd` intentionally suppresses indexing symbols from system headers
as these are likely implementation details the user does not want.
Howver, there are plenty of system headers that provide extension
@@ -1270,77 +1270,21 @@ exit:
; MODULE: attributes #[[ATTR1:[0-9]+]] = { convergent nocallback nounwind }
; MODULE: attributes #[[ATTR2:[0-9]+]] = { convergent nocallback nofree
nounwind willreturn }
; MODULE: attributes #[[ATTR3:[0-9]+]] = { nocallback nofree nosync nounwind
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/119735
>From 04757e7d94ce5db11bb397accb0b1c0523d351ba Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Thu, 12 Dec 2024 12:15:32 -0600
Subject: [PATCH 1/2] [clangd] Index reserved symbols from `*intrin.h` system
head
@@ -1270,77 +1270,21 @@ exit:
; MODULE: attributes #[[ATTR1:[0-9]+]] = { convergent nocallback nounwind }
; MODULE: attributes #[[ATTR2:[0-9]+]] = { convergent nocallback nofree
nounwind willreturn }
; MODULE: attributes #[[ATTR3:[0-9]+]] = { nocallback nofree nosync nounwind
https://github.com/jhuber6 edited
https://github.com/llvm/llvm-project/pull/119261
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/119735
>From 04757e7d94ce5db11bb397accb0b1c0523d351ba Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Thu, 12 Dec 2024 12:15:32 -0600
Subject: [PATCH 1/2] [clangd] Index reserved symbols from `*intrin.h` system
head
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/119735
>From 04757e7d94ce5db11bb397accb0b1c0523d351ba Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Thu, 12 Dec 2024 12:15:32 -0600
Subject: [PATCH 1/2] [clangd] Index reserved symbols from `*intrin.h` system
head
https://github.com/jhuber6 commented:
Should we have an entry for `-fdefine_target_os_macros` as well?
https://github.com/llvm/llvm-project/pull/120632
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listi
https://github.com/jhuber6 edited
https://github.com/llvm/llvm-project/pull/120632
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
jhuber6 wrote:
Also needs a test
https://github.com/llvm/llvm-project/pull/120632
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
jhuber6 wrote:
> > Should we have an entry for `-fdefine-target-os-macros` as well?
>
> I think so, it wouldn't be a bad idea.
>
> > Also needs a test
>
> I might need some guidance on adding a test since I'm not fully familiar with
> the clang side of LLVM.
Check `./clang/test/Preprocessor
@@ -820,43 +820,6 @@ class LLVM_LIBRARY_VISIBILITY X86_64TargetInfo : public
X86TargetInfo {
}
};
-// x86-64 UEFI target
-class LLVM_LIBRARY_VISIBILITY UEFIX86_64TargetInfo
-: public UEFITargetInfo {
-public:
- UEFIX86_64TargetInfo(const llvm::Triple &Triple, const Tar
@@ -820,43 +820,6 @@ class LLVM_LIBRARY_VISIBILITY X86_64TargetInfo : public
X86TargetInfo {
}
};
-// x86-64 UEFI target
-class LLVM_LIBRARY_VISIBILITY UEFIX86_64TargetInfo
-: public UEFITargetInfo {
-public:
- UEFIX86_64TargetInfo(const llvm::Triple &Triple, const Tar
https://github.com/jhuber6 approved this pull request.
Seems reasonable if it passes all the tests.
https://github.com/llvm/llvm-project/pull/120632
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo
@@ -788,16 +789,28 @@ class LLVM_LIBRARY_VISIBILITY ZOSTargetInfo : public
OSTargetInfo {
// UEFI target
template
class LLVM_LIBRARY_VISIBILITY UEFITargetInfo : public OSTargetInfo {
+ llvm::Triple Triple;
protected:
void getOSDefines(const LangOptions &Opts, const llvm:
https://github.com/jhuber6 edited
https://github.com/llvm/llvm-project/pull/120632
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jhuber6 commented:
Should probably add something to `TargetOSMacros.def` while we're at it.
https://github.com/llvm/llvm-project/pull/120632
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailm
@@ -165,6 +165,9 @@ std::unique_ptr AllocateTarget(const
llvm::Triple &Triple,
case llvm::Triple::OpenBSD:
return std::make_unique>(Triple,
Opts);
+case llvm::Triple::UEFI:
+ return std::m
@@ -165,6 +165,9 @@ std::unique_ptr AllocateTarget(const
llvm::Triple &Triple,
case llvm::Triple::OpenBSD:
return std::make_unique>(Triple,
Opts);
+case llvm::Triple::UEFI:
+ return std::m
@@ -820,43 +820,6 @@ class LLVM_LIBRARY_VISIBILITY X86_64TargetInfo : public
X86TargetInfo {
}
};
-// x86-64 UEFI target
-class LLVM_LIBRARY_VISIBILITY UEFIX86_64TargetInfo
-: public UEFITargetInfo {
-public:
- UEFIX86_64TargetInfo(const llvm::Triple &Triple, const Tar
@@ -790,7 +790,9 @@ template
class LLVM_LIBRARY_VISIBILITY UEFITargetInfo : public OSTargetInfo {
protected:
void getOSDefines(const LangOptions &Opts, const llvm::Triple &Triple,
-MacroBuilder &Builder) const override {}
+MacroBuilder
@@ -165,6 +165,9 @@ std::unique_ptr AllocateTarget(const
llvm::Triple &Triple,
case llvm::Triple::OpenBSD:
return std::make_unique>(Triple,
Opts);
+case llvm::Triple::UEFI:
+ return std::m
jhuber6 wrote:
Didn't know this existed, but we have
https://github.com/llvm/llvm-project/pull/120632 tracking it now.
https://github.com/llvm/llvm-project/pull/111719
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-
@@ -35,6 +35,24 @@ UEFI::UEFI(const Driver &D, const llvm::Triple &Triple,
const ArgList &Args)
Tool *UEFI::buildLinker() const { return new tools::uefi::Linker(*this); }
+void UEFI::AddClangSystemIncludeArgs(const ArgList &DriverArgs,
+ A
@@ -788,16 +789,28 @@ class LLVM_LIBRARY_VISIBILITY ZOSTargetInfo : public
OSTargetInfo {
// UEFI target
template
class LLVM_LIBRARY_VISIBILITY UEFITargetInfo : public OSTargetInfo {
+ llvm::Triple Triple;
+
protected:
void getOSDefines(const LangOptions &Opts, const llv
https://github.com/jhuber6 edited
https://github.com/llvm/llvm-project/pull/120632
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jhuber6 created
https://github.com/llvm/llvm-project/pull/117171
Summary:
Previous changes relaxed the address space rules based on what the
target says about them. This accidentally included the AS(2) region as
convertible to generic. Simply check for AS(2) and reject it.
>
@@ -6405,7 +6424,12 @@ const ToolChain &Driver::getToolChain(const ArgList
&Args,
TC = std::make_unique(*this, Target, Args);
break;
case llvm::Triple::AMDHSA:
- TC = std::make_unique(*this, Target, Args);
+ TC =
+ llvm::any_of(Inputs,
+
https://github.com/jhuber6 closed
https://github.com/llvm/llvm-project/pull/117171
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
jhuber6 wrote:
> Should also forbid the buffers (probably should forbid using the buffers at
> all)
I think the logic currently allows 0, 1, 3, 4, and 5.
https://github.com/llvm/llvm-project/pull/117171
___
cfe-commits mailing list
cfe-commits@lists.
@@ -1024,6 +1024,15 @@ GCNTTIImpl::instCombineIntrinsic(InstCombiner &IC,
IntrinsicInst &II) const {
}
break;
}
+ case Intrinsic::amdgcn_wavefrontsize: {
+// TODO: this is a workaround for the pseudo-generic target one gets with
no
+// specified mcpu, which
jhuber6 wrote:
Okay, so it definitely uses the device side, however the ROCmToolChain does not
take a host ToolChain. I'm wondering why the CUDA toolchain doesn't have this
issue, it checks the HostTriple for Windows just fine.
https://github.com/llvm/llvm-project/pull/113628
_
jhuber6 wrote:
> > Is there an issue with simply using the `HostTC` for everything? I feel
> > like that's the solution to this mess, since the `HostTC` would always know
> > whether or not the target is Windows without us needing to forward a bunch
> > of stuff.
>
> Yes, that would work too.
@@ -1024,6 +1024,15 @@ GCNTTIImpl::instCombineIntrinsic(InstCombiner &IC,
IntrinsicInst &II) const {
}
break;
}
+ case Intrinsic::amdgcn_wavefrontsize: {
+// TODO: this is a workaround for the pseudo-generic target one gets with
no
+// specified mcpu, which
https://github.com/jhuber6 edited
https://github.com/llvm/llvm-project/pull/114481
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -1024,6 +1024,15 @@ GCNTTIImpl::instCombineIntrinsic(InstCombiner &IC,
IntrinsicInst &II) const {
}
break;
}
+ case Intrinsic::amdgcn_wavefrontsize: {
+// TODO: this is a workaround for the pseudo-generic target one gets with
no
+// specified mcpu, which
jhuber6 wrote:
it's also tough to say it's legal to drop the `amdhsa` in this case, because we
rely on builtins that refer to things like implicit arguments, which are from
HSA. Otherwise I would probably be fine just calling it `amdgcn`.
https://github.com/llvm/llvm-project/pull/99687
___
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/119091
>From 767d34a0469aa67c2c47a35bc9bff29d20ae1222 Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Sat, 7 Dec 2024 13:47:23 -0600
Subject: [PATCH] [OpenMP] Use generic IR for the OpenMP DeviceRTL
Summary:
We prev
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/122149
>From 3329b7ae7dc6044f6563f218c65f6af7498290f0 Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Wed, 8 Jan 2025 12:19:53 -0600
Subject: [PATCH] [OpenMP] Allow GPUs to be targeted directly via `-fopenmp`.
Summa
jhuber6 wrote:
> I don't think it should be GPU code generation path as there is no explicit
> `target` region used.
it needs to be, otherwise the code generation for things like `#pragma omp
parallel` will be wrong. The way I see it, the DeviceRTL is `libomp.a` for the
GPU target, so we need
jhuber6 wrote:
> ~I don't think it should be GPU code generation path as there is no explicit
> `target` region used.~ Probably I missed something here. Do you expect
> regular OpenMP stuff such as `parallel` region to be emitted in the same way
> as offloading code?
Yes, the example in the d
jhuber6 wrote:
> I think that is a misuse of OpenMP semantics. We can't expect to have regular
> OpenMP code working in the same way as OpenMP offloading code when targeting
> a GPU meanwhile the code is not wrapped into `target` region or declare
> target. I understand to have variants and de
@@ -0,0 +1,12 @@
+// REQUIRES: spirv-registered-target
+// REQUIRES: x86-registered-target
+
jhuber6 wrote:
These shouldn't be necessary, we're just emitting IR.
https://github.com/llvm/llvm-project/pull/121839
___
cfe
@@ -0,0 +1,12 @@
+// REQUIRES: spirv-registered-target
+// REQUIRES: x86-registered-target
+
+// RUN: %clang_cc1 -fopenmp -triple=spirv64 -fopenmp-is-target-device \
+// RUN: -aux-triple x86_64-linux-unknown -E %s | FileCheck
-implicit-check-not=BAD %s
jhuber6 wr
@@ -1818,8 +1819,21 @@ void Preprocessor::ExpandBuiltinMacro(Token &Tok) {
// usual allocation and deallocation functions. Required by libc++
return 201802;
default:
+// We may get here because of aux builtins which may not be
+
https://github.com/jhuber6 created
https://github.com/llvm/llvm-project/pull/122149
Summary:
Currently we prevent the following from working. However, it is
completely reasonable to be able to target the files individually.
```
$ clang --target=amdgcn-amd-amdhsa -fopenmp
```
This patch lifts th
jhuber6 wrote:
> Maybe just turn on OpenMPIsTargetDevice if `gpu target + -fopenmp` is
> specified?
I'll give it a try.
https://github.com/llvm/llvm-project/pull/122149
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cg
jhuber6 wrote:
> What code generation path would be used in this case? The GPU code generation
> or regular host OpenMP?
The GPU path, I'm treating that as the code generation path that created
correct runtime code for the GPU. I.e. you can link it with your OpenMP
offloading program and it'l
@@ -1818,8 +1819,21 @@ void Preprocessor::ExpandBuiltinMacro(Token &Tok) {
// usual allocation and deallocation functions. Required by libc++
return 201802;
default:
+// We may get here because of aux builtins which may not be
+
jhuber6 wrote:
> I am afraid this will break all existing CUDA/HIP programs since they expect
> to be able to parse the builtins for both host and device targets.
>
> In the spirit of single source, the compiler sees the entire code for all
> targets, including host target and all device targe
jhuber6 wrote:
> This is/was my concern. However, upon thinking further, as long as we
> RECOGNIZE/Parse/etc the builtins, it is OK I think if we report
> "!__has_builtin".
>
> That is:
>
> ```
> void foo() {
> #if __has_builtin(__builtin_x86_thing)
> __builtin_x86_thing();
> #else
> __builti
jhuber6 wrote:
> Maybe just turn on OpenMPIsTargetDevice if `gpu target + -fopenmp` is
> specified?
Doesn't work, it causes all definitions to be stripped as they are not declared
on the device, which is not what we want.
https://github.com/llvm/llvm-project/pull/122149
__
https://github.com/jhuber6 approved this pull request.
Maybe one day we'll be able to get rid of the bundler.
https://github.com/llvm/llvm-project/pull/122627
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailma
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/123437
>From 4414706b8ced9048a572fb78544a7e637c4946a0 Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Fri, 17 Jan 2025 19:56:18 -0600
Subject: [PATCH 1/3] [HIP] Support managed variables using the new driver
Summary
https://github.com/jhuber6 closed
https://github.com/llvm/llvm-project/pull/123437
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -55,29 +68,30 @@ enum OffloadEntryKindFlag : uint32_t {
/// globals that will be registered with the offloading runtime.
StructType *getEntryTy(Module &M);
-/// Returns the struct type we store the two pointers for CUDA / HIP managed
-/// variables in. Necessary until we wi
@@ -160,54 +160,30 @@
// CHECK-NTARGET-NOT: private unnamed_addr constant [1 x i
// CHECK-DAG: [[NAMEPTR1:@.+]] = internal unnamed_addr constant [{{.*}} x i8]
c"[[NAME1:__omp_offloading_[0-9a-f]+_[0-9a-f]+__Z.+_l[0-9]+]]\00"
-// CHECK-DAG: [[ENTRY1:@.+]] = weak{{.*}} constant
@@ -1818,8 +1819,21 @@ void Preprocessor::ExpandBuiltinMacro(Token &Tok) {
// usual allocation and deallocation functions. Required by libc++
return 201802;
default:
+// We may get here because of aux builtins which may not be
+
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/99687
>From a6df42bc9a7a9f3779f598f77db45e5334910bc2 Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Fri, 24 Jan 2025 16:15:57 -0600
Subject: [PATCH] [AMDGPU] Use the AMDGPUToolChain when targeting C/C++
directly
S
jhuber6 wrote:
Updated this to be simpler. This logic is *only* run when compiled from `clang
--target=amdgcn-amd-amdhsa`, which means it will only show up for OpenCL and
direct C++ targeting, HIP uses a different method to get the Toolchain. For
regular C++ I do not want to depend on the ROCm
jhuber6 wrote:
Maybe some day we can just make this the default if we ever provide an in-tree
way to handle CUDA.
https://github.com/llvm/llvm-project/pull/124116
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/m
https://github.com/jhuber6 closed
https://github.com/llvm/llvm-project/pull/124116
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jhuber6 created
https://github.com/llvm/llvm-project/pull/124116
Summary:
We pass the `-nvptx-lower-global-ctor-dtor` option to support the `libc`
like use-case which needs global constructors sometimes. This only
affects the backend. If the NVPTX target is not enabled this op
jhuber6 wrote:
Fixed the managed stuff in https://github.com/llvm/llvm-project/pull/123437,
should be good enough for the foreseeable future.
https://github.com/llvm/llvm-project/pull/123359
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/123437
>From bed6550941c0fafe2975288e49957a5a36895cf2 Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Fri, 17 Jan 2025 19:56:18 -0600
Subject: [PATCH 1/2] [HIP] Support managed variables using the new driver
Summary
jhuber6 wrote:
I somehow totally forgot that I implemented surfaces and textures, it was
`managed` that I was having problems with.
https://github.com/llvm/llvm-project/pull/123359
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.
jhuber6 wrote:
> > I'd like to avoid modifying this struct to avoid breaking ABI with OpenMP.
> > Perhaps I should make `addr` a pointer to a struct and just GEP the two
> > values.
>
> Are we trying to jam a square HIP peg into a round OpenMP hole, or are we
> truly wanting to move to a lang
jhuber6 wrote:
@yxsamliu Here's the general problem, the `__hipRegisterManagedVar` call takes
the following arguments.
```c
__hipRegisterManagedVar(void ** handle, char *ManagedVarPtr, char *VarPtr,
const char *VarName, size_t Size, unsigned Alignment)
```
But the struct that we store this in
jhuber6 wrote:
> > For now, it's probably easier just to put some indirection in this and pass
> > a struct to the first argument.
>
> It would be even easier to not make the move. Does "addr" have different
> meanings in each context where an offload_entry is used? Is this going to
> confuse
jhuber6 wrote:
> > > For now, it's probably easier just to put some indirection in this and
> > > pass a struct to the first argument.
> >
> >
> > It would be even easier to not make the move. Does "addr" have different
> > meanings in each context where an offload_entry is used? Is this goin
https://github.com/jhuber6 created
https://github.com/llvm/llvm-project/pull/123437
Summary:
Previously, managed variables didn't work in rdc mode using the new
driver because we just didn't register them. This was previously ignored
because we didn't have enough space in the current struct form
jhuber6 wrote:
> currently HIP runtime loads fat binary by itself for non-compressed fat
> binary. It needs to switch to use comgr for loading fat binary. Some math
> libs compile assembly code to code objects then bundle them to fat binary.
> Such math libs need to switch to use tools for pac
jhuber6 wrote:
Wonder if I should try really hard to update the struct we use for this before
the branch in 8 days, since if it's default in CUDA now it'd break ABI for a
release.
https://github.com/llvm/llvm-project/pull/123437
___
cfe-commits maili
https://github.com/jhuber6 approved this pull request.
Seems fine. What's required to move HIP over to using the binary format
natively by the way? Guessing we'd need to update a bunch of tools in the fork.
https://github.com/llvm/llvm-project/pull/122307
___
jhuber6 wrote:
> > Seems fine. What's required to move HIP over to using the binary format
> > natively by the way? Guessing we'd need to update a bunch of tools in the
> > fork.
>
> need to teach comgr to load the new offload binary
Should be easy enough, since there's a library function for
jhuber6 wrote:
> This breaks CUDA compilation on ARM, because `__has_builtin()` now returns
> false for the host-side builtins and that causes some clang headers on ARM to
> try defining their own replacement for the builtin they consider to be
> missing, but which is actually still there: htt
https://github.com/jhuber6 approved this pull request.
Honestly I'm wondering if this can only be solved with another preprocessor
macro that's like "Can this be used, no I mean it." just for these weird
offloading language cases.
https://github.com/llvm/llvm-project/pull/124626
__
https://github.com/jhuber6 closed
https://github.com/llvm/llvm-project/pull/84420
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jhuber6 closed
https://github.com/llvm/llvm-project/pull/99687
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
jhuber6 wrote:
> This looks like some changes worth in the release note?
Done.
https://github.com/llvm/llvm-project/pull/124018
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -24,11 +24,24 @@ namespace offloading {
/// This is the record of an object that just be registered with the offloading
/// runtime.
struct EntryTy {
+ /// Reserved bytes used to detect an older version of the struct, always
zero.
+ uint64_t Reserved = 0x0;
--
jhuber6 wrote:
> It is six months old, but it is relatively difficult to keep track of all the
> concurrent non-communicating swimlanes - I do apologise for not having done
> so though! It is possible that we are microfocusing on a potential use case,
> but there are other valid ones. For exam
https://github.com/jhuber6 commented:
Broad question, should we instead aim for having `-fsanitize` work for both
sides? I've always been very much against all these `*-gpu` bloat flags when we
could just do `-Xarch_host -fsanitize` or `-Xarch_device -fsanitize`. The word
against that it usual
https://github.com/jhuber6 closed
https://github.com/llvm/llvm-project/pull/124018
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
jhuber6 wrote:
> I'm not sure I follow the question exactly, but in my opinion it does not
> make sense to independently control instrumentation for host and device. We
> had to implement a means to control instrumentation on the device when
> development started and after we changed its defau
Author: Joseph Huber
Date: 2025-01-28T12:07:02-06:00
New Revision: 17d1523207c6d5fb6b1b47ccf0406a0bb58cb38d
URL:
https://github.com/llvm/llvm-project/commit/17d1523207c6d5fb6b1b47ccf0406a0bb58cb38d
DIFF:
https://github.com/llvm/llvm-project/commit/17d1523207c6d5fb6b1b47ccf0406a0bb58cb38d.diff
@@ -37,6 +37,16 @@ AMDGPUOpenMPToolChain::AMDGPUOpenMPToolChain(const Driver &D,
// Lookup binaries into the driver directory, this is used to
// discover the 'amdgpu-arch' executable.
getProgramPaths().push_back(getDriver().Dir);
+ // Diagnose unsupported sanitizer opti
jhuber6 wrote:
I'm currently working on redoing the offloading entry format. Can this land as
an interim solution so I don't need to redo the work there?
https://github.com/llvm/llvm-project/pull/123437
___
cfe-commits mailing list
cfe-commits@lists.l
@@ -353,6 +353,16 @@ Function *createRegisterGlobalsFunction(Module &M, bool
IsHIP,
FunctionCallee RegVar = M.getOrInsertFunction(
IsHIP ? "__hipRegisterVar" : "__cudaRegisterVar", RegVarTy);
+ // Get the __cudaRegisterSurface function declaration.
j
https://github.com/jhuber6 updated
https://github.com/llvm/llvm-project/pull/123437
>From bed6550941c0fafe2975288e49957a5a36895cf2 Mon Sep 17 00:00:00 2001
From: Joseph Huber
Date: Fri, 17 Jan 2025 19:56:18 -0600
Subject: [PATCH 1/3] [HIP] Support managed variables using the new driver
Summary
https://github.com/jhuber6 commented:
I thought the new driver just got some entropy from the current source file's
`inode` entry? I'm not opposed, since I guess it would make sense to give the
user a way to override it.
https://github.com/llvm/llvm-project/pull/122859
jhuber6 wrote:
There's also the question of when done from CUDA, perhaps we should make this
match the aux triple if present.
https://github.com/llvm/llvm-project/pull/115248
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.o
2201 - 2300 of 2694 matches
Mail list logo