[clang] [Driver][LTO] Move common code for LTO to addLTOOptions() (PR #74178)

2025-05-23 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. Looks like a nice cleanup https://github.com/llvm/llvm-project/pull/74178 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [OpenMP] Fix atomic compare handling with overloaded operators (PR #141142)

2025-05-22 Thread Joseph Huber via cfe-commits
@@ -11762,52 +11762,98 @@ bool OpenMPAtomicCompareChecker::checkCondUpdateStmt(IfStmt *S, X = BO->getLHS(); - auto *Cond = dyn_cast(S->getCond()); - if (!Cond) { -ErrorInfo.Error = ErrorTy::NotABinaryOp; -ErrorInfo.ErrorLoc = ErrorInfo.NoteLoc = S->getCond()->get

[clang] [OpenMP] Fix atomic compare handling with overloaded operators (PR #141142)

2025-05-22 Thread Joseph Huber via cfe-commits
@@ -11762,52 +11762,98 @@ bool OpenMPAtomicCompareChecker::checkCondUpdateStmt(IfStmt *S, X = BO->getLHS(); - auto *Cond = dyn_cast(S->getCond()); - if (!Cond) { -ErrorInfo.Error = ErrorTy::NotABinaryOp; -ErrorInfo.ErrorLoc = ErrorInfo.NoteLoc = S->getCond()->get

[clang] [XRay] Fix argument parsing with offloading (#140748) (PR #141043)

2025-05-22 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/141043 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [XRay] Fix argument parsing with offloading (#140748) (PR #141043)

2025-05-22 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. We'll now be creating the XRayArgs class when we do this every time, but I don't think it's expensive enough or done enough times to be an issue. Thanks. https://github.com/llvm/llvm-project/pull/141043 _

[clang] [CUDA][HIP] add option -gpuinc (PR #140106)

2025-05-15 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > but there is other comgr user expecting comgr to have -nogpuinc by default. > changing that will cause regressions. If `comgr` can have custom flags then you could just pass the 'do not pass `-nogpuinc` by default' flag presumably. https://github.com/llvm/llvm-project/pull/14

[clang] [CUDA][HIP] add option -gpuinc (PR #140106)

2025-05-15 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > > > > Hmm, in what cases is `-nogpuinc` added when we don't actually want it? > > > > I think we should avoid adding `-nogpuinc` if it's not needed, if > > > > possible. > > > > > > > > > comgr is the JIT compiler for HIP on ROCm. comgr uses -nogpuinc by > > > default. Howev

[clang] [CUDA][HIP] add option -gpuinc (PR #140106)

2025-05-15 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > > Hmm, in what cases is `-nogpuinc` added when we don't actually want it? I > > think we should avoid adding `-nogpuinc` if it's not needed, if possible. > > comgr is the JIT compiler for HIP on ROCm. comgr uses -nogpuinc by default. > However, some users of comgr need to over

[clang] [HIP] change default offload archs (PR #139281)

2025-05-13 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > > It's just the AMDGCN target without any `+features`, right? The only issue > > I was aware of was assuming w64 when unspecified but you fixed that > > previously. > > Almost, but it's problematic in several ways. The problems multiply once you > start adding in manually spe

[clang] [HIP] change default offload archs (PR #139281)

2025-05-13 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > > The main obstacle of letting clang emit error when `--offload-arch` is not > > specified is HIP apps using hipcc as CMAKE_CXX_COMPILER. hipcc adds -xhip > > by default for .cpp programs. This is a known and long existing issue. > > Another option is to have multiple `--offloa

[clang] [CUDA] fix wrapper cmath header to match #136101 (PR #139164)

2025-05-12 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. https://github.com/llvm/llvm-project/pull/139164 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA] fix wrapper cmath header to match #136101 (PR #139164)

2025-05-12 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. Right now this checks for `libc++` less than 14. Is that still relevant following that change? https://github.com/llvm/llvm-project/pull/139164 ___ cfe-commits mailing list cfe-commits@lists.llvm.

[clang] [NFC][Clang][CodeGen] Remove vestigial assertion (PR #127528)

2025-05-10 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > @jhuber6 Was the follow-up for this backported too? I don't remember, sorry. I think the whole thing got reverted or something? https://github.com/llvm/llvm-project/pull/127528 ___ cfe-commits mailing list cfe-commits@lists.llvm.org h

[clang] [OpenMP] Fix crash when diagnosing dist_schedule (PR #139277)

2025-05-09 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. https://github.com/llvm/llvm-project/pull/139277 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [HIP] change default offload archs (PR #139281)

2025-05-09 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > @jhuber6 do you think can we use `native` instead? I think it would be a > somewhat better option here. If we have to choose a GPU variant by default, > we may as well choose the actual GPU, rather than a conditional choice > between generic SPIR-V or an old GPU, which has the

[clang] [Clang][SYCL] Add AOT compilation support for Intel GPUs in clang-sycl-linker (PR #133194)

2025-05-06 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. Seems like a lot of the linker wrapper utilities copied and applied to Intel binaries, harmless enough. I'm wondering though, is there a reason we can't just use the backend right now? What do these tools do that running `llc` can't. http

[clang] [Clang][Driver] Only enable internalization for OpenMP target offloading with ThinLTO on AMDGPU (PR #138547)

2025-05-05 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. https://github.com/llvm/llvm-project/pull/138547 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang-linker-wrapper] Remove unused local variables (NFC) (PR #138480)

2025-05-04 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. https://github.com/llvm/llvm-project/pull/138480 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][Driver] Enable internalization by default for AMDGPU (PR #138365)

2025-05-03 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > I don't think OpenMP is more special than HIP here. Anything exposed to the > host should not be internalized. In addition, OpenMP actually also heavily > uses internalization as well in OpenMPOpt. It is likely that this change > exposes something bad in the downstream. > > T

[clang] [Clang][Driver] Enable internalization by default for AMDGPU (PR #138365)

2025-05-03 Thread Joseph Huber via cfe-commits
@@ -9284,6 +9284,12 @@ void LinkerWrapper::ConstructJob(Compilation &C, const JobAction &JA, CmdArgs.push_back(Args.MakeArgString( "--device-linker=" + TC->getTripleString() + "=" + Arg)); + // Enable internalization for AMDGPU. + if (TC->getTrip

[clang] [Clang][Driver] Enable internalization by default for AMDGPU (PR #138365)

2025-05-03 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > also seeing "PluginInterface" error: Failure to look up global address: Error > in hsa_executable_get_symbol_by_name(grid_points): > HSA_STATUS_ERROR_INVALID_SYMBOL_NAME: There is no symbol with the given name. > omptarget error: Failed to load symbol grid_points Yeah, this i

[clang] [Clang][Driver] Enable internalization by default for AMDGPU (PR #138365)

2025-05-02 Thread Joseph Huber via cfe-commits
@@ -9284,6 +9284,12 @@ void LinkerWrapper::ConstructJob(Compilation &C, const JobAction &JA, CmdArgs.push_back(Args.MakeArgString( "--device-linker=" + TC->getTripleString() + "=" + Arg)); + // Enable internalization for AMDGPU. + if (TC->getTrip

[clang] [Clang][SYCL] Add initial set of Intel OffloadArch values (PR #138158)

2025-05-01 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/138158 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][SYCL] Add initial set of Intel OffloadArch values (PR #138158)

2025-05-01 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. https://github.com/llvm/llvm-project/pull/138158 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] 0a1dde1 - [Clang] Fix GPU match any truncating 64-bit lane mask

2025-04-30 Thread Joseph Huber via cfe-commits
Author: Joseph Huber Date: 2025-04-30T16:25:28-05:00 New Revision: 0a1dde1d7957531701ba56e357276033a927f496 URL: https://github.com/llvm/llvm-project/commit/0a1dde1d7957531701ba56e357276033a927f496 DIFF: https://github.com/llvm/llvm-project/commit/0a1dde1d7957531701ba56e357276033a927f496.diff

[clang] [flang] [llvm] [openmp] [Flang][OpenMP] Move builtin .mod generation into runtimes (PR #137828)

2025-04-29 Thread Joseph Huber via cfe-commits
@@ -3979,6 +3979,16 @@ def fsyntax_only : Flag<["-"], "fsyntax-only">, Visibility<[ClangOption, CLOption, DXCOption, CC1Option, FC1Option, FlangOption]>, Group, HelpText<"Run the preprocessor, parser and semantic analysis stages">; + + +def fno_builtin_modules : Flag<["-

[clang] [flang] [llvm] [openmp] [Flang][OpenMP] Move builtin .mod generation into runtimes (PR #137828)

2025-04-29 Thread Joseph Huber via cfe-commits
@@ -157,6 +157,9 @@ class ToolChain { /// The list of toolchain specific path prefixes to search for programs. path_list ProgramPaths; +path_list ModulePaths; +path_list IntrinsicModulePaths; jhuber6 wrote: Format. https://github.com/llvm/llv

[clang] [flang] [llvm] [openmp] [Flang][OpenMP] Move builtin .mod generation into runtimes (PR #137828)

2025-04-29 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 edited https://github.com/llvm/llvm-project/pull/137828 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [flang] [llvm] [openmp] [Flang][OpenMP] Move builtin .mod generation into runtimes (PR #137828)

2025-04-29 Thread Joseph Huber via cfe-commits
@@ -299,6 +310,18 @@ elseif (FLANG_RT_GCC_RESOURCE_DIR) endif () endif () + + +if (CMAKE_C_BYTE_ORDER STREQUAL "BIG_ENDIAN") jhuber6 wrote: I was hoping I got rid of needing to detect endianness in CMake, since it makes cross-compiling a pain. Not eager to

[clang] [flang] [llvm] [openmp] [Flang][OpenMP] Move builtin .mod generation into runtimes (PR #137828)

2025-04-29 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 commented: What's the main limitation here? If this is just a file dependency it should be identical to how all the OpenMP tests depend on `omp.h` being in the resource directory. IMHO this is trivial if we do a runtimes build, since we can just require that `openmp;

[clang] [flang] [llvm] [openmp] [Flang][OpenMP] Move builtin .mod generation into runtimes (PR #137828)

2025-04-29 Thread Joseph Huber via cfe-commits
@@ -102,6 +102,10 @@ ToolChain::ToolChain(const Driver &D, const llvm::Triple &T, getFilePaths().push_back(*Path); for (const auto &Path : getArchSpecificLibPaths()) addIfExists(getFilePaths(), Path); + + if (D.IsFlangMode()) { +getIntrinsicModulePaths().append(

[clang] [llvm] [AMDGPU][clang][CodeGen][opt] Add late-resolved feature identifying predicates (PR #134016)

2025-04-29 Thread Joseph Huber via cfe-commits
@@ -29,6 +29,8 @@ MODULE_PASS("amdgpu-printf-runtime-binding", AMDGPUPrintfRuntimeBindingPass()) MODULE_PASS("amdgpu-remove-incompatible-functions", AMDGPURemoveIncompatibleFunctionsPass(*this)) MODULE_PASS("amdgpu-sw-lower-lds", AMDGPUSwLowerLDSPass(*this)) MODULE_PASS("amdg

[clang] [Clang] Disable RTTI for offloading at the frontend level (PR #127082)

2025-04-25 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/127082 >From b17f35541bb5de23389afe0af61cda2cac749e81 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Thu, 13 Feb 2025 09:27:24 -0600 Subject: [PATCH] [Clang] Disable RTTI for offloading at the frontend level Summar

[clang] [Clang][NFC] Move OffloadArch enum to a generic location (PR #137070)

2025-04-25 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/137070 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [OpenMP] Do not emit default thread limits of 128 (PR #87558)

2025-04-25 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/87558 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][NFC] Move OffloadArch enum to a generic location (PR #137070)

2025-04-24 Thread Joseph Huber via cfe-commits
@@ -97,30 +97,30 @@ static const OffloadArchToStringMap arch_names[] = { #undef GFX const char *OffloadArchToString(OffloadArch A) { - auto result = std::find_if( - std::begin(arch_names), std::end(arch_names), - [A](const OffloadArchToStringMap &map) { return A ==

[clang] [Clang][NFC] Move OffloadArch enum to a generic location (PR #137070)

2025-04-24 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. Thanks https://github.com/llvm/llvm-project/pull/137070 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][NFC] Move OffloadArch enum to a generic location (PR #137070)

2025-04-24 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 edited https://github.com/llvm/llvm-project/pull/137070 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][NFC[ Move OffloadArch enum to a generic location (PR #137070)

2025-04-24 Thread Joseph Huber via cfe-commits
@@ -0,0 +1,126 @@ +#include "clang/Basic/OffloadArch.h" + +#include "llvm/ADT/StringRef.h" + +#include + +namespace clang { + +namespace { +struct OffloadArchToStringMap { + OffloadArch arch; + const char *arch_name; + const char *virtual_arch_name; jhuber6 wr

[clang] [llvm] [OpenMP] Remove 'libomptarget.devicertl.a' fatbinary and use static library (PR #126143)

2025-04-24 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > A naive question from someone who is not familiar with this area: Is any of > this stuff usable with anything but a matching version of clang? If no, can > we place these things in the clang resource directory, where the other > version-bound runtimes live? It's not intended,

[clang] [Clang] Move OffloadArch enum to a generic location and add initial set of Intel OffloadArch values (PR #137070)

2025-04-23 Thread Joseph Huber via cfe-commits
@@ -0,0 +1,143 @@ +//===--- OffloadArch.h - Definition of offloading architectures --- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [llvm] [OpenMP] Update the bitcode library install and search path (PR #136754)

2025-04-23 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/136754 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [openmp] [OpenMP] Change build of OpenMP device runtime to be a separate runtime (PR #136729)

2025-04-23 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > I think using the LLVM_ENABLE_RUNTIMES-machanism is a great idea. Regarding > the move back to `openmp/device`, I don't really have an opinion. However, > there are some arguments to make: > > 1. The same arguments apply to `libomptarget` as well > > 2. Definitions su

[clang] [llvm] [openmp] [OpenMP] Change build of OpenMP device runtime to be a separate runtime (PR #136729)

2025-04-23 Thread Joseph Huber via cfe-commits
@@ -122,35 +130,41 @@ else() get_clang_resource_dir(LIBOMP_HEADERS_INSTALL_PATH SUBDIR include) endif() -# Build host runtime library, after LIBOMPTARGET variables are set since they are needed -# to enable time profiling support in the OpenMP runtime. -add_subdirectory(run

[clang] [llvm] [OpenMP] Update the bitcode library install and search path (PR #136754)

2025-04-23 Thread Joseph Huber via cfe-commits
@@ -2794,6 +2794,11 @@ void tools::addOpenMPDeviceRTL(const Driver &D, for (const auto &LibPath : HostTC.getFilePaths()) LibraryPaths.emplace_back(LibPath); + // Check the target specific library path for the triple as well. + SmallString<128> P(D.Dir); + llvm::sys::p

[clang] [llvm] [OpenMP] Update the bitcode library install and search path (PR #136754)

2025-04-22 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/136754 Summary: This was accidentally kept in the old location when we moved to the new `lib//` location for the DeviceRTL. Move this to reduce the delta with https://github.com/llvm/llvm-project/pull/136729. >From 21

[clang] [llvm] [openmp] [OpenMP] Change build of OpenMP device runtime to be a separate runtime (PR #136729)

2025-04-22 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/136729 >From 748a7f76bf0188e0a1b72fcd5527a03a5ca2f054 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Tue, 22 Apr 2025 12:05:42 -0500 Subject: [PATCH] [OpenMP] Change build of OpenMP device runtime to be a separate

[clang] [llvm] [openmp] [OpenMP] Change build of OpenMP device runtime to be a separate runtime (PR #136729)

2025-04-22 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/136729 >From ee6ca9501a07746c446a106619567d3faff07e98 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Tue, 22 Apr 2025 12:05:42 -0500 Subject: [PATCH] [OpenMP] Change build of OpenMP device runtime to be a separate

[clang] [llvm] [openmp] [OpenMP] Change build of OpenMP device runtime to be a separate runtime (PR #136729)

2025-04-22 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/136729 Summary: Currently we build the OpenMP device runtime as part of the `offload/` project. This is problematic because it has several restrictions when compared to the normal offloading runtime. It can only be buil

[clang] [llvm] [OpenMP] Remove 'libomptarget.devicertl.a' fatbinary and use static library (PR #126143)

2025-04-19 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > Yeah, appending `-march=` seems to work. Is this a functional work-around for now? ```diff diff --git a/offload/DeviceRTL/CMakeLists.txt b/offload/DeviceRTL/CMakeLists.txt index cce360236960..277ad9816411 100644 --- a/offload/DeviceRTL/CMakeLists.txt +++ b/offload/DeviceRTL/CMak

[clang] [llvm] [OpenMP] Remove 'libomptarget.devicertl.a' fatbinary and use static library (PR #126143)

2025-04-19 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > No, I'm afraid that didn't change anything. However, it did if I added it to > `target_link_options` too. > > That said, you want to instead: > > ```diff > --- a/offload/DeviceRTL/CMakeLists.txt > +++ b/offload/DeviceRTL/CMakeLists.txt > @@ -132,7 +132,7 @@ function(compileDev

[clang] [llvm] [OpenMP] Remove 'libomptarget.devicertl.a' fatbinary and use static library (PR #126143)

2025-04-19 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > > It should only be invoking `nvptx-arch` if the user passed `-march=native`. > > Sorry, didn't notice this sentence. Well, _I am_ building with > `-march=native` here — after all, other files are built for a CPU. If I > change it to, say, `-march=zen2`, then it indeed compile

[clang] [llvm] [OpenMP] Remove 'libomptarget.devicertl.a' fatbinary and use static library (PR #126143)

2025-04-19 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > How did you disable it? Perhaps it's failing because of the specific error: > > ``` > $ nvptx-arch > > Failed to 'dlopen' libcuda.so.1 > ``` > > For comparison, `amdgpu-

[clang] [llvm] [OpenMP] Remove 'libomptarget.devicertl.a' fatbinary and use static library (PR #126143)

2025-04-19 Thread Joseph Huber via cfe-commits
jhuber6 wrote: I disabled my NVIDIA GPU discovery and I could build it successfully. What's your `clang` version? I'm wondering what could be different here. https://github.com/llvm/llvm-project/pull/126143 ___ cfe-commits mailing list cfe-commits@lis

[clang] [llvm] [OpenMP] Remove 'libomptarget.devicertl.a' fatbinary and use static library (PR #126143)

2025-04-19 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > Nah, building standalone directly. And separately from OpenMP. I see, it previously worked because when you built with `gcc` it was still finding `clang` in your environment and using that. I'm going to move this code so that it's more explicit that we only support a just-buil

[clang] [llvm] [OpenMP] Remove 'libomptarget.devicertl.a' fatbinary and use static library (PR #126143)

2025-04-19 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > This change broke building with GCC set as the C++ compiler: > > ``` > FAILED: libomptarget-nvptx.bc > : && /usr/lib/ccache/bin/x86_64-pc-linux-gnu-g++ -O2 -pipe -march=native > -Wl,-O1 -Wl,--as-needed -Wl,-z,pack-relative-relocs > --target=nvptx64-nvidia-cuda -r -nostdlib

[clang] [llvm] [OpenMP] Remove 'libomptarget.devicertl.a' fatbinary and use static library (PR #126143)

2025-04-18 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/126143 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [SYCL] Add clang-linker-wrapper changes to call clang-sycl-linker for SYCL offloads (PR #135683)

2025-04-17 Thread Joseph Huber via cfe-commits
@@ -792,6 +805,7 @@ bundleLinkedOutput(ArrayRef Images, const ArgList &Args, llvm::TimeTraceScope TimeScope("Bundle linked output"); switch (Kind) { case OFK_OpenMP: + case OFK_SYCL: return bundleOpenMP(Images); jhuber6 wrote: Could call it `offlo

[clang] [llvm] [SYCL] Add clang-linker-wrapper changes to call clang-sycl-linker for SYCL offloads (PR #135683)

2025-04-16 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. https://github.com/llvm/llvm-project/pull/135683 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [SYCL] Add clang-linker-wrapper changes to call clang-sycl-linker for SYCL offloads (PR #135683)

2025-04-16 Thread Joseph Huber via cfe-commits
@@ -168,10 +170,10 @@ Expected> getInput(const ArgList &Args) { /// are LLVM IR bitcode files. // TODO: Support SPIR-V IR files. Expected> getBitcodeModule(StringRef File, - LLVMContext &C) { +

[clang] [llvm] [SYCL] Add clang-linker-wrapper changes to call clang-sycl-linker for SYCL offloads (PR #135683)

2025-04-16 Thread Joseph Huber via cfe-commits
@@ -937,13 +961,47 @@ Expected> linkAndWrapDeviceFiles( InputFiles.emplace_back(*FileNameOrErr); } +if (HasSYCLOffloadKind) { + // Link the remaining device files using the device linker. + auto OutputOrErr = linkDevice(InputFiles, LinkerArgs, HasSYCLO

[clang] [llvm] [SYCL] Add clang-linker-wrapper changes to call clang-sycl-linker for SYCL offloads (PR #135683)

2025-04-16 Thread Joseph Huber via cfe-commits
@@ -988,6 +1038,11 @@ Expected> linkAndWrapDeviceFiles( A.StringData["arch"] > B.StringData["arch"] || A.TheOffloadKind < B.TheOffloadKind; }); +if (Kind == OFK_SYCL) { + // TODO: Update once SYCL offload wrapping logic is available. -

[clang] [llvm] [Offload][SYCL] Refactor OffloadKind implementation (PR #135809)

2025-04-15 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. https://github.com/llvm/llvm-project/pull/135809 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [Offload][SYCL] Refactor OffloadKind implementation (PR #135809)

2025-04-15 Thread Joseph Huber via cfe-commits
@@ -32,10 +32,12 @@ namespace object { /// The producer of the associated offloading image. enum OffloadKind : uint16_t { OFK_None = 0, - OFK_OpenMP, - OFK_Cuda, - OFK_HIP, - OFK_LAST, + OFK_OpenMP = (1 << 1), + OFK_FIRST = OFK_OpenMP, jhuber6 wrote: W

[clang] [llvm] [Offload][SYCL] Refactor OffloadKind implementation (PR #135809)

2025-04-15 Thread Joseph Huber via cfe-commits
@@ -32,10 +32,12 @@ namespace object { /// The producer of the associated offloading image. enum OffloadKind : uint16_t { OFK_None = 0, - OFK_OpenMP, - OFK_Cuda, - OFK_HIP, - OFK_LAST, + OFK_OpenMP = (1 << 1), jhuber6 wrote: This is 2, not 1. https://g

[clang] [llvm] [Offload][SYCL] Refactor OffloadKind implementation (PR #135809)

2025-04-15 Thread Joseph Huber via cfe-commits
@@ -923,10 +923,9 @@ Expected> linkAndWrapDeviceFiles( }); auto LinkerArgs = getLinkerArgs(Input, BaseArgs); -DenseSet ActiveOffloadKinds; +uint16_t ActiveOffloadKindMask = 0u; jhuber6 wrote: This code doesn't need to be modified, but I gu

[clang] [llvm] [SYCL] Add clang-linker-wrapper changes to call clang-sycl-linker for SYCL offloads (PR #135683)

2025-04-15 Thread Joseph Huber via cfe-commits
@@ -35,6 +35,7 @@ enum OffloadKind : uint16_t { OFK_OpenMP, OFK_Cuda, OFK_HIP, + OFK_SYCL, jhuber6 wrote: I think we should assign specific values for these and make them powers of two apart so we can use them like a bitfield. Clang does that already.

[clang] [llvm] [SYCL] Add clang-linker-wrapper changes to call clang-sycl-linker for SYCL offloads (PR #135683)

2025-04-14 Thread Joseph Huber via cfe-commits
@@ -464,7 +464,8 @@ fatbinary(ArrayRef> InputFiles, } // namespace amdgcn namespace generic { -Expected clang(ArrayRef InputFiles, const ArgList &Args) { +Expected clang(ArrayRef InputFiles, const ArgList &Args, + bool HasSYCLOffloadKind = false) { -

[clang] [Clang] Forward two linker options to `lld` when ThinLTO is enabled for AMDGPU (PR #135690)

2025-04-14 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. https://github.com/llvm/llvm-project/pull/135690 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [SYCL] Add clang-linker-wrapper changes to call clang-sycl-linker for SYCL offloads (PR #135683)

2025-04-14 Thread Joseph Huber via cfe-commits
@@ -464,7 +464,8 @@ fatbinary(ArrayRef> InputFiles, } // namespace amdgcn namespace generic { -Expected clang(ArrayRef InputFiles, const ArgList &Args) { +Expected clang(ArrayRef InputFiles, const ArgList &Args, + bool HasSYCLOffloadKind = false) { -

[clang] [llvm] [SYCL] Add clang-linker-wrapper changes to call clang-sycl-linker for SYCL offloads (PR #135683)

2025-04-14 Thread Joseph Huber via cfe-commits
@@ -937,13 +961,47 @@ Expected> linkAndWrapDeviceFiles( InputFiles.emplace_back(*FileNameOrErr); } +if (HasSYCLOffloadKind) { + // Link the remaining device files using the device linker. + auto OutputOrErr = linkDevice(InputFiles, LinkerArgs, HasSYCLO

[clang] [Clang] add option --offload-jobs=N (PR #135229)

2025-04-14 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. LG, one nit. https://github.com/llvm/llvm-project/pull/135229 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] add option --offload-jobs=N (PR #135229)

2025-04-14 Thread Joseph Huber via cfe-commits
@@ -57,6 +57,7 @@ #include "llvm/Support/FileSystem.h" #include "llvm/Support/Path.h" #include "llvm/Support/Process.h" +#include "llvm/Support/ThreadPool.h" jhuber6 wrote: ```suggestion ``` Unused now? https://github.com/llvm/llvm-project/pull/135229 ___

[clang] [Clang] add option --offload-jobs=N (PR #135229)

2025-04-14 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 edited https://github.com/llvm/llvm-project/pull/135229 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] add option --offload-jobs=N (PR #135229)

2025-04-14 Thread Joseph Huber via cfe-commits
@@ -1234,6 +1234,10 @@ def offload_compression_level_EQ : Joined<["--"], "offload-compression-level=">, Flags<[HelpHidden]>, HelpText<"Compression level for offload device binaries (HIP only)">; +def offload_jobs_EQ : Joined<["--"], "offload-jobs=">, + HelpText<"Specify

[clang] [Clang][AMDGPU] Enable `avail-extern-to-local` for ThinLTO in HIP (PR #134476)

2025-04-14 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. https://github.com/llvm/llvm-project/pull/134476 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] add option --offload-jobs=N (PR #135229)

2025-04-11 Thread Joseph Huber via cfe-commits
@@ -1234,6 +1234,10 @@ def offload_compression_level_EQ : Joined<["--"], "offload-compression-level=">, Flags<[HelpHidden]>, HelpText<"Compression level for offload device binaries (HIP only)">; +def offload_jobs_EQ : Joined<["--"], "offload-jobs=">, + HelpText<"Specify

[clang] [Clang] add option --offload-jobs=N (PR #135229)

2025-04-11 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 edited https://github.com/llvm/llvm-project/pull/135229 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] add option --offload-jobs=N (PR #135229)

2025-04-11 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. LG with one nit. https://github.com/llvm/llvm-project/pull/135229 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [flang] [Flang][OpenMP][ROCM] Enable rocm-device-lib-path for flang (PR #135307)

2025-04-10 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. https://github.com/llvm/llvm-project/pull/135307 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Unify 'nvptx-arch' and 'amdgpu-arch' into 'offload-arch' (PR #134713)

2025-04-10 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/134713 Summary: These two tools do the same thing, we should unify them into a single tool. We create symlinks for backward compatiblity and provide a way to get the old vendor specific behavior with `--amdgpu-only` and

[clang] [Clang] add option --offload-jobs=N (PR #135229)

2025-04-10 Thread Joseph Huber via cfe-commits
@@ -1233,6 +1233,10 @@ def offload_compression_level_EQ : Joined<["--"], "offload-compression-level=">, Flags<[HelpHidden]>, HelpText<"Compression level for offload device binaries (HIP only)">; +def offload_jobs_EQ : Joined<["--"], "offload-jobs=">, + HelpText<"Set the

[clang] [Clang] add option --offload-jobs=N (PR #135229)

2025-04-10 Thread Joseph Huber via cfe-commits
@@ -9360,6 +9362,19 @@ void LinkerWrapper::ConstructJob(Compilation &C, const JobAction &JA, CmdArgs.push_back(LinkArg); addOffloadCompressArgs(Args, CmdArgs); + + // Default to half of hardware threads if users do not specify it. + if (Arg *A = Args.getLastArg(opti

[clang] [compiler-rt] [libc] [libcxx] [llvm] [AMDGPU] Fix code object version not being set to 'none' (PR #135036)

2025-04-09 Thread Joseph Huber via cfe-commits
@@ -62,7 +62,7 @@ Value *EmitAMDGPUWorkGroupSize(CodeGenFunction &CGF, unsigned Index) { auto Cov = CGF.getTarget().getTargetOpts().CodeObjectVersion; - if (Cov == CodeObjectVersionKind::COV_None) { + if (Cov == CodeObjectVersionKind::COV_None && !CGF.getLangOpts().OpenM

[clang] [compiler-rt] [libc] [libcxx] [llvm] [AMDGPU] Fix code object version not being set to 'none' (PR #135036)

2025-04-09 Thread Joseph Huber via cfe-commits
@@ -62,7 +62,7 @@ Value *EmitAMDGPUWorkGroupSize(CodeGenFunction &CGF, unsigned Index) { auto Cov = CGF.getTarget().getTargetOpts().CodeObjectVersion; - if (Cov == CodeObjectVersionKind::COV_None) { + if (Cov == CodeObjectVersionKind::COV_None && !CGF.getLangOpts().OpenM

[clang] [compiler-rt] [libc] [libcxx] [llvm] [AMDGPU] Fix code object version not being set to 'none' (PR #135036)

2025-04-09 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/135036 >From e41985970c254f3eda71cb5ef3a1dc321c8e6f56 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Wed, 9 Apr 2025 09:41:38 -0500 Subject: [PATCH 1/2] [AMDGPU] Fix code object verion not being set to 'none' Summa

[clang] [compiler-rt] [libc] [libcxx] [llvm] [AMDGPU] Fix code object verion not being set to 'none' (PR #135036)

2025-04-09 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/135036 >From e41985970c254f3eda71cb5ef3a1dc321c8e6f56 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Wed, 9 Apr 2025 09:41:38 -0500 Subject: [PATCH 1/2] [AMDGPU] Fix code object verion not being set to 'none' Summa

[clang] [HIP] use offload wrapper for non-device-only non-rdc (PR #132869)

2025-04-09 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. https://github.com/llvm/llvm-project/pull/132869 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Unify 'nvptx-arch' and 'amdgpu-arch' into 'offload-arch' (PR #134713)

2025-04-08 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/134713 >From e44db82f3abe7c1d23c2b49094c92a890127ffc7 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Mon, 7 Apr 2025 14:37:22 -0500 Subject: [PATCH 1/4] [Clang] Unify 'nvptx-arch' and 'amdgpu-arch' into 'offload-ar

[clang] [llvm] [clang][OpenMP][SPIR-V] Fix addrspace of globals and global constants (PR #134399)

2025-04-08 Thread Joseph Huber via cfe-commits
@@ -37,8 +37,8 @@ static const unsigned SPIRDefIsPrivMap[] = { 0, // cuda_device 0, // cuda_constant 0, // cuda_shared -// SYCL address space values for this map are dummy -0, // sycl_global +// Most SYCL address space values for this map are dummy -

[clang] [Clang] Unify 'nvptx-arch' and 'amdgpu-arch' into 'offload-arch' (PR #134713)

2025-04-08 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/134713 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] clang/AMDGPU: Stop looking for hip.bc in device libs (PR #134801)

2025-04-08 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. https://github.com/llvm/llvm-project/pull/134801 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Unify 'nvptx-arch' and 'amdgpu-arch' into 'offload-arch' (PR #134713)

2025-04-07 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/134713 >From e44db82f3abe7c1d23c2b49094c92a890127ffc7 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Mon, 7 Apr 2025 14:37:22 -0500 Subject: [PATCH 1/2] [Clang] Unify 'nvptx-arch' and 'amdgpu-arch' into 'offload-ar

[clang] [Clang] Unify 'nvptx-arch' and 'amdgpu-arch' into 'offload-arch' (PR #134713)

2025-04-07 Thread Joseph Huber via cfe-commits
@@ -0,0 +1,78 @@ +//===- OffloadArch.cpp - list available GPUs *- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apa

[clang] [Clang] Unify 'nvptx-arch' and 'amdgpu-arch' into 'offload-arch' (PR #134713)

2025-04-07 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/134713 >From e44db82f3abe7c1d23c2b49094c92a890127ffc7 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Mon, 7 Apr 2025 14:37:22 -0500 Subject: [PATCH 1/6] [Clang] Unify 'nvptx-arch' and 'amdgpu-arch' into 'offload-ar

[clang] [Clang] Unify 'nvptx-arch' and 'amdgpu-arch' into 'offload-arch' (PR #134713)

2025-04-07 Thread Joseph Huber via cfe-commits
@@ -0,0 +1,78 @@ +//===- OffloadArch.cpp - list available GPUs *- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apa

[clang] e288577 - [libc] Fix function that wasn't updated in wrapper headers

2025-04-07 Thread Joseph Huber via cfe-commits
Author: Joseph Huber Date: 2025-04-07T10:30:58-05:00 New Revision: e2885772f05ddf9d81c54c5489801108838ca053 URL: https://github.com/llvm/llvm-project/commit/e2885772f05ddf9d81c54c5489801108838ca053 DIFF: https://github.com/llvm/llvm-project/commit/e2885772f05ddf9d81c54c5489801108838ca053.diff

[clang] [compiler-rt] [flang] [libc] [libcxx] [llvm] [Clang][AMDGPU] Remove special handling for COV4 libraries (PR #132870)

2025-04-05 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/132870 >From 3fe8e18a4fb725b345210f5dffa13716cc7fb7f0 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Mon, 24 Mar 2025 22:36:04 -0500 Subject: [PATCH] [Clang][AMDGPU] Remove special handling for COV4 libraries Summa

[clang] [NFC][clang] Remove superfluous header files after refactor in #132252 (PR #132495)

2025-04-05 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > How'd you manage to find the right ones? IMO we should be using > include-what-you-use on these to make sure we get it right (if you have > already, disregard this). > > Also, can you share before-split/after-split/after-this build time > benchmarks? Does this get us back to

[clang] [AMDGPU][clang] provide device implementation for __builtin_logb and … (PR #129347)

2025-04-05 Thread Joseph Huber via cfe-commits
@@ -0,0 +1,32 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 5 +// RUN: %clang --cuda-device-only -nogpuinc -nogpulib -emit-llvm -S -o - %s | FileCheck %s jhuber6 wrote: I feel like this test doesn't need t

  1   2   3   4   5   6   7   8   9   10   >