[clang] [nvlink-wrapper] Use a symbolic link instead of copying the file (PR #110139)

2024-09-26 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/110139 >From 393e05145d0c31a3b1b254f97a357c776617898c Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Thu, 26 Sep 2024 11:04:46 -0500 Subject: [PATCH] [nvlink-wrapper] Use a symbolic link instead of copying the file

[clang] [nvlink-wrapper] Use a symbolic link instead of copying the file (PR #110139)

2024-09-26 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/110139 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Implement resource directory headers for common GPU intrinsics (PR #110179)

2024-09-26 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/110179 Summary: All GPU based languages provide some way to access things like the thread ID or other resources. However, this is spread between many different languages and it varies between targets. The goal here is t

[clang] [Clang] Implement resource directory headers for common GPU intrinsics (PR #110179)

2024-09-26 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/110179 >From f5a8afe139a25f13989556d40e29b98788934dd9 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Thu, 26 Sep 2024 16:47:14 -0500 Subject: [PATCH] [Clang] Implement resource directory headers for common GPU intr

[clang] [Clang] Implement resource directory headers for common GPU intrinsics (PR #110179)

2024-09-27 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > Probably want a longer prefix. _gpu or_llvm or similar. Yeah, just wasn't sure. Also, do resource headers need to be in a reserved namespace? Probably nothing wrong with `gpu_get_thread_id` vs `_gpu_get_thread_id`. > If the shared header gets the declarations then people ca

[clang] [NVPTX] Add a clang builtin for the `warpsize` intrinsic (PR #110316)

2024-09-27 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/110316 Summary: There's an intrinsic for the warp size, we want to expose this to make the interface proposed in https://github.com/llvm/llvm-project/pull/110179 more generic. >From 63d45843ee15c940680e4d6a3ea87138ebf

[clang] [Clang] Implement resource directory headers for common GPU intrinsics (PR #110179)

2024-09-27 Thread Joseph Huber via cfe-commits
@@ -0,0 +1,184 @@ +//===-- nvptxintrin.h - NVPTX intrinsic functions -===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [Clang] Implement resource directory headers for common GPU intrinsics (PR #110179)

2024-09-27 Thread Joseph Huber via cfe-commits
@@ -0,0 +1,187 @@ +//===-- amdgpuintrin.h - AMDPGU intrinsic functions ---===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [Clang] Implement resource directory headers for common GPU intrinsics (PR #110179)

2024-09-27 Thread Joseph Huber via cfe-commits
@@ -0,0 +1,187 @@ +//===-- amdgpuintrin.h - AMDPGU intrinsic functions ---===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [Clang] Implement resource directory headers for common GPU intrinsics (PR #110179)

2024-09-27 Thread Joseph Huber via cfe-commits
@@ -0,0 +1,187 @@ +//===-- amdgpuintrin.h - AMDPGU intrinsic functions ---===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [Clang] Implement resource directory headers for common GPU intrinsics (PR #110179)

2024-09-27 Thread Joseph Huber via cfe-commits
@@ -0,0 +1,187 @@ +//===-- amdgpuintrin.h - AMDPGU intrinsic functions ---===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [Clang] Implement resource directory headers for common GPU intrinsics (PR #110179)

2024-09-27 Thread Joseph Huber via cfe-commits
@@ -0,0 +1,184 @@ +//===-- nvptxintrin.h - NVPTX intrinsic functions -===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [Clang] Implement resource directory headers for common GPU intrinsics (PR #110179)

2024-09-27 Thread Joseph Huber via cfe-commits
@@ -0,0 +1,187 @@ +//===-- amdgpuintrin.h - AMDPGU intrinsic functions ---===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [Clang] Implement resource directory headers for common GPU intrinsics (PR #110179)

2024-09-27 Thread Joseph Huber via cfe-commits
jhuber6 wrote: I am wondering if it would be easier to provide generic builtins in clang and just codegen them. I guess in that case we'd just upscale everything to 64-bit and say "If you need the other one use the target specific version". https://github.com/llvm/llvm-project/pull/110179

[clang] [NVPTX] Add a clang builtin for the `warpsize` intrinsic (PR #110316)

2024-09-27 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/110316 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Implement resource directory headers for common GPU intrinsics (PR #110179)

2024-09-27 Thread Joseph Huber via cfe-commits
@@ -0,0 +1,184 @@ +//===-- nvptxintrin.h - NVPTX intrinsic functions -===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] 787a6d5 - [nvlink-wrapper] Remove use of symlinks

2024-09-27 Thread Joseph Huber via cfe-commits
Author: Joseph Huber Date: 2024-09-27T12:05:56-05:00 New Revision: 787a6d57f95ff6eaee8df01392900a6eea512930 URL: https://github.com/llvm/llvm-project/commit/787a6d57f95ff6eaee8df01392900a6eea512930 DIFF: https://github.com/llvm/llvm-project/commit/787a6d57f95ff6eaee8df01392900a6eea512930.diff

[clang] [AMDGPU] Correctly use the auxiliary toolchain to include libc++ (PR #109366)

2024-09-19 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/109366 Summary: Now that we have a functional build for `libc++` on the GPU, it will now find the target specific headers in `include/amdgcn-amd-amdhsa`. This is a problem for offloading via OpenMP because we need the C

[clang] [Clang] Automatically link the `compiler-rt` for GPUs if present (PR #109152)

2024-09-18 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/109152 Summary: This automically links `copmiler-rt` for offloading languages if it exists in the resource directory. >From b6f6cbf7e1819779eeece437daef5bfb9b2a8cd0 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: W

[clang] [AMDGPU] Correctly use the auxiliary toolchain to include libc++ (PR #109366)

2024-09-20 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > The fix looks good. A test would be preferred. Done https://github.com/llvm/llvm-project/pull/109366 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [AMDGPU] Correctly use the auxiliary toolchain to include libc++ (PR #109366)

2024-09-20 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/109366 >From f47b67c20014fbedc5ce9764be2e2687258a474e Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Thu, 19 Sep 2024 22:03:42 -0500 Subject: [PATCH] [AMDGPU] Correctly use the auxiliary toolchain to include libc++

[clang] [AMDGPU] Correctly use the auxiliary toolchain to include libc++ (PR #109366)

2024-09-20 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/109366 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [flang] [Flang][Driver][Offload] Support -Xoffload-linker argument in Flang (PR #109907)

2024-09-25 Thread Joseph Huber via cfe-commits
jhuber6 wrote: Maybe it's the double dashes after the check? I guess while we're at it might as well check the `-Xoffload-linker-amdgcn-amd-amdhsa` format as well. https://github.com/llvm/llvm-project/pull/109907 ___ cfe-commits mailing list cfe-commi

[clang] [flang] [Flang][Driver][Offload] Support -Xoffload-linker argument in Flang (PR #109907)

2024-09-25 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 approved this pull request. https://github.com/llvm/llvm-project/pull/109907 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Implement resource directory headers for common GPU intrinsics (PR #110179)

2024-10-02 Thread Joseph Huber via cfe-commits
@@ -0,0 +1,187 @@ +//===-- amdgpuintrin.h - AMDPGU intrinsic functions ---===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [Clang] Implement resource directory headers for common GPU intrinsics (PR #110179)

2024-10-02 Thread Joseph Huber via cfe-commits
@@ -0,0 +1,187 @@ +//===-- amdgpuintrin.h - AMDPGU intrinsic functions ---===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [Clang] Implement resource directory headers for common GPU intrinsics (PR #110179)

2024-10-02 Thread Joseph Huber via cfe-commits
@@ -0,0 +1,187 @@ +//===-- amdgpuintrin.h - AMDPGU intrinsic functions ---===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [compiler-rt] [llvm] [openmp] [PGO][Offload] Add GPU profiling flags to driver (PR #94268)

2024-10-02 Thread Joseph Huber via cfe-commits
@@ -0,0 +1,82 @@ +// RUN: %libomptarget-compile-generic -fprofile-generate-gpu jhuber6 wrote: This is a limitation of the PTX target, globals cannot reference themselves. Most likely whatever NVIDIA engineer wrote the PTX parser found it annoying to reference s

[clang] [AMDGPU] Make `__GCC_DESTRUCTIVE_SIZE` 128 on AMDGPU (PR #115241)

2024-11-06 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/115241 Summary: The cache line size on AMDGPU varies between 64 and 128 (The lowest L2 cache also goes to 256 on some architectures.) This macro is intended to present a size that will not cause destructive interference

[clang] [AMDGPU] Make `__GCC_DESTRUCTIVE_SIZE` 128 on AMDGPU (PR #115241)

2024-11-06 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/115241 >From fcb8bcfba329b6ad9f33ace70c22ca4b542d2117 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Wed, 6 Nov 2024 18:27:07 -0600 Subject: [PATCH] [AMDGPU] Make `__GCC_DESTRUCTIVE_SIZE` 128 on AMDGPU Summary: The

[clang] [AMDGPU] Make `__GCC_DESTRUCTIVE_SIZE` 128 on AMDGPU (PR #115241)

2024-11-06 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/115241 >From 451f37016c5bd4cbd0bb08cc172995e8af4e7482 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Wed, 6 Nov 2024 18:27:07 -0600 Subject: [PATCH] [AMDGPU] Make `__GCC_DESTRUCTIVE_SIZE` 128 on AMDGPU Summary: The

[clang] [AMDGPU] Make `__GCC_DESTRUCTIVE_SIZE` 128 on AMDGPU (PR #115241)

2024-11-07 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/115241 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [LinkerWrapper] Remove in-house handling of LTO (PR #113715)

2024-10-25 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 ready_for_review https://github.com/llvm/llvm-project/pull/113715 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang-linker-wrapper] Add error handling for missing linker path (PR #113613)

2024-10-25 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/113613 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [libc] [Clang] Implement resource directory headers for common GPU intrinsics (PR #110179)

2024-10-25 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/110179 >From 014742418463fffa0b2d097fe668f02558addcc9 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Thu, 26 Sep 2024 16:47:14 -0500 Subject: [PATCH 1/5] [Clang] Implement resource directory headers for common GPU

[clang] [LinkerWrapper] Remove in-house handling of LTO (PR #113715)

2024-10-25 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/113715 Summary: This should be the linker's job if the user creates any bitcode files, then passing `-flto` to the linker for the toolchain should be able to handle it. Right now this path is only used in the case where

[clang] [LinkerWrapper] Remove in-house handling of LTO (PR #113715)

2024-10-25 Thread Joseph Huber via cfe-commits
jhuber6 wrote: @asudarsa https://github.com/llvm/llvm-project/pull/113715 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [libc] [Clang] Implement resource directory headers for common GPU intrinsics (PR #110179)

2024-10-25 Thread Joseph Huber via cfe-commits
@@ -0,0 +1,154 @@ +//===-- amdgpuintrin.h - AMDPGU intrinsic functions ---===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [libc] [Clang] Implement resource directory headers for common GPU intrinsics (PR #110179)

2024-10-25 Thread Joseph Huber via cfe-commits
@@ -0,0 +1,154 @@ +//===-- amdgpuintrin.h - AMDPGU intrinsic functions ---===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [clang][Driver][HIP] Add support for mixing AMDGCNSPIRV & concrete `offload-arch`s. (PR #113509)

2024-10-25 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > It's how it works today, I believe: `--offload-arch` unambiguously > establishes the toolchain It's pretty ambiguous, right now it mostly works through a combination of the file type and good guessing because the targets people care about now have distinct names. For OpenMP,

[clang] [Clang] Add a flag to include GPU startup files (PR #112025)

2024-10-25 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/112025 >From a02a27171801ad3f5618099b5035ef8185c2f835 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Fri, 11 Oct 2024 12:21:49 -0500 Subject: [PATCH 1/4] [Clang] Add a flag to include GPU startup files Summary: The

[clang] [Clang] Add a flag to include GPU startup files (PR #112025)

2024-10-25 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/112025 >From a02a27171801ad3f5618099b5035ef8185c2f835 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Fri, 11 Oct 2024 12:21:49 -0500 Subject: [PATCH 1/5] [Clang] Add a flag to include GPU startup files Summary: The

[clang] [libc] [Clang] Implement resource directory headers for common GPU intrinsics (PR #110179)

2024-10-25 Thread Joseph Huber via cfe-commits
@@ -0,0 +1,76 @@ +//===-- gpuintrin.h - Generic GPU intrinsic functions -===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apa

[clang] [Clang][SYCL] Introduce clang-sycl-link-wrapper to link SYCL offloading device code (Part 1 of many) (PR #112245)

2024-10-15 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > Why is this a clang-based tool? Other than using clang/Basic to get the clang > version (which could also be retrieved from LLVM), this doesn't seem to have > any dependencies on Clang. > > As I'm seeing the SYCL patches coming in I'm getting concerned that the > architecture

[clang] [Clang][SYCL] Introduce clang-sycl-link-wrapper to link SYCL offloading device code (Part 1 of many) (PR #112245)

2024-10-15 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > From the discourse post and everything I've found reading about the SYCL > tooling, it seems to me like this should really just all be integrated into > LLD and performed with the linking phase. It seems like a huge waste of IO to > read objects, rip out device-specific bits,

[clang] [Clang][SYCL] Introduce clang-sycl-link-wrapper to link SYCL offloading device code (Part 1 of many) (PR #112245)

2024-10-15 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > LLD has a wasm target. Exactly what I meant. > I think LLD also needs to eventually gain a SPIR-V target to support linking > SPIR-V binaries, because SPIR-V does support linkage at the SPIR-V level not > just the LTO IR level (see: > https://registry.khronos.org/SPIR-V/spec

[clang] [Clang][SYCL] Introduce clang-sycl-link-wrapper to link SYCL offloading device code (Part 1 of many) (PR #112245)

2024-10-15 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > > I don't think `lld` makes sense, but you can definitely use the LTO > > interface inside of this tool to create a similar effect. > > Why not? `lld` is a toolchain linker (which you said this is supposed to > emulate), and it is an interface to the LTO interface... so it see

[clang] [Clang][SYCL] Introduce clang-sycl-link-wrapper to link SYCL offloading device code (Part 1 of many) (PR #112245)

2024-10-15 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > I think we could all benefit from some documentation describing how the SYCL > compiler flow is intended to work, what tools are added/modified, and what > the expected outputs are at each compiler phase. Without some idea of the > architecture of what is being built changes l

[clang] [Clang][SYCL] Introduce clang-sycl-link-wrapper to link SYCL offloading device code (Part 1 of many) (PR #112245)

2024-10-15 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > Clang can use `lld`, `llvm-ar`, and `dsymutil` (among others), so being part > of Clang certainly isn't necessary for the clang driver to invoke it. Yes, but generally we've just stashed stuff in `clang` if we didn't plan to expose them as general utilities, since it's not lik

[clang] [Clang][SYCL] Introduce clang-sycl-link-wrapper to link SYCL offloading device code (Part 1 of many) (PR #112245)

2024-10-15 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > > Why is this a clang-based tool? Other than using clang/Basic to get the > > clang version (which could also be retrieved from LLVM), this doesn't seem > > to have any dependencies on Clang. > > As I'm seeing the SYCL patches coming in I'm getting concerned that the > > archi

[clang] [Clang][SYCL] Introduce clang-sycl-link-wrapper to link SYCL offloading device code (Part 1 of many) (PR #112245)

2024-10-15 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > Generally clang-based tools have dependencies on Clang. This tool does not > (other than pulling the version number which could come from LLVM). We can move this to `llvm` if it's a blocking issue, doesn't really make much of a difference (I forget if this patch adds it to the

[clang] [Clang][SYCL] Introduce clang-sycl-link-wrapper to link SYCL offloading device code (Part 1 of many) (PR #112245)

2024-10-15 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > I'm also concerned that there are no tests for the new tool. There are tests > for the clang driver changes, but none for the new tool. The new tool seems > to be just a wrapper around llvm-link, which does make me think this should > probably just be folded into llvm-link. Un

[clang] [Clang][SYCL] Introduce clang-sycl-link-wrapper to link SYCL offloading device code (Part 1 of many) (PR #112245)

2024-10-15 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > It seems to me like this should be done through the LTO interface and driven > through `lld`. I can understand if an intermediate step is required while the > SPIR-V backend is under development, but clang shouldn't be in the business > of linking, and generally neither should

[clang] [Clang] Add a flag to include GPU startup files (PR #112025)

2024-10-17 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/112025 >From a02a27171801ad3f5618099b5035ef8185c2f835 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Fri, 11 Oct 2024 12:21:49 -0500 Subject: [PATCH 1/3] [Clang] Add a flag to include GPU startup files Summary: The

[clang] [Clang][HIP] Deprecate the AMDGCN_WAVEFRONT_SIZE macros (PR #112849)

2024-11-06 Thread Joseph Huber via cfe-commits
jhuber6 wrote: This just emits a warning so it doesn't actually change anything yet. I'm waiting on the wavesize folding before I land the changes in `libomptarget` because it'd have performance considerations. https://github.com/llvm/llvm-project/pull/112849 __

[clang] [NVPTX] Set cache line size to 128 for __GCC_DESTRUCTIVE_SIZE (PR #115248)

2024-11-06 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/115248 Summary: According to the white papers, the cache line size on NVPTX architectures is 128 bytes. This should be what's returned by these preprocessor macros. >From c18a58229b6921b613045a0d380f09bade14c2fb Mon S

[clang] [AMDGPU] Make `__GCC_DESTRUCTIVE_SIZE` 128 on AMDGPU (PR #115241)

2024-11-06 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/115241 >From fcb8bcfba329b6ad9f33ace70c22ca4b542d2117 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Wed, 6 Nov 2024 18:27:07 -0600 Subject: [PATCH 1/2] [AMDGPU] Make `__GCC_DESTRUCTIVE_SIZE` 128 on AMDGPU Summary:

[clang] [llvm] [llvm][AMDGPU] Fold `llvm.amdgcn.wavefrontsize` early (PR #114481)

2024-11-24 Thread Joseph Huber via cfe-commits
@@ -1024,6 +1024,15 @@ GCNTTIImpl::instCombineIntrinsic(InstCombiner &IC, IntrinsicInst &II) const { } break; } + case Intrinsic::amdgcn_wavefrontsize: { +// TODO: this is a workaround for the pseudo-generic target one gets with no +// specified mcpu, which

[clang] [Clang] Add support for scoped atomic thread fence (PR #115545)

2024-11-18 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/115545 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [amdgpu-arch] Replace use of HSA with reading sysfs directly (PR #116651)

2024-11-18 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/116651 >From c95e80939c8189def053556a232ba611d6dc02cc Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Mon, 18 Nov 2024 10:34:23 -0600 Subject: [PATCH 1/5] [amdgpu-arch] Replcae use of HSA with reading sysfs directly

[clang] [amdgpu-arch] Replace use of HSA with reading sysfs directly (PR #116651)

2024-11-18 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > One clumsy outstanding thing is that this should now be some code in a header > that clang includes so that instead of the subprocess shell to handle > arch=native clang just looks up the information directly. Would only work for Linux unfortunately, unless some Windows driver

[clang] [amdgpu-arch] Replace use of HSA with reading sysfs directly (PR #116651)

2024-11-18 Thread Joseph Huber via cfe-commits
@@ -0,0 +1,74 @@ +//===- AMDGPUArchByKFD.cpp - list AMDGPU installed --*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apa

[clang] [amdgpu-arch] Replace use of HSA with reading sysfs directly (PR #116651)

2024-11-18 Thread Joseph Huber via cfe-commits
@@ -0,0 +1,74 @@ +//===- AMDGPUArchByKFD.cpp - list AMDGPU installed --*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apa

[clang] [amdgpu-arch] Replace use of HSA with reading sysfs directly (PR #116651)

2024-11-18 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/116651 >From c95e80939c8189def053556a232ba611d6dc02cc Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Mon, 18 Nov 2024 10:34:23 -0600 Subject: [PATCH 1/2] [amdgpu-arch] Replcae use of HSA with reading sysfs directly

[clang] [amdgpu-arch] Replace use of HSA with reading sysfs directly (PR #116651)

2024-11-18 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > @jhuber6 can you comment on "lot of overhead" and if that matters? Also, not > sure why the HSA library dependence is a problem. This seems to be exposing > amdgpu-arch to more maintenance overhead. Sometimes the driver will hang and since this is used inside of `clang` to su

[clang] [amdgpu-arch] Replace use of HSA with reading sysfs directly (PR #116651)

2024-11-18 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > > @jhuber6 can you comment on "lot of overhead" and if that matters? Also, > > not sure why the HSA library dependence is a problem. This seems to be > > exposing amdgpu-arch to more maintenance overhead. > Sometimes the driver will hang and since this is used inside of `clan

[clang] [amdgpu-arch] Replace use of HSA with reading sysfs directly (PR #116651)

2024-11-18 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/116651 >From c95e80939c8189def053556a232ba611d6dc02cc Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Mon, 18 Nov 2024 10:34:23 -0600 Subject: [PATCH 1/4] [amdgpu-arch] Replcae use of HSA with reading sysfs directly

[clang] [amdgpu-arch] Replace use of HSA with reading sysfs directly (PR #116651)

2024-11-18 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > I feel like this is a workaround. Can we not fix the "limitation of the > driver" and if there is an issue with HSA overhead, shouldn't we file a > ticket? I don't know for sure, but I'd guess that the limitations seen may be related to the limit of 128 or so doorbells I thin

[clang] [Clang] Add support for scoped atomic thread fence (PR #115545)

2024-11-18 Thread Joseph Huber via cfe-commits
jhuber6 wrote: Ping https://github.com/llvm/llvm-project/pull/115545 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [amdgpu-arch] Replace use of HSA with reading sysfs directly (PR #116651)

2024-11-18 Thread Joseph Huber via cfe-commits
@@ -0,0 +1,77 @@ +//===- AMDGPUArchByKFD.cpp - list AMDGPU installed --*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apa

[clang] [amdgpu-arch] Replace use of HSA with reading sysfs directly (PR #116651)

2024-11-18 Thread Joseph Huber via cfe-commits
jhuber6 wrote: Here's a question, should it respect `ROCR_VISIBLE_DEVICES`? https://github.com/llvm/llvm-project/pull/116651 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [amdgpu-arch] Replace use of HSA with reading sysfs directly (PR #116651)

2024-11-18 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > > Here's a question, should it respect `ROCR_VISIBLE_DEVICES`? > > I think it depends on whether you consider this as a ROCm toolchain. If not, > I'd prefer not to be bound to any ROCm related stuff. Agreed, maybe in the fork but makes sense to leave community LLVM alone. htt

[clang] Remove Linux search paths on Windows (PR #113628)

2024-11-19 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 commented: Is there an issue with simply using the `HostTC` for everything? I feel like that's the solution to this mess, since the `HostTC` would always know whether or not the target is Windows without us needing to forward a bunch of stuff. https://github.com/llv

[clang] [llvm] [Offload] Move HIP and CUDA to new driver by default (PR #84420)

2024-11-19 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/84420 >From 90a6145e2bc22ed511a11306307488a525f3738f Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Thu, 7 Mar 2024 15:48:00 -0600 Subject: [PATCH] [Offload] Move HIP and CUDA to new driver by default Summary: This

[clang] [llvm] [Offload] Move HIP and CUDA to new driver by default (PR #84420)

2024-11-19 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/84420 >From 2384f1b93238103d2a5b8944d8fa79a32a9c994e Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Thu, 7 Mar 2024 15:48:00 -0600 Subject: [PATCH] [Offload] Move HIP and CUDA to new driver by default Summary: This

[clang] [AMDGPU] Use the AMDGPUToolChain when targeting C/C++ directly (PR #99687)

2024-11-20 Thread Joseph Huber via cfe-commits
jhuber6 wrote: Ping https://github.com/llvm/llvm-project/pull/99687 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [amdgpu-arch] Replace use of HSA with reading sysfs directly (PR #116651)

2024-11-19 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/116651 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [llvm][AMDGPU] Fold `llvm.amdgcn.wavefrontsize` early (PR #114481)

2024-11-18 Thread Joseph Huber via cfe-commits
@@ -1024,6 +1024,15 @@ GCNTTIImpl::instCombineIntrinsic(InstCombiner &IC, IntrinsicInst &II) const { } break; } + case Intrinsic::amdgcn_wavefrontsize: { +// TODO: this is a workaround for the pseudo-generic target one gets with no +// specified mcpu, which

[clang] [amdgpu-arch] Replcae use of HSA with reading sysfs directly (PR #116651)

2024-11-18 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/116651 Summary: For Linux systems, we currently use the HSA library to determine the installed GPUs. However, this isn't really necessary and adds a dependency on the HSA runtime as well as a lot of overhead. Instead, t

[clang] [llvm] [llvm][AMDGPU] Fold `llvm.amdgcn.wavefrontsize` early (PR #114481)

2024-11-18 Thread Joseph Huber via cfe-commits
@@ -1024,6 +1024,15 @@ GCNTTIImpl::instCombineIntrinsic(InstCombiner &IC, IntrinsicInst &II) const { } break; } + case Intrinsic::amdgcn_wavefrontsize: { +// TODO: this is a workaround for the pseudo-generic target one gets with no +// specified mcpu, which

[clang] [llvm] [llvm][AMDGPU] Fold `llvm.amdgcn.wavefrontsize` early (PR #114481)

2024-11-18 Thread Joseph Huber via cfe-commits
@@ -1024,6 +1024,15 @@ GCNTTIImpl::instCombineIntrinsic(InstCombiner &IC, IntrinsicInst &II) const { } break; } + case Intrinsic::amdgcn_wavefrontsize: { +// TODO: this is a workaround for the pseudo-generic target one gets with no +// specified mcpu, which

[clang] Remove Linux search paths on Windows (PR #113628)

2024-11-18 Thread Joseph Huber via cfe-commits
@@ -6440,7 +6440,8 @@ const ToolChain &Driver::getToolChain(const ArgList &Args, TC = std::make_unique(*this, Target, Args); break; case llvm::Triple::AMDHSA: - TC = std::make_unique(*this, Target, Args); + TC = std::make_unique(*this, Target, Args, +

[clang] [amdgpu-arch] Replace use of HSA with reading sysfs directly (PR #116651)

2024-11-18 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/116651 >From c95e80939c8189def053556a232ba611d6dc02cc Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Mon, 18 Nov 2024 10:34:23 -0600 Subject: [PATCH 1/3] [amdgpu-arch] Replcae use of HSA with reading sysfs directly

[clang] [amdgpu-arch] Replace use of HSA with reading sysfs directly (PR #116651)

2024-11-18 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > I don't really understand why cluster users are compiling on a system where > the GPUs are being stressed, and I still don't see why it's a good idea to > break layering for this case. I'd like to eliminate a class of failures I've seen with `--offload-arch=native` either cau

[clang] [Clang] Add 'gpuintrin.h' to the release notes (PR #116410)

2024-11-15 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/116410 None >From 872cc825e86ec1ad52a95ed5a9e532c34b27f4cb Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Fri, 15 Nov 2024 11:03:21 -0600 Subject: [PATCH] [Clang] Add 'gpuintrin.h' to the release notes --- clang/

[clang] [Clang] Add 'gpuintrin.h' to the release notes (PR #116410)

2024-11-15 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/116410 >From 81f3f902a7ee16251ebf1d7b0b6aa86e11e4dc89 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Fri, 15 Nov 2024 11:03:21 -0600 Subject: [PATCH] [Clang] Add 'gpuintrin.h' to the release notes --- clang/docs/R

[clang] [clang-tools-extra] [Clang] Use TargetInfo when deciding if an address space is compatible (PR #115777)

2024-11-15 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 edited https://github.com/llvm/llvm-project/pull/115777 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang-tools-extra] [Clang] Use TargetInfo when deciding if an address space is compatible (PR #115777)

2024-11-15 Thread Joseph Huber via cfe-commits
@@ -111,6 +111,18 @@ class LLVM_LIBRARY_VISIBILITY AMDGPUTargetInfo final : public TargetInfo { return getPointerWidthV(AddrSpace); } + virtual bool isAddressSpaceSupersetOf(LangAS A, LangAS B) const override { +// The flat address space AS(0) is a superset of all

[clang] [clang-tools-extra] [Clang] Use TargetInfo when deciding if an address space is compatible (PR #115777)

2024-11-15 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/115777 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Remove Linux search paths on Windows (PR #113628)

2024-11-13 Thread Joseph Huber via cfe-commits
@@ -6440,7 +6440,8 @@ const ToolChain &Driver::getToolChain(const ArgList &Args, TC = std::make_unique(*this, Target, Args); break; case llvm::Triple::AMDHSA: - TC = std::make_unique(*this, Target, Args); + TC = std::make_unique(*this, Target, Args, +

[clang] [NVPTX] Set cache line size to 128 for __GCC_DESTRUCTIVE_SIZE (PR #115248)

2024-11-13 Thread Joseph Huber via cfe-commits
jhuber6 wrote: ping https://github.com/llvm/llvm-project/pull/115248 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang-tools-extra] [Clang] Use TargetInfo when deciding is an address space is compatible (PR #115777)

2024-11-13 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/115777 >From 1e400acbd574703adcebd704c53991427815b090 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Tue, 12 Nov 2024 11:20:19 -0600 Subject: [PATCH 1/2] [Clang] Use TargetInfo when deciding is an address space is

[clang] [AMDGPU] Use the AMDGPUToolChain when targeting C/C++ directly (PR #99687)

2024-11-13 Thread Joseph Huber via cfe-commits
jhuber6 wrote: So the main reason I want this is so that I don't need to keep passing `-nogpulib` because targeting plain C/C++ should not require the ROCm device libs. https://github.com/llvm/llvm-project/pull/99687 ___ cfe-commits mailing list cfe-

[clang] [clang-tools-extra] [Clang] Use TargetInfo when deciding is an address space is compatible (PR #115777)

2024-11-13 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > > Okay the problem with using `ASTContext` here is that it creates some > > recursive includes. ~I can do this by moving the check into `Type.cpp` > > instead, so this will be function call instead of being inlined.~ This > > would require a lot of extra stuff so I'm going to

[clang] [clang-tools-extra] [Clang] Use TargetInfo when deciding is an address space is compatible (PR #115777)

2024-11-13 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > Address spaces from language dialects generally have required relationships > and behaviors in the language, and that really shouldn't be overridden by > targets. However, targets do need to be able to decide how target-specific > address spaces work, including how they intera

[clang] [clang-tools-extra] [Clang] Use TargetInfo when deciding is an address space is compatible (PR #115777)

2024-11-13 Thread Joseph Huber via cfe-commits
jhuber6 wrote: Okay the problem with using `ASTContext` here is that it creates some recursive includes. I can do this by moving the check into `Type.cpp` instead, so this will be function call instead of being inlined. https://github.com/llvm/llvm-project/pull/115777 _

[clang] [clang-tools-extra] [Clang] Use TargetInfo when deciding is an address space is compatible (PR #115777)

2024-11-13 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > I'm not sure what unrelated code you're saying would need to pulled into a > .cpp file. It looks like there's only one actual call to > `TI.isAddressSpaceSupersetOf`, so if you just pass around an `ASTContext &` > to that point, nothing else will need to drill into it. And fra

[clang] [clang-tools-extra] [Clang] Use TargetInfo when deciding is an address space is compatible (PR #115777)

2024-11-13 Thread Joseph Huber via cfe-commits
@@ -697,45 +699,21 @@ class Qualifiers { /// every address space is a superset of itself. /// CL2.0 adds: /// __generic is a superset of any address space except for __constant. - static bool isAddressSpaceSupersetOf(LangAS A, LangAS B) { -// Address spaces must

[clang] [clang-tools-extra] [Clang] Use TargetInfo when deciding is an address space is compatible (PR #115777)

2024-11-13 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/115777 >From 23a8d5af0ab181814885bca6ab6494be9d71f59b Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Wed, 13 Nov 2024 18:14:05 -0600 Subject: [PATCH 1/4] use ASTContext --- .../bugprone/VirtualNearMissCheck.cpp

[clang] [clang-tools-extra] [Clang] Use TargetInfo when deciding is an address space is compatible (PR #115777)

2024-11-13 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/115777 >From 23a8d5af0ab181814885bca6ab6494be9d71f59b Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Wed, 13 Nov 2024 18:14:05 -0600 Subject: [PATCH 1/2] use ASTContext --- .../bugprone/VirtualNearMissCheck.cpp

<    14   15   16   17   18   19   20   21   22   23   >