from:"Joseph Huber via cfe\-commits"

[clang] [AMDGPU] Add missing `__builtin_amdgcn_wavefrontsize` builtin (PR #80741)

2024-02-05 Thread Joseph Huber via cfe-commits

@@ -832,6 +832,13 @@ void test_atomic_inc_dec(local uint *lptr, global uint *gptr, uint val) { res = __builtin_amdgcn_atomic_dec32((volatile global uint*)gptr, val, __ATOMIC_SEQ_CST, ""); } +// CHECK-LABEL test_wavefrontsize( +unsigned test_wavefrontsize() { --

[clang] [AMDGPU] Add missing `__builtin_amdgcn_wavefrontsize` builtin (PR #80741)

2024-02-05 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/80741 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][Driver] Fix `--save-temps` for OpenCL AoT compilation (PR #78333)

2024-01-23 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 approved this pull request. https://github.com/llvm/llvm-project/pull/78333 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [Offload] Fix the offloading wrapper when merged multiple times. (PR #79231)

2024-01-23 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/79231 Summary: The offloading wrapper is a object file that contains code necessary to register offloading entries for the given runtime. Currently, we expected only one of these to be present when we make the final exe

[clang] [libc] [lldb] [llvm] [mlir] [compiler-rt] [lld] [libcxx] [Driver] Test ignored target-specific options for AMDGPU/NVPTX (PR #79222)

2024-01-23 Thread Joseph Huber via cfe-commits

@@ -0,0 +1,7 @@ +/// Some target-specific options are ignored for GPU, so %clang exits with code 0. +// DEFINE: %{gpu_opts} = --cuda-gpu-arch=sm_60 --cuda-path=%S/Inputs/CUDA/usr/local/cuda --no-cuda-version-check +// DEFINE: %{check} = %clang -### -c %{gpu_opts} -mcmodel=medium

[llvm] [lldb] [lld] [compiler-rt] [clang] [mlir] [libc] [libcxx] [Driver] Test ignored target-specific options for AMDGPU/NVPTX (PR #79222)

2024-01-23 Thread Joseph Huber via cfe-commits

@@ -0,0 +1,7 @@ +/// Some target-specific options are ignored for GPU, so %clang exits with code 0. +// DEFINE: %{gpu_opts} = --cuda-gpu-arch=sm_60 --cuda-path=%S/Inputs/CUDA/usr/local/cuda --no-cuda-version-check +// DEFINE: %{check} = %clang -### -c %{gpu_opts} -mcmodel=medium

[libc] [clang] [openmp] [lld] [clang-tools-extra] [lldb] [libcxx] [compiler-rt] [mlir] [llvm] [pstl] [Driver] Test ignored target-specific options for AMDGPU/NVPTX (PR #79222)

2024-01-23 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 approved this pull request. https://github.com/llvm/llvm-project/pull/79222 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [LinkerWrapper] Do not link device code under a relocatable link (PR #79314)

2024-01-24 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/79314 Summary: A relocatable link through `clang -r` can go through the clang-linker-wrapper if offloading is enabled. This will have the effect of linking the device code and creating the wrapper module. It will then b

[lldb] [pstl] [llvm] [mlir] [libc] [compiler-rt] [libcxx] [openmp] [clang-tools-extra] [clang] [lld] [Driver] Test ignored target-specific options for AMDGPU/NVPTX (PR #79222)

2024-01-24 Thread Joseph Huber via cfe-commits

@@ -0,0 +1,5 @@ +/// Some target-specific options are ignored for GPU, so %clang exits with code 0. +// DEFINE: %{check} = %clang -### -c -mcmodel=medium jhuber6 wrote: Probably depends on the option we're testing. We could do both. https://github.com/llvm/llvm

[clang] [LinkerWrapper] Do not link device code under a relocatable link (PR #79314)

2024-01-24 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/79314 >From 0f8d9bb329b6d50493286e117ea0fe45e0a49247 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Wed, 24 Jan 2024 09:41:15 -0600 Subject: [PATCH 1/2] [LinkerWrapper] Do not link device code under a relocatable l

[clang] [llvm] [Offload] Fix the offloading wrapper when merged multiple times. (PR #79231)

2024-01-24 Thread Joseph Huber via cfe-commits

jhuber6 wrote: > Do we need two different linkages or could the COFF setting be used in both? > Can we have a test to show the merging works as expected? Doing a merge intentionally will be difficult until I add another flag to do this on purpose as an extra feature. This patch just changes it

[clang] [LinkerWrapper] Do not link device code under a relocatable link (PR #79314)

2024-01-24 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/79314 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [Offload] Fix the offloading wrapper when merged multiple times. (PR #79231)

2024-01-24 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/79231 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-24 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/79373 Summary: We support `--target=nvptx64-nvidia-cuda` as a way to target the NVPTX architecture from standard CPU. This patch simply uses the existing support for handling `--offload-arch=native` to also apply to the

[lldb] [clang] [openmp] [compiler-rt] [lld] [llvm] [libc] [libcxx] [clang-tools-extra] [mlir] [pstl] [Driver] Test ignored target-specific options for AMDGPU/NVPTX (PR #79222)

2024-01-24 Thread Joseph Huber via cfe-commits

jhuber6 wrote: Maybe need to specify `--target=x86_64-unknown-linux-gnu` in the test? https://github.com/llvm/llvm-project/pull/79222 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-24 Thread Joseph Huber via cfe-commits

jhuber6 wrote: Some interesting points, I'll try to clarify some things. > This option may not as well as one would hope. > > Problem #1 is that it will drastically slow down compilation for some users. > NVIDIA GPU drivers are loaded on demand, and the process takes a while > (O(second), dep

[clang] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via cfe-commits

jhuber6 wrote: > I think I'm with Art on this one. > > > > Problem #2 [...] The arch=native will create a working configuration, but > > > would build more than necessary. > > > > > > It will target the first GPU it finds. We could maybe change the behavior > > to detect the newest, but the

[clang] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/79373 >From 145b7bc932ce3ffa46545cd7af29b1c93981429c Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Wed, 24 Jan 2024 15:34:00 -0600 Subject: [PATCH 1/2] [NVPTX] Add support for -march=native in standalone NVPTX Sum

[clang] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/79373 >From 145b7bc932ce3ffa46545cd7af29b1c93981429c Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Wed, 24 Jan 2024 15:34:00 -0600 Subject: [PATCH 1/3] [NVPTX] Add support for -march=native in standalone NVPTX Sum

[compiler-rt] [flang] [libcxx] [clang] [llvm] [clang-tools-extra] [lldb] [lld] [libc] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/79373 >From 145b7bc932ce3ffa46545cd7af29b1c93981429c Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Wed, 24 Jan 2024 15:34:00 -0600 Subject: [PATCH 1/3] [NVPTX] Add support for -march=native in standalone NVPTX Sum

[lld] [lldb] [llvm] [compiler-rt] [clang-tools-extra] [libc] [clang] [flang] [libcxx] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via cfe-commits

jhuber6 wrote: > On the other hand, I'd be OK with providing --offload-arch=native translating > into "compile for all present GPU variants", with a possibility to further > adjust the selected set with the usual --no-offload-arch-foo, if the user > needs to. This will at least produce code th

[clang] [lld] [libcxx] [flang] [compiler-rt] [libc] [clang-tools-extra] [llvm] [lldb] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via cfe-commits

jhuber6 wrote: > User confusion is only part of the issue here. With any single GPU choice we > would still potentially produce a nonworking binary, if our GPU choice does > not match what the user wants. > > "all GPUs" has the advantage of always producing the binary that's guaranteed > to wo

[clang] [clang-tools-extra] [lldb] [libc] [libcxx] [lld] [llvm] [flang] [compiler-rt] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via cfe-commits

jhuber6 wrote: > > I think the semantics of native on other architectures are clear enough > > here. > > I don't think we have the same idea about that. Let's spell it out, so > there's no confusion. > > [GCC > manual](https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html#index-march-16) > sa

[flang] [clang] [clang-tools-extra] [llvm] [compiler-rt] [libcxx] [libc] [lldb] [lld] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via cfe-commits

jhuber6 wrote: > > This method of compilation is not like CUDA, so we can't target all the > > GPUs at the same time. > > I think this is the key fact I was missing. If the patch is only for a > standalone compilation which does not do multi-GPU compilation in principle, > then your approach

[lld] [lldb] [libcxx] [compiler-rt] [clang-tools-extra] [llvm] [libc] [clang] [flang] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via cfe-commits

jhuber6 wrote: > > This method of compilation is not like CUDA, so we can't target all the > > GPUs at the same time. > > Can you clarify for me -- what are you compiling where it's impossible to > target multiple GPUs in the binary? I'm confused because Art is understanding > that it's not C

[lld] [lldb] [libcxx] [compiler-rt] [clang-tools-extra] [llvm] [libc] [clang] [flang] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via cfe-commits

jhuber6 wrote: > I...think I understand. > > Is the output of this compilation step a cubin, then? Yes, it will spit out a simple `cubin` instead of a fatbinary. The NVIDIA toolchain is much worse about this stuff than the AMD one, but in general it works. You can check with `-###` or whateve

[flang] [clang] [libc] [compiler-rt] [clang-tools-extra] [llvm] [lld] [lldb] [libcxx] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via cfe-commits

jhuber6 wrote: > Got it, okay, thanks. > > Since this change only applies to `--target=nvptx64-nvidia-cuda`, fine by me. > Thanks for putting up with our scrutiny. :) No problem, I probably should've have been clearer in my commit messages. https://github.com/llvm/llvm-project/pull/79373 ___

[clang-tools-extra] [llvm] [libc] [clang] [libcxx] [lldb] [lld] [flang] [compiler-rt] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/79373 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [AMDGPU] Do not emit arch dependent macros with unspecified cpu (PR #79660)

2024-01-26 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/79660 Summary: Currently, the AMDGPU toolchain accepts not passing `-mcpu` as a means to create a sort of "generic" IR. The resulting IR will not contain any target dependent attributes and can then be inserted into ano

[clang] [AMDGPU] Do not emit arch dependent macros with unspecified cpu (PR #79660)

2024-01-26 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/79660 >From ba04b20709cbf76ef6f1490081aecc125bdafec7 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Fri, 26 Jan 2024 16:25:30 -0600 Subject: [PATCH] [AMDGPU] Do not emit arch dependent macros with unspecified cpu

[clang] [AMDGPU] Do not emit arch dependent macros with unspecified cpu (PR #79660)

2024-01-26 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/79660 >From ba04b20709cbf76ef6f1490081aecc125bdafec7 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Fri, 26 Jan 2024 16:25:30 -0600 Subject: [PATCH 1/2] [AMDGPU] Do not emit arch dependent macros with unspecified c

[clang] [AMDGPU] Do not emit arch dependent macros with unspecified cpu (PR #79660)

2024-01-26 Thread Joseph Huber via cfe-commits

jhuber6 wrote: > LGTM. AFAIK only device libs compile OpenCL code without -mcpu. I don't think > it uses any of these predefined macros. That's what I figured from a cursory look at the ROCm-Device-Libs. The goal is to formalize this more to make more generic LLVM-IR. https://github.com/llvm/

[llvm] [clang] [NVPTX} Add builtin support for 'globaltimer' (PR #79765)

2024-01-28 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/79765 Summary: This patch adds support for `globaltimer` to match `clock` and `clock64`. See the PTX ISA reference fro details. This patch does not implement the `hi` or `lo` variants for brevity as they can be obtained

[clang] [llvm] [NVPTX} Add builtin support for 'globaltimer' (PR #79765)

2024-01-28 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/79765 >From 9a07e319274f4ec2f7b12a174b7664af118de4e9 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Mon, 29 Jan 2024 08:12:35 -0600 Subject: [PATCH] [NVPTX} Add builtin support for 'globaltimer' Summary: This patch

[clang] [llvm] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-28 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/79768 Summary: This patch adds support for getting the 'activemask' instruction's value without needing to use inline assembly. See the relevant PTX reference for details. https://docs.nvidia.com/cuda/parallel-thread-e

[llvm] [clang] [NVPTX] Add builtin for 'exit' handling (PR #79777)

2024-01-28 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/79777 Summary: The PTX ISA has always supported the 'exit' instruction to terminate individual threads. This patch adds a builtin to handle it. See the PTX documentation for further details. https://docs.nvidia.com/cuda

[clang] [llvm] [NVPTX} Add builtin support for 'globaltimer' (PR #79765)

2024-01-28 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/79765 >From cb2503ee6c10a3d03548b6bd44d6800ed67b2753 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Mon, 29 Jan 2024 08:12:35 -0600 Subject: [PATCH] [NVPTX} Add builtin support for 'globaltimer' Summary: This patch

[clang] [llvm] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-28 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/79768 >From 2c7049defef3b62de7017640948cccfb07ff756c Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Sun, 28 Jan 2024 14:57:05 -0600 Subject: [PATCH] [NVPTX] Add 'activemask' builtin and intrinsic support Summary: T

[clang] [llvm] [NVPTX] Add builtin support for 'globaltimer' (PR #79765)

2024-01-29 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 edited https://github.com/llvm/llvm-project/pull/79765 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [AMDGPU] Do not emit arch dependent macros with unspecified cpu (PR #79660)

2024-01-29 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/79660 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [AMDGPU] Do not emit arch dependent macros with unspecified cpu (PR #79660)

2024-01-29 Thread Joseph Huber via cfe-commits

jhuber6 wrote: This seems to have perturbed the HIP build. https://lab.llvm.org/staging/#/builders/22/builds/22 The problem is that we used to set `__AMDGCN_WAVEFRONTSIZE` for the host compilation as well in a bunch of the wave function macros. I think that this is just poor programming, beca

[clang] [AMDGPU] Do not emit arch dependent macros with unspecified cpu (PR #79660)

2024-01-29 Thread Joseph Huber via cfe-commits

jhuber6 wrote: > > This seems to have perturbed the HIP build. > > https://lab.llvm.org/staging/#/builders/22/builds/22 > > The problem is that we used to set `__AMDGCN_WAVEFRONTSIZE` for the host > > compilation as well in a bunch of the wave function macros. I think that > > this is just poo

[clang] 72d4fc1 - Revert "[AMDGPU] Do not emit arch dependent macros with unspecified cpu (#79660)"

2024-01-29 Thread Joseph Huber via cfe-commits

Author: Joseph Huber Date: 2024-01-29T11:11:25-06:00 New Revision: 72d4fc1b4d5cfc4f7d50cc5cf1b315543c088f4d URL: https://github.com/llvm/llvm-project/commit/72d4fc1b4d5cfc4f7d50cc5cf1b315543c088f4d DIFF: https://github.com/llvm/llvm-project/commit/72d4fc1b4d5cfc4f7d50cc5cf1b315543c088f4d.diff

[clang] [AMDGPU] Do not emit arch dependent macros with unspecified cpu (PR #79660)

2024-01-29 Thread Joseph Huber via cfe-commits

jhuber6 wrote: Reverted. I don't think there's a "proper" solution here since this seems to have leaked into the headers due to whoever set this up initially not properly setting these on the host. That seems to be endemic now, so the best we can do it just set it to some dummy values I think.

[clang] [NVPTX] Allow compiling LLVM-IR without `-march` set (PR #79873)

2024-01-29 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/79873 Summary: The NVPTX tools require an architecture to be used, however if we are creating generic LLVM-IR we should be able to leave it unspecified. This will result in the `target-cpu` attributes not being set on t

[clang] [llvm] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-29 Thread Joseph Huber via cfe-commits

jhuber6 wrote: > Unlike the other PRs, this one has a CUDA function, `__activemask()`. > Presumably we should make that one work by hacking our headers? That is currently defined here https://github.com/llvm/llvm-project/blob/main/clang/lib/Headers/__clang_cuda_intrinsics.h#L214. I was planni

[clang] [llvm] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-29 Thread Joseph Huber via cfe-commits

jhuber6 wrote: > > I was planning on updating this to use the new instrinsic for the newer > > version. Alternatively we could make __activemask the builtin which expands > > to both versions, but I'm somewhat averse since we should target the > > instruction directly I feel. > > Yes, I agree

[llvm] [clang] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-29 Thread Joseph Huber via cfe-commits

jhuber6 wrote: > https://bugs.llvm.org/show_bug.cgi?id=35249 Yeah, there's constant issues with convergence analysis. I included one of the tests to try to show that it won't merge with the covergent attribute. Since this is a general issue for all of these things. In the past I usually add i

[llvm] [clang] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-29 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/79768 >From 2c7049defef3b62de7017640948cccfb07ff756c Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Sun, 28 Jan 2024 14:57:05 -0600 Subject: [PATCH 1/2] [NVPTX] Add 'activemask' builtin and intrinsic support Summar

[llvm] [clang] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-29 Thread Joseph Huber via cfe-commits

jhuber6 wrote: Added side effects attribute, I believe this matches the current behavior of the inline asm better. https://github.com/llvm/llvm-project/pull/79768 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/m

[llvm] [clang] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-29 Thread Joseph Huber via cfe-commits

@@ -4599,6 +4599,14 @@ def int_nvvm_vote_ballot_sync : [IntrInaccessibleMemOnly, IntrConvergent, IntrNoCallback], "llvm.nvvm.vote.ballot.sync">, ClangBuiltin<"__nvvm_vote_ballot_sync">; +// +// ACTIVEMASK +// +def int_nvvm_activemask : + Intrinsic<[llvm_i32_ty]

[llvm] [clang] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-29 Thread Joseph Huber via cfe-commits

@@ -65,7 +65,7 @@ def : Proc<"sm_61", [SM61, PTX50]>; def : Proc<"sm_62", [SM62, PTX50]>; def : Proc<"sm_70", [SM70, PTX60]>; def : Proc<"sm_72", [SM72, PTX61]>; -def : Proc<"sm_75", [SM75, PTX63]>; +def : Proc<"sm_75", [SM75, PTX62, PTX63]>; jhuber6 wrote: Ye

[llvm] [clang] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-29 Thread Joseph Huber via cfe-commits

@@ -65,7 +65,7 @@ def : Proc<"sm_61", [SM61, PTX50]>; def : Proc<"sm_62", [SM62, PTX50]>; def : Proc<"sm_70", [SM70, PTX60]>; def : Proc<"sm_72", [SM72, PTX61]>; -def : Proc<"sm_75", [SM75, PTX63]>; +def : Proc<"sm_75", [SM75, PTX62, PTX63]>; jhuber6 wrote: Ok

[llvm] [clang] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-29 Thread Joseph Huber via cfe-commits

@@ -65,7 +65,7 @@ def : Proc<"sm_61", [SM61, PTX50]>; def : Proc<"sm_62", [SM62, PTX50]>; def : Proc<"sm_70", [SM70, PTX60]>; def : Proc<"sm_72", [SM72, PTX61]>; -def : Proc<"sm_75", [SM75, PTX63]>; +def : Proc<"sm_75", [SM75, PTX62, PTX63]>; jhuber6 wrote: Ok

[llvm] [clang] [NVPTX] Add builtin support for 'nanosleep' PTX instrunction (PR #79888)

2024-01-29 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/79888 Summary: This patch adds a builtin for the `nanosleep` PTX function. It takes either an immediate or a register and sleeps for [0, 2t] nanoseconds given t. More information at the documentation: https://docs.nvidi

[llvm] [clang] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-29 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/79768 >From 2c7049defef3b62de7017640948cccfb07ff756c Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Sun, 28 Jan 2024 14:57:05 -0600 Subject: [PATCH 1/3] [NVPTX] Add 'activemask' builtin and intrinsic support Summar

[llvm] [clang] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-29 Thread Joseph Huber via cfe-commits

@@ -65,7 +65,7 @@ def : Proc<"sm_61", [SM61, PTX50]>; def : Proc<"sm_62", [SM62, PTX50]>; def : Proc<"sm_70", [SM70, PTX60]>; def : Proc<"sm_72", [SM72, PTX61]>; -def : Proc<"sm_75", [SM75, PTX63]>; +def : Proc<"sm_75", [SM75, PTX62, PTX63]>; jhuber6 wrote: Sh

[llvm] [clang] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-29 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/79768 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [NVPTX] Add builtin support for 'nanosleep' PTX instrunction (PR #79888)

2024-01-29 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/79888 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [NVPTX] Add builtin for 'exit' handling (PR #79777)

2024-01-29 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/79777 >From ea3b32593dd0f2035020313176c6e1a131ef8eb4 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Sun, 28 Jan 2024 21:27:37 -0600 Subject: [PATCH] [NVPTX] Add builtin for 'exit' handling Summary: The PTX ISA has

[llvm] [clang] [NVPTX] Add builtin for 'exit' handling (PR #79777)

2024-01-29 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/79777 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [NVPTX] Add builtin support for 'globaltimer' (PR #79765)

2024-01-29 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/79765 >From 5c4fc3dd207e91210f76c158e9c99e9591dccb96 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Mon, 29 Jan 2024 08:12:35 -0600 Subject: [PATCH] [NVPTX} Add builtin support for 'globaltimer' Summary: This patch

[llvm] [clang] [NVPTX] Add builtin support for 'globaltimer' (PR #79765)

2024-01-29 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/79765 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [NVPTX] Allow compiling LLVM-IR without `-march` set (PR #79873)

2024-01-29 Thread Joseph Huber via cfe-commits

jhuber6 wrote: > Relying on something _not_ being defined is probably not the best way to > handle 'generic' target. For starters it makes it hard or impossible to > recreate the same compilation state by undoing already-specified option. It > also breaks established assumption that there _is_

[clang] [CUDA] Change 'activemask' to use 'nvvm_activemask()' (PR #79892)

2024-01-29 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/79892 Summary: We recently added builitin support for this function. >From 5f316d30a179dd21cfadd50d232de622d394ccea Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Mon, 29 Jan 2024 14:28:35 -0600 Subject: [PATCH] [

[clang] [NVPTX] Allow compiling LLVM-IR without `-march` set (PR #79873)

2024-01-29 Thread Joseph Huber via cfe-commits

jhuber6 wrote: > > I think there's some precedent from both vendors to treat missing > > attributes as a more generic target. > > It sounds more like a bug than a feature to me. > > The major difference between "you get sm_xx by default" and this "you get > generic by default" is that With sp

[clang] [CUDA] Change 'activemask' to use 'nvvm_activemask()' (PR #79892)

2024-01-29 Thread Joseph Huber via cfe-commits

jhuber6 wrote: I've actually encountered some really strange behavior when trying to update `libc` to use the new intrinsic. The following returns a common 64-bit value to be compatible with AMDGPU's 64 lane wide mode. When I run this against the test suite, it fails on tests that specifically

[clang] [CUDA] Change 'activemask' to use 'nvvm_activemask()' (PR #79892)

2024-01-29 Thread Joseph Huber via cfe-commits

jhuber6 wrote: Scratch that, I missed `Ui` in the builtin definition. I'll do a quick fix. https://github.com/llvm/llvm-project/pull/79892 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commi

[clang] 0a2b5b0 - [NVPTX][Fix] Ensure the return value of 'activemask' is unsigned

2024-01-29 Thread Joseph Huber via cfe-commits

Author: Joseph Huber Date: 2024-01-29T17:33:38-06:00 New Revision: 0a2b5b03c4084ac1fefd0e62db2ba49f5ac24ab9 URL: https://github.com/llvm/llvm-project/commit/0a2b5b03c4084ac1fefd0e62db2ba49f5ac24ab9 DIFF: https://github.com/llvm/llvm-project/commit/0a2b5b03c4084ac1fefd0e62db2ba49f5ac24ab9.diff

[clang] [CUDA] Change 'activemask' to use 'nvvm_activemask()' (PR #79892)

2024-01-29 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/79892 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [AMDGPU] add function attrbute amdgpu-lib-fun (PR #74737)

2024-01-12 Thread Joseph Huber via cfe-commits

jhuber6 wrote: > > > An AMDGPU library function is not internalized and can be used to > > > fullfill calls generated by LLVM passes or instruction selection. > > > > > > I am confused by the description of "internalized". Do you refer to LTO > > internalization? You can leverage `llvm.used`

[clang] [compiler-rt] [clang-tools-extra] [llvm] [AMDGPU] Avoid hitting AMDGPUAsmPrinter related asserts for local functions at O0 (PR #72129)

2024-01-12 Thread Joseph Huber via cfe-commits

jhuber6 wrote: > As a somewhat naive question, what would it take to turn off requiring > codegen to be in SCC order? We seem to be the only target doing that. The > comments on that line say something about function calls and noinline I believe this is also the reason parallel codegen via `--

[clang] [llvm] [llvm][frontend][offloading] Move clang-linker-wrapper/OffloadWrapper.* to llvm/Frontend/Offloading (PR #78057)

2024-01-14 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 commented: Thanks, some comments. https://github.com/llvm/llvm-project/pull/78057 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [llvm][frontend][offloading] Move clang-linker-wrapper/OffloadWrapper.* to llvm/Frontend/Offloading (PR #78057)

2024-01-14 Thread Joseph Huber via cfe-commits

@@ -0,0 +1,62 @@ +//===- OffloadWrapper.h --r-*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apach

[llvm] [clang] [llvm][frontend][offloading] Move clang-linker-wrapper/OffloadWrapper.* to llvm/Frontend/Offloading (PR #78057)

2024-01-14 Thread Joseph Huber via cfe-commits

@@ -568,32 +590,45 @@ void createRegisterFatbinFunction(Module &M, GlobalVariable *FatbinDesc, } // namespace -Error wrapOpenMPBinaries(Module &M, ArrayRef> Images) { - GlobalVariable *Desc = createBinDesc(M, Images); +Error OffloadWrapper::wrapOpenMPBinaries( +Module &

[llvm] [clang] [llvm][frontend][offloading] Move clang-linker-wrapper/OffloadWrapper.* to llvm/Frontend/Offloading (PR #78057)

2024-01-14 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 edited https://github.com/llvm/llvm-project/pull/78057 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [llvm][frontend][offloading] Move clang-linker-wrapper/OffloadWrapper.* to llvm/Frontend/Offloading (PR #78057)

2024-01-14 Thread Joseph Huber via cfe-commits

@@ -568,32 +590,45 @@ void createRegisterFatbinFunction(Module &M, GlobalVariable *FatbinDesc, } // namespace -Error wrapOpenMPBinaries(Module &M, ArrayRef> Images) { - GlobalVariable *Desc = createBinDesc(M, Images); +Error OffloadWrapper::wrapOpenMPBinaries( +Module &

[llvm] [clang] [llvm][frontend][offloading] Move clang-linker-wrapper/OffloadWrapper.* to llvm/Frontend/Offloading (PR #78057)

2024-01-14 Thread Joseph Huber via cfe-commits

@@ -0,0 +1,62 @@ +//===- OffloadWrapper.h --r-*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apach

[clang] [llvm] [llvm][frontend][offloading] Move clang-linker-wrapper/OffloadWrapper.* to llvm/Frontend/Offloading (PR #78057)

2024-01-14 Thread Joseph Huber via cfe-commits

@@ -568,32 +590,45 @@ void createRegisterFatbinFunction(Module &M, GlobalVariable *FatbinDesc, } // namespace -Error wrapOpenMPBinaries(Module &M, ArrayRef> Images) { - GlobalVariable *Desc = createBinDesc(M, Images); +Error OffloadWrapper::wrapOpenMPBinaries( +Module &

[clang] [llvm] [llvm][frontend][offloading] Move clang-linker-wrapper/OffloadWrapper.* to llvm/Frontend/Offloading (PR #78057)

2024-01-14 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 edited https://github.com/llvm/llvm-project/pull/78057 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [llvm][frontend][offloading] Move clang-linker-wrapper/OffloadWrapper.* to llvm/Frontend/Offloading (PR #78057)

2024-01-14 Thread Joseph Huber via cfe-commits

@@ -568,32 +590,45 @@ void createRegisterFatbinFunction(Module &M, GlobalVariable *FatbinDesc, } // namespace -Error wrapOpenMPBinaries(Module &M, ArrayRef> Images) { - GlobalVariable *Desc = createBinDesc(M, Images); +Error OffloadWrapper::wrapOpenMPBinaries( +Module &

[libc] [llvm] [clang] [Libc] Give more functions restrict qualifiers (PR #78061)

2024-01-15 Thread Joseph Huber via cfe-commits

jhuber6 wrote: LLVM changes look unrelated, it was originally copied from OpenBSD it seems. But it's not a major issue. https://github.com/llvm/llvm-project/pull/78061 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi

[libc] [clang] [llvm] [Libc] Give more functions restrict qualifiers (PR #78061)

2024-01-15 Thread Joseph Huber via cfe-commits

jhuber6 wrote: > > LLVM changes look unrelated, it was originally copied from OpenBSD it > > seems. But it's not a major issue. > > FWIW I opened a few PRs in FreeBSD regarding this. Yeah, go ahead and move that portion there so the people who know more about LLVM's regex can look at it compa

[clang] [libc] [Libc] Give more functions restrict qualifiers (PR #78061)

2024-01-15 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 approved this pull request. Thanks. https://github.com/llvm/llvm-project/pull/78061 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [llvm][frontend][offloading] Move clang-linker-wrapper/OffloadWrapper.* to llvm/Frontend/Offloading (PR #78057)

2024-01-15 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 approved this pull request. Thanks. I'll probably make a patch after this to make the surface handling for CUDA default off because it seems to be unsupported. https://github.com/llvm/llvm-project/pull/78057 ___ cfe-commits

[libc] [clang] [libc] Give more functions restrict qualifiers (NFC) (PR #78061)

2024-01-15 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/78061 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Add a NULL check (PR #77131)

2024-01-16 Thread Joseph Huber via cfe-commits

@@ -21067,6 +21067,10 @@ Sema::ActOnOpenMPDependClause(const OMPDependClause::DependDataTy &Data, ExprTy = ATy->getElementType(); else ExprTy = BaseType->getPointeeType(); +// bug 69200 +if (ExprTy.isNull()) { +

[clang] [Clang] Add a NULL check (PR #77131)

2024-01-16 Thread Joseph Huber via cfe-commits

jhuber6 wrote: Thanks for the patch, this one likely fell through the cracks because it has no assigned reviewers. We'll need a test based off of the original bug report. Put that in `clang/test/OpenMP/.c` and then look at other tests for what it should look like. LLVM uses `lit` to test, you

[clang] [llvm] [LinkerWrapper] Handle AMDGPU Target-IDs correctly when linking (PR #78359)

2024-01-16 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/78359 Summary: The linker wrapper's job is to sort various embedded inputs into a list of files that participate in a single link job. So far, this has been completely 1-to-1, that is, each input file participates in ex

[clang] [llvm] [LinkerWrapper] Handle AMDGPU Target-IDs correctly when linking (PR #78359)

2024-01-16 Thread Joseph Huber via cfe-commits

jhuber6 wrote: This is a redo of what was originally in https://github.com/llvm/llvm-project/pull/72442 https://github.com/llvm/llvm-project/pull/78359 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/list

[llvm] [clang] [LinkerWrapper] Handle AMDGPU Target-IDs correctly when linking (PR #78359)

2024-01-17 Thread Joseph Huber via cfe-commits

@@ -162,6 +162,19 @@ class OffloadFile : public OwningBinary { std::unique_ptr Buffer) : OwningBinary(std::move(Binary), std::move(Buffer)) {} + /// Make a deep copy of this offloading file. + OffloadFile copy() const { +std::unique_ptr Buffer = Memor

[clang] [llvm] [LinkerWrapper] Handle AMDGPU Target-IDs correctly when linking (PR #78359)

2024-01-17 Thread Joseph Huber via cfe-commits

jhuber6 wrote: Looks like it still has that Windows failure. That's going to be impossible to debug on account of the fact that I have no clue how to run this thing on Windows. The precommit checking takes a whole day to run as well. The only error message is "invalid argument", so I really ha

[llvm] [clang] [LinkerWrapper] Handle AMDGPU Target-IDs correctly when linking (PR #78359)

2024-01-17 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/78359 >From d7c8a6e0cb2289af939a90e82afbc6e35b08010c Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Mon, 15 Jan 2024 15:42:06 -0600 Subject: [PATCH 1/2] [LinkerWrapper] Handle AMDGPU Target-IDs correctly when linki

[clang] [llvm] [LinkerWrapper] Handle AMDGPU Target-IDs correctly when linking (PR #78359)

2024-01-17 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/78359 >From d7c8a6e0cb2289af939a90e82afbc6e35b08010c Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Mon, 15 Jan 2024 15:42:06 -0600 Subject: [PATCH 1/3] [LinkerWrapper] Handle AMDGPU Target-IDs correctly when linki

[clang] [llvm] [LinkerWrapper] Handle AMDGPU Target-IDs correctly when linking (PR #78359)

2024-01-17 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/78359 >From 2a460f6ff9e7bca938adca5487609df41616e8c1 Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Mon, 15 Jan 2024 15:42:06 -0600 Subject: [PATCH] [LinkerWrapper] Handle AMDGPU Target-IDs correctly when linking

[clang] [openmp] [OpenMP][USM] Introduces -fopenmp-force-usm flag (PR #76571)

2024-01-18 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 approved this pull request. https://github.com/llvm/llvm-project/pull/76571 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [LinkerWrapper] Support device binaries in multiple link jobs (PR #72442)

2024-01-18 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/72442 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [LinkerWrapper] Support device binaries in multiple link jobs (PR #72442)

2024-01-18 Thread Joseph Huber via cfe-commits

jhuber6 wrote: Replaced by https://github.com/llvm/llvm-project/pull/78359 https://github.com/llvm/llvm-project/pull/72442 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [LinkerWrapper] Handle AMDGPU Target-IDs correctly when linking (PR #78359)

2024-01-18 Thread Joseph Huber via cfe-commits

https://github.com/jhuber6 closed https://github.com/llvm/llvm-project/pull/78359 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] 2b804f8 - [LinkerWrapper][Obvious] Fix move on temporary object

2024-01-18 Thread Joseph Huber via cfe-commits

Author: Joseph Huber Date: 2024-01-18T10:42:13-06:00 New Revision: 2b804f875579995b1588f1a079e265929163d0e4 URL: https://github.com/llvm/llvm-project/commit/2b804f875579995b1588f1a079e265929163d0e4 DIFF: https://github.com/llvm/llvm-project/commit/2b804f875579995b1588f1a079e265929163d0e4.diff

< 2 3 4 5 6 7 8 9 10 11 >

601 - 700 of 2677 matches

Mail list logo