[clang] [llvm] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-29 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/79768 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-29 Thread Artem Belevich via cfe-commits
@@ -4599,6 +4599,14 @@ def int_nvvm_vote_ballot_sync : [IntrInaccessibleMemOnly, IntrConvergent, IntrNoCallback], "llvm.nvvm.vote.ballot.sync">, ClangBuiltin<"__nvvm_vote_ballot_sync">; +// +// ACTIVEMASK +// +def int_nvvm_activemask : + Intrinsic<[llvm_i32_ty]

[clang] [llvm] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-29 Thread Artem Belevich via cfe-commits
Artem-B wrote: https://bugs.llvm.org/show_bug.cgi?id=35249 https://github.com/llvm/llvm-project/pull/79768 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-29 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > > I was planning on updating this to use the new instrinsic for the newer > > version. Alternatively we could make __activemask the builtin which expands > > to both versions, but I'm somewhat averse since we should target the > > instruction directly I feel. > > Yes, I agree

[clang] [llvm] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-29 Thread Justin Lebar via cfe-commits
https://github.com/jlebar approved this pull request. https://github.com/llvm/llvm-project/pull/79768 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-29 Thread Justin Lebar via cfe-commits
jlebar wrote: > I was planning on updating this to use the new instrinsic for the newer > version. Alternatively we could make __activemask the builtin which expands > to both versions, but I'm somewhat averse since we should target the > instruction directly I feel. Yes, I agree that the bui

[clang] [llvm] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-29 Thread Joseph Huber via cfe-commits
jhuber6 wrote: > Unlike the other PRs, this one has a CUDA function, `__activemask()`. > Presumably we should make that one work by hacking our headers? That is currently defined here https://github.com/llvm/llvm-project/blob/main/clang/lib/Headers/__clang_cuda_intrinsics.h#L214. I was planni

[clang] [llvm] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-29 Thread Justin Lebar via cfe-commits
jlebar wrote: Unlike the other PRs, this one has a CUDA function, `__activemask()`. Presumably we should make that one work by hacking our headers? https://github.com/llvm/llvm-project/pull/79768 ___ cfe-commits mailing list cfe-commits@lists.llvm.or

[clang] [llvm] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-28 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 updated https://github.com/llvm/llvm-project/pull/79768 >From 2c7049defef3b62de7017640948cccfb07ff756c Mon Sep 17 00:00:00 2001 From: Joseph Huber Date: Sun, 28 Jan 2024 14:57:05 -0600 Subject: [PATCH] [NVPTX] Add 'activemask' builtin and intrinsic support Summary: T

[clang] [llvm] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-28 Thread via cfe-commits
llvmbot wrote: @llvm/pr-subscribers-llvm-ir Author: Joseph Huber (jhuber6) Changes Summary: This patch adds support for getting the 'activemask' instruction's value without needing to use inline assembly. See the relevant PTX reference for details. https://docs.nvidia.com/cuda/parallel-th

[clang] [llvm] [NVPTX] Add 'activemask' builtin and intrinsic support (PR #79768)

2024-01-28 Thread Joseph Huber via cfe-commits
https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/79768 Summary: This patch adds support for getting the 'activemask' instruction's value without needing to use inline assembly. See the relevant PTX reference for details. https://docs.nvidia.com/cuda/parallel-thread-e