[clang] [llvm] [HIP][HIPSTDPAR][NFC] Re-order & adapt `hipstdpar` specific passes (PR #134753)

2025-04-13 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/134753 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] ccfb97b - Revert "clang/AMDGPU: Stop looking for oclc_daz_opt_* control libraries (#134805)"

2025-04-13 Thread Matt Arsenault via cfe-commits
Author: Matt Arsenault Date: 2025-04-13T14:47:39+02:00 New Revision: ccfb97b42174eab118a4e4222c25e986db876563 URL: https://github.com/llvm/llvm-project/commit/ccfb97b42174eab118a4e4222c25e986db876563 DIFF: https://github.com/llvm/llvm-project/commit/ccfb97b42174eab118a4e4222c25e986db876563.diff

[clang] clang/AMDGPU: Stop looking for oclc_daz_opt_* control libraries (PR #134805)

2025-04-13 Thread Matt Arsenault via cfe-commits
arsenm wrote: > This should be device libs from ROCm 6.3.3. > We really need these to be part of the compiler distribution. It doesn't really work to have this as an imported 3rd party package that's a year old https://github.com/llvm/llvm-project/pull/134805

[clang] clang/AMDGPU: Stop looking for oclc_daz_opt_* control libraries (PR #134805)

2025-04-13 Thread Matt Arsenault via cfe-commits
arsenm wrote: > Hi, in this bot it is not the flang test that is failing but some HIP blender > test. Also in https://lab.llvm.org/staging/#/builders/207/builds/1994 this > produces undefined_reference errors. Is this bot using an antique device libs? https://github.com/llvm/llvm-project/pu

[clang] [llvm] [HIP][HIPSTDPAR][NFC] Re-order & adapt `hipstdpar` specific passes (PR #134753)

2025-04-13 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,21 @@ +// Test that the accelerator code selection pass only gets invoked after linking + +// Ensure Pass HipStdParAcceleratorCodeSelectionPass is not invoked in PreLink. +// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -mllvm -amdgpu-enable-hipstdpar -flto -emit-llvm-bc

[clang] [llvm] [HIP][HIPSTDPAR][NFC] Re-order & adapt `hipstdpar` specific passes (PR #134753)

2025-04-13 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/134753 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][OpenCL][AMDGPU] Use `byref` for OpenCL kernel arguments (PR #134892)

2025-04-13 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/134892 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][OpenCL][AMDGPU] Use `byref` for aggregate OpenCL kernel arguments (PR #134892)

2025-04-13 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/134892 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][OpenCL][AMDGPU] Use `byref` for OpenCL kernel arguments (PR #134892)

2025-04-13 Thread Matt Arsenault via cfe-commits
arsenm wrote: > I might misunderstand it but based on your comment, it doesn't sound like the > issue is with using `byref` for aggregate arguments in OpenCL (which is what > this PR is trying to do), especially since OpenCL is currently the only > language not using it. We already use it for

[clang] clang/AMDGPU: Stop looking for oclc_daz_opt_* control libraries (PR #134805)

2025-04-13 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm closed https://github.com/llvm/llvm-project/pull/134805 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] clang/AMDGPU: Stop looking for oclc_daz_opt_* control libraries (PR #134805)

2025-04-13 Thread Matt Arsenault via cfe-commits
arsenm wrote: ### Merge activity * **Apr 13, 3:46 AM EDT**: A user started a stack merge that includes this pull request via [Graphite](https://app.graphite.dev/github/pr/llvm/llvm-project/134805). https://github.com/llvm/llvm-project/pull/134805 _

[clang] Clang: Add elementwise minnum/maxnum builtin functions (PR #129207)

2025-04-13 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/129207 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][OpenCL][AMDGPU] Allow a kernel to call another kernel (PR #115821)

2025-04-13 Thread Matt Arsenault via cfe-commits
arsenm wrote: > @lalaniket8 @arsenm I don't have a strong opinion, but shouldn't this > transformation be done during lowering to the target? This is the lowering to the target. > Current version of the patch brings odd behavior for LLVM IR to SPIR-V > lowering for OpenCL kernels. SPIR-V d

[clang] [llvm] [HIP][HIPSTDPAR][NFC] Re-order & adapt `hipstdpar` specific passes (PR #134753)

2025-04-12 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,21 @@ +// Test that the accelerator code selection pass only gets invoked after linking + +// Ensure Pass HipStdParAcceleratorCodeSelectionPass is not invoked in PreLink. +// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -mllvm -amdgpu-enable-hipstdpar -flto -emit-llvm-bc

[clang] [llvm] [HIP][HIPSTDPAR][NFC] Re-order & adapt `hipstdpar` specific passes (PR #134753)

2025-04-12 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,21 @@ +// Test that the accelerator code selection pass only gets invoked after linking + +// Ensure Pass HipStdParAcceleratorCodeSelectionPass is not invoked in PreLink. +// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -mllvm -amdgpu-enable-hipstdpar -flto -emit-llvm-bc

[clang] [llvm] [HIP][HIPSTDPAR][NFC] Re-order & adapt `hipstdpar` specific passes (PR #134753)

2025-04-12 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,21 @@ +// Test that the accelerator code selection pass only gets invoked after linking + +// Ensure Pass HipStdParAcceleratorCodeSelectionPass is not invoked in PreLink. +// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -mllvm -amdgpu-enable-hipstdpar -flto -emit-llvm-bc

[clang] [llvm] [HIP][HIPSTDPAR][NFC] Re-order & adapt `hipstdpar` specific passes (PR #134753)

2025-04-12 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,11 @@ +// Check that if we are compiling with fgpu-rdc amdgpu-enable-hipstdpar is not +// passed to CC1, to avoid eager, per TU, removal of potentially accessible +// functions. + +// RUN: %clang -### --hipstdpar --offload-arch=gfx906 %s -nogpulib -nogpuinc \ +// RUN:

[clang] [llvm] [mlir] [AMDGPU] Generalize global.load.lds to buffer fat pointers (PR #134911)

2025-04-11 Thread Matt Arsenault via cfe-commits
arsenm wrote: > * Per your comments on the previous PR, a new intrinsic for p7 is also out - > or were you just objecting to the naming? Mostly this, but I also dont' really understand why this doesn't fit into the existing raw_buffer_load_lds https://github.com/llvm/llvm-project/pull/134911

[clang] [Clang][AMDGPU] Accept builtins in lambda declarations (PR #135027)

2025-04-11 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/135027 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] clang/AMDGPU: Stop looking for oclc_daz_opt_* control libraries (PR #134805)

2025-04-10 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/134805 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][AMDGPU] Accept builtins in lambda declarations (PR #135027)

2025-04-10 Thread Matt Arsenault via cfe-commits
Juan Manuel Martinez =?utf-8?q?Caamaño?= , Juan Manuel Martinez =?utf-8?q?Caamaño?= , Juan Manuel Martinez =?utf-8?q?Caamaño?= Message-ID: In-Reply-To: https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/135027 ___ cfe-commits

[clang] [Clang] add option --offload-jobs=N (PR #135229)

2025-04-10 Thread Matt Arsenault via cfe-commits
@@ -1233,6 +1233,10 @@ def offload_compression_level_EQ : Joined<["--"], "offload-compression-level=">, Flags<[HelpHidden]>, HelpText<"Compression level for offload device binaries (HIP only)">; +def offload_jobs_EQ : Joined<["--"], "offload-jobs=">, + HelpText<"Set the

[clang] [flang] [llvm] [clang][flang][Triple][llvm] Add isOffload function to LangOpts and isGPU function to Triple (PR #126956)

2025-04-10 Thread Matt Arsenault via cfe-commits
arsenm wrote: > It's not a bikeshedding problem that it doesn't handle other SYCL targets > consistently though. In some places there are checks for SYCL, in other > places there are checks for SPIR-V. Usage context wrong is a different problem than whether SPIRV should count as "isGPU" http

[clang] [flang] [llvm] [clang][flang][Triple][llvm] Add isOffload function to LangOpts and isGPU function to Triple (PR #126956)

2025-04-10 Thread Matt Arsenault via cfe-commits
arsenm wrote: > It's not a bikeshedding problem that it doesn't handle other SYCL targets > consistently though. In some places there are checks for SYCL, in other > places there are checks for SPIR-V. Usage context wrong is a different problem than whether SPIRV should count as "isGPU" http

[clang] [flang] [llvm] [clang][flang][Triple][llvm] Add isOffload function to LangOpts and isGPU function to Triple (PR #126956)

2025-04-10 Thread Matt Arsenault via cfe-commits
arsenm wrote: > > Seems reasonable so long as we know that sound reasonable to you? No. This is a bikeshedding problem on the isGPU name. SPIRV is functionally a "GPU" target. The abstract future physical device it nay run on is unimportant https://github.com/llvm/llvm-project/pull/126956 ___

[clang] [compiler-rt] [libc] [libcxx] [llvm] [AMDGPU] Fix code object version not being set to 'none' (PR #135036)

2025-04-10 Thread Matt Arsenault via cfe-commits
@@ -62,7 +62,7 @@ Value *EmitAMDGPUWorkGroupSize(CodeGenFunction &CGF, unsigned Index) { auto Cov = CGF.getTarget().getTargetOpts().CodeObjectVersion; - if (Cov == CodeObjectVersionKind::COV_None) { + if (Cov == CodeObjectVersionKind::COV_None && !CGF.getLangOpts().OpenM

[clang] [Clang][AMDGPU] Accept builtins in lambda declarations (PR #135027)

2025-04-10 Thread Matt Arsenault via cfe-commits
Juan Manuel Martinez =?utf-8?q?Caamaño?= , Juan Manuel Martinez =?utf-8?q?Caamaño?= , Juan Manuel Martinez =?utf-8?q?Caamaño?= Message-ID: In-Reply-To: https://github.com/arsenm commented: IIRC there were bugs when you try to use lambda in conjunction with the target attribute. Can you add a

[clang] [clang] Introduce elementwise clz/ctz builtins (PR #131995)

2025-04-10 Thread Matt Arsenault via cfe-commits
@@ -831,6 +832,14 @@ of different sizes and signs is forbidden in binary and ternary builtins. semantics, see `LangRef

[clang] [clang] Introduce elementwise clz/ctz builtins (PR #131995)

2025-04-10 Thread Matt Arsenault via cfe-commits
@@ -831,6 +832,14 @@ of different sizes and signs is forbidden in binary and ternary builtins. semantics, see `LangRef

[clang] [clang] Introduce elementwise clz/ctz builtins (PR #131995)

2025-04-10 Thread Matt Arsenault via cfe-commits
@@ -831,6 +832,14 @@ of different sizes and signs is forbidden in binary and ternary builtins. semantics, see `LangRef

[clang] [clang] Introduce elementwise clz/ctz builtins (PR #131995)

2025-04-10 Thread Matt Arsenault via cfe-commits
@@ -831,6 +832,14 @@ of different sizes and signs is forbidden in binary and ternary builtins. semantics, see `LangRef

[clang] [llvm] [HIP][HIPSTDPAR][NFC] Re-order & adapt `hipstdpar` specific passes (PR #134753)

2025-04-10 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,11 @@ +// Check that if we are compiling with fgpu-rdc amdgpu-enable-hipstdpar is not +// passed to CC1, to avoid eager, per TU, removal of potentially accessible +// functions. + +// RUN: %clang -### --hipstdpar --offload-arch=gfx906 %s -nogpulib -nogpuinc \ +// RUN:

[clang] [llvm] [IR] Mark convergence intrins as has-side-effect (PR #134844)

2025-04-09 Thread Matt Arsenault via cfe-commits
arsenm wrote: Thinking we should remove the IR verification requirement. It's more of a lint type check and violations would be UB https://github.com/llvm/llvm-project/pull/134844 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.

[clang] clang/AMDGPU: Stop looking for oclc_daz_opt_* control libraries (PR #134805)

2025-04-09 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/134805 >From 239a55f25f7afecef3513788e8c2428fb2a06c22 Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Tue, 8 Apr 2025 15:03:22 +0700 Subject: [PATCH] clang/AMDGPU: Stop looking for oclc_daz_opt_* control libraries

[clang] [Clang][AMDGPU] Accept builtins in lambda declarations (PR #135027)

2025-04-09 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,24 @@ +// REQUIRES: amdgpu-registered-target arsenm wrote: -emit-llvm should work fine, real codegen will obviously fail https://github.com/llvm/llvm-project/pull/135027 ___ cfe-commits mailing list cfe-com

[clang] [compiler-rt] [libc] [libcxx] [llvm] [AMDGPU] Fix code object version not being set to 'none' (PR #135036)

2025-04-09 Thread Matt Arsenault via cfe-commits
@@ -62,7 +62,7 @@ Value *EmitAMDGPUWorkGroupSize(CodeGenFunction &CGF, unsigned Index) { auto Cov = CGF.getTarget().getTargetOpts().CodeObjectVersion; - if (Cov == CodeObjectVersionKind::COV_None) { + if (Cov == CodeObjectVersionKind::COV_None && !CGF.getLangOpts().OpenM

[clang] clang/AMDGPU: Stop looking for hip.bc in device libs (PR #134801)

2025-04-09 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm closed https://github.com/llvm/llvm-project/pull/134801 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] clang/AMDGPU: Stop looking for hip.bc in device libs (PR #134801)

2025-04-09 Thread Matt Arsenault via cfe-commits
arsenm wrote: ### Merge activity * **Apr 9, 10:38 AM EDT**: A user started a stack merge that includes this pull request via [Graphite](https://app.graphite.dev/github/pr/llvm/llvm-project/134801). https://github.com/llvm/llvm-project/pull/134801 _

[clang] [Clang][AMDGPU] Accept builtins in lambda declarations (PR #135027)

2025-04-09 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,24 @@ +// REQUIRES: amdgpu-registered-target arsenm wrote: Don't think it actually does https://github.com/llvm/llvm-project/pull/135027 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists

[libclc] [libclc] Move shuffle/shuffle2 to the CLC library (PR #135000)

2025-04-09 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/135000 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] Reapply "Inline: Propagate callsite nofpclass attribute" (PR #135018)

2025-04-09 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm created https://github.com/llvm/llvm-project/pull/135018 This reverts commit 3f38cd07d820248fd2043efb1341fabaac2d84a6. Fix case where inner callsite has nofpclass but callsite does not. >From 14c2ecf4714f5e0c4e6928565678b8e98288fd89 Mon Sep 17 00:00:00 2001 From: Matt

[clang] [llvm] Reapply "Inline: Propagate callsite nofpclass attribute" (PR #135018)

2025-04-09 Thread Matt Arsenault via cfe-commits
arsenm wrote: * **#135018** https://app.graphite.dev/github/pr/llvm/llvm-project/135018?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/13501

[clang] [llvm] Reapply "Inline: Propagate callsite nofpclass attribute" (PR #135018)

2025-04-09 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm ready_for_review https://github.com/llvm/llvm-project/pull/135018 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [HIP][HIPSTDPAR][NFC] Re-order & adapt `hipstdpar` specific passes (PR #134753)

2025-04-09 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm commented: Needs test https://github.com/llvm/llvm-project/pull/134753 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] 3f38cd0 - Revert "Inline: Propagate callsite nofpclass attribute"

2025-04-08 Thread Matt Arsenault via cfe-commits
Author: Matt Arsenault Date: 2025-04-08T23:15:00+07:00 New Revision: 3f38cd07d820248fd2043efb1341fabaac2d84a6 URL: https://github.com/llvm/llvm-project/commit/3f38cd07d820248fd2043efb1341fabaac2d84a6 DIFF: https://github.com/llvm/llvm-project/commit/3f38cd07d820248fd2043efb1341fabaac2d84a6.diff

[clang] [llvm] [IR] Mark convergence intrins as has-side-effect (PR #134844)

2025-04-08 Thread Matt Arsenault via cfe-commits
arsenm wrote: > > Turns out not really, I ran spec with this about 2 years ago and the only > > non-noise change was a mild improvement > > Looking at the PR you linked, seems like there was still not a clear > consensus on the default change no? Yes, there were people who are wrong and need

[clang] [llvm] [IR] Mark convergence intrins as has-side-effect (PR #134844)

2025-04-08 Thread Matt Arsenault via cfe-commits
arsenm wrote: > I suspect the long-term change to change the default IR to assume convergent > will take some time as it will impact many subprojects. Turns out not really, I ran spec with this about 2 years ago and the only non-noise change was a mild improvement > Would you be OK with me

[clang] [llvm] [IR] Mark convergence intrins as has-side-effect (PR #134844)

2025-04-08 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm requested changes to this pull request. These should not have side effects. The problem you are experiencing is because the design of the convergent attribute is broken. It imposes a structural requirement on the IR, rather than relaxing an assumed restriction. Whateve

[clang] clang/AMDGPU: Stop looking for hip.bc in device libs (PR #134801)

2025-04-08 Thread Matt Arsenault via cfe-commits
arsenm wrote: * **#134801** https://app.graphite.dev/github/pr/llvm/llvm-project/134801?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 https://app.graphite.dev/github/pr/llvm/llvm-project/13480

[clang] clang/AMDGPU: Stop looking for hip.bc in device libs (PR #134801)

2025-04-08 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm ready_for_review https://github.com/llvm/llvm-project/pull/134801 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] clang/AMDGPU: Stop looking for hip.bc in device libs (PR #134801)

2025-04-08 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm created https://github.com/llvm/llvm-project/pull/134801 This has been an empty library since January 2023 >From b46b307e034fed518437f8e28ce05704d1c20560 Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Tue, 8 Apr 2025 14:00:34 +0700 Subject: [PATCH] clang/AMDGPU:

[clang] [Clang] Unify 'nvptx-arch' and 'amdgpu-arch' into 'offload-arch' (PR #134713)

2025-04-07 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/134713 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [HIP][HIPSTDPAR][NFC] Re-order & adapt `hipstdpar` specific passes (PR #134753)

2025-04-07 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm commented: Needs test https://github.com/llvm/llvm-project/pull/134753 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Unify 'nvptx-arch' and 'amdgpu-arch' into 'offload-arch' (PR #134713)

2025-04-07 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,78 @@ +//===- OffloadArch.cpp - list available GPUs *- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apa

[clang] [Clang][OpenCL][AMDGPU] Allow a kernel to call another kernel (PR #115821)

2025-04-06 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/115821 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [flang] [llvm] [AMDGPU] Use a target feature to enable __builtin_amdgcn_global_load_lds on gfx9/10 (PR #133055)

2025-04-05 Thread Matt Arsenault via cfe-commits
Juan Manuel Martinez =?utf-8?q?Caamaño?= , Juan Manuel Martinez =?utf-8?q?Caamaño?= , Juan Manuel Martinez =?utf-8?q?Caamaño?= Message-ID: In-Reply-To: @@ -0,0 +1,45 @@ +; RUN: split-file %s %t arsenm wrote: It's super annoying when these tests break. We shoul

[clang] [llvm] Vectorize: Support fminimumnum and fmaximumnum (PR #131781)

2025-04-05 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,1059 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5 +; RUN: opt --passes=loop-vectorize --mtriple=riscv64 -mattr="+zvfh,+v" -S < %s | FileCheck %s --check-prefix=RV64 +; RUN: opt --passes=loop-vectorize --mtriple=aa

[clang] [llvm] [Clang][AMDGPU] Add __builtin_amdgcn_cvt_off_f32_i4 (PR #133741)

2025-04-05 Thread Matt Arsenault via cfe-commits
@@ -140,6 +140,7 @@ BUILTIN(__builtin_amdgcn_cvt_pknorm_u16, "E2Usff", "nc") BUILTIN(__builtin_amdgcn_cvt_pk_i16, "E2sii", "nc") BUILTIN(__builtin_amdgcn_cvt_pk_u16, "E2UsUiUi", "nc") BUILTIN(__builtin_amdgcn_cvt_pk_u8_f32, "UifUiUi", "nc") +BUILTIN(__builtin_amdgcn_cvt_off_f32

[libclc] [libclc] Move sinh, cosh & tanh to the CLC library (PR #134063)

2025-04-05 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/134063 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [flang] [llvm] [AMDGPU] Use a target feature to enable __builtin_amdgcn_global_load_lds on gfx9/10 (PR #133055)

2025-04-05 Thread Matt Arsenault via cfe-commits
Juan Manuel Martinez =?utf-8?q?Caamaño?= , Juan Manuel Martinez =?utf-8?q?Caamaño?= , Juan Manuel Martinez =?utf-8?q?Caamaño?= Message-ID: In-Reply-To: @@ -260,7 +260,7 @@ AMDGPUTargetInfo::AMDGPUTargetInfo(const llvm::Triple &Triple, MaxAtomicPromoteWidth = MaxAtomicInli

[clang] [Clang][OpenCL][AMDGPU] Allow a kernel to call another kernel (PR #115821)

2025-04-04 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. lgtm with nits https://github.com/llvm/llvm-project/pull/115821 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libclc] [libclc] Fix unresolved reference to missing table (PR #133691)

2025-04-04 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/133691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [Clang][AMDGPU] Add __builtin_amdgcn_cvt_off_f32_i4 (PR #133741)

2025-04-04 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,15 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py +// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL1.2 \ +// RUN: -emit-llvm -o - | FileCheck %s + +// CHECK-LABEL: @test_builtin_amdgcn_cvt_off_f32_i4( +// CHECK-NEXT:

[clang] [AArch64] Remove strict checks from init-aarch64.c (PR #134338)

2025-04-03 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm requested changes to this pull request. The other tests should be strengthened to always use -NEXT. The exact set of macros should be tested, don't want others hiding in the gaps https://github.com/llvm/llvm-project/pull/134338

[clang] [Clang][OpenCL][AMDGPU] Allow a kernel to call another kernel (PR #115821)

2025-04-03 Thread Matt Arsenault via cfe-commits
@@ -6138,6 +6150,7 @@ void CodeGenModule::EmitGlobalFunctionDefinition(GlobalDecl GD, CodeGenFunction(*this).GenerateCode(GD, Fn, FI); setNonAliasAttributes(GD, Fn); + arsenm wrote: ```suggestion ``` https://github.com/llvm/llvm-project/pull/115821

[clang] [Clang][OpenCL][AMDGPU] Allow a kernel to call another kernel (PR #115821)

2025-04-03 Thread Matt Arsenault via cfe-commits
@@ -1583,6 +1583,26 @@ void CodeGenFunction::GenerateCode(GlobalDecl GD, llvm::Function *Fn, // Implicit copy-assignment gets the same special treatment as implicit // copy-constructors. emitImplicitAssignmentOperatorBody(Args); + } else if (FD->hasAttr() && +

[clang] [Clang][OpenCL][AMDGPU] Allow a kernel to call another kernel (PR #115821)

2025-04-03 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/115821 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][OpenCL][AMDGPU] Allow a kernel to call another kernel (PR #115821)

2025-04-03 Thread Matt Arsenault via cfe-commits
@@ -2497,7 +2502,12 @@ void CodeGenModule::ConstructAttributeList(StringRef Name, NumElemsParam); } -if (TargetDecl->hasAttr()) { +if (TargetDecl->hasAttr() && +CallingConv != CallingConv::CC_C && +CallingConv != +

[clang] [Clang][OpenCL][AMDGPU] Allow a kernel to call another kernel (PR #115821)

2025-04-03 Thread Matt Arsenault via cfe-commits
@@ -299,11 +196,243 @@ Mat64X64 __attribute__((noinline)) foo_large(Mat32X32 in) { // X86-NEXT:call void @llvm.memcpy.p1.p0.i32(ptr addrspace(1) align 4 [[ARRAYIDX]], ptr align 4 [[TMP]], i32 16384, i1 false) // X86-NEXT:ret void // +// +// X86-LABEL: define void @Fun

[clang] [Clang][OpenCL][AMDGPU] Allow a kernel to call another kernel (PR #115821)

2025-04-03 Thread Matt Arsenault via cfe-commits
@@ -3892,6 +3895,10 @@ void CodeGenModule::EmitGlobal(GlobalDecl GD) { // Ignore declarations, they will be emitted on their first use. if (const auto *FD = dyn_cast(Global)) { + arsenm wrote: ```suggestion ``` https://github.com/llvm/llvm-project/pull/1

[clang] [AMDGPU] Remove detection of hip runtime for Spack (PR #133263)

2025-04-02 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm closed https://github.com/llvm/llvm-project/pull/133263 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libclc] [libclc] Move native_(exp10|powr|tan) to CLC library (PR #134080)

2025-04-02 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/134080 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [EquivalenceClasses] Use SmallVector for deterministic iteration order. (PR #134075)

2025-04-02 Thread Matt Arsenault via cfe-commits
@@ -138,6 +139,9 @@ class EquivalenceClasses { /// ECValues, it just keeps the key as part of the value. std::set TheMapping; + /// List of all members, used to provide a determinstic iteration order. + SmallVector Members; arsenm wrote: Can you combine

[libclc] [libclc] Move lgamma, lgamma_r & tgamma to CLC library (PR #134053)

2025-04-02 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm commented: I'd leave any content changes for a separate pr https://github.com/llvm/llvm-project/pull/134053 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libclc] [libclc] Move sinh, cosh & tanh to the CLC library (PR #134063)

2025-04-02 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,201 @@ +//===--===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apa

[libclc] [libclc] Move lgamma, lgamma_r & tgamma to CLC library (PR #134053)

2025-04-02 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/134053 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libclc] [libclc] Move lgamma, lgamma_r & tgamma to CLC library (PR #134053)

2025-04-02 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,73 @@ +//===--===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apac

[clang] [llvm] [Clang][AMDGPU] Add __builtin_amdgcn_cvt_off_f32_i4 (PR #133741)

2025-04-02 Thread Matt Arsenault via cfe-commits
Juan Manuel Martinez =?utf-8?q?Caamaño?= , Juan Manuel Martinez =?utf-8?q?Caamaño?= , Juan Manuel Martinez =?utf-8?q?Caamaño?= , Juan Manuel Martinez =?utf-8?q?Caamaño?= , Juan Manuel Martinez =?utf-8?q?Caamaño?= , Juan Manuel Martinez =?utf-8?q?Caamaño?= , Juan Manuel Martinez =?utf-8?q?Caamaño?=

[clang] [llvm] [Clang][AMDGPU] Add __builtin_amdgcn_cvt_off_f32_i4 (PR #133741)

2025-04-02 Thread Matt Arsenault via cfe-commits
Juan Manuel Martinez =?utf-8?q?Caama=C3=B1o?= , Juan Manuel Martinez =?utf-8?q?Caama=C3=B1o?= , Juan Manuel Martinez =?utf-8?q?Caama=C3=B1o?= , Juan Manuel Martinez =?utf-8?q?Caama=C3=B1o?= , Juan Manuel Martinez =?utf-8?q?Caama=C3=B1o?= , Juan Manuel Martinez =?utf-8?q?Caama=C3=B1o?= , Juan Manuel

[clang] [llvm] [Clang][AMDGPU] Add __builtin_amdgcn_cvt_off_f32_i4 (PR #133741)

2025-04-02 Thread Matt Arsenault via cfe-commits
Juan Manuel Martinez =?utf-8?q?Caamaño?= , Juan Manuel Martinez =?utf-8?q?Caamaño?= , Juan Manuel Martinez =?utf-8?q?Caamaño?= , Juan Manuel Martinez =?utf-8?q?Caamaño?= , Juan Manuel Martinez =?utf-8?q?Caamaño?= , Juan Manuel Martinez =?utf-8?q?Caamaño?= , Juan Manuel Martinez =?utf-8?q?Caamaño?=

[libclc] [libclc] Move cbrt to the CLC library; vectorize (PR #133940)

2025-04-01 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,135 @@ +//===--===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apa

[libclc] [libclc] Move cbrt to the CLC library; vectorize (PR #133940)

2025-04-01 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/133940 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libclc] [libclc] Move rootn to the CLC library; optimize (PR #133735)

2025-03-31 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/133735 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-03-31 Thread Matt Arsenault via cfe-commits
arsenm wrote: > @arsenm Any recommendations for appeasing the [undef > deprecator](https://github.com/llvm/llvm-project/pull/133301#issuecomment-2759145421)? I don't think you did anything other than update existing tests, I would ignore it for the purposes of this change https://github.com/

[libclc] [libclc][amdgpu] Implement native_exp2 via AMD builtin (PR #133696)

2025-03-31 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/133696 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] Vectorize: Support fminimumnum and fmaximumnum (PR #131781)

2025-03-31 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,1059 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5 +; RUN: opt --passes=loop-vectorize --mtriple=riscv64 -mattr="+zvfh,+v" -S < %s | FileCheck %s --check-prefix=RV64 +; RUN: opt --passes=loop-vectorize --mtriple=aa

[clang] [llvm] Vectorize: Support fminimumnum and fmaximumnum (PR #131781)

2025-03-31 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm commented: You can probably write a smaller test using SLPVectorizer, e.g. /llvm/test/Transforms/SLPVectorizer/AMDGPU/round.ll https://github.com/llvm/llvm-project/pull/131781 ___ cfe-commits mailing list cfe-commits@lists.l

[clang] [llvm] Vectorize: Support fminimumnum and fmaximumnum (PR #131781)

2025-03-31 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,1059 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5 +; RUN: opt --passes=loop-vectorize --mtriple=riscv64 -mattr="+zvfh,+v" -S < %s | FileCheck %s --check-prefix=RV64 +; RUN: opt --passes=loop-vectorize --mtriple=aa

[clang] [llvm] Vectorize: Support fminimumnum and fmaximumnum (PR #131781)

2025-03-31 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/131781 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] Vectorize: Support fminimumnum and fmaximumnum (PR #131781)

2025-03-31 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,407 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 5 arsenm wrote: You don't have a backend vectorization test. This PR contains no clang changes. You should delete the clang test and replace it

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-03-30 Thread Matt Arsenault via cfe-commits
@@ -1011,6 +1011,31 @@ static Value *foldPHINodeOrSelectInst(Instruction &I) { return foldSelectInst(cast(I)); } +/// Returns a fixed vector type equivalent to the memory set by II or nullptr if +/// unable to do so. +static FixedVectorType *getVectorTypeFor(const MemSetIns

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-03-30 Thread Matt Arsenault via cfe-commits
@@ -1011,6 +1011,31 @@ static Value *foldPHINodeOrSelectInst(Instruction &I) { return foldSelectInst(cast(I)); } +/// Returns a fixed vector type equivalent to the memory set by II or nullptr if +/// unable to do so. +static FixedVectorType *getVectorTypeFor(const MemSetIns

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-03-30 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,124 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py +; RUN: opt < %s -passes='sroa' -S | FileCheck %s +target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-n8:16:3

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-03-30 Thread Matt Arsenault via cfe-commits
@@ -1011,6 +1011,31 @@ static Value *foldPHINodeOrSelectInst(Instruction &I) { return foldSelectInst(cast(I)); } +/// Returns a fixed vector type equivalent to the memory set by II or nullptr if +/// unable to do so. +static FixedVectorType *getVectorTypeFor(const MemSetIns

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-03-30 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,124 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py +; RUN: opt < %s -passes='sroa' -S | FileCheck %s +target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-n8:16:3

[clang] [llvm] [SROA] Vector promote some memsets (PR #133301)

2025-03-30 Thread Matt Arsenault via cfe-commits
@@ -1011,6 +1011,31 @@ static Value *foldPHINodeOrSelectInst(Instruction &I) { return foldSelectInst(cast(I)); } +/// Returns a fixed vector type equivalent to the memory set by II or nullptr if +/// unable to do so. +static FixedVectorType *getVectorTypeFor(const MemSetIns

[clang] [flang] [libcxx] [llvm] [llvm-reduce]: print short form, actionable names in the log (PR #132813)

2025-03-29 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm closed https://github.com/llvm/llvm-project/pull/132813 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [flang] [libcxx] [llvm] [llvm-reduce]: print short form, actionable names in the log (PR #132813)

2025-03-29 Thread Matt Arsenault via cfe-commits
arsenm wrote: Replaced with new PR https://github.com/llvm/llvm-project/pull/132813 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [flang] [libcxx] [llvm] [llvm-reduce]: print short form, actionable names in the log (PR #132813)

2025-03-28 Thread Matt Arsenault via cfe-commits
arsenm wrote: Yes, that's about right. I think the entry points should be renamed for consistency though. So keep the original reduceIFuncsDeltaPass instead of taking the old implementation detail name (like extractIFuncsFromModule) https://github.com/llvm/llvm-project/pull/132813

[libclc] [libclc] Pass -fapprox-func when compiling 'native' builtins (PR #133119)

2025-03-27 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/133119 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

  1   2   3   4   5   6   7   8   9   10   >