[clang] [clang] Introduce target-specific `Sema` components (PR #93179)

2024-05-23 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm commented: Should update the GitHub autolabeler paths for the targets https://github.com/llvm/llvm-project/pull/93179 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe

[clang] [clang] Introduce target-specific `Sema` components (PR #93179)

2024-05-23 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/93179 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU][Clang] Add check of size for __builtin_amdgcn_global_load_lds (PR #93064)

2024-05-23 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,13 @@ +// RUN: %clang_cc1 -cl-std=CL2.0 -O0 -triple amdgcn-unknown-unknown -target-cpu gfx940 -S -verify -o - %s +// REQUIRES: amdgpu-registered-target + +typedef unsigned int u32; + +void test_global_load_lds_unsupported_size(global u32* src, local u32 *dst, u32 size

[clang] [llvm] [AMDGPU][Clang] Add check of size for __builtin_amdgcn_global_load_lds (PR #93064)

2024-05-23 Thread Matt Arsenault via cfe-commits
@@ -12385,4 +12385,8 @@ def err_acc_reduction_composite_type def err_acc_reduction_composite_member_type :Error< "OpenACC 'reduction' composite variable must not have non-scalar field">; def note_acc_reduction_composite_member_loc : Note<"invalid field is here">; + +// AMDG

[clang] [llvm] [AMDGPU][WIP] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-23 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm commented: Should lose the [WIP] in the title https://github.com/llvm/llvm-project/pull/89217 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU][Clang] Add check of size for __builtin_amdgcn_global_load_lds (PR #93064)

2024-05-23 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/93064 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] AMDGPU: Handle remote/fine-grained memory in atomicrmw fmin/fmax lowering (PR #96759)

2024-08-01 Thread Matt Arsenault via cfe-commits
arsenm wrote: ping https://github.com/llvm/llvm-project/pull/96759 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] AMDGPU: Handle remote/fine-grained memory in atomicrmw fmin/fmax lowering (PR #96759)

2024-08-01 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm closed https://github.com/llvm/llvm-project/pull/96759 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] AMDGPU: Handle new atomicrmw metadata for fadd case (PR #96760)

2024-08-01 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/96760 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] clang/AMDGPU: Emit atomicrmw for __builtin_amdgcn_global_atomic_fadd_{f32|f64} (PR #96872)

2024-08-02 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/96872 >From f4d7712ae6b81657572b7963b42d9909a4143452 Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Tue, 11 Jun 2024 10:58:44 +0200 Subject: [PATCH 1/2] clang/AMDGPU: Emit atomicrmw for __builtin_amdgcn_global_ato

[clang] [clang] Construct SmallVector with ArrayRef (NFC) (PR #101898)

2024-08-04 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/101898 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Do not emit intrinsic math functions on GPU targets (PR #98209)

2024-08-06 Thread Matt Arsenault via cfe-commits
arsenm wrote: > Interesting, the important bit is that this is done before the LTO link so we > know which functions to pull in. Unfortunately I don't think there's a good > location that actually states which math calls are supported, but I could > probably make a script that goes through eac

[clang] 33315ef - clang/AMDGPU: Don't set implicit arg attribute to default size

2022-01-14 Thread Matt Arsenault via cfe-commits
Author: Matt Arsenault Date: 2022-01-14T18:43:30-05:00 New Revision: 33315ef3216be6edcfb4a6577150682b80a18766 URL: https://github.com/llvm/llvm-project/commit/33315ef3216be6edcfb4a6577150682b80a18766 DIFF: https://github.com/llvm/llvm-project/commit/33315ef3216be6edcfb4a6577150682b80a18766.diff

[clang] [Clang] Remove 3-element vector load and store special handling (PR #104661)

2024-08-16 Thread Matt Arsenault via cfe-commits
@@ -45,7 +45,7 @@ void test3(packedfloat3 *p) { *p = (packedfloat3) { 3.2f, 2.3f, 0.1f }; } // CHECK: @test3( -// CHECK: store <4 x float> {{.*}}, align 4 +// CHECK: store <3 x float> {{.*}}, align 4 arsenm wrote: According to the langref the backend can wid

[clang] [llvm] [AMDGPU] Convert AMDGPUResourceUsageAnalysis pass from Module to MF pass (PR #102913)

2024-08-19 Thread Matt Arsenault via cfe-commits
@@ -41,20 +44,19 @@ class MCResourceInfo { int32_t MaxAGPR = 0; int32_t MaxSGPR = 0; - MCContext &OutContext; - bool finalized; + bool Finalized = false; arsenm wrote: Document Finalized https://github.com/llvm/llvm-project/pull/102913 __

[clang] [llvm] [AMDGPU] Convert AMDGPUResourceUsageAnalysis pass from Module to MF pass (PR #102913)

2024-08-19 Thread Matt Arsenault via cfe-commits
@@ -68,82 +71,84 @@ void MCResourceInfo::assignMaxRegs() { assignMaxRegSym(MaxSGPRSym, MaxSGPR); } -void MCResourceInfo::finalize() { - assert(!finalized && "Cannot finalize ResourceInfo again."); - finalized = true; - assignMaxRegs(); +void MCResourceInfo::finalize(MCCon

[clang] [llvm] [AMDGPU] Convert AMDGPUResourceUsageAnalysis pass from Module to MF pass (PR #102913)

2024-08-19 Thread Matt Arsenault via cfe-commits
@@ -84,10 +87,13 @@ class MCResourceInfo { /// functions with indirect calls should be assigned the module level maximum. void gatherResourceInfo( const MachineFunction &MF, - const AMDGPUResourceUsageAnalysis::SIFunctionResourceInfo &FRI); + const AMDGPURe

[clang] [llvm] [AMDGPU] Convert AMDGPUResourceUsageAnalysis pass from Module to MF pass (PR #102913)

2024-08-19 Thread Matt Arsenault via cfe-commits
@@ -75,10 +75,10 @@ bb.2: store volatile i32 0, ptr addrspace(1) undef ret void } -; DEFAULTSIZE: .amdhsa_private_segment_fixed_size 16 +; DEFAULTSIZE: .amdhsa_private_segment_fixed_size kernel_non_entry_block_static_alloca_uniformly_reached_align4.private_seg_size ; DEFA

[clang] [clang][CodeGen][SPIR-V][AMDGPU] Tweak AMDGCNSPIRV ABI to allow for the correct handling of aggregates passed to kernels / functions. (PR #102776)

2024-08-19 Thread Matt Arsenault via cfe-commits
@@ -78,18 +101,52 @@ ABIArgInfo SPIRVABIInfo::classifyKernelArgumentType(QualType Ty) const { return ABIArgInfo::getDirect(LTy, 0, nullptr, false); } -// Force copying aggregate type in kernel arguments by value when -// compiling CUDA targeting SPIR-V. This

[clang] [clang][CodeGen][SPIR-V][AMDGPU] Tweak AMDGCNSPIRV ABI to allow for the correct handling of aggregates passed to kernels / functions. (PR #102776)

2024-08-19 Thread Matt Arsenault via cfe-commits
@@ -64,6 +66,27 @@ void CommonSPIRABIInfo::setCCs() { RuntimeCC = llvm::CallingConv::SPIR_FUNC; } +ABIArgInfo SPIRVABIInfo::classifyReturnType(QualType RetTy) const { + if (getTarget().getTriple().getVendor() != llvm::Triple::AMD) +return DefaultABIInfo::classifyReturnT

[clang] [clang][CodeGen][SPIR-V][AMDGPU] Tweak AMDGCNSPIRV ABI to allow for the correct handling of aggregates passed to kernels / functions. (PR #102776)

2024-08-19 Thread Matt Arsenault via cfe-commits
@@ -78,18 +101,52 @@ ABIArgInfo SPIRVABIInfo::classifyKernelArgumentType(QualType Ty) const { return ABIArgInfo::getDirect(LTy, 0, nullptr, false); } -// Force copying aggregate type in kernel arguments by value when -// compiling CUDA targeting SPIR-V. This

[clang] [clang][CodeGen][SPIR-V][AMDGPU] Tweak AMDGCNSPIRV ABI to allow for the correct handling of aggregates passed to kernels / functions. (PR #102776)

2024-08-19 Thread Matt Arsenault via cfe-commits
@@ -64,6 +66,27 @@ void CommonSPIRABIInfo::setCCs() { RuntimeCC = llvm::CallingConv::SPIR_FUNC; } +ABIArgInfo SPIRVABIInfo::classifyReturnType(QualType RetTy) const { + if (getTarget().getTriple().getVendor() != llvm::Triple::AMD) +return DefaultABIInfo::classifyReturnT

[clang] [clang][CodeGen][SPIR-V][AMDGPU] Tweak AMDGCNSPIRV ABI to allow for the correct handling of aggregates passed to kernels / functions. (PR #102776)

2024-08-19 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/102776 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Convert AMDGPUResourceUsageAnalysis pass from Module to MF pass (PR #102913)

2024-08-20 Thread Matt Arsenault via cfe-commits
@@ -68,82 +71,84 @@ void MCResourceInfo::assignMaxRegs() { assignMaxRegSym(MaxSGPRSym, MaxSGPR); } -void MCResourceInfo::finalize() { - assert(!finalized && "Cannot finalize ResourceInfo again."); - finalized = true; - assignMaxRegs(); +void MCResourceInfo::finalize(MCCon

[clang] [llvm] [AMDGPU] Convert AMDGPUResourceUsageAnalysis pass from Module to MF pass (PR #102913)

2024-08-20 Thread Matt Arsenault via cfe-commits
@@ -2,12 +2,12 @@ // RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -target-cpu gfx908 -Rpass-analysis=kernel-resource-usage -S -O0 -verify %s -o /dev/null // expected-remark@+10 {{Function Name: foo}} -// expected-remark@+9 {{SGPRs: 13}} -// expected-remark@+8 {{VGPRs: 10

[clang] clang/AMDGPU: Emit atomicrmw from {global|flat}_atomic_fadd_v2f16 builtins (PR #96873)

2024-08-20 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm commented: ping https://github.com/llvm/llvm-project/pull/96873 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] clang/AMDGPU: Emit atomicrmw from {global|flat}_atomic_fadd_v2f16 builtins (PR #96873)

2024-08-20 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/96873 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Convert AMDGPUResourceUsageAnalysis pass from Module to MF pass (PR #102913)

2024-08-20 Thread Matt Arsenault via cfe-commits
@@ -65,8 +65,8 @@ define amdgpu_kernel void @minimal_kernel_inputs() #0 { ; GCN-NEXT: .amdhsa_user_sgpr_dispatch_id 0 ; GCN-NEXT: .amdhsa_user_sgpr_private_segment_size 0 ; GCN-NEXT: .amdhsa_wavefront_size32 -; GCN-NEXT: .amdhsa_uses_dynamic_stack 0 -; GCN-NEXT: .amdhsa_enable_

[clang] [llvm] [AMDGPU] Convert AMDGPUResourceUsageAnalysis pass from Module to MF pass (PR #102913)

2024-08-20 Thread Matt Arsenault via cfe-commits
@@ -36,8 +36,8 @@ ; GCN-NEXT: .amdhsa_user_sgpr_dispatch_id 0 ; GCN-NEXT: .amdhsa_user_sgpr_private_segment_size 0 ; GCN-NEXT: .amdhsa_wavefront_size32 -; GCN-NEXT: .amdhsa_uses_dynamic_stack 0 -; GCN-NEXT: .amdhsa_enable_private_segment 0 +; GCN-NEXT: .amdhsa_uses_dynamic_stac

[clang] [llvm] [AMDGPU] Convert AMDGPUResourceUsageAnalysis pass from Module to MF pass (PR #102913)

2024-08-20 Thread Matt Arsenault via cfe-commits
@@ -3,6 +3,18 @@ declare i32 @llvm.amdgcn.workitem.id.x() +define <2 x i64> @f1() #0 { arsenm wrote: Unrelated function appeared? https://github.com/llvm/llvm-project/pull/102913 ___ cfe-commits mailing list cfe-

[clang] clang/AMDGPU: Emit atomicrmw from {global|flat}_atomic_fadd_v2f16 builtins (PR #96873)

2024-08-20 Thread Matt Arsenault via cfe-commits
arsenm wrote: ### Merge activity * **Aug 20, 2:53 PM EDT**: @arsenm started a stack merge that includes this pull request via [Graphite](https://app.graphite.dev/github/pr/llvm/llvm-project/96873). https://github.com/llvm/llvm-project/pull/96873 __

[clang] clang/AMDGPU: Emit atomicrmw from {global|flat}_atomic_fadd_v2f16 builtins (PR #96873)

2024-08-20 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/96873 >From 3bada576176af63ac7960380511b80a0c541c437 Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Wed, 26 Jun 2024 19:12:59 +0200 Subject: [PATCH] clang/AMDGPU: Emit atomicrmw from {global|flat}_atomic_fadd_v2f1

[clang] clang/AMDGPU: Emit atomicrmw from {global|flat}_atomic_fadd_v2f16 builtins (PR #96873)

2024-08-20 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/96873 >From 484ad51d86ddb426eab70505953a06fe43782fc1 Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Wed, 26 Jun 2024 19:12:59 +0200 Subject: [PATCH] clang/AMDGPU: Emit atomicrmw from {global|flat}_atomic_fadd_v2f1

[clang] clang/AMDGPU: Emit atomicrmw from {global|flat}_atomic_fadd_v2f16 builtins (PR #96873)

2024-08-20 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm closed https://github.com/llvm/llvm-project/pull/96873 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] clang/AMDGPU: Emit atomicrmw from flat_atomic_{f32|f64} builtins (PR #96874)

2024-08-20 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/96874 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] clang/AMDGPU: Emit atomicrmw from flat_atomic_{f32|f64} builtins (PR #96874)

2024-08-20 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/96874 >From 98c982762710ff6da91b1c8acac34ed1665b5284 Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Wed, 26 Jun 2024 19:15:26 +0200 Subject: [PATCH] clang/AMDGPU: Emit atomicrmw from flat_atomic_{f32|f64} builtins

[clang] clang/AMDGPU: Emit atomicrmw from flat_atomic_{f32|f64} builtins (PR #96874)

2024-08-20 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/96874 >From 6bd03d98751b64b7c294cd90e66b5f1c49631623 Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Wed, 26 Jun 2024 19:15:26 +0200 Subject: [PATCH] clang/AMDGPU: Emit atomicrmw from flat_atomic_{f32|f64} builtins

[clang] clang/AMDGPU: Emit atomicrmw from flat_atomic_{f32|f64} builtins (PR #96874)

2024-08-20 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/96874 >From e03a9b6112507637bdc50e04586bdedd3a6769ec Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Wed, 26 Jun 2024 19:15:26 +0200 Subject: [PATCH] clang/AMDGPU: Emit atomicrmw from flat_atomic_{f32|f64} builtins

[clang] clang/AMDGPU: Emit atomicrmw from flat_atomic_{f32|f64} builtins (PR #96874)

2024-08-20 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm closed https://github.com/llvm/llvm-project/pull/96874 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] clang/AMDGPU: Emit atomicrmw for global/flat fadd v2bf16 builtins (PR #96875)

2024-08-20 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/96875 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] clang/AMDGPU: Emit atomicrmw for global/flat fadd v2bf16 builtins (PR #96875)

2024-08-20 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/96875 >From dd352ab3bf0428a9ffaae0383291ebac9ca03f59 Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Wed, 26 Jun 2024 19:34:43 +0200 Subject: [PATCH] clang/AMDGPU: Emit atomicrmw for global/flat fadd v2bf16 builtin

[clang] clang/AMDGPU: Emit atomicrmw for global/flat fadd v2bf16 builtins (PR #96875)

2024-08-20 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm closed https://github.com/llvm/llvm-project/pull/96875 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] clang/AMDGPU: Emit atomicrmw for flat/global atomic min/max f64 builtins (PR #96876)

2024-08-20 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/96876 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] clang/AMDGPU: Emit atomicrmw for flat/global atomic min/max f64 builtins (PR #96876)

2024-08-20 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/96876 >From 87ce332ec79ca7ad66405bc8ba608967e1b1d05a Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Wed, 26 Jun 2024 23:18:32 +0200 Subject: [PATCH] clang/AMDGPU: Emit atomicrmw for flat/global atomic min/max f64

[clang] clang/AMDGPU: Emit atomicrmw for flat/global atomic min/max f64 builtins (PR #96876)

2024-08-20 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm closed https://github.com/llvm/llvm-project/pull/96876 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] AMDGPU: Remove flat/global atomic fadd v2bf16 intrinsics (PR #97050)

2024-08-20 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/97050 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Convert AMDGPUResourceUsageAnalysis pass from Module to MF pass (PR #102913)

2024-08-21 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,533 @@ +; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx900 -enable-ipra=0 -verify-machineinstrs < %s | FileCheck -check-prefix=GCN %s arsenm wrote: Can drop -verify-machineinstrs https://github.com/llvm/llvm-project/pull/102913 _

[clang] [llvm] [AMDGPU] Convert AMDGPUResourceUsageAnalysis pass from Module to MF pass (PR #102913)

2024-08-21 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,533 @@ +; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx900 -enable-ipra=0 -verify-machineinstrs < %s | FileCheck -check-prefix=GCN %s + +; SGPR use may not seem equal to the sgpr use provided in comments as the latter includes extra sgprs (e.g., for vcc use). + +; Fun

[clang] [llvm] [AMDGPU] Convert AMDGPUResourceUsageAnalysis pass from Module to MF pass (PR #102913)

2024-08-21 Thread Matt Arsenault via cfe-commits
@@ -1,8 +1,8 @@ ; REQUIRES: asserts -; RUN: not --crash llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx900 -filetype=null %s 2>&1 | FileCheck %s -; RUN: not --crash llc -O0 -mtriple=amdgcn-amd-amdhsa -mcpu=gfx900 -filetype=null %s 2>&1 | FileCheck %s +; RUN: llc -mtriple=amdgcn-amd-am

[clang] [llvm] [AMDGPU] Convert AMDGPUResourceUsageAnalysis pass from Module to MF pass (PR #102913)

2024-08-21 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,533 @@ +; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx900 -enable-ipra=0 -verify-machineinstrs < %s | FileCheck -check-prefix=GCN %s + +; SGPR use may not seem equal to the sgpr use provided in comments as the latter includes extra sgprs (e.g., for vcc use).

[clang] [llvm] [AsmWriter] Print `nan`, `pinf`, and `ninf` when applicable (PR #105618)

2024-08-21 Thread Matt Arsenault via cfe-commits
@@ -4426,6 +4425,32 @@ represented by ``0xH`` followed by 4 hexadecimal digits. The bfloat 16-bit format is represented by ``0xR`` followed by 4 hexadecimal digits. All hexadecimal formats are big-endian (sign bit at the left). +Some of the special floating point values can b

[clang] [llvm] [AsmWriter] Print `nan`, `pinf`, and `ninf` when applicable (PR #105618)

2024-08-21 Thread Matt Arsenault via cfe-commits
@@ -4426,6 +4425,32 @@ represented by ``0xH`` followed by 4 hexadecimal digits. The bfloat 16-bit format is represented by ``0xR`` followed by 4 hexadecimal digits. All hexadecimal formats are big-endian (sign bit at the left). +Some of the special floating point values can b

[clang] [llvm] [AsmWriter] Print `nan`, `pinf`, and `ninf` when applicable (PR #105618)

2024-08-21 Thread Matt Arsenault via cfe-commits
@@ -4387,12 +4387,12 @@ Simple Constants zeros. So '``s0x0001``' of type '``i16``' will be -1, not 1. **Floating-point constants** Floating-point constants use standard decimal notation (e.g. -123.421), exponential notation (e.g. 1.23421e+2), or a more precise -

[clang] [flang] [llvm] [mlir] Make MMIWP not have ownership over MMI + Remove Move Constructor of MMI + Make MMI Only Use and Externally-Created MCContext (PR #105541)

2024-08-22 Thread Matt Arsenault via cfe-commits
arsenm wrote: > The TargetMachine interface functions addPassesToEmitFile and > addPassesToEmitMC now require a reference to an MMI; This IMO breaks the > abstraction of the TargetMachine, since an MMI requires a LLVMTargetMachine, > and if you have a TargetMachine you should do the dreaded ca

[clang] [Clang] Replace `emitXXXBuiltin` with a unified interface (PR #96313)

2024-06-21 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/96313 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [libc] [llvm] [libc] Implement (v|f)printf on the GPU (PR #96369)

2024-06-21 Thread Matt Arsenault via cfe-commits
@@ -1671,6 +1671,7 @@ int main(int Argc, char **Argv) { NewArgv.push_back(Arg->getValue()); for (const opt::Arg *Arg : Args.filtered(OPT_offload_opt_eq_minus)) NewArgv.push_back(Args.MakeArgString(StringRef("-") + Arg->getValue())); + llvm::errs() << "asdfasdf\n"; --

[clang] [llvm] [LLVM] Fix incorrect alignment on AMDGPU variadics (PR #96370)

2024-06-22 Thread Matt Arsenault via cfe-commits
arsenm wrote: > Here, because the minimum alignment is 4, we will only increment the buffer by 4, It should be incrementing by the size? 4 byte aligned access of 8 byte type should work fine https://github.com/llvm/llvm-project/pull/96370 ___ cfe-co

[clang] [llvm] [LLVM] Fix incorrect alignment on AMDGPU variadics (PR #96370)

2024-06-22 Thread Matt Arsenault via cfe-commits
arsenm wrote: Incrementing by align is just a bug, of course the size is the real value. Whether we want to continue wasting space is another not-correctness discussion https://github.com/llvm/llvm-project/pull/96370 ___ cfe-commits mailing list cfe-

[clang] [llvm] AMDGPU: Start selecting flat/global atomicrmw fmin/fmax. (PR #95592)

2024-06-23 Thread Matt Arsenault via cfe-commits
arsenm wrote: ### Merge activity * **Jun 23, 4:06 AM EDT**: @arsenm started a stack merge that includes this pull request via [Graphite](https://app.graphite.dev/github/pr/llvm/llvm-project/95592). https://github.com/llvm/llvm-project/pull/95592 __

[clang] [llvm] AMDGPU: Start selecting flat/global atomicrmw fmin/fmax. (PR #95592)

2024-06-23 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm closed https://github.com/llvm/llvm-project/pull/95592 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] AMDGPU: Start selecting buffer fat pointer atomicrmw fmin/fmax (PR #95593)

2024-06-23 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/95593 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] AMDGPU: Remove ds atomic fadd intrinsics (PR #95396)

2024-06-23 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm closed https://github.com/llvm/llvm-project/pull/95396 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Enable atomic optimizer for 64 bit divergent values (PR #96473)

2024-06-24 Thread Matt Arsenault via cfe-commits
arsenm wrote: > Kindly review only the top commit here If you're going to repost with a pre-commit, it would be better to have all the pieces squashed into one. Also you could look into using graphite or SPR for managing dependent pull requests https://github.com/llvm/llvm-project/pull/96473

[clang] [clang] Improve diagnostics for constraints of inline asm (NFC) (PR #96363)

2024-06-24 Thread Matt Arsenault via cfe-commits
@@ -2626,14 +2629,20 @@ void CodeGenFunction::EmitAsmStmt(const AsmStmt &S) { SmallVector OutputConstraintInfos; SmallVector InputConstraintInfos; + const FunctionDecl *FD = dyn_cast_or_null(CurCodeDecl); arsenm wrote: Where do you get dyn_cast_or_null i

[clang] [clang] Improve diagnostics for constraints of inline asm (NFC) (PR #96363)

2024-06-24 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm commented: It's really unfortunate to have to add all this asm handling to clang. Can't it rely on backend diagnostic remarks for this? https://github.com/llvm/llvm-project/pull/96363 ___ cfe-commits mailing list cfe-commits

[clang] [clang] Improve diagnostics for constraints of inline asm (NFC) (PR #96363)

2024-06-24 Thread Matt Arsenault via cfe-commits
@@ -2626,14 +2629,20 @@ void CodeGenFunction::EmitAsmStmt(const AsmStmt &S) { SmallVector OutputConstraintInfos; SmallVector InputConstraintInfos; + const FunctionDecl *FD = dyn_cast_or_null(CurCodeDecl); arsenm wrote: I think we should just get rid of d

[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-06-24 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/89217 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][AMDGPU] Add builtins for instrinsic `llvm.amdgcn.raw.ptr.buffer.store` (PR #94576)

2024-06-25 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/94576 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Extend permlane16, permlanex16 and permlane64 intrinsic lowering for generic types (PR #92725)

2024-06-25 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/92725 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Enable atomic optimizer for 64 bit divergent values (PR #96473)

2024-06-25 Thread Matt Arsenault via cfe-commits
@@ -311,10 +312,11 @@ void AMDGPUAtomicOptimizerImpl::visitIntrinsicInst(IntrinsicInst &I) { // If the value operand is divergent, each lane is contributing a different // value to the atomic calculation. We can only optimize divergent values if - // we have DPP availabl

[clang] [llvm] [AMDGPU] Enable atomic optimizer for 64 bit divergent values (PR #96473)

2024-06-25 Thread Matt Arsenault via cfe-commits
@@ -228,10 +228,11 @@ void AMDGPUAtomicOptimizerImpl::visitAtomicRMWInst(AtomicRMWInst &I) { // If the value operand is divergent, each lane is contributing a different // value to the atomic calculation. We can only optimize divergent values if - // we have DPP availabl

[clang] [llvm] [LLVM] Fix incorrect alignment on AMDGPU variadics (PR #96370)

2024-06-25 Thread Matt Arsenault via cfe-commits
arsenm wrote: > > Incrementing by align is just a bug, of course the size is the real value. > > Whether we want to continue wasting space is another not-correctness > > discussion > > Struct padding is pretty universal, AMDGPU seems the odd one out here. I > wouldn't mind it so much if it di

[clang] [llvm] [LLVM] Fix incorrect alignment on AMDGPU variadics (PR #96370)

2024-06-25 Thread Matt Arsenault via cfe-commits
arsenm wrote: > > > > Incrementing by align is just a bug, of course the size is the real > > > > value. Whether we want to continue wasting space is another > > > > not-correctness discussion > > > > > > > > > Struct padding is pretty universal, AMDGPU seems the odd one out here. I > > > wo

[clang] clang/AMDGPU: Use atomicrmw for ds fmin/fmax builtins (PR #96738)

2024-06-26 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm created https://github.com/llvm/llvm-project/pull/96738 None >From 0d9ab2bcbaa2b4b11832a8ac1848505cf73f4880 Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Tue, 11 Jun 2024 10:40:27 +0200 Subject: [PATCH] clang/AMDGPU: Use atomicrmw for ds fmin/fmax builtins ---

[clang] clang/AMDGPU: Use atomicrmw for ds fmin/fmax builtins (PR #96738)

2024-06-26 Thread Matt Arsenault via cfe-commits
arsenm wrote: * **#96739** https://app.graphite.dev/github/pr/llvm/llvm-project/96739?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * **#96738** https://app.graphite.dev/github/pr/llvm/llvm-proj

[clang] clang/AMDGPU: Use atomicrmw for ds fmin/fmax builtins (PR #96738)

2024-06-26 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm ready_for_review https://github.com/llvm/llvm-project/pull/96738 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Enable atomic optimizer for 64 bit divergent values (PR #96473)

2024-06-26 Thread Matt Arsenault via cfe-commits
@@ -178,6 +178,20 @@ bool AMDGPUAtomicOptimizerImpl::run(Function &F) { return Changed; } +static bool shouldOptimize(Type *Ty) { + switch (Ty->getTypeID()) { + case Type::FloatTyID: + case Type::DoubleTyID: +return true; + case Type::IntegerTyID: { +if (Ty->getI

[clang] [llvm] [AMDGPU] Enable atomic optimizer for 64 bit divergent values (PR #96473)

2024-06-26 Thread Matt Arsenault via cfe-commits
@@ -178,6 +178,20 @@ bool AMDGPUAtomicOptimizerImpl::run(Function &F) { return Changed; } +static bool shouldOptimize(Type *Ty) { arsenm wrote: Better name that expresses why this type is handleable. Also in a follow up, really should cover the i16/half/b

[clang] clang/AMDGPU: Use atomicrmw for ds fmin/fmax builtins (PR #96738)

2024-06-27 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/96738 >From 5f614809ac4ffa5e29a01c7e9410d91eadcbe6f2 Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Tue, 11 Jun 2024 10:40:27 +0200 Subject: [PATCH 1/2] clang/AMDGPU: Use atomicrmw for ds fmin/fmax builtins --- c

[clang] clang: Allow targets to set custom metadata on atomics (PR #96906)

2024-06-27 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm created https://github.com/llvm/llvm-project/pull/96906 Use this to replace the emission of the amdgpu-unsafe-fp-atomics attribute in favor of per-instruction metadata. In the future new fine grained controls should be introduced that also cover the integer cases. Add

[clang] clang: Allow targets to set custom metadata on atomics (PR #96906)

2024-06-27 Thread Matt Arsenault via cfe-commits
arsenm wrote: * **#96906** https://app.graphite.dev/github/pr/llvm/llvm-project/96906?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 * `main` This stack of pull requests is managed by Graphite

[clang] clang: Allow targets to set custom metadata on atomics (PR #96906)

2024-06-27 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm ready_for_review https://github.com/llvm/llvm-project/pull/96906 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] clang/AMDGPU: Use atomicrmw for ds fmin/fmax builtins (PR #96738)

2024-06-27 Thread Matt Arsenault via cfe-commits
arsenm wrote: ### Merge activity * **Jun 27, 9:27 AM EDT**: @arsenm started a stack merge that includes this pull request via [Graphite](https://app.graphite.dev/github/pr/llvm/llvm-project/96738). https://github.com/llvm/llvm-project/pull/96738 __

[clang] clang/AMDGPU: Use atomicrmw for ds fmin/fmax builtins (PR #96738)

2024-06-27 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/96738 >From d16cc8ec8b9ad4780fcaa14a035193ee930cd8fe Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Tue, 11 Jun 2024 10:40:27 +0200 Subject: [PATCH 1/2] clang/AMDGPU: Use atomicrmw for ds fmin/fmax builtins --- c

[clang] clang/AMDGPU: Use atomicrmw for ds fmin/fmax builtins (PR #96738)

2024-06-27 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm closed https://github.com/llvm/llvm-project/pull/96738 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] AMDGPU: Remove ds_fmin/ds_fmax intrinsics (PR #96739)

2024-06-27 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/96739 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA][NFC] CudaArch to GpuArch rename (PR #97028)

2024-06-28 Thread Matt Arsenault via cfe-commits
@@ -52,7 +52,7 @@ const char *CudaVersionToString(CudaVersion V); // Input is "Major.Minor" CudaVersion CudaStringToVersion(const llvm::Twine &S); -enum class CudaArch { +enum class GpuArch { arsenm wrote: Probably should call this OffloadArch to match --offl

[clang] [CUDA][NFC] CudaArch to OffloadArch rename (PR #97028)

2024-06-28 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/97028 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [libc] [llvm] AMDGPU: Add a subtarget feature for fine-grained remote memory support (PR #96442)

2024-06-28 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/96442 >From 03be6a1847ff90955413d1d488e2619036ffbceb Mon Sep 17 00:00:00 2001 From: martinboehme Date: Wed, 26 Jun 2024 15:01:57 +0200 Subject: [PATCH 01/14] [clang][dataflow] Teach `AnalysisASTVisitor` that `typeid()`

[clang] [clang][CodeGen] Add query for a target's flat address space (PR #95728)

2024-06-28 Thread Matt Arsenault via cfe-commits
arsenm wrote: I still think we should not need this. DefaultIsPrivate is junk that needs to be deleted. Querying for LangAS::Default should always give the answer 0 for AMDGPU, which is what this is working around. This clang notion of address space has nothing to do with your troubles with l

[clang] [llvm] [clang][CodeGen][AMDGPU] Enable AMDGPU `printf` for `spirv64-amd-amdhsa` (PR #97132)

2024-06-28 Thread Matt Arsenault via cfe-commits
@@ -5888,12 +5888,16 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID, case Builtin::BI__builtin_printf: case Builtin::BIprintf: if (getTarget().getTriple().isNVPTX() || -getTarget().getTriple().isAMDGCN()) { +getTarget

[clang] [llvm] [SPIRV][RFC] Rework / extend support for memory scopes (PR #106429)

2024-09-13 Thread Matt Arsenault via cfe-commits
@@ -251,6 +251,24 @@ SPIRV::MemorySemantics::MemorySemantics getMemSemantics(AtomicOrdering Ord) { llvm_unreachable(nullptr); } +SPIRV::Scope::Scope getMemScope(const LLVMContext &Ctx, SyncScope::ID ID) { + SmallVector SSNs; + Ctx.getSyncScopeNames(SSNs); + + StringRef M

[clang] Don't emit int TBAA metadata on more complex FP math libcalls. (PR #107598)

2024-09-13 Thread Matt Arsenault via cfe-commits
@@ -686,6 +686,20 @@ static Value *EmitSignBit(CodeGenFunction &CGF, Value *V) { return CGF.Builder.CreateICmpSLT(V, Zero); } +static bool hasPointerArgsOrPointerReturnType(const Value *V) { + if (const CallBase *CB = dyn_cast(V)) { +for (const Value *A : CB->args()) {

[clang] Don't emit int TBAA metadata on more complex FP math libcalls. (PR #107598)

2024-09-13 Thread Matt Arsenault via cfe-commits
@@ -699,9 +713,12 @@ static RValue emitLibraryCall(CodeGenFunction &CGF, const FunctionDecl *FD, bool ConstWithoutErrnoAndExceptions = Context.BuiltinInfo.isConstWithoutErrnoAndExceptions(BuiltinID); // Restrict to target with errno, for example, MacOS doesn't

[clang] Don't emit int TBAA metadata on more complex FP math libcalls. (PR #107598)

2024-09-13 Thread Matt Arsenault via cfe-commits
@@ -686,6 +686,20 @@ static Value *EmitSignBit(CodeGenFunction &CGF, Value *V) { return CGF.Builder.CreateICmpSLT(V, Zero); } +static bool hasPointerArgsOrPointerReturnType(const Value *V) { + if (const CallBase *CB = dyn_cast(V)) { +for (const Value *A : CB->args()) {

[clang] Don't emit int TBAA metadata on more complex FP math libcalls. (PR #107598)

2024-09-13 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,31 @@ +// RUN: %clang_cc1 %s -O3 -fmath-errno -emit-llvm -triple x86_64-unknown-unknown -o - %s | FileCheck %s -check-prefixes=CHECK +// RUN: %clang_cc1 %s -O3 -fmath-errno -emit-llvm -triple x86_64-pc-win64 -o - %s | FileCheck %s -check-prefixes=CHECK +// RUN: %clan

[clang] [llvm] [AMDGPU] Convert AMDGPUResourceUsageAnalysis pass from Module to MF pass (PR #102913)

2024-09-15 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,225 @@ +//===- AMDGPUMCResourceInfo.cpp --- MC Resource Info --===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [llvm] [AMDGPU] Convert AMDGPUResourceUsageAnalysis pass from Module to MF pass (PR #102913)

2024-09-15 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,225 @@ +//===- AMDGPUMCResourceInfo.cpp --- MC Resource Info --===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [llvm] [AMDGPU] Convert AMDGPUResourceUsageAnalysis pass from Module to MF pass (PR #102913)

2024-09-15 Thread Matt Arsenault via cfe-commits
@@ -40,12 +42,19 @@ class AMDGPUAsmPrinter final : public AsmPrinter { AMDGPUResourceUsageAnalysis *ResourceUsage; + MCResourceInfo RI; + SIProgramInfo CurrentProgramInfo; std::unique_ptr HSAMetadataStream; MCCodeEmitter *DumpCodeInstEmitter = nullptr; + //

[clang] [llvm] [AMDGPU] Change CF intrinsics lowering to reconverge on predecessors. (PR #92809)

2024-09-15 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm closed https://github.com/llvm/llvm-project/pull/92809 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

<    9   10   11   12   13   14   15   16   17   18   >