[clang] [llvm] [clang][CodeGen] `used` globals && the payloads for global ctors & dtors are globals (PR #93601)

2024-06-06 Thread Matt Arsenault via cfe-commits
@@ -2922,18 +2922,19 @@ static void emitUsed(CodeGenModule &CGM, StringRef Name, if (List.empty()) return; + llvm::Type *UsedPtrTy = llvm::PointerType::getUnqual(CGM.getLLVMContext()); + // Convert List to what ConstantArray needs. SmallVector UsedArray; UsedAr

[clang] [llvm] [clang][CodeGen] `used` globals && the payloads for global ctors & dtors are globals (PR #93601)

2024-06-06 Thread Matt Arsenault via cfe-commits
@@ -2922,18 +2922,19 @@ static void emitUsed(CodeGenModule &CGM, StringRef Name, if (List.empty()) return; + llvm::Type *UsedPtrTy = llvm::PointerType::getUnqual(CGM.getLLVMContext()); arsenm wrote: Best to just use get(Ctx, 0) https://github.com/llv

[clang] [llvm] [clang][CodeGen] `used` globals && the payloads for global ctors & dtors are globals (PR #93601)

2024-06-06 Thread Matt Arsenault via cfe-commits
arsenm wrote: Commit message also needs to be updated https://github.com/llvm/llvm-project/pull/93601 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [clang][CodeGen] Global constructors/destructors are globals (PR #93914)

2024-06-06 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm commented: Is this redundant with #93601? https://github.com/llvm/llvm-project/pull/93914 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [clang][CodeGen] `used` globals are fake (PR #93601)

2024-06-07 Thread Matt Arsenault via cfe-commits
@@ -8642,8 +8642,11 @@ The '``llvm.used``' Global Variable The ``@llvm.used`` global is an array which has :ref:`appending linkage `. This array contains a list of pointers to named global variables, functions and aliases which may optionally -have a pointer cast formed of bitc

[clang] [llvm] [APFloat] Add APFloat support for FP6 data types (PR #94735)

2024-06-07 Thread Matt Arsenault via cfe-commits
@@ -47,6 +47,10 @@ static std::string convertToString(double d, unsigned Prec, unsigned Pad, return std::string(Buffer.data(), Buffer.size()); } +static bool hasNanOrInf(APFloat::Semantics S) { + return (S != APFloat::S_Float6E3M2FN) && (S != APFloat::S_Float6E2M3FN); +} -

[clang] [llvm] [APFloat] Add APFloat support for FP6 data types (PR #94735)

2024-06-07 Thread Matt Arsenault via cfe-commits
@@ -1881,6 +1890,20 @@ TEST(APFloatTest, getSmallest) { EXPECT_TRUE(test.isFiniteNonZero()); EXPECT_TRUE(test.isDenormal()); EXPECT_TRUE(test.bitwiseIsEqual(expected)); + + test = APFloat::getSmallest(APFloat::Float6E3M2FN(), false); + expected = APFloat(APFloat::Float6

[clang] [Clang] Add timeout for GPU detection utilities (PR #94751)

2024-06-07 Thread Matt Arsenault via cfe-commits
@@ -205,7 +205,7 @@ class ToolChain { /// Executes the given \p Executable and returns the stdout. llvm::Expected> - executeToolChainProgram(StringRef Executable) const; + executeToolChainProgram(StringRef Executable, unsigned Timeout = 0) const; arsenm

[clang] [Clang] Add timeout for GPU detection utilities (PR #94751)

2024-06-07 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/94751 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [APFloat] Add APFloat support for FP6 data types (PR #94735)

2024-06-07 Thread Matt Arsenault via cfe-commits
@@ -878,6 +896,10 @@ void IEEEFloat::copySignificand(const IEEEFloat &rhs) { for the significand. If double or longer, this is a signalling NaN, which may not be ideal. If float, this is QNaN(0). */ void IEEEFloat::makeNaN(bool SNaN, bool Negative, const APInt *fill) {

[clang] [llvm] [APFloat] Add APFloat support for FP6 data types (PR #94735)

2024-06-07 Thread Matt Arsenault via cfe-commits
@@ -68,6 +68,10 @@ enum class fltNonfiniteBehavior { // `fltNanEncoding` enum. We treat all NaNs as quiet, as the available // encodings do not distinguish between signalling and quiet NaN. NanOnly, + + // This behavior is present in Float6E3M2FN and Float6E2M3FN types.

[clang] [Clang][AMDGPU] Add builtins for instrinsic `llvm.amdgcn.raw.buffer.store` (PR #94576)

2024-06-07 Thread Matt Arsenault via cfe-commits
arsenm wrote: > Actually, even ignoring address space 7, it feels like these builtins if you > could `raw.ptr.buffer.store` any type you liked, and then they could be > type-varying in Clang? We could either have a builtin for all the types that would work, or if we want to treat them more li

[clang] [Clang][AMDGPU] Add builtins for instrinsic `llvm.amdgcn.raw.buffer.store` (PR #94576)

2024-06-07 Thread Matt Arsenault via cfe-commits
arsenm wrote: > 1. For the swizzled case, that's `struct.ptr.buffer.*`, and yeah, those will > always need builtins because LLVM can't deal in 2D addressing schemes But the raw buffer intrinsics have both the soffset and voffset parameters though? Not just the struct https://github.com/llv

[clang] [Clang][AMDGPU] Add builtins for instrinsic `llvm.amdgcn.raw.buffer.store` (PR #94576)

2024-06-07 Thread Matt Arsenault via cfe-commits
arsenm wrote: > 2. What I mean is that "types that work" isn't the right framing: any type > can be legalized to one or more types that work. That is, down in the isel > legalizer, if I call for, for example >```llvm >%0 = call {i64, i64, i8} @llvm.amdgcn.raw.buffer.ptr.load(ptr addrspa

[clang] [Clang][AMDGPU] Add builtins for instrinsic `llvm.amdgcn.raw.buffer.store` (PR #94576)

2024-06-07 Thread Matt Arsenault via cfe-commits
arsenm wrote: > `voffset` and `soffset` are "offset that goes in VGPRs" and "offset that goes > in SGPRs", with the latter having some different bounds-checking semantics on > ... at least some of the gfx9's, IIRC. > Right, that's the problem. We need to know the parameters of the SRD in orde

[clang] [Clang][AMDGPU] Add builtins for instrinsic `llvm.amdgcn.raw.buffer.store` (PR #94576)

2024-06-07 Thread Matt Arsenault via cfe-commits
arsenm wrote: > "aggregates" here might even be unusual cases like `<4 x i8>` Vectors aren't aggregates and are more reasonable https://github.com/llvm/llvm-project/pull/94576 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm

[clang] [llvm] Intrinsic: introduce minimumnum and maximumnum (PR #93841)

2024-06-07 Thread Matt Arsenault via cfe-commits
@@ -15874,6 +15874,96 @@ The returned value is completely identical to the input except for the sign bit; in particular, if the input is a NaN, then the quiet/signaling bit and payload are perfectly preserved. +.. _i_fminmax_family: + +'``llvm.min.*``' Intrinsics Comparation

[clang] [llvm] Intrinsic: introduce minimumnum and maximumnum (PR #93841)

2024-06-07 Thread Matt Arsenault via cfe-commits
@@ -16055,6 +16145,90 @@ of the two arguments. -0.0 is considered to be less than +0.0 for this intrinsic. Note that these are the semantics specified in the draft of IEEE 754-2019. +.. _i_minimumnum: + +'``llvm.minimumnum.*``' Intrinsic +^ + +

[clang] [llvm] Intrinsic: introduce minimumnum and maximumnum (PR #93841)

2024-06-07 Thread Matt Arsenault via cfe-commits
@@ -15874,6 +15874,96 @@ The returned value is completely identical to the input except for the sign bit; in particular, if the input is a NaN, then the quiet/signaling bit and payload are perfectly preserved. +.. _i_fminmax_family: + +'``llvm.min.*``' Intrinsics Comparation

[clang] [llvm] Intrinsic: introduce minimumnum and maximumnum (PR #93841)

2024-06-07 Thread Matt Arsenault via cfe-commits
@@ -16055,6 +16145,90 @@ of the two arguments. -0.0 is considered to be less than +0.0 for this intrinsic. Note that these are the semantics specified in the draft of IEEE 754-2019. +.. _i_minimumnum: + +'``llvm.minimumnum.*``' Intrinsic +^ + +

[clang] [llvm] Intrinsic: introduce minimumnum and maximumnum (PR #93841)

2024-06-07 Thread Matt Arsenault via cfe-commits
@@ -16055,6 +16145,90 @@ of the two arguments. -0.0 is considered to be less than +0.0 for this intrinsic. Note that these are the semantics specified in the draft of IEEE 754-2019. +.. _i_minimumnum: + +'``llvm.minimumnum.*``' Intrinsic +^ + +

[clang] [Clang][AMDGPU] Add a new builtin type for buffer rsrc (PR #94830)

2024-06-07 Thread Matt Arsenault via cfe-commits
@@ -1091,6 +1091,9 @@ enum PredefinedTypeIDs { // \brief WebAssembly reference types with auto numeration #define WASM_TYPE(Name, Id, SingletonId) PREDEF_TYPE_##Id##_ID, #include "clang/Basic/WebAssemblyReferenceTypes.def" +// \breif AMDGPU types with auto numeration --

[clang] [Clang][AMDGPU] Add a new builtin type for buffer rsrc (PR #94830)

2024-06-07 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm commented: Need stacked PR that adds the make_buffer_rsrc builtin that shows its use https://github.com/llvm/llvm-project/pull/94830 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailm

[clang] [Clang][AMDGPU] Add a new builtin type for buffer rsrc (PR #94830)

2024-06-07 Thread Matt Arsenault via cfe-commits
@@ -2200,6 +2206,9 @@ TypeInfo ASTContext::getTypeInfoImpl(const Type *T) const { Align = 8; \ break; #include "clang/Basic/WebAssemblyReferenceTypes.def" +case BuiltinType::AMDGPUBufferRsrc: + W

[clang] [Clang][AMDGPU] Add a new builtin type for buffer rsrc (PR #94830)

2024-06-07 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/94830 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][AMDGPU] Add a new builtin type for buffer rsrc (PR #94830)

2024-06-07 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,21 @@ +//===-- AMDGPUTypes.def - Metadata about AMDGPU types ---*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apa

[clang] [Clang][AMDGPU] Add a new builtin type for buffer rsrc (PR #94830)

2024-06-07 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,9 @@ +// REQUIRES: amdgpu-registered-target +// RUN: %clang_cc1 -fclang-abi-compat=latest -triple amdgcn %s -emit-llvm -o - | FileCheck %s arsenm wrote: Why do you need -fclang-abi-compat=latest https://github.com/llvm/llvm-project/pull/94830 ___

[clang] [llvm] [Offload][CUDA] Add initial cuda_runtime.h overlay (PR #94821)

2024-06-07 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,30 @@ +// RUN: %clang++ -foffload-via-llvm --offload-arch=native %s -o %t +// RUN: %t | %fcheck-generic + +// UNSUPPORTED: aarch64-unknown-linux-gnu +// UNSUPPORTED: aarch64-unknown-linux-gnu-LTO +// UNSUPPORTED: x86_64-pc-linux-gnu +// UNSUPPORTED: x86_64-pc-linux-gnu-

[clang] [Clang][AMDGPU] Add a new builtin type for buffer rsrc (PR #94830)

2024-06-10 Thread Matt Arsenault via cfe-commits
@@ -2200,6 +2206,9 @@ TypeInfo ASTContext::getTypeInfoImpl(const Type *T) const { Align = 8; \ break; #include "clang/Basic/WebAssemblyReferenceTypes.def" +case BuiltinType::AMDGPUBufferRsrc: + W

[clang] [Clang][AMDGPU] Add a new builtin type for buffer rsrc (PR #94830)

2024-06-10 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,11 @@ +// REQUIRES: amdgpu-registered-target +// RUN: %clang_cc1 -fsyntax-only -verify -triple amdgcn -Wno-unused-value %s + +void foo() { + int n = 100; + __buffer_rsrc_t v = 0; // expected-error {{cannot initialize a variable of type '__buffer_rsrc_t' with an rvalu

[clang] [Clang][AMDGPU] Add a new builtin type for buffer rsrc (PR #94830)

2024-06-10 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,11 @@ +// REQUIRES: amdgpu-registered-target +// RUN: %clang_cc1 -fsyntax-only -verify -triple amdgcn -Wno-unused-value %s + +void foo() { + int n = 100; + __buffer_rsrc_t v = 0; // expected-error {{cannot initialize a variable of type '__buffer_rsrc_t' with an rvalu

[clang] [Clang][AMDGPU] Add a new builtin type for buffer rsrc (PR #94830)

2024-06-10 Thread Matt Arsenault via cfe-commits
@@ -2201,6 +2207,9 @@ TypeInfo ASTContext::getTypeInfoImpl(const Type *T) const { Align = 8; \ break; #include "clang/Basic/WebAssemblyReferenceTypes.def" +case BuiltinType::AMDGPUBufferRsrc: + W

[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-06-12 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,46 @@ +# RUN: not --crash llc -mtriple=amdgcn -run-pass=none -verify-machineinstrs -o /dev/null %s 2>&1 | FileCheck %s arsenm wrote: You should not need to introduce any new machine verifier tests, they are not useful. The useful test would be the IR

[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-06-12 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,46 @@ +# RUN: not --crash llc -mtriple=amdgcn -run-pass=none -verify-machineinstrs -o /dev/null %s 2>&1 | FileCheck %s arsenm wrote: I'd still test all 3, but yes an IR test https://github.com/llvm/llvm-project/pull/89217 ___

[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-06-12 Thread Matt Arsenault via cfe-commits
arsenm wrote: > Or drop the new nodes altogether and legelaize to intrinsics directly ? That's another option. The only real plus to the intermediate is it's slightly less annoying to write combines for. But there are limited combining opportunities for these https://github.com/llvm/llvm-p

[clang] [llvm] [clang][Driver] Add HIPAMD Driver support for AMDGCN flavoured SPIR-V (PR #95061)

2024-06-12 Thread Matt Arsenault via cfe-commits
@@ -128,12 +128,13 @@ enum class CudaArch { GFX12_GENERIC, GFX1200, GFX1201, + AMDGCNSPIRV, Generic, // A processor model named 'generic' if the target backend defines a // public one. LAST, CudaDefault = CudaArch::SM_52, - HIPDefault = CudaArch::

[clang] [Clang][AMDGPU] Add a builtin for llvm.amdgcn.make.buffer.rsrc intrinsic (PR #95276)

2024-06-12 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,14 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py +// REQUIRES: amdgpu-registered-target +// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu verde -emit-llvm -o - %s | FileCheck %s +// RUN: %clang_cc1 -triple amdgcn-unknown

[clang] [Clang][AMDGPU] Add a builtin for llvm.amdgcn.make.buffer.rsrc intrinsic (PR #95276)

2024-06-12 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,95 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py +// REQUIRES: amdgpu-registered-target +// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -cl-std=CL2.0 -target-cpu verde -emit-llvm -o - %s | FileCheck %s +// RUN: %clang_cc1 -triple

[clang] [Clang][AMDGPU] Add a builtin for llvm.amdgcn.make.buffer.rsrc intrinsic (PR #95276)

2024-06-13 Thread Matt Arsenault via cfe-commits
arsenm wrote: > I understand the chance of conflict is low. It may be like the chance of > hitting by a meteor. However, if we prefix with `__amdgcn_`, there is no such > risk. And we have the benefit to clearly indicate it is a amdgcn > target-specific type. Should use amdgpu https://githu

[clang] [Clang][AMDGPU] Add a builtin for llvm.amdgcn.make.buffer.rsrc intrinsic (PR #95276)

2024-06-13 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,95 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py +// REQUIRES: amdgpu-registered-target +// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -cl-std=CL2.0 -target-cpu verde -emit-llvm -o - %s | FileCheck %s +// RUN: %clang_cc1 -triple

[clang] [Clang][AMDGPU] Add a new builtin type for buffer rsrc (PR #94830)

2024-06-13 Thread Matt Arsenault via cfe-commits
arsenm wrote: > Just a note - and maybe this was already discussed above - is there good > reason not to explicitly make this type a 128-bit scalar? The LLVM data > layout already does this I thought this was the 160 bit version? Can we have an opaque-but-sized type? The concern is exposing

[clang] [llvm] [Offload] Introduce the concept of "default streams" (PR #95371)

2024-06-13 Thread Matt Arsenault via cfe-commits
@@ -1125,6 +1125,22 @@ void Clang::AddPreprocessingOptions(Compilation &C, const JobAction &JA, CmdArgs.push_back("__clang_openmp_device_functions.h"); } + if (Args.hasArg(options::OPT_foffload_via_llvm)) { +// Add llvm_wrappers/* to our system include path. This

[clang] [llvm] [Offload] Introduce the concept of "default streams" (PR #95371)

2024-06-13 Thread Matt Arsenault via cfe-commits
@@ -1125,6 +1125,22 @@ void Clang::AddPreprocessingOptions(Compilation &C, const JobAction &JA, CmdArgs.push_back("__clang_openmp_device_functions.h"); } + if (Args.hasArg(options::OPT_foffload_via_llvm)) { +// Add llvm_wrappers/* to our system include path. This

[clang] [clang-tools-extra] [compiler-rt] [flang] [libc] [lld] [lldb] [llvm] [mlir] [openmp] [llvm-project] Fix typo "seperate" (PR #95373)

2024-06-13 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/95373 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][AMDGPU] Add a new builtin type for buffer rsrc (PR #94830)

2024-06-13 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,69 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py + // REQUIRES: amdgpu-registered-target + // RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu verde -emit-llvm -o - %s | FileCheck %s + // RUN: %clang_cc1 -triple amdgcn-unkn

[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-06-14 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,65 @@ +; RUN: llc -stop-after=amdgpu-isel -mtriple=amdgcn-- -mcpu=gfx1100 -verify-machineinstrs -o - %s | FileCheck --check-prefixes=CHECK,ISEL %s + +; CHECK-LABEL: name:basic_readfirstlane_i64 +; CHECK:[[TOKEN:%[0-9]+]]{{[^ ]*}} = CONVERGENCECTRL

[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-06-14 Thread Matt Arsenault via cfe-commits
@@ -6129,13 +6150,55 @@ static SDValue lowerLaneOp(const SITargetLowering &TLI, SDNode *N, if (ValSize % 32 != 0) return SDValue(); + auto unrollLaneOp = [&DAG, &SL](SDNode *N) -> SDValue { +EVT VT = N->getValueType(0); +unsigned NE = VT.getVectorNumElements();

[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-06-14 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,65 @@ +; RUN: llc -stop-after=amdgpu-isel -mtriple=amdgcn-- -mcpu=gfx1100 -verify-machineinstrs -o - %s | FileCheck --check-prefixes=CHECK,ISEL %s + +; CHECK-LABEL: name:basic_readfirstlane_i64 +; CHECK:[[TOKEN:%[0-9]+]]{{[^ ]*}} = CONVERGENCECTRL

[clang] [Clang][AMDGPU] Add a new builtin type for buffer rsrc (PR #94830)

2024-06-14 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,21 @@ +//===-- AMDGPUTypes.def - Metadata about AMDGPU types ---*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apa

[clang] [Clang][AMDGPU] Add a new builtin type for buffer rsrc (PR #94830)

2024-06-14 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,84 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --function-signature + // REQUIRES: amdgpu-registered-target + // RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu verde -emit-llvm -o - %s | FileCheck %s + // RUN

[clang] [Clang][AMDGPU] Add a new builtin type for buffer rsrc (PR #94830)

2024-06-14 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,21 @@ +//===-- AMDGPUTypes.def - Metadata about AMDGPU types ---*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apa

[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-06-14 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,65 @@ +; RUN: llc -stop-after=amdgpu-isel -mtriple=amdgcn-- -mcpu=gfx1100 -verify-machineinstrs -o - %s | FileCheck --check-prefixes=CHECK,ISEL %s + +; CHECK-LABEL: name:basic_readfirstlane_i64 +; CHECK:[[TOKEN:%[0-9]+]]{{[^ ]*}} = CONVERGENCECTRL

[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-06-14 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/89217 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] clang/AMDGPU: Emit atomicrmw from ds_fadd builtins (PR #95395)

2024-06-15 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/95395 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] [WIP] Added builtin_alloca support for OpenCL1.2 and below (PR #95750)

2024-06-17 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,86 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 5 +// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL1.2 -emit-llvm -o - | FileCheck --check-prefix=OPENCL12 %s +// RUN: %clang_cc1 %s -O0 -triple amdg

[clang] [Clang] [WIP] Added builtin_alloca support for OpenCL1.2 and below (PR #95750)

2024-06-17 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,86 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 5 +// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL1.2 -emit-llvm -o - | FileCheck --check-prefix=OPENCL12 %s +// RUN: %clang_cc1 %s -O0 -triple amdg

[clang] [clang][CodeGen] Add query for a target's flat address space (PR #95728)

2024-06-17 Thread Matt Arsenault via cfe-commits
@@ -1764,6 +1764,13 @@ class TargetInfo : public TransferrableTargetInfo, return 0; } + /// \returns Target specific flat ptr address space; a flat ptr is a ptr that + /// can be casted to / from all other target address spaces. If the target + /// exposes no such add

[clang] [Clang][AMDGPU] Add a new builtin type for buffer rsrc (PR #94830)

2024-06-17 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,17 @@ +// REQUIRES: amdgpu-registered-target +// RUN: %clang_cc1 -fsyntax-only -verify -std=gnu++11 -triple amdgcn -Wno-unused-value %s + arsenm wrote: We probably want another similar sema test for OpenCL/HIP/OpenMP https://github.com/llvm/llvm-pro

[clang] [Clang][AMDGPU] Add a new builtin type for buffer rsrc (PR #94830)

2024-06-17 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,9 @@ + arsenm wrote: Extra blank line https://github.com/llvm/llvm-project/pull/94830 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][AMDGPU] Add a new builtin type for buffer rsrc (PR #94830)

2024-06-17 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,84 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --function-signature + // REQUIRES: amdgpu-registered-target + // RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu verde -emit-llvm -o - %s | FileCheck %s + // RUN

[clang] [Clang][AMDGPU] Add a new builtin type for buffer rsrc (PR #94830)

2024-06-17 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,84 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --function-signature + // REQUIRES: amdgpu-registered-target + // RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu verde -emit-llvm -o - %s | FileCheck %s + // RUN

[clang] [Clang] [WIP] Added builtin_alloca support for OpenCL1.2 and below (PR #95750)

2024-06-17 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,86 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 5 +// RUN: %clang_cc1 %s -O0 -triple amdgcn-amd-amdhsa -cl-std=CL1.2 -emit-llvm -o - | FileCheck --check-prefix=OPENCL12 %s +// RUN: %clang_cc1 %s -O0 -triple amdg

[clang] [clang][CodeGen] Add query for a target's flat address space (PR #95728)

2024-06-17 Thread Matt Arsenault via cfe-commits
@@ -1764,6 +1764,13 @@ class TargetInfo : public TransferrableTargetInfo, return 0; } + /// \returns Target specific flat ptr address space; a flat ptr is a ptr that + /// can be casted to / from all other target address spaces. If the target + /// exposes no such add

[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-06-17 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/89217 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Change CF intrinsics lowering to reconverge on predecessors. (PR #92809)

2024-06-17 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm requested changes to this pull request. There are quite a few code quality regressions, and XFAILed tests. The description needs more elaboration on what the strategy is here https://github.com/llvm/llvm-project/pull/92809 _

[clang] [llvm] [AMDGPU] Change CF intrinsics lowering to reconverge on predecessors. (PR #92809)

2024-06-17 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/92809 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Change CF intrinsics lowering to reconverge on predecessors. (PR #92809)

2024-06-17 Thread Matt Arsenault via cfe-commits
@@ -15740,6 +15740,32 @@ void SITargetLowering::finalizeLowering(MachineFunction &MF) const { } } + // ISel inserts copy to regs for the successor PHIs + // at the BB end. We need to move the SI_WAVE_RECONVERGE right before the + // branch. + for (auto &MBB : MF) {

[clang] [llvm] [AMDGPU] Change CF intrinsics lowering to reconverge on predecessors. (PR #92809)

2024-06-17 Thread Matt Arsenault via cfe-commits
@@ -3172,8 +3172,8 @@ def int_amdgcn_loop : Intrinsic<[llvm_i1_ty], [llvm_anyint_ty], [IntrWillReturn, IntrNoCallback, IntrNoFree] >; -def int_amdgcn_end_cf : Intrinsic<[], [llvm_anyint_ty], - [IntrWillReturn, IntrNoCallback, IntrNoFree]>; +def int_amdgcn_wave_reconverge :

[clang] [llvm] [AMDGPU] Change CF intrinsics lowering to reconverge on predecessors. (PR #92809)

2024-06-17 Thread Matt Arsenault via cfe-commits
@@ -1,3 +1,4 @@ +; XFAIL: * arsenm wrote: can't just xfail tests https://github.com/llvm/llvm-project/pull/92809 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/

[clang] [llvm] [AMDGPU] Change CF intrinsics lowering to reconverge on predecessors. (PR #92809)

2024-06-17 Thread Matt Arsenault via cfe-commits
@@ -2103,12 +2103,36 @@ bool SIInstrInfo::expandPostRAPseudo(MachineInstr &MI) const { MI.setDesc(get(AMDGPU::S_MOV_B64)); break; + case AMDGPU::S_CMOV_B64_term: +// This is only a terminator to get the correct spill code placement during +// register allocat

[clang] [llvm] [AMDGPU] Change CF intrinsics lowering to reconverge on predecessors. (PR #92809)

2024-06-17 Thread Matt Arsenault via cfe-commits
@@ -15740,6 +15740,32 @@ void SITargetLowering::finalizeLowering(MachineFunction &MF) const { } } + // ISel inserts copy to regs for the successor PHIs + // at the BB end. We need to move the SI_WAVE_RECONVERGE right before the arsenm wrote: Can you

[clang] [llvm] [AMDGPU] Change CF intrinsics lowering to reconverge on predecessors. (PR #92809)

2024-06-17 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1 @@ +remark: :0:0: removing function 'needs_extimg': +extended-image-insts is not supported on the current target arsenm wrote: accidentally added file? https://github.com/llvm/llvm-project/pull/92809 ___ c

[clang] [llvm] [AMDGPU] Change CF intrinsics lowering to reconverge on predecessors. (PR #92809)

2024-06-17 Thread Matt Arsenault via cfe-commits
@@ -305,43 +304,43 @@ bool SIAnnotateControlFlow::handleLoop(BranchInst *Term) { } /// Close the last opened control flow -bool SIAnnotateControlFlow::closeControlFlow(BasicBlock *BB) { - llvm::Loop *L = LI->getLoopFor(BB); +bool SIAnnotateControlFlow::tryWaveReconverge(Basic

[clang] [Clang][AMDGPU] Add a new builtin type for buffer rsrc (PR #94830)

2024-06-17 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,11 @@ +// REQUIRES: amdgpu-registered-target +// RUN: %clang_cc1 -verify -triple amdgcn-amd-amdhsa -Wno-unused-value %s arsenm wrote: set explicit -cl-std, and check 1.2 and 2.0? https://github.com/llvm/llvm-project/pull/94830 ___

[clang] [Clang][AMDGPU] Add a new builtin type for buffer rsrc (PR #94830)

2024-06-17 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. LGTM but I'm not a frontend expert https://github.com/llvm/llvm-project/pull/94830 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commit

[clang] [Clang][AMDGPU] Add a builtin for llvm.amdgcn.make.buffer.rsrc intrinsic (PR #95276)

2024-06-17 Thread Matt Arsenault via cfe-commits
@@ -19082,6 +19082,15 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID, CGM.getIntrinsic(Intrinsic::amdgcn_s_sendmsg_rtn, {ResultType}); return Builder.CreateCall(F, {Arg}); } + case AMDGPU::BI__builtin_amdgcn_make_buffer_rsrc: { +llvm::Va

[clang] [Clang][AMDGPU] Add a builtin for llvm.amdgcn.make.buffer.rsrc intrinsic (PR #95276)

2024-06-17 Thread Matt Arsenault via cfe-commits
@@ -33,6 +33,7 @@ // q -> Scalable vector, followed by the number of elements and the base type. // Q -> target builtin type, followed by a character to distinguish the builtin type //Qa -> AArch64 svcount_t builtin type. +//Qb -> AMDGPU __amdgpu_buffer_rsrc_t builti

[clang] [llvm] clang/AMDGPU: Emit atomicrmw from ds_fadd builtins (PR #95395)

2024-06-18 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/95395 >From 35c741fe2563094bc20c179ee9f244620025405c Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Mon, 10 Jun 2024 19:40:59 +0200 Subject: [PATCH] clang/AMDGPU: Emit atomicrmw from ds_fadd builtins We should hav

[clang] [Clang][AMDGPU] Add a builtin for llvm.amdgcn.make.buffer.rsrc intrinsic (PR #95276)

2024-06-18 Thread Matt Arsenault via cfe-commits
@@ -33,6 +33,7 @@ // q -> Scalable vector, followed by the number of elements and the base type. // Q -> target builtin type, followed by a character to distinguish the builtin type //Qa -> AArch64 svcount_t builtin type. +//Qb -> AMDGPU __amdgpu_buffer_rsrc_t builti

[clang] [llvm] clang/AMDGPU: Emit atomicrmw from ds_fadd builtins (PR #95395)

2024-06-18 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm closed https://github.com/llvm/llvm-project/pull/95395 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] AMDGPU: Remove ds atomic fadd intrinsics (PR #95396)

2024-06-18 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/95396 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] AMDGPU: Remove ds atomic fadd intrinsics (PR #95396)

2024-06-18 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/95396 >From f0f8e09caff2df5632d4252ca354b24c0c6f0e87 Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Mon, 10 Jun 2024 19:48:13 +0200 Subject: [PATCH] AMDGPU: Remove ds atomic fadd intrinsics These have been replace

[clang] Enable ASAN in amdgpu toolchain for OpenCL (PR #96262)

2024-06-21 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/96262 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Enable ASAN in amdgpu toolchain for OpenCL (PR #96262)

2024-06-21 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/96262 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Enable ASAN in amdgpu toolchain for OpenCL (PR #96262)

2024-06-21 Thread Matt Arsenault via cfe-commits
@@ -169,6 +180,11 @@ // COMMON-UNSAFE-MATH-SAME: "-mlink-builtin-bitcode" "{{.*}}/amdgcn/bitcode/oclc_finite_only_off.bc" // COMMON-UNSAFE-MATH-SAME: "-mlink-builtin-bitcode" "{{.*}}/amdgcn/bitcode/oclc_correctly_rounded_sqrt_off.bc" +// ASAN-SAME: "-fsanitize=address" + +//

[clang] [Clang][AMDGPU] Add builtins for instrinsic `llvm.amdgcn.raw.buffer.store` (PR #94576)

2024-06-21 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm commented: I'm wondering if we should really have all the different typed variants, and if this should be the name. I guess https://github.com/llvm/llvm-project/pull/94576 ___ cfe-commits mailing list cfe-commits@lists.llvm.o

[clang] [Clang][AMDGPU] Add builtins for instrinsic `llvm.amdgcn.raw.buffer.store` (PR #94576)

2024-06-21 Thread Matt Arsenault via cfe-commits
@@ -149,6 +149,19 @@ BUILTIN(__builtin_amdgcn_mqsad_pk_u16_u8, "WUiWUiUiWUi", "nc") BUILTIN(__builtin_amdgcn_mqsad_u32_u8, "V4UiWUiUiV4Ui", "nc") BUILTIN(__builtin_amdgcn_make_buffer_rsrc, "Qbv*sii", "nc") +BUILTIN(__builtin_amdgcn_raw_ptr_buffer_store_i8, "vcQbiiIi", "n") --

[clang] [Clang][AMDGPU] Add builtins for instrinsic `llvm.amdgcn.raw.buffer.store` (PR #94576)

2024-06-21 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/94576 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][AMDGPU] Add builtins for instrinsic `llvm.amdgcn.raw.buffer.store` (PR #94576)

2024-06-21 Thread Matt Arsenault via cfe-commits
@@ -626,6 +626,18 @@ static Value *emitQuaternaryBuiltin(CodeGenFunction &CGF, const CallExpr *E, return CGF.Builder.CreateCall(F, {Src0, Src1, Src2, Src3}); } +static Value *emitQuinaryBuiltin(CodeGenFunction &CGF, const CallExpr *E, arsenm wrote: The nam

[clang] [Clang][AMDGPU] Add builtins for instrinsic `llvm.amdgcn.raw.buffer.store` (PR #94576)

2024-06-21 Thread Matt Arsenault via cfe-commits
@@ -149,6 +149,19 @@ BUILTIN(__builtin_amdgcn_mqsad_pk_u16_u8, "WUiWUiUiWUi", "nc") BUILTIN(__builtin_amdgcn_mqsad_u32_u8, "V4UiWUiUiV4Ui", "nc") BUILTIN(__builtin_amdgcn_make_buffer_rsrc, "Qbv*sii", "nc") +BUILTIN(__builtin_amdgcn_raw_ptr_buffer_store_i8, "vcQbiiIi", "n") --

[clang] [Clang][AMDGPU] Add builtins for instrinsic `llvm.amdgcn.raw.buffer.store` (PR #94576)

2024-06-21 Thread Matt Arsenault via cfe-commits
@@ -149,6 +149,19 @@ BUILTIN(__builtin_amdgcn_mqsad_pk_u16_u8, "WUiWUiUiWUi", "nc") BUILTIN(__builtin_amdgcn_mqsad_u32_u8, "V4UiWUiUiV4Ui", "nc") BUILTIN(__builtin_amdgcn_make_buffer_rsrc, "Qbv*sii", "nc") +BUILTIN(__builtin_amdgcn_raw_ptr_buffer_store_i8, "vcQbiiIi", "n") --

[clang] [Clang] Replace `emitXXXBuiltin` with a unified interface (PR #96313)

2024-06-21 Thread Matt Arsenault via cfe-commits
@@ -581,49 +581,19 @@ static Value *emitCallMaybeConstrainedFPBuiltin(CodeGenFunction &CGF, return CGF.Builder.CreateCall(F, Args); } -// Emit a simple mangled intrinsic that has 1 argument and a return type -// matching the argument type. -static Value *emitUnaryBuiltin(

[clang] [Clang][AMDGPU] Add builtins for instrinsic `llvm.amdgcn.raw.buffer.store` (PR #94576)

2024-06-21 Thread Matt Arsenault via cfe-commits
@@ -149,6 +149,12 @@ BUILTIN(__builtin_amdgcn_mqsad_pk_u16_u8, "WUiWUiUiWUi", "nc") BUILTIN(__builtin_amdgcn_mqsad_u32_u8, "V4UiWUiUiV4Ui", "nc") BUILTIN(__builtin_amdgcn_make_buffer_rsrc, "Qbv*sii", "nc") +BUILTIN(__builtin_amdgcn_raw_buffer_store_b8, "vcQbiiIi", "n") +BUILT

[clang] [Clang][AMDGPU] Add builtins for instrinsic `llvm.amdgcn.raw.buffer.store` (PR #94576)

2024-06-21 Thread Matt Arsenault via cfe-commits
@@ -149,6 +149,12 @@ BUILTIN(__builtin_amdgcn_mqsad_pk_u16_u8, "WUiWUiUiWUi", "nc") BUILTIN(__builtin_amdgcn_mqsad_u32_u8, "V4UiWUiUiV4Ui", "nc") BUILTIN(__builtin_amdgcn_make_buffer_rsrc, "Qbv*sii", "nc") +BUILTIN(__builtin_amdgcn_raw_buffer_store_b8, "vcQbiiIi", "n") +BUILT

[clang] [llvm] [mlir] [Clang][CodeGen] Start migrating away from assuming the Default AS is 0 (PR #88182)

2024-05-28 Thread Matt Arsenault via cfe-commits
@@ -368,7 +368,8 @@ CodeGenModule::CodeGenModule(ASTContext &C, IntTy = llvm::IntegerType::get(LLVMContext, C.getTargetInfo().getIntWidth()); IntPtrTy = llvm::IntegerType::get(LLVMContext, C.getTargetInfo().getMaxPointerWidth()); - Int8PtrTy = llvm::PointerType::get(LL

[clang] [llvm] [mlir] [Clang][CodeGen] Start migrating away from assuming the Default AS is 0 (PR #88182)

2024-05-28 Thread Matt Arsenault via cfe-commits
@@ -368,7 +368,8 @@ CodeGenModule::CodeGenModule(ASTContext &C, IntTy = llvm::IntegerType::get(LLVMContext, C.getTargetInfo().getIntWidth()); IntPtrTy = llvm::IntegerType::get(LLVMContext, C.getTargetInfo().getMaxPointerWidth()); - Int8PtrTy = llvm::PointerType::get(LL

[clang] [llvm] [mlir] [Clang][CodeGen] Start migrating away from assuming the Default AS is 0 (PR #88182)

2024-05-28 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/88182 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [mlir] [Clang][CodeGen] Start migrating away from assuming the Default AS is 0 (PR #88182)

2024-05-28 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/88182 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [mlir] [Clang][CodeGen] Start migrating away from assuming the Default AS is 0 (PR #88182)

2024-05-28 Thread Matt Arsenault via cfe-commits
@@ -368,7 +368,8 @@ CodeGenModule::CodeGenModule(ASTContext &C, IntTy = llvm::IntegerType::get(LLVMContext, C.getTargetInfo().getIntWidth()); IntPtrTy = llvm::IntegerType::get(LLVMContext, C.getTargetInfo().getMaxPointerWidth()); - Int8PtrTy = llvm::PointerType::get(LL

[clang] [clang][CodeGen] `used` globals && the payloads for global ctors & dtors are globals (PR #93601)

2024-05-28 Thread Matt Arsenault via cfe-commits
@@ -2047,9 +2047,9 @@ void CodeGenModule::EmitCtorList(CtorList &Fns, const char *GlobalName) { llvm::Type *CtorPFTy = llvm::PointerType::get(CtorFTy, TheModule.getDataLayout().getProgramAddressSpace()); - // Get the type of a ctor entry, { i32, void ()*, i8* }.

<    5   6   7   8   9   10   11   12   13   14   >