[libcxx] [llvm] [clang-tools-extra] [flang] [lld] [compiler-rt] [lldb] [libc] [clang] [Legalizer] Expand fmaximum and fminimum (PR #67301)

2024-01-08 Thread Matt Arsenault via cfe-commits
@@ -8310,6 +8310,64 @@ SDValue TargetLowering::expandFMINNUM_FMAXNUM(SDNode *Node, return SDValue(); } +SDValue TargetLowering::expandFMINIMUM_FMAXIMUM(SDNode *N, +SelectionDAG &DAG) const { + SDLoc DL(N); + SDValue LHS = N-

[lldb] [clang] [libcxx] [lld] [compiler-rt] [libc] [clang-tools-extra] [llvm] [flang] [Legalizer] Expand fmaximum and fminimum (PR #67301)

2024-01-08 Thread Matt Arsenault via cfe-commits
@@ -8262,6 +8262,64 @@ SDValue TargetLowering::expandFMINNUM_FMAXNUM(SDNode *Node, return SDValue(); } +SDValue TargetLowering::expandFMINIMUM_FMAXIMUM(SDNode *N, +SelectionDAG &DAG) const { + SDLoc DL(N); + SDValue LHS = N-

[llvm] [clang-tools-extra] [clang] [libcxx] [libc] [lldb] [compiler-rt] [flang] [lld] [Legalizer] Expand fmaximum and fminimum (PR #67301)

2024-01-08 Thread Matt Arsenault via cfe-commits
@@ -8310,6 +8310,64 @@ SDValue TargetLowering::expandFMINNUM_FMAXNUM(SDNode *Node, return SDValue(); } +SDValue TargetLowering::expandFMINIMUM_FMAXIMUM(SDNode *N, +SelectionDAG &DAG) const { + SDLoc DL(N); + SDValue LHS = N-

[clang] [Driver][LTO] Copy fix empty stats filename to AMDGPU, HIPAMD, MinGW (PR #74178)

2024-01-09 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm requested changes to this pull request. Last comments requested changes https://github.com/llvm/llvm-project/pull/74178 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/c

[llvm] [clang] [AMDGPU][GFX12] Add 16 bit atomic fadd instructions (PR #75917)

2024-01-09 Thread Matt Arsenault via cfe-commits
@@ -1368,6 +1391,28 @@ def int_amdgcn_struct_ptr_buffer_atomic_cmpswap : Intrinsic< // gfx908 intrinsic def int_amdgcn_struct_buffer_atomic_fadd : AMDGPUStructBufferAtomic; def int_amdgcn_struct_ptr_buffer_atomic_fadd : AMDGPUStructPtrBufferAtomic; +// gfx12 intrinsic +def i

[clang] [CLANG] Add warning when INF or NAN are used in a binary operation or as function argument in fast math mode. (PR #76873)

2024-01-09 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,245 @@ +// RUN: %clang_cc1 -x c++ -verify -triple powerpc64le-unknown-unknown %s \ +// RUN: -menable-no-infs -menable-no-nans -DFAST=1 + +// RUN: %clang_cc1 -x c++ -verify -triple powerpc64le-unknown-unknown %s \ +// RUN: -DNOFAST=1 + +// RUN: %clang_cc1 -x c++ -verify

[libcxx] [clang] [libclc] [libcxxabi] [lld] [flang] [clang-tools-extra] [libunwind] [llvm] [lldb] [libc] [compiler-rt] [Legalizer] Soften EXTRACT_ELEMENT on ppcf128 (PR #77412)

2024-01-09 Thread Matt Arsenault via cfe-commits
@@ -68,8 +68,18 @@ define dso_local zeroext i32 @func(double noundef %0, double noundef %1) #0 { ; CHECK-LABEL: __adddf3 } +; To check ppc_fp128 soften without crash +define zeroext i1 @ppcf128_soften(ppc_fp128 %a) #0 { +entry: + %fpclass = tail call i1 @llvm.is.fpcla

[clang] [llvm] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #75647)

2024-01-09 Thread Matt Arsenault via cfe-commits
arsenm wrote: ping @krzysz00 https://github.com/llvm/llvm-project/pull/75647 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libcxxabi] [libunwind] [flang] [lldb] [clang-tools-extra] [libcxx] [libc] [llvm] [lld] [clang] [compiler-rt] [libclc] [Legalizer] Soften EXTRACT_ELEMENT on ppcf128 (PR #77412)

2024-01-10 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/77412 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[lldb] [llvm] [compiler-rt] [libcxx] [libclc] [clang-tools-extra] [libc] [clang] [libunwind] [flang] [libcxxabi] [lld] [Legalizer] Soften EXTRACT_ELEMENT on ppcf128 (PR #77412)

2024-01-10 Thread Matt Arsenault via cfe-commits
@@ -1,58 +1,305 @@ -; RUN: llc -verify-machineinstrs -mtriple=powerpc-unknown-linux-gnu -O0 < %s | FileCheck %s -; RUN: llc -verify-machineinstrs -mtriple=powerpc64-unknown-linux-gnu -O0 < %s | FileCheck %s -; RUN: llc -verify-machineinstrs -mtriple=powerpc64le-unknown-linux-

[llvm] [clang] [flang] [clang-tools-extra] [libcxxabi] [libclc] [lld] [libunwind] [compiler-rt] [libcxx] [libc] [lldb] [Legalizer] Soften EXTRACT_ELEMENT on ppcf128 (PR #77412)

2024-01-10 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/77412 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[openmp] [llvm] [libc] [lldb] [clang] [flang] [libcxx] [lld] [compiler-rt] [clang-tools-extra] [PGO][OpenMP] Instrumentation for GPU devices (PR #76587)

2024-01-10 Thread Matt Arsenault via cfe-commits
@@ -959,8 +959,12 @@ void CodeGenPGO::emitCounterIncrement(CGBuilderTy &Builder, const Stmt *S, unsigned Counter = (*RegionCounterMap)[S]; - llvm::Value *Args[] = {FuncNameVar, - Builder.getInt64(FunctionHash), + // Make sure that pointer to globa

[clang] c5b36ab - AMDGPU/clang: Remove dead code

2022-08-04 Thread Matt Arsenault via cfe-commits
Author: Matt Arsenault Date: 2022-08-04T19:02:56-04:00 New Revision: c5b36ab1d6a667554bf369c34e51d02add039d16 URL: https://github.com/llvm/llvm-project/commit/c5b36ab1d6a667554bf369c34e51d02add039d16 DIFF: https://github.com/llvm/llvm-project/commit/c5b36ab1d6a667554bf369c34e51d02add039d16.diff

[clang] a1303b2 - clang/AMDGPU: Define macro for -munsafe-fp-atomics

2022-04-14 Thread Matt Arsenault via cfe-commits
Author: Matt Arsenault Date: 2022-04-14T22:04:59-04:00 New Revision: a1303b23c9de6ef6d667aa923ec266ca4a0334e7 URL: https://github.com/llvm/llvm-project/commit/a1303b23c9de6ef6d667aa923ec266ca4a0334e7 DIFF: https://github.com/llvm/llvm-project/commit/a1303b23c9de6ef6d667aa923ec266ca4a0334e7.diff

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-03-14 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm requested changes to this pull request. Outstanding comments https://github.com/llvm/llvm-project/pull/79236 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Add an option to disable unsafe uses of atomic xor (PR #69229)

2024-03-15 Thread Matt Arsenault via cfe-commits
arsenm wrote: > @arsenm I agree that the default should be assuming fine-grained is possible. > My thinking behind the original naming and direction was not wanting to > introduce an unexpected performance regression by default. I'm happy for both > to be changed, and this patch being rebased

[clang] [llvm] [clang][HLSL][SPRI-V] Add convergence intrinsics (PR #80680)

2024-03-15 Thread Matt Arsenault via cfe-commits
Nathan =?utf-8?q?Gauër?= , Nathan =?utf-8?q?Gauër?= , Nathan =?utf-8?q?Gauër?= , Nathan =?utf-8?q?Gauër?= , Nathan =?utf-8?q?Gauër?= Message-ID: In-Reply-To: @@ -1295,11 +1295,13 @@ double4 trunc(double4); /// true, across all active lanes in the current wave. _HLSL_AVAILABIL

[clang] [llvm] [AMDGPU] Add an option to disable unsafe uses of atomic xor (PR #69229)

2024-03-15 Thread Matt Arsenault via cfe-commits
arsenm wrote: > Will the metadata for unsafe-fp-atomics also be controlled by the pragma that > controls the no-fine-grained and no-remote metadata? e.g. something like > > ``` > #pragma clang atomics begin no-fine-grained(on) no-remote(on) unsafe-fp(on) > ``` Yes, I would expect something lik

[clang] [llvm] [AMDGPU] Enable OpenCL hostcall printf (WIP) (PR #72556)

2024-03-15 Thread Matt Arsenault via cfe-commits
@@ -2550,6 +2550,11 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID, &getTarget().getLongDoubleFormat() == &llvm::APFloat::IEEEquad()) BuiltinID = mutateLongDoubleBuiltin(BuiltinID); + // Mutate the printf builtin ID so that we us

[clang] [llvm] [AMDGPU] Enable OpenCL hostcall printf (WIP) (PR #72556)

2024-03-15 Thread Matt Arsenault via cfe-commits
@@ -3616,6 +3617,12 @@ unsigned FunctionDecl::getBuiltinID(bool ConsiderWrapperFunctions) const { if (!ConsiderWrapperFunctions && getStorageClass() == SC_Static) return 0; + // AMDGCN implementation supports printf as a builtin + // for OpenCL + if (Context.getTarge

[clang] [llvm] [AMDGPU] Enable OpenCL hostcall printf (WIP) (PR #72556)

2024-03-15 Thread Matt Arsenault via cfe-commits
@@ -3616,6 +3617,12 @@ unsigned FunctionDecl::getBuiltinID(bool ConsiderWrapperFunctions) const { if (!ConsiderWrapperFunctions && getStorageClass() == SC_Static) return 0; + // AMDGCN implementation supports printf as a builtin + // for OpenCL + if (Context.getTarge

[clang] [llvm] [AMDGPU] Add an option to disable unsafe uses of atomic xor (PR #69229)

2024-03-15 Thread Matt Arsenault via cfe-commits
arsenm wrote: > we might document that those are not supported for now. if users really need > them, introducing more controls to support them Sure. I think it's easiest to make progress if we fix the integer cases as a first step https://github.com/llvm/llvm-project/pull/69229

[clang] [llvm] [AMDGPU] Enable OpenCL hostcall printf (WIP) (PR #72556)

2024-03-18 Thread Matt Arsenault via cfe-commits
@@ -3616,6 +3617,12 @@ unsigned FunctionDecl::getBuiltinID(bool ConsiderWrapperFunctions) const { if (!ConsiderWrapperFunctions && getStorageClass() == SC_Static) return 0; + // AMDGCN implementation supports printf as a builtin + // for OpenCL + if (Context.getTarge

[clang] [llvm] [AMDGPU] Enable OpenCL hostcall printf (WIP) (PR #72556)

2024-03-18 Thread Matt Arsenault via cfe-commits
@@ -3616,6 +3617,12 @@ unsigned FunctionDecl::getBuiltinID(bool ConsiderWrapperFunctions) const { if (!ConsiderWrapperFunctions && getStorageClass() == SC_Static) return 0; + // AMDGCN implementation supports printf as a builtin + // for OpenCL + if (Context.getTarge

[clang] [llvm] [CodeGen][LLVM] Make the `va_list` related intrinsics generic. (PR #85460)

2024-03-20 Thread Matt Arsenault via cfe-commits
@@ -700,10 +700,13 @@ class MSBuiltin { //===--- Variable Argument Handling Intrinsics ===// // -def int_vastart : DefaultAttrsIntrinsic<[], [llvm_ptr_ty], [], "llvm.va_start">; -def int_vacopy : DefaultAttrsIntrinsic<[], [llvm_ptr_ty, llvm_ptr_t

[clang] [llvm] [clang][HLSL][SPRI-V] Add convergence intrinsics (PR #80680)

2024-03-21 Thread Matt Arsenault via cfe-commits
Nathan =?utf-8?q?Gauër?= , Nathan =?utf-8?q?Gauër?= , Nathan =?utf-8?q?Gauër?= , Nathan =?utf-8?q?Gauër?= , Nathan =?utf-8?q?Gauër?= , Nathan =?utf-8?q?Gauër?= Message-ID: In-Reply-To: @@ -1295,11 +1295,13 @@ double4 trunc(double4); /// true, across all active lanes in the cur

[clang] [llvm] [clang][HLSL][SPRI-V] Add convergence intrinsics (PR #80680)

2024-03-21 Thread Matt Arsenault via cfe-commits
Nathan =?utf-8?q?Gauër?= , Nathan =?utf-8?q?Gauër?= , Nathan =?utf-8?q?Gauër?= , Nathan =?utf-8?q?Gauër?= , Nathan =?utf-8?q?Gauër?= , Nathan =?utf-8?q?Gauër?= Message-ID: In-Reply-To: @@ -1295,11 +1295,13 @@ double4 trunc(double4); /// true, across all active lanes in the cur

[clang] [llvm] [CodeGen][LLVM] Make the `va_list` related intrinsics generic. (PR #85460)

2024-03-21 Thread Matt Arsenault via cfe-commits
@@ -700,10 +700,13 @@ class MSBuiltin { //===--- Variable Argument Handling Intrinsics ===// // -def int_vastart : DefaultAttrsIntrinsic<[], [llvm_ptr_ty], [], "llvm.va_start">; -def int_vacopy : DefaultAttrsIntrinsic<[], [llvm_ptr_ty, llvm_ptr_t

[clang] [compiler-rt] [llvm] Add numerical sanitizer (PR #85916)

2024-03-21 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,2261 @@ +//===-- NumericalStabilitySanitizer.cpp ---===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: A

[clang] [compiler-rt] [llvm] Add numerical sanitizer (PR #85916)

2024-03-21 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,2261 @@ +//===-- NumericalStabilitySanitizer.cpp ---===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: A

[clang] [compiler-rt] [llvm] Add numerical sanitizer (PR #85916)

2024-03-21 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,2261 @@ +//===-- NumericalStabilitySanitizer.cpp ---===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: A

[clang] [compiler-rt] [llvm] Add numerical sanitizer (PR #85916)

2024-03-21 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,2261 @@ +//===-- NumericalStabilitySanitizer.cpp ---===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: A

[clang] [compiler-rt] [llvm] Add numerical sanitizer (PR #85916)

2024-03-21 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,2261 @@ +//===-- NumericalStabilitySanitizer.cpp ---===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: A

[clang] [compiler-rt] [llvm] Add numerical sanitizer (PR #85916)

2024-03-21 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,2261 @@ +//===-- NumericalStabilitySanitizer.cpp ---===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: A

[clang] [compiler-rt] [llvm] Add numerical sanitizer (PR #85916)

2024-03-21 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,2261 @@ +//===-- NumericalStabilitySanitizer.cpp ---===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: A

[clang] [compiler-rt] [llvm] Add numerical sanitizer (PR #85916)

2024-03-21 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,2261 @@ +//===-- NumericalStabilitySanitizer.cpp ---===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: A

[clang] [compiler-rt] [llvm] Add numerical sanitizer (PR #85916)

2024-03-21 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,2261 @@ +//===-- NumericalStabilitySanitizer.cpp ---===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: A

[clang] AMDGPU: Rename and add bf16 support for global_load_tr builtins (PR #86202)

2024-03-22 Thread Matt Arsenault via cfe-commits
arsenm wrote: > > I don't think intrinsics are meant for users. Builtins are the user-facing > > front. :-) > > Depending on who you consider an user. Are folks writing MLIR generators > users? They're consumers of an unstable API, changing intrinsics is fine https://github.com/llvm/llvm-pr

[clang] AMDGPU: Rename and add bf16 support for global_load_tr builtins (PR #86202)

2024-03-22 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/86202 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] AMDGPU: Rename and add bf16 support for global_load_tr builtins (PR #86202)

2024-03-22 Thread Matt Arsenault via cfe-commits
@@ -432,13 +432,15 @@ TARGET_BUILTIN(__builtin_amdgcn_s_wakeup_barrier, "vi", "n", "gfx12-insts") TARGET_BUILTIN(__builtin_amdgcn_s_barrier_leave, "b", "n", "gfx12-insts") TARGET_BUILTIN(__builtin_amdgcn_s_get_barrier_state, "Uii", "n", "gfx12-insts") -TARGET_BUILTIN(__builti

[clang] [llvm] [CodeGen][LLVM] Make the `va_list` related intrinsics generic. (PR #85460)

2024-03-22 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/85460 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [clang][HLSL][SPRI-V] Add convergence intrinsics (PR #80680)

2024-03-26 Thread Matt Arsenault via cfe-commits
Nathan =?utf-8?q?Gauër?= , Nathan =?utf-8?q?Gauër?= , Nathan =?utf-8?q?Gauër?= , Nathan =?utf-8?q?Gauër?= , Nathan =?utf-8?q?Gauër?= , Nathan =?utf-8?q?Gauër?= , Nathan =?utf-8?q?Gauër?= Message-ID: In-Reply-To: arsenm wrote: > @arsenm would you be fine with those codegen changes as-is? Given

[clang] [ClangFE] Improve handling of casting of atomic memory operations. (PR #86691)

2024-03-27 Thread Matt Arsenault via cfe-commits
@@ -1399,13 +1401,22 @@ RValue AtomicInfo::convertAtomicTempToRValue(Address addr, LVal.getBaseInfo(), TBAAAccessInfo())); } +static bool shouldCastToInt(llvm::Type *ValTy, bool CmpXchg) { arsenm wrote: The answer should just be false. I see no reason

[clang] [ClangFE] Improve handling of casting of atomic memory operations. (PR #86691)

2024-03-27 Thread Matt Arsenault via cfe-commits
arsenm wrote: > The non-ieee FP types left out as it seems easier if someone working with > that target does this part including test updates, which should be simple > enough by now. Just add the tests https://github.com/llvm/llvm-project/pull/86691 _

[clang] [ClangFE] Improve handling of casting of atomic memory operations. (PR #86691)

2024-03-27 Thread Matt Arsenault via cfe-commits
arsenm wrote: > I think this case isn't that simple as it is an 80 bit value. Currently that > is loaded atomically first with i128, then stored as a temporary and then > loaded as an fp80. If I remove that casting, the verifier complains "atomic > memory access' operand must have a power-of-t

[libclc] [libclc] Make CMake messages better fit into LLVM (PR #86945)

2024-03-28 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/86945 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libclc] [libclc] Track dependencies through dependency files (PR #86965)

2024-03-28 Thread Matt Arsenault via cfe-commits
arsenm wrote: The build here seems to be trying to define clc as a language, which then results in needing to rely on language support magic like this. I think it would be better if this did what rocm-device-libs does and treat these as custom targets. I don't think it's particularly helpful t

[libclc] [libclc] Track dependencies through dependency files (PR #86965)

2024-03-28 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. In the context of what the build is already doing, this should be fine https://github.com/llvm/llvm-project/pull/86965 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-03-28 Thread Matt Arsenault via cfe-commits
@@ -2326,6 +2326,20 @@ bool SIInsertWaitcnts::insertWaitcntInBlock(MachineFunction &MF, } #endif +if (ST->isPreciseMemoryEnabled() && Inst.mayLoadOrStore()) { + AMDGPU::Waitcnt Wait; + if (ST->hasExtendedWaitCounts()) +Wait = AMDGPU::Waitcnt(0, 0, 0,

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-03-28 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/79236 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-03-28 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,618 @@ +; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -mattr=+precise-memory < %s | FileCheck %s -check-prefixes=GFX9 +; RUN: llc -mtriple=amdgcn -mcpu=gfx90a -mattr=+precise-memory < %s | FileCheck %s -check-prefixes=GFX90A +; RUN: llc -mtriple=amdgcn -mcpu=gfx1010 -mattr=

[clang] [ClangFE] Improve handling of casting of atomic memory operations. (PR #86691)

2024-03-29 Thread Matt Arsenault via cfe-commits
@@ -1399,13 +1401,22 @@ RValue AtomicInfo::convertAtomicTempToRValue(Address addr, LVal.getBaseInfo(), TBAAAccessInfo())); } +static bool shouldCastToInt(llvm::Type *ValTy, bool CmpXchg) { arsenm wrote: I don't understand the CmpXchg parameter. Also ne

[clang] [ClangFE] Improve handling of casting of atomic memory operations. (PR #86691)

2024-03-29 Thread Matt Arsenault via cfe-commits
@@ -1414,13 +1425,11 @@ RValue AtomicInfo::ConvertToValueOrAtomic(llvm::Value *Val, auto *ValTy = AsValue ? CGF.ConvertTypeForMem(ValueTy) : getAtomicAddress().getElementType(); -if (ValTy->isIntegerTy() || (!CastFP && ValTy-

[clang] [ClangFE] Improve handling of casting of atomic memory operations. (PR #86691)

2024-03-29 Thread Matt Arsenault via cfe-commits
@@ -1399,13 +1401,22 @@ RValue AtomicInfo::convertAtomicTempToRValue(Address addr, LVal.getBaseInfo(), TBAAAccessInfo())); } +static bool shouldCastToInt(llvm::Type *ValTy, bool CmpXchg) { + bool KeepType = + (ValTy->isIntegerTy() || ValTy->isPointerTy() || +

[clang] [ClangFE] Improve handling of casting of atomic memory operations. (PR #86691)

2024-03-29 Thread Matt Arsenault via cfe-commits
@@ -134,14 +134,11 @@ static _Atomic float glob_flt = 0.0f; void force_global_uses(void) { (void)glob_pointer; - // CHECK: %[[LOCAL_INT:.+]] = load atomic i32, ptr @[[GLOB_POINTER]] seq_cst - // CHECK-NEXT: inttoptr i32 %[[LOCAL_INT]] to ptr + // CHECK: load atomic ptr, p

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-03-29 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,1413 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 4 +; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -mattr=+precise-memory < %s | FileCheck %s -check-prefixes=GFX9 +; RUN: llc -mtriple=amdgcn -mcpu=gfx90a -mattr=+preci

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-03-29 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,1413 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 4 +; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -mattr=+precise-memory < %s | FileCheck %s -check-prefixes=GFX9 +; RUN: llc -mtriple=amdgcn -mcpu=gfx90a -mattr=+preci

[clang] [ClangFE] Improve handling of casting of atomic memory operations. (PR #86691)

2024-04-02 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/86691 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [NVPTX][AMDGPU][CodeGen] Fix `local_space nullptr` handling for NVPTX and local/private `nullptr` value for AMDGPU. (PR #78759)

2024-02-06 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. amdgpu parts lgtm (which could be split to a separate change from the ptx change) https://github.com/llvm/llvm-project/pull/78759 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lis

[llvm] [clang-tools-extra] [clang] Reapply "InstCombine: Introduce SimplifyDemandedUseFPClass"" (PR #74056)

2024-02-06 Thread Matt Arsenault via cfe-commits
arsenm wrote: ping, I want to get this in and move to remove the flag https://github.com/llvm/llvm-project/pull/74056 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] Revert "InstCombine: Fold is.fpclass(x, fcInf) to fabs+fcmp" (PR #76338)

2024-02-06 Thread Matt Arsenault via cfe-commits
arsenm wrote: @dtcxzyw are you planning on a codegen patch to improve the backend handling? https://github.com/llvm/llvm-project/pull/76338 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-com

[clang] [AMDGPU] Treat printf as builtin for OpenCL (PR #72554)

2024-02-06 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm requested changes to this pull request. Is this redundant with #68515? Do we just need to add OpenCL test coverage? https://github.com/llvm/llvm-project/pull/72554 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https:

[llvm] [clang] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #79035)

2024-02-06 Thread Matt Arsenault via cfe-commits
arsenm wrote: > @arsenm Are you suggesting that these should instead be a range of > minimum/maximum number of workitems globally? That's how all of the other attributes we already have do this. amdgpu-waves-per-eu is a single min, max pair. Same with amdgpu-flat-work-group-size Although thi

[clang] Disable FTZ/DAZ when compiling shared libraries by default. (PR #80475)

2024-02-06 Thread Matt Arsenault via cfe-commits
arsenm wrote: > > So, alternatively...we could just go with the simplest solution, and use > > "ieee" as the default even under -ffast-math. > +1. There hasn't been a performance reason to use FTZ/DAZ since ~2011. Maybe there's still a power benefit? But in that case you could still explicitl

[clang] Disable FTZ/DAZ when compiling shared libraries by default. (PR #80475)

2024-02-06 Thread Matt Arsenault via cfe-commits
arsenm wrote: > Do you only set the register for kernel entries? Yes, it's the pre-initialized state. Non kernels can't be arbitrarily invoked from the host > Is the attribute ignored for other functions? No, it's an informative attribute about that the mode is. The compiler isn't trying t

[clang] [NVPTX][AMDGPU][CodeGen] Fix `local_space nullptr` handling for NVPTX and local/private `nullptr` value for AMDGPU. (PR #78759)

2024-02-07 Thread Matt Arsenault via cfe-commits
@@ -285,6 +289,20 @@ void NVPTXTargetCodeGenInfo::addNVVMMetadata(llvm::GlobalValue *GV, bool NVPTXTargetCodeGenInfo::shouldEmitStaticExternCAliases() const { return false; } + +llvm::Constant * +NVPTXTargetCodeGenInfo::getNullPointer(const CodeGen::CodeGenModule &CGM, +

[clang] [NVPTX][AMDGPU][CodeGen] Fix `local_space nullptr` handling for NVPTX and local/private `nullptr` value for AMDGPU. (PR #78759)

2024-02-07 Thread Matt Arsenault via cfe-commits
@@ -285,6 +289,20 @@ void NVPTXTargetCodeGenInfo::addNVVMMetadata(llvm::GlobalValue *GV, bool NVPTXTargetCodeGenInfo::shouldEmitStaticExternCAliases() const { return false; } + +llvm::Constant * +NVPTXTargetCodeGenInfo::getNullPointer(const CodeGen::CodeGenModule &CGM, +

[clang] [llvm] Reapply "InstCombine: Introduce SimplifyDemandedUseFPClass"" (PR #74056)

2024-02-07 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/74056 >From 9be777d5b39852cf3c0b2538fd5f712922672caa Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Fri, 1 Dec 2023 18:00:13 +0900 Subject: [PATCH 1/2] Reapply "InstCombine: Introduce SimplifyDemandedUseFPClass""

[clang] [llvm] Reapply "InstCombine: Introduce SimplifyDemandedUseFPClass"" (PR #74056)

2024-02-07 Thread Matt Arsenault via cfe-commits
arsenm wrote: > @arsenm Can you rebase this patch first? It was already fresh, I just re-merged again with no conflicts https://github.com/llvm/llvm-project/pull/74056 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi

[clang] [llvm] Reapply "InstCombine: Introduce SimplifyDemandedUseFPClass"" (PR #74056)

2024-02-07 Thread Matt Arsenault via cfe-commits
arsenm wrote: > I don't know why it fails: > > ``` > error: patch failed: llvm/lib/Transforms/InstCombine/InstCombineInternal.h:551 > error: llvm/lib/Transforms/InstCombine/InstCombineInternal.h: patch does not > apply > error: patch failed: > llvm/lib/Transforms/InstCombine/InstCombineSimplif

[clang] [llvm] Reapply "InstCombine: Introduce SimplifyDemandedUseFPClass"" (PR #74056)

2024-02-07 Thread Matt Arsenault via cfe-commits
@@ -1877,3 +1877,139 @@ Value *InstCombinerImpl::SimplifyDemandedVectorElts(Value *V, return MadeChange ? I : nullptr; } + +/// For floating-point classes that resolve to a single bit pattern, return that +/// value. +static Constant *getFPClassConstant(Type *Ty, FPClassTe

[clang] [llvm] Reapply "InstCombine: Introduce SimplifyDemandedUseFPClass"" (PR #74056)

2024-02-07 Thread Matt Arsenault via cfe-commits
@@ -1877,3 +1877,139 @@ Value *InstCombinerImpl::SimplifyDemandedVectorElts(Value *V, return MadeChange ? I : nullptr; } + +/// For floating-point classes that resolve to a single bit pattern, return that +/// value. +static Constant *getFPClassConstant(Type *Ty, FPClassTe

[clang] [llvm] Reapply "InstCombine: Introduce SimplifyDemandedUseFPClass"" (PR #74056)

2024-02-07 Thread Matt Arsenault via cfe-commits
@@ -1877,3 +1877,139 @@ Value *InstCombinerImpl::SimplifyDemandedVectorElts(Value *V, return MadeChange ? I : nullptr; } + +/// For floating-point classes that resolve to a single bit pattern, return that +/// value. +static Constant *getFPClassConstant(Type *Ty, FPClassTe

[clang] [llvm] Reapply "InstCombine: Introduce SimplifyDemandedUseFPClass"" (PR #74056)

2024-02-07 Thread Matt Arsenault via cfe-commits
@@ -1877,3 +1877,139 @@ Value *InstCombinerImpl::SimplifyDemandedVectorElts(Value *V, return MadeChange ? I : nullptr; } + +/// For floating-point classes that resolve to a single bit pattern, return that +/// value. +static Constant *getFPClassConstant(Type *Ty, FPClassTe

[clang] [llvm] Reapply "InstCombine: Introduce SimplifyDemandedUseFPClass"" (PR #74056)

2024-02-07 Thread Matt Arsenault via cfe-commits
@@ -1877,3 +1877,139 @@ Value *InstCombinerImpl::SimplifyDemandedVectorElts(Value *V, return MadeChange ? I : nullptr; } + +/// For floating-point classes that resolve to a single bit pattern, return that +/// value. +static Constant *getFPClassConstant(Type *Ty, FPClassTe

[clang] [llvm] [transforms] Inline simple variadic functions (PR #81058)

2024-02-07 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,698 @@ +//===-- ExpandVariadicsPass.cpp *- C++ -*-=// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache

[clang] [llvm] [transforms] Inline simple variadic functions (PR #81058)

2024-02-07 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,698 @@ +//===-- ExpandVariadicsPass.cpp *- C++ -*-=// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache

[clang] [llvm] [transforms] Inline simple variadic functions (PR #81058)

2024-02-07 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,698 @@ +//===-- ExpandVariadicsPass.cpp *- C++ -*-=// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache

[clang] [llvm] [transforms] Inline simple variadic functions (PR #81058)

2024-02-07 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,698 @@ +//===-- ExpandVariadicsPass.cpp *- C++ -*-=// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache

[clang] [llvm] [transforms] Inline simple variadic functions (PR #81058)

2024-02-07 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,698 @@ +//===-- ExpandVariadicsPass.cpp *- C++ -*-=// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache

[clang] [llvm] [transforms] Inline simple variadic functions (PR #81058)

2024-02-07 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,589 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: -p --function-signature +; RUN: opt -S --passes=expand-variadics < %s | FileCheck %s +target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:

[clang] [llvm] [transforms] Inline simple variadic functions (PR #81058)

2024-02-07 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,698 @@ +//===-- ExpandVariadicsPass.cpp *- C++ -*-=// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache

[clang] [llvm] [transforms] Inline simple variadic functions (PR #81058)

2024-02-07 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,589 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: -p --function-signature +; RUN: opt -S --passes=expand-variadics < %s | FileCheck %s +target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:

[clang] [llvm] [transforms] Inline simple variadic functions (PR #81058)

2024-02-07 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,698 @@ +//===-- ExpandVariadicsPass.cpp *- C++ -*-=// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache

[clang] [llvm] [transforms] Inline simple variadic functions (PR #81058)

2024-02-07 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,698 @@ +//===-- ExpandVariadicsPass.cpp *- C++ -*-=// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache

[clang] [llvm] [transforms] Inline simple variadic functions (PR #81058)

2024-02-07 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,698 @@ +//===-- ExpandVariadicsPass.cpp *- C++ -*-=// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache

[clang] [llvm] [transforms] Inline simple variadic functions (PR #81058)

2024-02-07 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,589 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: -p --function-signature +; RUN: opt -S --passes=expand-variadics < %s | FileCheck %s +target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:

[clang] [llvm] [transforms] Inline simple variadic functions (PR #81058)

2024-02-07 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,117 @@ +// RUN: %clang_cc1 -triple i386-unknown-linux-gnu -Wno-varargs -O1 -disable-llvm-passes -emit-llvm -o - %s | opt --passes=instcombine | opt -passes="expand-variadics,default" -S | FileCheck %s --check-prefixes=CHECK,X86Linux arsenm wrote: ca

[clang] [llvm] [transforms] Inline simple variadic functions (PR #81058)

2024-02-07 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,273 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py + +// REQUIRES: x86-registered-target +// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -target-cpu x86-64-v4 -std=c23 -O1 -ffreestanding -emit-llvm -o - %s | FileCheck %s + +// Th

[clang] [llvm] Reapply "InstCombine: Introduce SimplifyDemandedUseFPClass"" (PR #74056)

2024-02-08 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/74056 >From 9be777d5b39852cf3c0b2538fd5f712922672caa Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Fri, 1 Dec 2023 18:00:13 +0900 Subject: [PATCH 1/4] Reapply "InstCombine: Introduce SimplifyDemandedUseFPClass""

[clang] [llvm] Reapply "InstCombine: Introduce SimplifyDemandedUseFPClass"" (PR #74056)

2024-02-08 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm closed https://github.com/llvm/llvm-project/pull/74056 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] InstCombine: Enable SimplifyDemandedUseFPClass and remove flag (PR #81108)

2024-02-08 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm created https://github.com/llvm/llvm-project/pull/81108 This completes the unrevert of ef388334ee5a3584255b9ef5b3fefdb244fa3fd7. >From 7b5b50597e13c647ec70beab35dcc9b643bff42f Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Thu, 8 Feb 2024 14:15:33 +0530 Subject:

[clang] [llvm] Reapply "InstCombine: Introduce SimplifyDemandedUseFPClass"" (PR #74056)

2024-02-08 Thread Matt Arsenault via cfe-commits
arsenm wrote: Next piece in #81108 https://github.com/llvm/llvm-project/pull/74056 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [RFC][WIP][AMDGPU] Use `bf16` instead of `i16` for bfloat (PR #80908)

2024-02-08 Thread Matt Arsenault via cfe-commits
@@ -1562,8 +1562,9 @@ bool IRTranslator::translateBitCast(const User &U, bool IRTranslator::translateCast(unsigned Opcode, const User &U, MachineIRBuilder &MIRBuilder) { - if (U.getType()->getScalarType()->isBFloatTy() || - U.getOperand(0

[clang] [llvm] [RFC][WIP][AMDGPU] Use `bf16` instead of `i16` for bfloat (PR #80908)

2024-02-08 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,8 @@ +// RUN: llvm-mc -arch=amdgcn -mcpu=gfx1100 -show-encoding %s | FileCheck %s +// RUN: llvm-mc -arch=amdgcn -mcpu=gfx1200 -show-encoding %s | FileCheck %s + +v_dot2_bf16_bf16 v5, v1, v2, 100.0 arsenm wrote: does this help with #79369 at all? http

[clang] [llvm] [RFC][WIP][AMDGPU] Use `bf16` instead of `i16` for bfloat (PR #80908)

2024-02-08 Thread Matt Arsenault via cfe-commits
@@ -1562,8 +1562,9 @@ bool IRTranslator::translateBitCast(const User &U, bool IRTranslator::translateCast(unsigned Opcode, const User &U, MachineIRBuilder &MIRBuilder) { - if (U.getType()->getScalarType()->isBFloatTy() || - U.getOperand(0

[clang] [llvm] [RFC][WIP][AMDGPU] Use `bf16` instead of `i16` for bfloat (PR #80908)

2024-02-08 Thread Matt Arsenault via cfe-commits
@@ -2835,8 +2835,8 @@ def int_amdgcn_fdot2_f32_bf16 : DefaultAttrsIntrinsic< [llvm_float_ty], // %r [ - llvm_v2i16_ty, // %a - llvm_v2i16_ty, // %b + llvm_v2bf16_ty, // %a + llvm_v2bf16_ty, // %b arsenm wrote: For potential revert

[clang] [llvm] [RFC][WIP][AMDGPU] Use `bf16` instead of `i16` for bfloat (PR #80908)

2024-02-08 Thread Matt Arsenault via cfe-commits
@@ -2819,11 +2819,11 @@ def int_amdgcn_fdot2_f16_f16 : def int_amdgcn_fdot2_bf16_bf16 : ClangBuiltin<"__builtin_amdgcn_fdot2_bf16_bf16">, DefaultAttrsIntrinsic< -[llvm_i16_ty], // %r +[llvm_bfloat_ty], // %r arsenm wrote: Changing the clang bui

[clang] [llvm] [AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (PR #76955)

2024-02-08 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm commented: I think this needs codegen tests for the gfx900 vs. gfx906 mad_mix/fma_fix issue https://github.com/llvm/llvm-project/pull/76955 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-b

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-08 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,199 @@ +; Testing the -amdgpu-precise-memory-op option +; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -mattr=+amdgpu-precise-memory-op -verify-machineinstrs < %s | FileCheck %s -check-prefixes=GFX9 +; RUN: llc -mtriple=amdgcn -mcpu=gfx90a -mattr=+amdgpu-precise-memory-op -v

<    1   2   3   4   5   6   7   8   9   10   >