[clang] [llvm] [llvm][opt][Transforms][SPIR-V] Enable `InferAddressSpaces` for SPIR-V (PR #110897)

2024-10-02 Thread Matt Arsenault via cfe-commits
@@ -91,6 +97,88 @@ SPIRVTargetMachine::SPIRVTargetMachine(const Target &T, const Triple &TT, setRequiresStructuredCFG(false); } +enum AddressSpace { + Function = storageClassToAddressSpace(SPIRV::StorageClass::Function), + CrossWorkgroup = + storageClassToAddressSpac

[clang] [llvm] [llvm][opt][Transforms][SPIR-V] Enable `InferAddressSpaces` for SPIR-V (PR #110897)

2024-10-02 Thread Matt Arsenault via cfe-commits
@@ -91,6 +97,88 @@ SPIRVTargetMachine::SPIRVTargetMachine(const Target &T, const Triple &TT, setRequiresStructuredCFG(false); } +enum AddressSpace { + Function = storageClassToAddressSpace(SPIRV::StorageClass::Function), + CrossWorkgroup = + storageClassToAddressSpac

[clang] [llvm] [llvm][opt][Transforms][SPIR-V] Enable `InferAddressSpaces` for SPIR-V (PR #110897)

2024-10-02 Thread Matt Arsenault via cfe-commits
@@ -91,6 +97,88 @@ SPIRVTargetMachine::SPIRVTargetMachine(const Target &T, const Triple &TT, setRequiresStructuredCFG(false); } +enum AddressSpace { + Function = storageClassToAddressSpace(SPIRV::StorageClass::Function), + CrossWorkgroup = + storageClassToAddressSpac

[clang] [llvm] [llvm][opt][Transforms][SPIR-V] Enable `InferAddressSpaces` for SPIR-V (PR #110897)

2024-10-02 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,29 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py arsenm wrote: You don't need to duplicate all of these tests. You just need some basic samples that the target is implemented, the full set is testing pass mechanics whi

[clang] [Clang] Automatically enable `-fconvergent-functions` on GPU targets (PR #111076)

2024-10-04 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/111076 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [HIP] Replace use of `llvm-mc` with `clang` (PR #112041)

2024-10-11 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/112041 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [HIP] Replace use of `llvm-mc` with `clang` (PR #112041)

2024-10-11 Thread Matt Arsenault via cfe-commits
@@ -463,10 +463,11 @@ void HIP::constructGenerateObjFileFromHIPFatBinary( Objf << ObjBuffer; - ArgStringList McArgs{"-triple", Args.MakeArgString(HostTriple.normalize()), + ArgStringList McArgs{"-target", Args.MakeArgString(HostTriple.normalize()),

[clang] Clang: Support minimumnum and maximumnum intrinsics (PR #96281)

2024-10-11 Thread Matt Arsenault via cfe-commits
@@ -372,6 +372,31 @@ void foo(double *d, float f, float *fp, long double *l, int *i, const char *c) { // HAS_MAYTRAP: declare float @llvm.experimental.constrained.minnum.f32( // HAS_MAYTRAP: declare x86_fp80 @llvm.experimental.constrained.minnum.f80( + fmaximum_num(*d,*d);

[clang] Clang: Support minimumnum and maximumnum intrinsics (PR #96281)

2024-10-11 Thread Matt Arsenault via cfe-commits
@@ -15314,6 +15314,32 @@ bool FloatExprEvaluator::VisitCallExpr(const CallExpr *E) { Result = RHS; arsenm wrote: Unrelated, but why is up here reproducing logic that's already in APFloat? https://github.com/llvm/llvm-project/pull/96281 _

[clang] Clang: Support minimumnum and maximumnum intrinsics (PR #96281)

2024-10-11 Thread Matt Arsenault via cfe-commits
@@ -15314,6 +15314,32 @@ bool FloatExprEvaluator::VisitCallExpr(const CallExpr *E) { Result = RHS; return true; } + + case Builtin::BI__builtin_fmaximum_num: + case Builtin::BI__builtin_fmaximum_numf: + case Builtin::BI__builtin_fmaximum_numl: + case Builtin::B

[clang] Clang: Support minimumnum and maximumnum intrinsics (PR #96281)

2024-10-11 Thread Matt Arsenault via cfe-commits
@@ -475,6 +475,12 @@ SYMBOL(fmaxl, None, ) SYMBOL(fmin, None, ) SYMBOL(fminf, None, ) SYMBOL(fminl, None, ) +SYMBOL(fmaximum_num, None, ) arsenm wrote: Not sure what this for, but this isn't tested? https://github.com/llvm/llvm-project/pull/96281

[clang] Clang: Support minimumnum and maximumnum intrinsics (PR #96281)

2024-10-11 Thread Matt Arsenault via cfe-commits
@@ -372,6 +372,31 @@ void foo(double *d, float f, float *fp, long double *l, int *i, const char *c) { // HAS_MAYTRAP: declare float @llvm.experimental.constrained.minnum.f32( // HAS_MAYTRAP: declare x86_fp80 @llvm.experimental.constrained.minnum.f80( + fmaximum_num(f,f);

[clang] Clang: Support minimumnum and maximumnum intrinsics (PR #96281)

2024-10-11 Thread Matt Arsenault via cfe-commits
@@ -1295,6 +1295,24 @@ SYMBOL(fminf, None, ) SYMBOL(fminl, std::, ) SYMBOL(fminl, None, ) SYMBOL(fminl, None, ) +SYMBOL(fmaximum_num, std::, ) arsenm wrote: Not sure what this for, but this isn't tested? https://github.com/llvm/llvm-project/pull/96281 __

[clang] [HIP] Replace use of `llvm-mc` with `clang` (PR #112041)

2024-10-14 Thread Matt Arsenault via cfe-commits
arsenm wrote: > > I fixed an apparent missing test dependency in > > [1ac6ef5](https://github.com/llvm/llvm-project/commit/1ac6ef5af28b72e534496a9833a2b75a2aba66cc) > > and this commit removed the llvm-mc dependency > > I probably should've remembered to include removing that in this patch. I

[clang] [llvm] [llvm][opt][Transforms][SPIR-V] Enable `InferAddressSpaces` for SPIR-V (PR #110897)

2024-10-14 Thread Matt Arsenault via cfe-commits
@@ -91,6 +97,100 @@ SPIRVTargetMachine::SPIRVTargetMachine(const Target &T, const Triple &TT, setRequiresStructuredCFG(false); } +enum AddressSpace { + Function = storageClassToAddressSpace(SPIRV::StorageClass::Function), + CrossWorkgroup = + storageClassToAddressSpa

[clang] [llvm] [llvm][opt][Transforms][SPIR-V] Enable `InferAddressSpaces` for SPIR-V (PR #110897)

2024-10-14 Thread Matt Arsenault via cfe-commits
@@ -91,6 +97,100 @@ SPIRVTargetMachine::SPIRVTargetMachine(const Target &T, const Triple &TT, setRequiresStructuredCFG(false); } +enum AddressSpace { + Function = storageClassToAddressSpace(SPIRV::StorageClass::Function), + CrossWorkgroup = + storageClassToAddressSpa

[clang] [HIP] Replace use of `llvm-mc` with `clang` (PR #112041)

2024-10-14 Thread Matt Arsenault via cfe-commits
arsenm wrote: I fixed an apparent missing test dependency in 1ac6ef5af28b72e534496a9833a2b75a2aba66cc and this commit removed the llvm-mc dependency https://github.com/llvm/llvm-project/pull/112041 ___ cfe-commits mailing list cfe-commits@lists.llvm

[clang] clang: Remove requires system-linux from some driver tests (PR #111976)

2024-10-28 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/111976 >From 91c2f46274f83603552b12317c2fb87a8633ccc3 Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Fri, 11 Oct 2024 14:33:32 +0400 Subject: [PATCH 1/7] clang: Remove requires system-linux from some driver tests

[clang] [clang][CodeGen][OpenCL] Fix `alloca` handling & `sret`when compiling for (PR #113930)

2024-10-28 Thread Matt Arsenault via cfe-commits
@@ -108,11 +108,15 @@ RawAddress CodeGenFunction::CreateTempAlloca(llvm::Type *Ty, CharUnits Align, if (AllocaAddr) *AllocaAddr = Alloca; llvm::Value *V = Alloca.getPointer(); + assert((!getLangOpts().OpenCL || + CGM.getTarget().getTargetAddressSpace(getASTAl

[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-10-29 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm commented: I agree that in theory a target could want to do something else, but that would be an ABI lowering decision. It doesn't naturally come from a source level type. Supporting such a target would require more work, but given the current state of the world just

[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-10-29 Thread Matt Arsenault via cfe-commits
@@ -5390,11 +5391,19 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo, V->getType()->isIntegerTy()) V = Builder.CreateZExt(V, ArgInfo.getCoerceToType()); -// If the argument doesn't match, perform a bitcast to coerce it. This -

[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-10-29 Thread Matt Arsenault via cfe-commits
@@ -5390,11 +5391,19 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo, V->getType()->isIntegerTy()) V = Builder.CreateZExt(V, ArgInfo.getCoerceToType()); -// If the argument doesn't match, perform a bitcast to coerce it. This -

[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-10-29 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/114062 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-10-29 Thread Matt Arsenault via cfe-commits
@@ -5390,11 +5391,19 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo, V->getType()->isIntegerTy()) V = Builder.CreateZExt(V, ArgInfo.getCoerceToType()); -// If the argument doesn't match, perform a bitcast to coerce it. This -

[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-10-29 Thread Matt Arsenault via cfe-commits
@@ -23,8 +25,10 @@ X Test() // sret argument. // CHECK-CXX98: call void @_ZN1XC1ERKS_( // CHECK-CXX11: call void @_ZN1XC1EOS_( + // CHECK-CXX11-NONZEROALLOCAAS: call void @_ZN1XC1EOS_( arsenm wrote: Can you add more context checks here? https://github.

[clang] need to pass executable name as arg[0] (PR #114067)

2024-10-29 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm commented: Should fix commit description https://github.com/llvm/llvm-project/pull/114067 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang][CodeGen][OpenCL] Fix `alloca` handling & `sret`when compiling for (PR #113930)

2024-10-28 Thread Matt Arsenault via cfe-commits
@@ -108,11 +108,15 @@ RawAddress CodeGenFunction::CreateTempAlloca(llvm::Type *Ty, CharUnits Align, if (AllocaAddr) *AllocaAddr = Alloca; llvm::Value *V = Alloca.getPointer(); + assert((!getLangOpts().OpenCL || + CGM.getTarget().getTargetAddressSpace(getASTAl

[libclc] [libclc] Create an internal 'clc' builtins library (PR #109985)

2024-10-28 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/109985 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [AMDGPU] Allow overload of __builtin_amdgcn_mov_dpp8 (PR #113610)

2024-10-28 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/113610 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [TLI] Add support for reallocarray (PR #114818)

2024-11-04 Thread Matt Arsenault via cfe-commits
@@ -318,6 +318,7 @@ TEST_F(TargetLibraryInfoTest, ValidProto) { "declare void @qsort(i8*, i64, i64, i32 (i8*, i8*)*)\n" "declare i64 @readlink(i8*, i8*, i64)\n" "declare i8* @realloc(i8*, i64)\n" + "declare i8* @reallocarray(i8*, i64, i64)\n"

[clang] Add clang atomic control options and attribute (PR #114841)

2024-11-04 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,19 @@ +//===--- AtomicOptions.def - Atomic Options database -*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] Add clang atomic control options and attribute (PR #114841)

2024-11-04 Thread Matt Arsenault via cfe-commits
@@ -569,19 +569,21 @@ void AMDGPUTargetCodeGenInfo::setTargetAtomicMetadata( AtomicInst.setMetadata(llvm::LLVMContext::MD_noalias_addrspace, ASRange); } - if (!RMW || !CGF.getTarget().allowAMDGPUUnsafeFPAtomics()) + if (!RMW) return; - // TODO: Introduce new, m

[clang] Add clang atomic control options and attribute (PR #114841)

2024-11-04 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,30 @@ +// RUN: %clang_cc1 -fsyntax-only -verify %s +// RUN: %clang_cc1 -fsyntax-only -verify -fcuda-is-device %s +// RUN: %clang_cc1 -fsyntax-only -verify -fcuda-is-device %s \ +// RUN: -fatomic=no_fine_grained_memory:off,no_remote_memory:on,ignore_denormal_mode:on +

[libclc] [libclc] Move clcmacro.h to CLC library. NFC (PR #114845)

2024-11-04 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/114845 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libclc] [libclc] Move abs/abs_diff to CLC library (PR #114960)

2024-11-05 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/114960 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [LLVM][IR] Use splat syntax when printing Constant[Data]Vector. (PR #112548)

2024-11-05 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/112548 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [Transforms][Utils][PromoteMem2Reg] Propagate nnan flag on par with the nsz flag (PR #114271)

2024-11-05 Thread Matt Arsenault via cfe-commits
arsenm wrote: > I do think this change still makes sense, especially from a consistency point > of view. If SROA sets one of the value-based FMF flags (nsz) then it stands > to reason that it should also set the other two (nnan and ninf). Unless there > is some reason why nsz would be more pro

[clang] [llvm] AMDGPU: Treat uint32_max as the default value for amdgpu-max-num-workgroups (PR #113751)

2024-11-05 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/113751 >From 6981d5ad80130130d373b8c879a88b7d727b0115 Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Sat, 19 Oct 2024 02:39:06 +0400 Subject: [PATCH 1/4] clang/AMDGPU: Emit grid size builtins with range metadata T

[clang] clang/AMDGPU: Emit grid size builtins with range metadata (PR #113038)

2024-11-05 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/113038 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] clang/AMDGPU: Emit grid size builtins with range metadata (PR #113038)

2024-11-05 Thread Matt Arsenault via cfe-commits
arsenm wrote: > I'm not sure what's special about your usage of Graphite, but I've seen other > people use it without triggering these notifications to everyone. This was specifically a reorder. I've seen this in some reorder cases, but not others > I think you need to do it the other way ar

[clang] [llvm] [TLI] Add support for reallocarray (PR #114818)

2024-11-05 Thread Matt Arsenault via cfe-commits
@@ -852,6 +852,7 @@ static void initializeLibCalls(TargetLibraryInfoImpl &TLI, const Triple &T, TLI.setUnavailable(LibFunc_memrchr); TLI.setUnavailable(LibFunc_ntohl); TLI.setUnavailable(LibFunc_ntohs); +TLI.setUnavailable(LibFunc_reallocarray); ---

[clang] [llvm] [PassBuilder] Add `ThinOrFullLTOPhase` to early simplication EP call backs (PR #114547)

2024-11-03 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/114547 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [llvm][AMDGPU] Fold `llvm.amdgcn.wavefrontsize` early (PR #114481)

2024-11-04 Thread Matt Arsenault via cfe-commits
@@ -1024,6 +1024,15 @@ GCNTTIImpl::instCombineIntrinsic(InstCombiner &IC, IntrinsicInst &II) const { } break; } + case Intrinsic::amdgcn_wavefrontsize: { +// TODO: this is a workaround for the pseudo-generic target one gets with no +// specified mcpu, which

[clang] [llvm] [llvm][AMDGPU] Fold `llvm.amdgcn.wavefrontsize` early (PR #114481)

2024-11-04 Thread Matt Arsenault via cfe-commits
@@ -1024,6 +1024,15 @@ GCNTTIImpl::instCombineIntrinsic(InstCombiner &IC, IntrinsicInst &II) const { } break; } + case Intrinsic::amdgcn_wavefrontsize: { +// TODO: this is a workaround for the pseudo-generic target one gets with no +// specified mcpu, which

[clang] [llvm] [llvm][AMDGPU] Fold `llvm.amdgcn.wavefrontsize` early (PR #114481)

2024-11-04 Thread Matt Arsenault via cfe-commits
@@ -6,48 +7,78 @@ ; RUN: opt -O3 -S < %s | FileCheck -check-prefix=OPT %s ; RUN: opt -mtriple=amdgcn-- -O3 -S < %s | FileCheck -check-prefix=OPT %s -; RUN: opt -mtriple=amdgcn-- -O3 -mattr=+wavefrontsize32 -S < %s | FileCheck -check-prefix=OPT %s -; RUN: opt -mtriple=amdgcn--

[clang] [llvm] [llvm][AMDGPU] Fold `llvm.amdgcn.wavefrontsize` early (PR #114481)

2024-11-04 Thread Matt Arsenault via cfe-commits
@@ -1024,6 +1024,15 @@ GCNTTIImpl::instCombineIntrinsic(InstCombiner &IC, IntrinsicInst &II) const { } break; } + case Intrinsic::amdgcn_wavefrontsize: { +// TODO: this is a workaround for the pseudo-generic target one gets with no +// specified mcpu, which

[clang] [llvm] [TLI] Add support for reallocarray (PR #114818)

2024-11-04 Thread Matt Arsenault via cfe-commits
@@ -318,6 +318,7 @@ TEST_F(TargetLibraryInfoTest, ValidProto) { "declare void @qsort(i8*, i64, i64, i32 (i8*, i8*)*)\n" "declare i64 @readlink(i8*, i8*, i64)\n" "declare i8* @realloc(i8*, i64)\n" + "declare i8* @reallocarray(i8*, i64, i64)\n"

[clang] [llvm] [llvm][AMDGPU] Fold `llvm.amdgcn.wavefrontsize` early (PR #114481)

2024-11-04 Thread Matt Arsenault via cfe-commits
@@ -1024,6 +1024,15 @@ GCNTTIImpl::instCombineIntrinsic(InstCombiner &IC, IntrinsicInst &II) const { } break; } + case Intrinsic::amdgcn_wavefrontsize: { +// TODO: this is a workaround for the pseudo-generic target one gets with no +// specified mcpu, which

[clang] [llvm] [llvm][AMDGPU] Fold `llvm.amdgcn.wavefrontsize` early (PR #114481)

2024-11-04 Thread Matt Arsenault via cfe-commits
@@ -1,3 +1,4 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5 arsenm wrote: The tests for this should go in test/InstCombine/AMDGPU https://github.com/llvm/llvm-project/pull/114481

[clang] clang: Remove requires system-linux from some driver tests (PR #111976)

2024-11-05 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/111976 >From a53a138fca1de49afb1814b538fcb3f97bb66264 Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Fri, 11 Oct 2024 14:33:32 +0400 Subject: [PATCH 1/8] clang: Remove requires system-linux from some driver tests

[clang] [libc] [Clang] Implement resource directory headers for common GPU intrinsics (PR #110179)

2024-11-05 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,154 @@ +//===-- amdgpuintrin.h - AMDPGU intrinsic functions ---===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] clang: Remove requires system-linux from some driver tests (PR #111976)

2024-11-05 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/111976 >From c9b2950be07d7272ece89b8bca6e3e982391a5c3 Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Fri, 11 Oct 2024 14:33:32 +0400 Subject: [PATCH 1/8] clang: Remove requires system-linux from some driver tests

[libclc] [libclc] Move ceil/fabs/floor/rint/trunc to CLC library (PR #114774)

2024-11-04 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/114774 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] clang/AMDGPU: Emit grid size builtins with range metadata (PR #113038)

2024-11-05 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm closed https://github.com/llvm/llvm-project/pull/113038 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] AMDGPU: Treat uint32_max as the default value for amdgpu-max-num-workgroups (PR #113751)

2024-11-05 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/113751 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang][docs] Revise documentation for `__builtin_reduce_(max|min)`. (PR #114637)

2024-11-05 Thread Matt Arsenault via cfe-commits
@@ -745,12 +745,12 @@ Let ``VT`` be a vector type and ``ET`` the element type of ``VT``. === == == Name

[clang] clang: Remove requires system-linux from some driver tests (PR #111976)

2024-11-05 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/111976 >From 947c0732cb8ebff4495a64d9fe7aa79ab3827926 Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Fri, 11 Oct 2024 14:33:32 +0400 Subject: [PATCH 1/9] clang: Remove requires system-linux from some driver tests

[clang] [clang][docs] Revise documentation for `__builtin_reduce_(max|min)`. (PR #114637)

2024-11-05 Thread Matt Arsenault via cfe-commits
@@ -745,12 +745,8 @@ Let ``VT`` be a vector type and ``ET`` the element type of ``VT``. === == == NameO

[clang] [clang][docs] Revise documentation for `__builtin_reduce_(max|min)`. (PR #114637)

2024-11-05 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/114637 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang][docs] Revise documentation for `__builtin_reduce_(max|min)`. (PR #114637)

2024-11-05 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/114637 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] clang/AMDGPU: Emit grid size builtins with range metadata (PR #113038)

2024-11-05 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/113038 >From 2e3964fef9cef4b374a8451367a01c850b70e7e8 Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Sat, 19 Oct 2024 02:39:06 +0400 Subject: [PATCH] clang/AMDGPU: Emit grid size builtins with range metadata These

[clang] clang/AMDGPU: Emit grid size builtins with range metadata (PR #113038)

2024-11-05 Thread Matt Arsenault via cfe-commits
arsenm wrote: ### Merge activity * **Nov 5, 3:43 PM EST**: A user started a stack merge that includes this pull request via [Graphite](https://app.graphite.dev/github/pr/llvm/llvm-project/113038). https://github.com/llvm/llvm-project/pull/113038 __

[clang] [llvm] AMDGPU: Propagate amdgpu-max-num-workgroups attribute (PR #113018)

2024-11-05 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/113018 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] Add sincos builtin using `llvm.sincos` intrinsic (PR #114086)

2024-10-30 Thread Matt Arsenault via cfe-commits
@@ -722,6 +722,38 @@ static Value *emitFrexpBuiltin(CodeGenFunction &CGF, const CallExpr *E, return CGF.Builder.CreateExtractValue(Call, 0); } +static void emitSincosBuiltin(CodeGenFunction &CGF, const CallExpr *E, + llvm::Intrinsic::ID Intrinsi

[clang] [clang] Add sincos builtin using `llvm.sincos` intrinsic (PR #114086)

2024-10-30 Thread Matt Arsenault via cfe-commits
@@ -722,6 +722,36 @@ static Value *emitFrexpBuiltin(CodeGenFunction &CGF, const CallExpr *E, return CGF.Builder.CreateExtractValue(Call, 0); } +static void emitSincosBuiltin(CodeGenFunction &CGF, const CallExpr *E, + llvm::Intrinsic::ID Intrinsi

[clang] [NFC] [clang] Use std::string instead of StringRef to reduce stack usage (PR #114285)

2024-10-30 Thread Matt Arsenault via cfe-commits
arsenm wrote: That sounds like MSVC's problem to solve. Why does the amount of stack size matter https://github.com/llvm/llvm-project/pull/114285 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/

[clang] [NFC] [clang] Use std::string instead of StringRef to reduce stack usage (PR #114285)

2024-10-30 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm commented: If you really want to work around MSVC's bug, you could change tablegen to emit the string literals as constant globals / StringLiteral https://github.com/llvm/llvm-project/pull/114285 ___ cfe-commits mailing list

[clang] [llvm] [AMDGPU] modify named barrier builtins and intrinsics (PR #114550)

2024-11-01 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm commented: Modify how? There seem to be too many things going on here. Description should say how https://github.com/llvm/llvm-project/pull/114550 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.or

[clang] [llvm] [AMDGPU] modify named barrier builtins and intrinsics (PR #114550)

2024-11-01 Thread Matt Arsenault via cfe-commits
@@ -920,6 +920,124 @@ class AMDGPULowerModuleLDS { return KernelToCreatedDynamicLDS; } + static GlobalVariable *uniquifyGVPerKernel(Module &M, GlobalVariable *GV, arsenm wrote: This looks like an unrelated, separate change https://github.com/llvm/llv

[clang] [llvm] [AMDGPU] modify named barrier builtins and intrinsics (PR #114550)

2024-11-01 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/114550 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] modify named barrier builtins and intrinsics (PR #114550)

2024-11-01 Thread Matt Arsenault via cfe-commits
@@ -920,6 +920,124 @@ class AMDGPULowerModuleLDS { return KernelToCreatedDynamicLDS; } + static GlobalVariable *uniquifyGVPerKernel(Module &M, GlobalVariable *GV, arsenm wrote: This still should be a separate PR https://github.com/llvm/llvm-project/p

[clang] [clang][CodeGen][OpenCL] Fix `alloca` handling & `sret`when compiling for (PR #113930)

2024-10-30 Thread Matt Arsenault via cfe-commits
@@ -108,11 +108,15 @@ RawAddress CodeGenFunction::CreateTempAlloca(llvm::Type *Ty, CharUnits Align, if (AllocaAddr) *AllocaAddr = Alloca; llvm::Value *V = Alloca.getPointer(); + assert((!getLangOpts().OpenCL || + CGM.getTarget().getTargetAddressSpace(getASTAl

[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-11-01 Thread Matt Arsenault via cfe-commits
@@ -1672,10 +1672,11 @@ CodeGenTypes::GetFunctionType(const CGFunctionInfo &FI) { // Add type for sret argument. if (IRFunctionArgs.hasSRetArg()) { -QualType Ret = FI.getReturnType(); -unsigned AddressSpace = CGM.getTypes().getTargetAddressSpace(Ret); +auto Ad

[clang] [clang][CodeGen][OpenCL] Fix `alloca` handling & `sret`when compiling for (PR #113930)

2024-10-28 Thread Matt Arsenault via cfe-commits
@@ -1648,6 +1648,8 @@ CodeGenTypes::GetFunctionType(const CGFunctionInfo &FI) { // Add type for sret argument. if (IRFunctionArgs.hasSRetArg()) { QualType Ret = FI.getReturnType(); +if (CGM.getLangOpts().OpenCL) + Ret = getContext().getAddrSpaceQualType(Ret, La

[clang] [clang][CodeGen][OpenCL] Fix `alloca` handling & `sret`when compiling for (PR #113930)

2024-10-28 Thread Matt Arsenault via cfe-commits
@@ -108,11 +108,15 @@ RawAddress CodeGenFunction::CreateTempAlloca(llvm::Type *Ty, CharUnits Align, if (AllocaAddr) *AllocaAddr = Alloca; llvm::Value *V = Alloca.getPointer(); + assert((!getLangOpts().OpenCL || + CGM.getTarget().getTargetAddressSpace(getASTAl

[clang] [Clang] Add `-fdefault-generic-addrspace` flag for targeting GPUs (PR #115777)

2024-11-11 Thread Matt Arsenault via cfe-commits
@@ -1579,7 +1579,7 @@ NamedDecl *Sema::getCurFunctionOrMethodDecl() const { } LangAS Sema::getDefaultCXXMethodAddrSpace() const { - if (getLangOpts().OpenCL) + if (getLangOpts().OpenCL || getLangOpts().OpenCLGenericAddressSpace) arsenm wrote: I think this w

[clang] [Clang] Add support for scoped atomic thread fence (PR #115545)

2024-11-11 Thread Matt Arsenault via cfe-commits
@@ -5133,6 +5133,132 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID, Builder.SetInsertPoint(ContBB); return RValue::get(nullptr); } + case Builtin::BI__scoped_atomic_thread_fence: { +auto ScopeModel = AtomicScopeModel::create(

[clang] [Clang] Add support for scoped atomic thread fence (PR #115545)

2024-11-11 Thread Matt Arsenault via cfe-commits
@@ -5133,6 +5133,132 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID, Builder.SetInsertPoint(ContBB); return RValue::get(nullptr); } + case Builtin::BI__scoped_atomic_thread_fence: { +auto ScopeModel = AtomicScopeModel::create(

[clang] [Clang] Add support for scoped atomic thread fence (PR #115545)

2024-11-08 Thread Matt Arsenault via cfe-commits
@@ -5133,6 +5133,135 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID, Builder.SetInsertPoint(ContBB); return RValue::get(nullptr); } + case Builtin::BI__scoped_atomic_thread_fence: { +auto ScopeModel = AtomicScopeModel::create(

[clang] [Clang] Add support for scoped atomic thread fence (PR #115545)

2024-11-08 Thread Matt Arsenault via cfe-commits
@@ -5133,6 +5133,135 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID, Builder.SetInsertPoint(ContBB); return RValue::get(nullptr); } + case Builtin::BI__scoped_atomic_thread_fence: { +auto ScopeModel = AtomicScopeModel::create(

[clang] [Clang] Add support for scoped atomic thread fence (PR #115545)

2024-11-08 Thread Matt Arsenault via cfe-commits
@@ -5133,6 +5133,135 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID, Builder.SetInsertPoint(ContBB); return RValue::get(nullptr); } + case Builtin::BI__scoped_atomic_thread_fence: { +auto ScopeModel = AtomicScopeModel::create(

[clang] [Clang] Add support for scoped atomic thread fence (PR #115545)

2024-11-08 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,179 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 5 +// RUN: %clang_cc1 %s -emit-llvm -o - -triple=amdgcn-amd-amdhsa -ffreestanding \ +// RUN: -fvisibility=hidden | FileCheck --check-prefix=AMDGCN %s +//: %clan

[clang] [llvm] [TLI] Add support for reallocarray (PR #114818)

2024-11-08 Thread Matt Arsenault via cfe-commits
@@ -3224,6 +3224,13 @@ def AllocA : GNULibBuiltin<"stdlib.h"> { let AddBuiltinPrefixedAlias = 1; } +// Available in glibc by default since since 2.29 and in GNU mode before. +def ReallocArray : GNULibBuiltin<"stdlib.h"> { arsenm wrote: The clang builtin cha

[clang] [llvm] [AMDGPU] Introduce a new generic target `gfx9-4-generic` (PR #115190)

2024-11-08 Thread Matt Arsenault via cfe-commits
@@ -121,6 +121,7 @@ static const OffloadArchToStringMap arch_names[] = { GFX(909), // gfx909 GFX(90a), // gfx90a GFX(90c), // gfx90c +{OffloadArch::GFX9_4_GENERIC, "gfx9-4-generic", "compute_amdgcn"}, arsenm wrote: Why "9-4" and not "9.4"?

[clang] [llvm] [AMDGPU] Introduce a new generic target `gfx9-4-generic` (PR #115190)

2024-11-08 Thread Matt Arsenault via cfe-commits
@@ -121,6 +121,7 @@ static const OffloadArchToStringMap arch_names[] = { GFX(909), // gfx909 GFX(90a), // gfx90a GFX(90c), // gfx90c +{OffloadArch::GFX9_4_GENERIC, "gfx9-4-generic", "compute_amdgcn"}, arsenm wrote: I guess this precedent is

[clang] [llvm] [AMDGPU] Introduce a new generic target `gfx9-4-generic` (PR #115190)

2024-11-08 Thread Matt Arsenault via cfe-commits
@@ -85,6 +87,7 @@ // GFX940: "target-features"="+16-bit-insts,+atomic-buffer-global-pk-add-f16-insts,+atomic-ds-pk-add-16-insts,+atomic-fadd-rtn-insts,+atomic-flat-pk-add-16-insts,+atomic-global-pk-add-bf16-inst,+ci-insts,+dl-insts,+dot1-insts,+dot10-insts,+dot2-insts,+dot3-inst

[clang] [llvm] [AMDGPU] Introduce a new generic target `gfx9-4-generic` (PR #115190)

2024-11-08 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/115190 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Introduce a new generic target `gfx9-4-generic` (PR #115190)

2024-11-08 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm commented: Test that the unsupported clang builtins are errors? https://github.com/llvm/llvm-project/pull/115190 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-comm

[libclc] [libclc] Move sign to the CLC builtins library (PR #115699)

2024-11-11 Thread Matt Arsenault via cfe-commits
@@ -322,22 +322,26 @@ foreach( t ${LIBCLC_TARGETS_TO_BUILD} ) if ( ARCH STREQUAL spirv OR ARCH STREQUAL spirv64 ) set( build_flags -O0 -finline-hint-functions -DCLC_SPIRV ) - set( opt_flags ) + set( clc_opt_flags ) + # Inline CLC functions into OpenCL

[libclc] [libclc] Move sign to the CLC builtins library (PR #115699)

2024-11-11 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,38 @@ +#include +#include +#include + +#define CLC_SIGN(TYPE, F) \ + _CLC_DEF _CLC_OVERLOAD TYPE __clc_sign(TYPE x) { \ +if (__clc_isnan(x)) {

[libclc] [libclc] Move sign to the CLC builtins library (PR #115699)

2024-11-11 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/115699 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [AMDGPU] Allow overload of __builtin_amdgcn_mov_dpp8 (PR #113610)

2024-10-25 Thread Matt Arsenault via cfe-commits
@@ -152,6 +115,44 @@ bool SemaAMDGPU::CheckAMDGCNBuiltinFunctionCall(unsigned BuiltinID, return false; } +bool SemaAMDGPU::CheckMovDPPFunctionCall(CallExpr *TheCall, unsigned NumArgs, arsenm wrote: Start with lowercase https://github.com/llvm/llvm-projec

[clang] [llvm] [BPF] Add load-acquire and store-release instructions under -mcpu=v4 (PR #108636)

2024-10-26 Thread Matt Arsenault via cfe-commits
@@ -703,6 +715,39 @@ SDValue BPFTargetLowering::LowerSELECT_CC(SDValue Op, SelectionDAG &DAG) const { return DAG.getNode(BPFISD::SELECT_CC, DL, VTs, Ops); } +SDValue BPFTargetLowering::LowerATOMIC_LOAD(SDValue Op, +SelectionDAG &D

[clang] [llvm] [IR] Allow fast math flags on fptrunc and fpext (PR #115894)

2024-11-12 Thread Matt Arsenault via cfe-commits
@@ -42,6 +42,14 @@ entry: %f = fneg float %x ; CHECK: %f_vec = fneg <3 x float> %vec %f_vec = fneg <3 x float> %vec +; CHECK: %g = fpext float %x to double arsenm wrote: Needs bitcode compatibility test https://github.com/llvm/llvm-project/pull/115894 _

[clang] [llvm] [IR] Allow fast math flags on fptrunc and fpext (PR #115894)

2024-11-12 Thread Matt Arsenault via cfe-commits
@@ -148,6 +172,14 @@ entry: %e = frem nnan float %x, %y ; CHECK: %e_vec = frem nnan ninf <3 x float> %vec, %vec %e_vec = frem ninf nnan <3 x float> %vec, %vec +; CHECK: %f = fpext nnan ninf float %x to double + %f = fpext ninf nnan float %x to double +; CHECK: %f_vec = fp

[clang] [llvm] [AMDGPU] Introduce a new generic target `gfx9-4-generic` (PR #115190)

2024-11-12 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/115190 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Introduce a new generic target `gfx9-4-generic` (PR #115190)

2024-11-12 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,39 @@ +// REQUIRES: amdgpu-registered-target + +// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -target-cpu gfx9-4-generic -verify -emit-llvm -o - %s + +typedef unsigned int uint; + +typedef float v2f __attribute__((ext_vector_type(2))); +typedef float v4f __attribu

[clang] [CUDA][HIP] Fix host/device context in concept (PR #67721)

2024-11-12 Thread Matt Arsenault via cfe-commits
arsenm wrote: Is this still relevant? https://github.com/llvm/llvm-project/pull/67721 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] clang: Remove requires system-linux from some driver tests (PR #111976)

2024-11-13 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm updated https://github.com/llvm/llvm-project/pull/111976 >From 3912e952fd6d9ae3f4c3dd9dc6ff8b72eee794db Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Fri, 11 Oct 2024 14:33:32 +0400 Subject: [PATCH 1/9] clang: Remove requires system-linux from some driver tests

[clang] [clang][docs] Revise documentation for `__builtin_reduce_(max|min)`. (PR #114637)

2024-11-13 Thread Matt Arsenault via cfe-commits
@@ -745,12 +745,10 @@ Let ``VT`` be a vector type and ``ET`` the element type of ``VT``. === == == Name

<    11   12   13   14   15   16   17   18   19   20   >