[clang] [clang][docs] Revise documentation for `__builtin_reduce_(max|min)`. (PR #114637)

2024-11-13 Thread Matt Arsenault via cfe-commits
@@ -745,12 +745,10 @@ Let ``VT`` be a vector type and ``ET`` the element type of ``VT``. === == == Name

[clang] Remove Linux search paths on Windows (PR #113628)

2024-11-13 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/113628 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Remove Linux search paths on Windows (PR #113628)

2024-11-13 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,9 @@ +// REQUIRES: system-windows arsenm wrote: This doesn't require windows, use an explicit triple https://github.com/llvm/llvm-project/pull/113628 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[clang] Remove Linux search paths on Windows (PR #113628)

2024-11-13 Thread Matt Arsenault via cfe-commits
@@ -425,6 +425,12 @@ MSVCToolChain::MSVCToolChain(const Driver &D, const llvm::Triple &Triple, const ArgList &Args) : ToolChain(D, Triple, Args), CudaInstallation(D, Triple, Args), RocmInstallation(D, Triple, Args) { + + // Tell the ROCm

[clang] Remove Linux search paths on Windows (PR #113628)

2024-11-13 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,9 @@ +// REQUIRES: system-linux arsenm wrote: This doesn't require linux, use an explicit triple https://github.com/llvm/llvm-project/pull/113628 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http

[clang] Remove Linux search paths on Windows (PR #113628)

2024-11-13 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,9 @@ +// REQUIRES: system-windows arsenm wrote: Also add a clang-cl test? https://github.com/llvm/llvm-project/pull/113628 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-

[clang] Remove Linux search paths on Windows (PR #113628)

2024-11-13 Thread Matt Arsenault via cfe-commits
arsenm wrote: Commit title should be more specific and remove the Gerrit id https://github.com/llvm/llvm-project/pull/113628 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libclc] [libclc] Move min/max/clamp into the CLC builtins library (PR #114386)

2024-10-31 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/114386 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libclc] [libclc] Move min/max/clamp into the CLC builtins library (PR #114386)

2024-10-31 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,11 @@ +_CLC_OVERLOAD _CLC_DEF __CLC_GENTYPE __clc_max(__CLC_GENTYPE a, + __CLC_GENTYPE b) { + return (a > b ? a : b); +} + +#ifndef __CLC_SCALAR +_CLC_OVERLOAD _CLC_DEF __CLC_GENTYPE __clc_max(__CLC_GENTYPE a, +

[libclc] [libclc] Move min/max/clamp into the CLC builtins library (PR #114386)

2024-10-31 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm requested changes to this pull request. Moving them seems fine but these should probably just be deleted. __clc_min is equivalent to __builtin_elementwise_min etc. I see there's a __builtin_hlsl_elementwise_clamp, but given the language prefix it may be more abusive

[clang] [llvm] [opt][AMDGPU] Add pass to handle AMDGCN pseudo-intrinsics target specific info), start with `llvm.amdgcn.wavefrontsize` (PR #114481)

2024-10-31 Thread Matt Arsenault via cfe-commits
arsenm wrote: Mechanically, this pass can be replaced with trivial handling of the intrinsic in AMDGPUInstCombineIntrinsic; we don't need a new module pass. As inserted into the pipeline here, this does not have any advantage over handling it directly in instcombine. > We could just turn this

[clang] [llvm] [opt][AMDGPU] Add pass to handle AMDGCN pseudo-intrinsics target specific info), start with `llvm.amdgcn.wavefrontsize` (PR #114481)

2024-10-31 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm requested changes to this pull request. We do not want or need a new pass to handle this. This is not a fix to the structural issue of wavesize. The problem is there is no such thing as a "no wavesize" IR. There is only wave32 or wave64. Querying the target gives the

[clang] [llvm] [opt][AMDGPU] Add pass to handle AMDGCN pseudo-intrinsics target specific info), start with `llvm.amdgcn.wavefrontsize` (PR #114481)

2024-10-31 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,49 @@ +//===- AMDGPUExpandPseudoIntrinsics.cpp - Pseudo Intrinsic Expander Pass --===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apa

[clang] [llvm] [opt][AMDGPU] Add pass to handle AMDGCN pseudo-intrinsics target specific info), start with `llvm.amdgcn.wavefrontsize` (PR #114481)

2024-10-31 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,49 @@ +//===- AMDGPUExpandPseudoIntrinsics.cpp - Pseudo Intrinsic Expander Pass --===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apa

[clang] [llvm] [opt][AMDGPU] Add pass to handle AMDGCN pseudo-intrinsics target specific info), start with `llvm.amdgcn.wavefrontsize` (PR #114481)

2024-10-31 Thread Matt Arsenault via cfe-commits
arsenm wrote: Just adding this to the pass pipeline where it is is no better than just doing it in instcombine, which is the natural place to do this. This patch, like instcombine, still has the problem that we don't know if we're producing the final code. https://github.com/llvm/llvm-project

[libclc] [libclc] Create aliases with custom_command (PR #115885)

2024-11-12 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/115885 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libclc] [libclc] Move sign to the CLC builtins library (PR #115699)

2024-11-12 Thread Matt Arsenault via cfe-commits
@@ -322,22 +322,26 @@ foreach( t ${LIBCLC_TARGETS_TO_BUILD} ) if ( ARCH STREQUAL spirv OR ARCH STREQUAL spirv64 ) set( build_flags -O0 -finline-hint-functions -DCLC_SPIRV ) - set( opt_flags ) + set( clc_opt_flags ) + # Inline CLC functions into OpenCL

[libclc] [libclc] Use builtin_convertvector to convert between vector types (PR #115865)

2024-11-12 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/115865 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [flang] [llvm] [mlir] Make MMIWP not have ownership over MMI + Make MMI Only Use an External MCContext (PR #105541)

2024-09-22 Thread Matt Arsenault via cfe-commits
arsenm wrote: > @aeubanks @arsenm after looking into this in more detail, I realized that the > `getContext` method of `MMI` is heavily used in the `AsmPrinter` to create > symbols. Also not having it makes it harder for the `MMI` to create machine > functions using `getOrCreateMachineFunction

[clang] [llvm] [Support] Add scaling support in `indent` (PR #109478)

2024-09-20 Thread Matt Arsenault via cfe-commits
@@ -774,18 +774,27 @@ class buffer_unique_ostream : public raw_svector_ostream { // you can use // OS << indent(6) << "more stuff"; // which has better ergonomics (and clang-formats better as well). +// +// If indentation is always in increments of a fixed value, you can use Sc

[clang] [llvm] [Support] Add scaling support in `indent` (PR #109478)

2024-09-20 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/109478 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libclc] [libclc] use default paths with find_program when possible (PR #105969)

2024-09-23 Thread Matt Arsenault via cfe-commits
arsenm wrote: > Nixpkgs has no intention of moving away from standalone builds. I encourage you to acquire that intention. IMO libclc should not support the standalone build, and this should be version locked to the exact compiler commit. It's compiler data, not a real library https://github

[libclc] [libclc] use default paths with find_program when possible (PR #105969)

2024-09-23 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/105969 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[libclc] [libclc] use default paths with find_program when possible (PR #105969)

2024-09-23 Thread Matt Arsenault via cfe-commits
@@ -55,7 +55,7 @@ if( LIBCLC_STANDALONE_BUILD OR CMAKE_SOURCE_DIR STREQUAL CMAKE_CURRENT_SOURCE_DI # Import required tools if( NOT EXISTS ${LIBCLC_CUSTOM_LLVM_TOOLS_BINARY_DIR} ) foreach( tool IN ITEMS clang llvm-as llvm-link opt ) - find_program( LLVM_TOOL_${tool

[libclc] [libclc] use default paths with find_program when possible (PR #105969)

2024-09-23 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm commented: The nix build should probably migrate to using the non-standalone build https://github.com/llvm/llvm-project/pull/105969 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman

[libclc] [libclc] use default paths with find_program when possible (PR #105969)

2024-09-23 Thread Matt Arsenault via cfe-commits
arsenm wrote: > So it should be built along with the core of LLVM? Also, we package LLVM per > version per subproject. Yes, it should be built along with the core (but doesn't need to ship in the same package as the core). https://github.com/llvm/llvm-project/pull/105969 __

[clang] [clang][codegen] Don't mark "int" TBAA on FP libcalls with indirect args (PR #108853)

2024-09-24 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/108853 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Don't emit int TBAA metadata on more complex FP math libcalls. (PR #107598)

2024-09-24 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm closed https://github.com/llvm/llvm-project/pull/107598 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Don't emit int TBAA metadata on more complex FP math libcalls. (PR #107598)

2024-09-24 Thread Matt Arsenault via cfe-commits
arsenm wrote: Superseded by #108853 https://github.com/llvm/llvm-project/pull/107598 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang][codegen] Don't mark "int" TBAA on FP libcalls with indirect args (PR #108853)

2024-09-24 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. LGTM, but like I mentioned on #107598, it would be good if there was a test that requires the argument check, and the return check isn't sufficient https://github.com/llvm/llvm-project/pull/108853 ___

[clang] [llvm] [AMDGPU] Convert AMDGPUResourceUsageAnalysis pass from Module to MF pass (PR #102913)

2024-09-24 Thread Matt Arsenault via cfe-commits
arsenm wrote: > If we already have per-function metadata, I'm wondering how difficult it > would be to put this handling in the linker. AFAIK there's already handling > for `call-graph-profile` which can inform the linker of the call-graph, so we > could potentially just walk that graph, find

[clang] [clang] Use std::optional::value_or (NFC) (PR #109894)

2024-09-24 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/109894 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Add __builtin_(elementwise|reduce)_(max|min)imum (PR #110198)

2024-09-26 Thread Matt Arsenault via cfe-commits
@@ -706,6 +706,12 @@ Unless specified otherwise operation(±0) = ±0 and operation(±infinity) = ±in representable values for the signed/unsigned integer type. T __builtin_elementwise_sub_sat(T x, T y) return the difference of x and

[clang] [cuda][[HIP] `__constant__` should imply constant (PR #110182)

2024-09-26 Thread Matt Arsenault via cfe-commits
arsenm wrote: If it's not legal for it to be marked as constant, it's also not legal to use constant address space https://github.com/llvm/llvm-project/pull/110182 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/

[clang] [clang][codegen] Don't mark "int" TBAA on FP libcalls with indirect args (PR #108853)

2024-09-18 Thread Matt Arsenault via cfe-commits
@@ -690,23 +690,46 @@ static RValue emitLibraryCall(CodeGenFunction &CGF, const FunctionDecl *FD, const CallExpr *E, llvm::Constant *calleeValue) { CodeGenFunction::CGFPOptionsRAII FPOptsRAII(CGF, E); CGCallee callee = CGCallee::forDirect(call

[clang] [lldb] [llvm] [mlir] [APInt] Fix APInt constructions where value does not fit bitwidth (NFCI) (PR #80309)

2024-09-20 Thread Matt Arsenault via cfe-commits
@@ -4377,7 +4377,7 @@ AMDGPUInstructionSelector::selectGlobalSAddr(MachineOperand &Root) const { // instructions to perform VALU adds with immediates or inline literals. unsigned NumLiterals = !TII.isInlineConstant(APInt(32, ConstOffset & 0xfff

[clang] [lldb] [llvm] [mlir] [APInt] Fix APInt constructions where value does not fit bitwidth (NFCI) (PR #80309)

2024-09-20 Thread Matt Arsenault via cfe-commits
@@ -1806,7 +1806,7 @@ bool AMDGPUDAGToDAGISel::SelectGlobalSAddr(SDNode *N, // instructions to perform VALU adds with immediates or inline literals. unsigned NumLiterals = !TII->isInlineConstant(APInt(32, COffsetVal & 0x)) + - !TII->isInli

[clang] [flang] [llvm] [mlir] Make MMIWP not have ownership over MMI + Make MMI Only Use an External MCContext (PR #105541)

2024-09-20 Thread Matt Arsenault via cfe-commits
arsenm wrote: > @aeubanks It's not impossible to separate them completely. `MCContext` is > needed during initialization and finalization of the > `MachineModuleInfoWrapperPass` (and its new pass manager variant) to set the > diagnostics handler. > > In theory, you can just pass the context t

[clang] [llvm] [IR] Allow fast math flags on calls with homogeneous FP struct types (PR #110506)

2024-09-30 Thread Matt Arsenault via cfe-commits
@@ -1122,6 +1122,26 @@ define void @fastMathFlagsForArrayCalls([2 x float] %f, [2 x double] %d1, [2 x < ret void } +declare { float, float } @fmf_struct_f32() +declare { double, double } @fmf_struct_f64() +declare { <4 x double>, <4 x double> } @fmf_struct_v4f64() + +; CHEC

[clang] [flang] [llvm] [mlir] Make Ownership of MachineModuleInfo in Its Wrapper Pass External (PR #110443)

2024-09-30 Thread Matt Arsenault via cfe-commits
arsenm wrote: > * Move the MC emission functions in `TargetMachine` to `LLVMTargetMachine`. > With the changes in this PR, we explicitly assume in both > `addPassesToEmitFile` and `addPassesToEmitMC` that the `TargetMachine` is an > `LLVMTargetMachine`; Hence it does not make sense for these f

[clang] [llvm] Implement operand bundles for floating-point operations (PR #109798)

2024-09-30 Thread Matt Arsenault via cfe-commits
arsenm wrote: > With the constrained intrinsics the default is safe because optimizations > don't recognize the constrained intrinsic and thus don't know how to optimize > it. If we instead rely on the strictfp attribute then we'll need possibly > thousands of checks for this attribute, we'll

[clang] [llvm] [mlir] [LLVM][TableGen] Change SeachableTableEmitter to use const RecordKeeper (PR #110032)

2024-09-30 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/110032 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [mlir] [LLVM][TableGen] Change SeachableTableEmitter to use const RecordKeeper (PR #110032)

2024-09-30 Thread Matt Arsenault via cfe-commits
@@ -1556,7 +1557,7 @@ class RecordVal { bool IsUsed = false; /// Reference locations to this record value. - SmallVector ReferenceLocs; + mutable SmallVector ReferenceLocs; arsenm wrote: Is this removed in later patches? https://github.com/llvm/llvm-p

[clang] [Clang] Add __builtin_(elementwise|reduce)_(max|min)imum (PR #110198)

2024-09-30 Thread Matt Arsenault via cfe-commits
@@ -273,6 +273,74 @@ void test_builtin_elementwise_min(int i, short s, double d, float4 v, int3 iv, u // expected-error@-1 {{1st argument must be a vector, integer or floating point type (was '_Complex float')}} } +void test_builtin_elementwise_maximum(int i, short s, floa

[clang] [llvm] [clang][llvm][SPIR-V] Explicitly encode native integer widths for SPIR-V (PR #110695)

2024-10-01 Thread Matt Arsenault via cfe-commits
@@ -1,56 +0,0 @@ -; This test aims to check ability to support "Arithmetic with Overflow" intrinsics arsenm wrote: This one is testing codegenprepare as part of the normal codegen pipeline, so this one is fine. The other case was a full optimization pipeline +

[clang] [llvm] [clang][llvm][SPIR-V] Explicitly encode native integer widths for SPIR-V (PR #110695)

2024-10-01 Thread Matt Arsenault via cfe-commits
@@ -54,14 +54,14 @@ static std::string computeDataLayout(const Triple &TT) { // memory model used for graphics: PhysicalStorageBuffer64. But it shouldn't // mean anything. if (Arch == Triple::spirv32) -return "e-p:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-" -

[clang] [llvm] [clang][llvm][SPIR-V] Explicitly encode native integer widths for SPIR-V (PR #110695)

2024-10-01 Thread Matt Arsenault via cfe-commits
@@ -1,56 +0,0 @@ -; This test aims to check ability to support "Arithmetic with Overflow" intrinsics arsenm wrote: Not sure what the problem is with this test, but it's already covered by another? https://github.com/llvm/llvm-project/pull/110695 _

[clang] [llvm] [IR] Allow fast math flags on calls with homogeneous FP struct types (PR #110506)

2024-10-01 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/110506 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Add __builtin_(elementwise|reduce)_(max|min)imum (PR #110198)

2024-10-01 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/110198 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [clang][llvm][SPIR-V] Explicitly encode native integer widths for SPIR-V (PR #110695)

2024-10-01 Thread Matt Arsenault via cfe-commits
@@ -1,56 +0,0 @@ -; This test aims to check ability to support "Arithmetic with Overflow" intrinsics arsenm wrote: That is not the nature of this kind of test https://github.com/llvm/llvm-project/pull/110695 ___ cfe-c

[clang] [llvm] [clang][llvm][SPIR-V] Explicitly encode native integer widths for SPIR-V (PR #110695)

2024-10-01 Thread Matt Arsenault via cfe-commits
@@ -54,14 +54,14 @@ static std::string computeDataLayout(const Triple &TT) { // memory model used for graphics: PhysicalStorageBuffer64. But it shouldn't // mean anything. if (Arch == Triple::spirv32) -return "e-p:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-" -

[clang] [lld] [llvm] [mlir] [IR] Introduce `T` to `DataLayout` to represent flat address space if a target supports it (PR #108786)

2024-10-01 Thread Matt Arsenault via cfe-commits
arsenm wrote: > > I would like to avoid adding additional special properties to AS0, or > > defining the flat concept. > > How can we add a new specification w/o defining it? By not defining it in terms of flat addressing. Just make it the undesirable address space https://github.com/llvm/ll

[clang] [lld] [llvm] [mlir] [IR] Introduce `T` to `DataLayout` to represent flat address space if a target supports it (PR #108786)

2024-09-25 Thread Matt Arsenault via cfe-commits
@@ -66,12 +66,12 @@ NVPTXTargetInfo::NVPTXTargetInfo(const llvm::Triple &Triple, HasFloat16 = true; if (TargetPointerWidth == 32) -resetDataLayout("e-p:32:32-i64:64-i128:128-v16:16-v32:32-n16:32:64"); +resetDataLayout("e-p:32:32-i64:64-i128:128-v16:16-v32:32-n16:32

[clang] [lld] [llvm] [mlir] [IR] Introduce `T` to `DataLayout` to represent flat address space if a target supports it (PR #108786)

2024-09-25 Thread Matt Arsenault via cfe-commits
arsenm wrote: > There are targets that use a different integer to denote flat (e.g. see SPIR > & SPIR-V). Whilst I know that there are objections to that, the fact remains > that they had historical reason (wanted to make legacy OCL convention that > the default is private work, and given that

[clang] [lld] [llvm] [mlir] [IR] Introduce `T` to `DataLayout` to represent flat address space if a target supports it (PR #108786)

2024-09-25 Thread Matt Arsenault via cfe-commits
arsenm wrote: > Just to clarify, does this mean any two non-flat address space pointers > _cannot_ alias? This should change nothing about aliasing. The IR assumption is any address space may alias any other https://github.com/llvm/llvm-project/pull/108786 ___

[clang] [lld] [llvm] [mlir] [IR] Introduce `T` to `DataLayout` to represent flat address space if a target supports it (PR #108786)

2024-09-25 Thread Matt Arsenault via cfe-commits
@@ -579,7 +579,7 @@ static StringRef computeDataLayout(const Triple &TT) { "-p7:160:256:256:32-p8:128:128-p9:192:256:256:32-i64:64-v16:16-v24:32-" "v32:32-v48:64-v96:" "128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-" - "G

[clang] [lld] [llvm] [mlir] [IR] Introduce `T` to `DataLayout` to represent flat address space if a target supports it (PR #108786)

2024-09-25 Thread Matt Arsenault via cfe-commits
arsenm wrote: > Both in InferAddressSpaces, and in Attributor, you don't really care about > whether a flat address-space exists. Right, this is more of an undesirable address space. Optimizations don't need to know anything about its behavior beyond that. > In reply to your question above

[clang] [llvm] Implement operand bundles for floating-point operations (PR #109798)

2024-09-25 Thread Matt Arsenault via cfe-commits
@@ -357,6 +357,9 @@ class IRBuilderBase { void setConstrainedFPCallAttr(CallBase *I) { I->addFnAttr(Attribute::StrictFP); +MemoryEffects ME = MemoryEffects::inaccessibleMemOnly(); arsenm wrote: It shouldn't be necessary to touch the attributes. The

[clang] [llvm] Implement operand bundles for floating-point operations (PR #109798)

2024-09-25 Thread Matt Arsenault via cfe-commits
arsenm wrote: Also it's silly that we need to do bitcode autoupgrade of "experimental" intrinsics, but x86 started shipping with strictfp enabled in production before they graduated. We might as well drop the experimental bit then https://github.com/llvm/llvm-project/pull/109798 _

[clang] [llvm] [AMDGPU] Convert AMDGPUResourceUsageAnalysis pass from Module to MF pass (PR #102913)

2024-09-25 Thread Matt Arsenault via cfe-commits
@@ -78,15 +78,15 @@ void MCResourceInfo::finalize(MCContext &OutContext) { } MCSymbol *MCResourceInfo::getMaxVGPRSymbol(MCContext &OutContext) { - return OutContext.getOrCreateSymbol("max_num_vgpr"); + return OutContext.getOrCreateSymbol("amdgcn.max_num_vgpr"); -

[clang] [llvm] [AMDGPU] Infer amdgpu-no-flat-scratch-init attribute in AMDGPUAttributor (PR #94647)

2024-09-25 Thread Matt Arsenault via cfe-commits
@@ -434,6 +434,15 @@ struct AAAMDAttributesFunction : public AAAMDAttributes { indicatePessimisticFixpoint(); return; } + +for (Instruction &I : instructions(F)) { + if (isa(I) && arsenm wrote: 5->3 is an illegal address space cast, bu

[clang] [llvm] Implement operand bundles for floating-point operations (PR #109798)

2024-09-25 Thread Matt Arsenault via cfe-commits
arsenm wrote: > If we can't keep the constrained semantics and near-100% guarantee that no > new exceptions will be introduced then operand bundles are not a replacement > for the constrained intrinsics. We would still need a call / function attribute to indicate strictfp calls, and such call

[clang] [llvm] [AMDGPU] Convert AMDGPUResourceUsageAnalysis pass from Module to MF pass (PR #102913)

2024-09-25 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. I think we need more thought about how the ABI for this will work, but we need to start somewhere https://github.com/llvm/llvm-project/pull/102913 ___ cfe-commits mailing list cfe-commits@lists.ll

[clang] [llvm] [AMDGPU] Infer amdgpu-no-flat-scratch-init attribute in AMDGPUAttributor (PR #94647)

2024-09-25 Thread Matt Arsenault via cfe-commits
@@ -434,6 +434,15 @@ struct AAAMDAttributesFunction : public AAAMDAttributes { indicatePessimisticFixpoint(); return; } + +for (Instruction &I : instructions(F)) { + if (isa(I) && arsenm wrote: Simple example, where the cast is still d

[clang] [llvm] [AMDGPU] Infer amdgpu-no-flat-scratch-init attribute in AMDGPUAttributor (PR #94647)

2024-09-25 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/94647 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [clang][llvm][SPIR-V] Explicitly encode native integer widths for SPIR-V (PR #110695)

2024-10-01 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/110695 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [clang][llvm][SPIR-V] Explicitly encode native integer widths for SPIR-V (PR #110695)

2024-10-01 Thread Matt Arsenault via cfe-commits
arsenm wrote: > 1. Usually (or at least AFAIK) optimization passes won't consider datalayout > automatically, The datalayout is a widely used global constant. There's no option of "not considering it" > Do you plan to go over LLVM passes adding this check? There's nothing new to do here. T

[clang] [llvm] [clang][llvm][SPIR-V] Explicitly encode native integer widths for SPIR-V (PR #110695)

2024-10-01 Thread Matt Arsenault via cfe-commits
@@ -1,56 +0,0 @@ -; This test aims to check ability to support "Arithmetic with Overflow" intrinsics arsenm wrote: > Right but it's relying on a non-guaranteed maybe-optimisation firing, as far > as I can tell. The point is to test the optimization does work.

[clang] [Clang] Implement resource directory headers for common GPU intrinsics (PR #110179)

2024-10-02 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,187 @@ +//===-- amdgpuintrin.h - AMDPGU intrinsic functions ---===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [Clang] Implement resource directory headers for common GPU intrinsics (PR #110179)

2024-10-02 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,187 @@ +//===-- amdgpuintrin.h - AMDPGU intrinsic functions ---===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [llvm] [mlir] [TableGen] Change `DefInit::Def` to a const Record pointer (PR #110747)

2024-10-02 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/110747 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [mlir] [TableGen] Change `DefInit::Def` to a const Record pointer (PR #110747)

2024-10-02 Thread Matt Arsenault via cfe-commits
@@ -1660,7 +1660,7 @@ class Record { // this record. SmallVector Locs; SmallVector ForwardDeclarationLocs; - SmallVector ReferenceLocs; + mutable SmallVector ReferenceLocs; arsenm wrote: You have the const_cast on the addition, so this is unnecessary?

[clang] [flang] [llvm] [mlir] Make Ownership of MachineModuleInfo in Its Wrapper Pass External (PR #110443)

2024-10-02 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,102 @@ +//===-- LLVMTargetMachineC.cpp ===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [llvm] [clang][llvm][SPIR-V] Explicitly encode native integer widths for SPIR-V (PR #110695)

2024-10-02 Thread Matt Arsenault via cfe-commits
@@ -1,56 +0,0 @@ -; This test aims to check ability to support "Arithmetic with Overflow" intrinsics arsenm wrote: The codegen prepare behavior is still backend code to be tested. You can just run codegenprepare as a standalone pass too (usually would have sepa

[clang] clang/AMDGPU: Restore O3 checks in default-attributes.hip (PR #115238)

2024-11-06 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm created https://github.com/llvm/llvm-project/pull/115238 These were dropped in b1bcb7ca460fcd317bbc8309e14c8761bf8394e0 to avoid some bot failures. >From 3a5d957b5fe0d36df2273693c7c865c39715d192 Mon Sep 17 00:00:00 2001 From: Matt Arsenault Date: Mon, 15 Jul 2024 11:4

[clang] clang/AMDGPU: Restore O3 checks in default-attributes.hip (PR #115238)

2024-11-06 Thread Matt Arsenault via cfe-commits
arsenm wrote: * **#115238** https://app.graphite.dev/github/pr/llvm/llvm-project/115238?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 * `main` This stack of pull requests is managed by Graphi

[clang] clang/AMDGPU: Restore O3 checks in default-attributes.hip (PR #115238)

2024-11-06 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm ready_for_review https://github.com/llvm/llvm-project/pull/115238 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [AMDGPU] Make `__GCC_DESTRUCTIVE_SIZE` 128 on AMDGPU (PR #115241)

2024-11-06 Thread Matt Arsenault via cfe-commits
@@ -462,6 +462,10 @@ class LLVM_LIBRARY_VISIBILITY AMDGPUTargetInfo final : public TargetInfo { } bool hasHIPImageSupport() const override { return HasImage; } + + std::pair hardwareInterferenceSizes() const override { +return std::make_pair(128, 128); --

[clang] [llvm] [BPF] Add load-acquire and store-release instructions under -mcpu=v4 (PR #108636)

2024-10-25 Thread Matt Arsenault via cfe-commits
@@ -703,6 +715,39 @@ SDValue BPFTargetLowering::LowerSELECT_CC(SDValue Op, SelectionDAG &DAG) const { return DAG.getNode(BPFISD::SELECT_CC, DL, VTs, Ops); } +SDValue BPFTargetLowering::LowerATOMIC_LOAD(SDValue Op, +SelectionDAG &D

[clang] [llvm] [BPF] Add load-acquire and store-release instructions under -mcpu=v4 (PR #108636)

2024-10-25 Thread Matt Arsenault via cfe-commits
@@ -703,6 +715,39 @@ SDValue BPFTargetLowering::LowerSELECT_CC(SDValue Op, SelectionDAG &DAG) const { return DAG.getNode(BPFISD::SELECT_CC, DL, VTs, Ops); } +SDValue BPFTargetLowering::LowerATOMIC_LOAD(SDValue Op, +SelectionDAG &D

[clang] [libc] [Clang] Implement resource directory headers for common GPU intrinsics (PR #110179)

2024-10-25 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,76 @@ +//===-- gpuintrin.h - Generic GPU intrinsic functions -===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apa

[clang] Fix amdgpu-arch for dll name on Windows (PR #101350)

2024-10-25 Thread Matt Arsenault via cfe-commits
@@ -31,16 +44,118 @@ typedef hipError_t (*hipGetDeviceCount_t)(int *); typedef hipError_t (*hipDeviceGet_t)(int *, int); typedef hipError_t (*hipGetDeviceProperties_t)(hipDeviceProp_t *, int); -int printGPUsByHIP() { +extern cl::opt Verbose; + #ifdef _WIN32 - constexpr const

[clang] Fix amdgpu-arch for dll name on Windows (PR #101350)

2024-10-25 Thread Matt Arsenault via cfe-commits
@@ -31,16 +44,118 @@ typedef hipError_t (*hipGetDeviceCount_t)(int *); typedef hipError_t (*hipDeviceGet_t)(int *, int); typedef hipError_t (*hipGetDeviceProperties_t)(hipDeviceProp_t *, int); -int printGPUsByHIP() { +extern cl::opt Verbose; + #ifdef _WIN32 - constexpr const

[clang] Fix amdgpu-arch for dll name on Windows (PR #101350)

2024-10-25 Thread Matt Arsenault via cfe-commits
@@ -31,16 +44,118 @@ typedef hipError_t (*hipGetDeviceCount_t)(int *); typedef hipError_t (*hipDeviceGet_t)(int *, int); typedef hipError_t (*hipGetDeviceProperties_t)(hipDeviceProp_t *, int); -int printGPUsByHIP() { +extern cl::opt Verbose; + #ifdef _WIN32 - constexpr const

[clang] [Clang] Add a flag to include GPU startup files (PR #112025)

2024-10-25 Thread Matt Arsenault via cfe-commits
@@ -648,6 +648,17 @@ void amdgpu::Linker::ConstructJob(Compilation &C, const JobAction &JA, Args.MakeArgString("-plugin-opt=-mattr=" + llvm::join(Features, ","))); } + if (Args.hasArg(options::OPT_stdlib)) +CmdArgs.append({"-lc", "-lm"}); + if (Args.hasArg(opt

[clang] Fix amdgpu-arch for dll name on Windows (PR #101350)

2024-10-25 Thread Matt Arsenault via cfe-commits
@@ -31,16 +44,118 @@ typedef hipError_t (*hipGetDeviceCount_t)(int *); typedef hipError_t (*hipDeviceGet_t)(int *, int); typedef hipError_t (*hipGetDeviceProperties_t)(hipDeviceProp_t *, int); -int printGPUsByHIP() { +extern cl::opt Verbose; + #ifdef _WIN32 - constexpr const

[clang] [AMDGPU] Allow overload of __builtin_amdgcn_mov_dpp8 (PR #113610)

2024-10-25 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/113610 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [libc] [Clang] Implement resource directory headers for common GPU intrinsics (PR #110179)

2024-10-25 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,154 @@ +//===-- amdgpuintrin.h - AMDPGU intrinsic functions ---===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [libc] [Clang] Implement resource directory headers for common GPU intrinsics (PR #110179)

2024-10-25 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,154 @@ +//===-- amdgpuintrin.h - AMDPGU intrinsic functions ---===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [llvm] [BPF] Add load-acquire and store-release instructions under -mcpu=v4 (PR #108636)

2024-10-25 Thread Matt Arsenault via cfe-commits
@@ -703,6 +715,39 @@ SDValue BPFTargetLowering::LowerSELECT_CC(SDValue Op, SelectionDAG &DAG) const { return DAG.getNode(BPFISD::SELECT_CC, DL, VTs, Ops); } +SDValue BPFTargetLowering::LowerATOMIC_LOAD(SDValue Op, +SelectionDAG &D

[clang] [Clang] Add a flag to include GPU startup files (PR #112025)

2024-10-25 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/112025 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [libc] [Clang] Implement resource directory headers for common GPU intrinsics (PR #110179)

2024-10-25 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,154 @@ +//===-- amdgpuintrin.h - AMDPGU intrinsic functions ---===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [llvm] [BPF] Add load-acquire and store-release instructions under -mcpu=v4 (PR #108636)

2024-10-25 Thread Matt Arsenault via cfe-commits
@@ -703,6 +715,39 @@ SDValue BPFTargetLowering::LowerSELECT_CC(SDValue Op, SelectionDAG &DAG) const { return DAG.getNode(BPFISD::SELECT_CC, DL, VTs, Ops); } +SDValue BPFTargetLowering::LowerATOMIC_LOAD(SDValue Op, +SelectionDAG &D

[clang] [llvm] [BPF] Add load-acquire and store-release instructions under -mcpu=v4 (PR #108636)

2024-10-25 Thread Matt Arsenault via cfe-commits
@@ -703,6 +715,39 @@ SDValue BPFTargetLowering::LowerSELECT_CC(SDValue Op, SelectionDAG &DAG) const { return DAG.getNode(BPFISD::SELECT_CC, DL, VTs, Ops); } +SDValue BPFTargetLowering::LowerATOMIC_LOAD(SDValue Op, +SelectionDAG &D

[clang] [OpenCL] Replace a CreatePointerCast call; NFC (PR #112676)

2024-10-17 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/112676 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [mlir] [LLVM][TableGen] Change all `Init` pointers to const (PR #112705)

2024-10-17 Thread Matt Arsenault via cfe-commits
arsenm wrote: I think const should always be used in all situations https://github.com/llvm/llvm-project/pull/112705 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Improve EmitClangAttrSpellingListIndex (PR #114899)

2024-11-07 Thread Matt Arsenault via cfe-commits
@@ -153,12 +155,33 @@ std::string AttributeCommonInfo::getNormalizedFullName() const { normalizeName(getAttrName(), getScopeName(), getSyntax())); } +const llvm::StringMap ScopeMap = { arsenm wrote: Avoid static constructor? Just make this a sorted lis

[clang] [Clang] Improve EmitClangAttrSpellingListIndex (PR #114899)

2024-11-07 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/114899 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang][CodeGen] `sret` args should always point to the `alloca` AS, so use that (PR #114062)

2024-11-06 Thread Matt Arsenault via cfe-commits
@@ -1780,6 +1780,14 @@ class TargetInfo : public TransferrableTargetInfo, return 0; } + /// \returns Target specific address space for indirect (e.g. sret) arguments. + /// If such an address space exists, it must be convertible to and from the + /// alloca address s

[clang] [llvm] [AMDGPU] Introduce a new generic target `gfx9-4-generic` (PR #115190)

2024-11-06 Thread Matt Arsenault via cfe-commits
@@ -156,6 +157,8 @@ StringRef llvm::AMDGPU::getArchFamilyNameAMDGCN(GPUKind AK) { switch (AK) { case AMDGPU::GK_GFX9_GENERIC: return "gfx9"; + case AMDGPU::GK_GFX9_4_GENERIC: +return "gfx9"; arsenm wrote: I guess it would still be gfx9 (not that

<    12   13   14   15   16   17   18   19   20   21   >