[GCC 13 PATCH] aarch64: Remove architecture dependencies from intrinsics

2023-07-19 Thread Andrew Carlotti via Gcc-patches
Updated patch to fix the fp16 intrinsic pragmas, and pushed to master. OK to backport to GCC 13? Many intrinsics currently depend on both an architecture version and a feature, despite the corresponding instructions being available within GCC at lower architecture versions. LLVM has already remo

Re: [GCC 13 PATCH] aarch64: Remove architecture dependencies from intrinsics

2023-07-19 Thread Andrew Carlotti via Gcc-patches
On Wed, Jul 19, 2023 at 07:35:26PM +0100, Ramana Radhakrishnan wrote: > On Wed, Jul 19, 2023 at 5:44 PM Andrew Carlotti via Gcc-patches > wrote: > > > > Updated patch to fix the fp16 intrinsic pragmas, and pushed to master. > > OK to backport to GCC 13? > > >

Re: [GCC 13 PATCH] aarch64: Remove architecture dependencies from intrinsics

2023-07-20 Thread Andrew Carlotti via Gcc-patches
On Thu, Jul 20, 2023 at 09:37:14AM +0200, Richard Biener wrote: > On Thu, Jul 20, 2023 at 8:49 AM Richard Sandiford via Gcc-patches > wrote: > > > > Andrew Carlotti writes: > > > Updated patch to fix the fp16 intrinsic pragmas, and pushed to master. > > > OK to backport to GCC 13? > > > > OK, tha

[PATCH] aarch64: Fix pure/const function attributes for intrinsics

2022-06-30 Thread Andrew Carlotti via Gcc-patches
No testcase for this, since I haven't found a way to turn the incorrect attribute into incorrect codegen. Bootstrapped and tested on aarch64-none-linux gnu. gcc/ * config/aarch64/aarch64-builtins.c (aarch64_get_attributes): Fix choice of pure/const attributes. --- diff --git a/

Re: [PATCH] aarch64: Fix pure/const function attributes for intrinsics

2022-07-01 Thread Andrew Carlotti via Gcc-patches
On Fri, Jul 01, 2022 at 08:42:15AM +0200, Richard Biener wrote: > On Thu, Jun 30, 2022 at 6:04 PM Andrew Carlotti via Gcc-patches > wrote: > > diff --git a/gcc/config/aarch64/aarch64-builtins.cc > > b/gcc/config/aarch64/aarch64-builtins.cc > > index > > e0a741ac66

[PATCH v2 1/2] aarch64: Don't return invalid GIMPLE assign statements

2022-07-12 Thread Andrew Carlotti via Gcc-patches
aarch64_general_gimple_fold_builtin doesn't check whether the LHS of a function call is null before converting it to an assign statement. To avoid returning an invalid GIMPLE statement in this case, we instead assign the expression result to a new (unused) variable. This change only affects code t

[PATCH v2 2/2] aarch64: Lower vcombine to GIMPLE

2022-07-12 Thread Andrew Carlotti via Gcc-patches
This lowers vcombine intrinsics to a GIMPLE vector constructor, which enables better optimisation during GIMPLE passes. gcc/ * config/aarch64/aarch64-builtins.c (aarch64_general_gimple_fold_builtin): Add combine. gcc/testsuite/ * gcc.target/aarch64/advsimd-intrinsics/com

Re: [PATCH v2 1/2] aarch64: Don't return invalid GIMPLE assign statements

2022-07-13 Thread Andrew Carlotti via Gcc-patches
On Wed, Jul 13, 2022 at 09:10:25AM +0100, Richard Sandiford wrote: > Richard Biener via Gcc-patches writes: > > On Tue, Jul 12, 2022 at 4:38 PM Andrew Carlotti > > wrote: > >> > >> aarch64_general_gimple_fold_builtin doesn't check whether the LHS of a > >> function call is null before converting

[PATCH v2 1/4] aarch64: Add V1DI mode

2022-07-13 Thread Andrew Carlotti via Gcc-patches
We already have a V1DF mode, so this makes the vector modes more consistent. Additionally, this allows us to recognise uint64x1_t and int64x1_t types given only the mode and type qualifiers (e.g. in aarch64_lookup_simd_builtin_type). gcc/ChangeLog: * config/aarch64/aarch64-builtins.cc

[PATCH v2 3/4] aarch64: Consolidate simd type lookup functions

2022-07-13 Thread Andrew Carlotti via Gcc-patches
There were several similarly-named functions, which each built or looked up a type using a different subset of valid modes or qualifiers. This change combines these all into a single function, which can additionally handle const and pointer qualifiers. gcc/ChangeLog: * config/aarch64/aar

[PATCH v2 2/4] aarch64: Remove qualifier_internal

2022-07-13 Thread Andrew Carlotti via Gcc-patches
This has been unused since 2014, so there's no reason to retain it. gcc/ChangeLog: * config/aarch64/aarch64-builtins.cc (enum aarch64_type_qualifiers): Remove qualifier_internal. (aarch64_init_simd_builtin_functions): Remove qualifier_internal check. --- diff --git a/gcc

[PATCH v2 4/4] aarch64: Move vreinterpret definitions into the compiler

2022-07-13 Thread Andrew Carlotti via Gcc-patches
This removes a significant number of intrinsic definitions from the arm_neon.h header file, and reduces the amount of code duplication. The new macros and data structures are intended to also facilitate moving other intrinsic definitions out of the header file in future. There is a a slight change

[committed] MAINTAINERS: Add myself to Write After Approval

2022-07-15 Thread Andrew Carlotti via Gcc-patches
ChangeLog: * MAINTAINERS: Add myself to Write After Approval. diff --git a/MAINTAINERS b/MAINTAINERS index 7d9aab76dd9676c806bd08abc7542553fcf81928..7a7ad42ced3027f1f7970916b355fd5fc7b0088c 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -352,6 +352,7 @@ Kevin Buettner

Re: [PATCH v2 1/2] aarch64: Don't return invalid GIMPLE assign statements

2022-07-15 Thread Andrew Carlotti via Gcc-patches
On Wed, Jul 13, 2022 at 02:32:16PM +0200, Richard Biener wrote: > On Wed, Jul 13, 2022 at 12:50 PM Andrew Carlotti > wrote: > > I specifically wanted to avoid not folding the call, because always > > folding means that the builtin doesn't need to be implemented anywhere > > else (which isn't relev

[PATCH v2.1 3/4] aarch64: Consolidate simd type lookup functions

2022-07-19 Thread Andrew Carlotti via Gcc-patches
On Wed, Jul 13, 2022 at 05:36:04PM +0100, Richard Sandiford wrote: > I like the part about getting rid of: > > static tree > aarch64_simd_builtin_type (machine_mode mode, > bool unsigned_p, bool poly_p) > > and the flow of the new function. However, I think it's still >

[committed] docs: Fix outdated reference to LOOPS_HAVE_MARKED_SINGLE_EXITS

2022-07-27 Thread Andrew Carlotti via Gcc-patches
This reference has been wrong since 2007; committed as an obvious fix. gcc/ChangeLog: * doc/loop.texi: Refer to LOOPS_HAVE_RECORDED_EXITS instead. diff --git a/gcc/doc/loop.texi b/gcc/doc/loop.texi index d7b71a24dbfed284b13da702bd5037691a515535..6e8657a074d2447db7ae9b75cbfbb71282b84287 100644

[PATCH] aarch64: Remove architecture dependencies from intrinsics

2023-06-26 Thread Andrew Carlotti via Gcc-patches
Many intrinsics currently depend on both an architecture version and a feature, despite the corresponding instructions being available within GCC at lower architecture versions. LLVM has already removed these explicit architecture version dependences; this patch does the same for GCC, as well as r

[committed] docs: Fix typo

2023-06-26 Thread Andrew Carlotti via Gcc-patches
gcc/ChangeLog: * doc/optinfo.texi: Fix "steam" -> "stream". diff --git a/gcc/doc/optinfo.texi b/gcc/doc/optinfo.texi index b91bba7bd10470b17ca5190688beee06ad3b87ab..5e8c97ef118786e68b7e46f3c802154cb9b57b83 100644 --- a/gcc/doc/optinfo.texi +++ b/gcc/doc/optinfo.texi @@ -100,7 +100,7 @@ that o

Re: [PATCH] aarch64: Remove architecture dependencies from intrinsics

2023-06-29 Thread Andrew Carlotti via Gcc-patches
On Tue, Jun 27, 2023 at 07:23:32AM +0100, Richard Sandiford wrote: > Andrew Carlotti via Gcc-patches writes: > > Many intrinsics currently depend on both an architecture version and a > > feature, despite the corresponding instructions being available within > > GCC at lower

Re: [PATCH 1/2]middle-end: Fix wrong overmatching of div-bitmask by using new optabs [PR108583]

2023-03-01 Thread Andrew Carlotti via Gcc-patches
On Thu, Feb 23, 2023 at 11:39:51AM -0500, Andrew MacLeod via Gcc-patches wrote: > > > Inheriting from operator_mult is also going to be hazardous because it also > has an op1_range and op2_range...� you should at least define those and > return VARYING to avoid other issues.� Same thing appli

[committed] Improve comment for tree_niter_desc.{control,bound,cmp}

2022-08-12 Thread Andrew Carlotti via Gcc-patches
Fix typos and explain ERROR_MARK usage. gcc/ChangeLog: * tree-ssa-loop.h: Improve comment --- diff --git a/gcc/tree-ssa-loop.h b/gcc/tree-ssa-loop.h index 415f461c37e4cd7df0b49f6104f796c49cc830fa..6c70f795d171f22b3ed75873fec4920fea75255b 100644 --- a/gcc/tree-ssa-loop.h +++ b/gcc/tree

[committed] docs: Link to correct section for constraint modifiers

2022-12-22 Thread Andrew Carlotti via Gcc-patches
Committed as obvious. gcc/ChangeLog: * doc/md.texi: Fix incorrect pxref. --- diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index cc28f868fc85b5148450548a54d69a39ecc4f03a..c1d3ae2060d800bbaa9751fcf841d7417af1e37d 100644 --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -9321,6 +9321,11 @

[committed] docs: Fix inconsistent example predicate name

2022-12-22 Thread Andrew Carlotti via Gcc-patches
It is unclear why the example C function was renamed to `commutative_integer_operator` as part of ec8e098d in 2004, while the text and the example md were both left as `commutative_operator`. The latter name appears to be more accurate, so revert the 2004 change. Committed as obvious. gcc/ChangeL

[committed] docs: Fix peephole paragraph ordering

2022-12-22 Thread Andrew Carlotti via Gcc-patches
The documentation for the DONE and FAIL macros was incorrectly inserted between example code, and a remark attached to that example. Committed as obvious. gcc/ChangeLog: * doc/md.texi: Move example code remark next to it's code block. --- diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi

Re: [committed] docs: Link to correct section for constraint modifiers

2022-12-22 Thread Andrew Carlotti via Gcc-patches
Patches attached in to the wrong emails - this patch was actually: On Thu, Dec 22, 2022 at 05:05:51PM +, Andrew Carlotti via Gcc-patches wrote: > Committed as obvious. > > gcc/ChangeLog: > > * doc/md.texi: Fix incorrect pxref. > > --- diff --git a/gcc/doc/md.t

Re: [committed] docs: Fix peephole paragraph ordering

2022-12-22 Thread Andrew Carlotti via Gcc-patches
Patches attached to the wrong email - this patch was actually: On Thu, Dec 22, 2022 at 05:06:13PM +, Andrew Carlotti via Gcc-patches wrote: > The documentation for the DONE and FAIL macros was incorrectly inserted > between example code, and a remark attached to that example. > &g

[PATCH 5/8 v2] middle-end: Add cltz_complement idiom recognition

2022-12-22 Thread Andrew Carlotti via Gcc-patches
On Thu, Nov 24, 2022 at 11:41:31AM +0100, Richard Biener wrote: > Note we do have CTZ and CLZ > optabs and internal functions - in case there's a HImode CLZ this > could be elided. More general you can avoid using the __builtin_ > functions with their fixed types in favor of using IFN_C[TL]Z which

[PATCH 6/8 v2] docs: Add popcount, clz and ctz target attributes

2022-12-22 Thread Andrew Carlotti via Gcc-patches
Updated to reflect Sphinx revert; I'll commit this once the cltz_complement patch is merged. gcc/ChangeLog: * doc/sourcebuild.texi: Add missing target attributes. --- diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi index ffe69d6fcb9c46cf97ba570e85b56e586a0c9b99..1036b1

[PATCH 9/8] middle-end: Allow build_popcount_expr to use an IFN

2022-12-22 Thread Andrew Carlotti via Gcc-patches
Bootstrapped and regression tested on aarch64-unknown-linux-gnu and x86_64-pc-linux-gnu - ok to merge? gcc/ChangeLog: * tree-ssa-loop-niter.cc (build_popcount_expr): Add IFN support. gcc/testsuite/ChangeLog: * g++.dg/tree-ssa/pr86544.C: Add .POPCOUNT to tree scan regex.

Re: [PATCH 9/8] middle-end: Allow build_popcount_expr to use an IFN

2023-01-16 Thread Andrew Carlotti via Gcc-patches
ingly). I might have incorporated it into an earlier patch in this series, if I hadn't already pushed that earlier patch. Is this OK to leave in master now? Thanks, Andrew On Thu, Dec 22, 2022 at 05:43:21PM +0000, Andrew Carlotti via Gcc-patches wrote: > Bootstrapped and regression test

Re: [PATCH 9/8] middle-end: Allow build_popcount_expr to use an IFN

2023-01-16 Thread Andrew Carlotti via Gcc-patches
and approved for build_cltz_expr (and adjusts testcases > accordingly). I might have incorporated it into an earlier patch in this > series, if I hadn't already pushed that earlier patch. > > Is this OK to leave in master now? > > Thanks, > Andrew > > On Thu,

[PATCH 0/8] middle-end: Popcount and clz/ctz idiom recognition improvements

2022-11-11 Thread Andrew Carlotti via Gcc-patches
This is a series of patches to improve recognition of popcount and clz/ctz idioms, along with some related fixes. - Patches 1 and 8 are independent fixes or improvements. - Patch 4 is a dependency of patch 5, as it improves the robustness of a test that would otherwise begin failing. - Patches 2

[PATCH 0/8] middle-end: Ensure at_stmt is defined before an early exit

2022-11-11 Thread Andrew Carlotti via Gcc-patches
This prevents a null dereference error when outputing debug information following an early exit from number_of_iterations_exit_assumptions. gcc/ChangeLog: * tree-ssa-loop-niter.cc (number_of_iterations_exit_assumptions): Move at_stmt assignment. -- diff --git a/gcc/tree-ssa-lo

[PATCH 2/8] middle-end: Remove prototype for number_of_iterations_popcount

2022-11-11 Thread Andrew Carlotti via Gcc-patches
gcc/ChangeLog: * tree-ssa-loop-niter.c (ssa_defined_by_minus_one_stmt_p): Move (number_of_iterations_popcount): Move, and remove separate prototype. -- diff --git a/gcc/tree-ssa-loop-niter.cc b/gcc/tree-ssa-loop-niter.cc index cdbb924216243ebcabe6c695698a4aee71882c49..c23643fd

[PATCH 3/8] middle-end: Refactor number_of_iterations_popcount

2022-11-11 Thread Andrew Carlotti via Gcc-patches
This includes various changes to improve clarity, and to enable the code to be more similar to the clz and ctz idiom recognition added in subsequent patches. We create new number_of_iterations_bitcount function, which will be used to call the other bit-counting recognition functions added in subse

[PATCH 4/8] Modify test, to prevent the next patch breaking it

2022-11-11 Thread Andrew Carlotti via Gcc-patches
The upcoming c[lt]z idiom recognition patch eliminates the need for a brute force computation of the iteration count of these loops. The test is intended to verify that ivcanon can determine the loop count when the condition is given by a chain of constant computations. We replace the constant ope

[PATCH 5/8] middle-end: Add cltz_complement idiom recognition

2022-11-11 Thread Andrew Carlotti via Gcc-patches
This recognises patterns of the form: while (n) { n >>= 1 } This patch results in improved (but still suboptimal) codegen: foo (unsigned int b) { int c = 0; while (b) { b >>= 1; c++; } return c; } foo: .LFB11: .cfi_startproc cbz w0, .L3

[PATCH 6/8] docs: Add popcount, clz and ctz target attributes

2022-11-11 Thread Andrew Carlotti via Gcc-patches
gcc/ChangeLog: * doc/gccint/testsuites/directives-used-within-dejagnu-tests/keywords-describing-target-attributes.rst: Add missing target attributes. -- diff --git a/gcc/doc/gccint/testsuites/directives-used-within-dejagnu-tests/keywords-describing-target-attributes.rst b/g

[PATCH 7/8] middle-end: Add c[lt]z idiom recognition

2022-11-11 Thread Andrew Carlotti via Gcc-patches
This recognises the patterns of the form: while (n & 1) { n >>= 1 } Unfortunately there are currently two issues relating to this patch. Firstly, simplify_using_initial_conditions does not recognise that (n != 0) and ((n & 1) == 0) implies that ((n >> 1) != 0). This preconditions arise

[PATCH 8/8] middle-end: Expand comment for tree_niter_desc.max

2022-11-11 Thread Andrew Carlotti via Gcc-patches
This requirement is enforced by a gcc_checking_assert in record_estimate. gcc/ChangeLog: * tree-ssa-loop.h (tree_niter_desc): Update comment. -- diff --git a/gcc/tree-ssa-loop.h b/gcc/tree-ssa-loop.h index 6c70f795d171f22b3ed75873fec4920fea75255b..c24215be8822c31a05eaedcf4d3a26db0fea

[PATCH] aarch64: Lower vcombine to GIMPLE

2022-06-07 Thread Andrew Carlotti via Gcc-patches
Hi all, This lowers vcombine intrinsics to a GIMPLE vector constructor, which enables better optimisation during GIMPLE passes. Bootstrapped and tested on aarch64-none-linux-gnu, and tested for aarch64_be-none-linux-gnu via cross-compilation. gcc/ * config/aarch64/aarch64-builtins.c

[PATCH] aarch64: Move vreinterpret definitions into the compiler

2022-06-29 Thread Andrew Carlotti via Gcc-patches
Hi, This removes a significant number of intrinsic definitions from the arm_neon.h header file, and reduces the amount of code duplication. The new macros and data structures are intended to also facilitate moving other intrinsic definitions out of the header file in future. There is a a slight c

Re: [PATCH 5/8] middle-end: Add cltz_complement idiom recognition

2022-11-21 Thread Andrew Carlotti via Gcc-patches
On Mon, Nov 14, 2022 at 04:10:22PM +0100, Richard Biener wrote: > On Fri, Nov 11, 2022 at 7:53 PM Andrew Carlotti via Gcc-patches > wrote: > > > > This recognises patterns of the form: > > while (n) { n >>= 1 } > > > > This patch results in improved (