Re: [11a/n] Avoid retrying with the same vector modes

2019-11-06 Thread Richard Sandiford
Richard Biener writes: > On Wed, Nov 6, 2019 at 11:21 AM Richard Sandiford > wrote: >> >> Richard Biener writes: >> > On Tue, Nov 5, 2019 at 9:25 PM Richard Sandiford >> > wrote: >> >> >> >> Patch 12/n makes the AArch64 port add f

Re: [3/4] Don't vectorise single-iteration epilogues

2019-11-06 Thread Richard Sandiford
Richard Biener writes: > On Mon, Nov 4, 2019 at 4:30 PM Richard Sandiford > wrote: >> >> With a later patch I saw a case in which we peeled a single iteration >> for gaps but didn't need to peel further iterations to make up a full >> vector. We then trie

Re: [11/n] Support vectorisation with mixed vector sizes

2019-11-06 Thread Richard Sandiford
Richard Biener writes: > On Fri, Oct 25, 2019 at 2:43 PM Richard Sandiford > wrote: >> >> After previous patches, it's now possible to make the vectoriser >> support multiple vector sizes in the same vector region, using >> related_vector_mode to pick the right

Re: [14/n] Vectorise conversions between differently-sized integer vectors

2019-11-06 Thread Richard Sandiford
Richard Biener writes: > On Fri, Oct 25, 2019 at 2:51 PM Richard Sandiford > wrote: >> >> This patch adds AArch64 patterns for converting between 64-bit and >> 128-bit integer vectors, and makes the vectoriser and expand pass >> use them. > > So on GIM

Re: [11a/n] Avoid retrying with the same vector modes

2019-11-06 Thread Richard Sandiford
Richard Biener writes: > On Wed, Nov 6, 2019 at 12:02 PM Richard Sandiford > wrote: >> >> Richard Biener writes: >> > On Wed, Nov 6, 2019 at 11:21 AM Richard Sandiford >> > wrote: >> >> >> >> Richard Biener writes: >> &g

Re: [4/6] Optionally pick the cheapest loop_vec_info

2019-11-06 Thread Richard Sandiford
Richard Biener writes: > On Tue, Nov 5, 2019 at 3:29 PM Richard Sandiford > wrote: >> >> This patch adds a mode in which the vectoriser tries each available >> base vector mode and picks the one with the lowest cost. For now >> the behaviour is behind a default-

Re: [2/6] Don't assign a cost to vectorizable_assignment

2019-11-06 Thread Richard Sandiford
Richard Biener writes: > On Tue, Nov 5, 2019 at 3:27 PM Richard Sandiford > wrote: >> >> vectorizable_assignment handles true SSA-to-SSA copies (which hopefully >> we don't see in practice) and no-op conversions that are required >> to maintain correct gimple,

Generalise gather and scatter optabs

2019-11-06 Thread Richard Sandiford
tch should be a no-op, but later SVE patches take advantage of the new flexibility. Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install? Richard 2019-11-06 Richard Sandiford gcc/ * optabs.def (gather_load_optab, mask_gather_load_optab) (scatte

[C] Opt out of GNU vector extensions for built-in SVE types

2019-11-07 Thread Richard Sandiford
submit in that form though, so this is just a combined patch instead. I'm happy to post the individual patches if that would help. Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install? Richard 2019-11-07 Richard Sandiford gcc/ * tree-core.h (tree_type_common::indivisi

Re: [1/6] Fix vectorizable_conversion costs

2019-11-07 Thread Richard Sandiford
Richard Biener writes: > On Tue, Nov 5, 2019 at 3:25 PM Richard Sandiford > wrote: >> >> This patch makes two tweaks to vectorizable_conversion. The first >> is to use "modifier" to distinguish between promotion, demotion, >> and neither promotion no

Re: [2/6] Don't assign a cost to vectorizable_assignment

2019-11-07 Thread Richard Sandiford
Richard Biener writes: > On Wed, Nov 6, 2019 at 4:58 PM Richard Sandiford > wrote: >> >> Richard Biener writes: >> > On Tue, Nov 5, 2019 at 3:27 PM Richard Sandiford >> > wrote: >> >> >> >> vectorizable_assignment handles true SSA-t

Re: [4/6] Optionally pick the cheapest loop_vec_info

2019-11-07 Thread Richard Sandiford
Richard Biener writes: > On Wed, Nov 6, 2019 at 3:01 PM Richard Sandiford > wrote: >> >> Richard Biener writes: >> > On Tue, Nov 5, 2019 at 3:29 PM Richard Sandiford >> > wrote: >> >> >> >> This patch adds a mode in which the vectoris

[committed][AArch64] Don't handle bswap in aarch64_builtin_vectorized_function

2019-11-08 Thread Richard Sandiford
aarch64_builtin_vectorized_function no longer needs to handle bswap* since we have internal functions and optabs for all supported cases. Tested on aarch64-linux-gnu and applied as r277951. Richard 2019-11-08 Richard Sandiford gcc/ * config/aarch64/aarch64-builtins.c

[committed][AArch64] Remove unused mode iterators

2019-11-08 Thread Richard Sandiford
Tested on aarch64-linux-gnu, applied as r277953. Richard 2019-11-08 Richard Sandiford gcc/ * config/aarch64/iterators.md (SVE_BH, SVE_BHS): Delete. Index: gcc/config/aarch64/iterators.md === --- gcc/config/aarch64

LRA: handle memory constraints that accept more than "m"

2019-11-08 Thread Richard Sandiford
ng its addressing modes to "m" would lead to bad Advanced SIMD optimisation decisions in passes like ivopts. LD1RQ therefore has a memory constraint that accepts things "m" doesn't. Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install? Richard 2019-11-08 Ri

[committed] Handle POLY_INT_CSTs in declare_return_value

2019-11-08 Thread Richard Sandiford
SVE allows variable-length vectors to be returned by value, which tripped the assert in declare_return_variable. Tested on aarch64-linux-gnu and x86_64-linux-gnu. Applied as obvious/ preapproved by Jeff some time ago for this kind of change. Richard 2019-11-08 Richard Sandiford gcc

Fix code order in tree-sra.c:create_access

2019-11-08 Thread Richard Sandiford
ggers there after the introduction of IPA SRA. Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install? Richard 2019-11-08 Richard Sandiford gcc/ * tree-sra.c (create_access): Delay disqualifying the base for poly_int values until we know we have a base. gcc/test

Mark constant-sized objects as addressable if they have poly-int accesses

2019-11-08 Thread Richard Sandiford
r; it would have to go via memory. And in that case it's more efficient to mark the fixed-size object as addressable from the outset, like we do for array references with non-constant indices. Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install? Richard 2019-11-08 Richard

[committed] Handle POLY_INT_CST in copy_reference_ops_from_ref

2019-11-08 Thread Richard Sandiford
Tested on aarch64-linux-gnu and x86_64-linux-gnu. Applied as obvious. Richard 2019-11-08 Richard Sandiford gcc/ * tree-ssa-sccvn.c (copy_reference_ops_from_ref): Handle POLY_INT_CST. gcc/testsuite/ * gcc.target/aarch64/sve/acle/general/deref_2.c: New test

Re: [PATCH] Fix PR92324

2019-11-08 Thread Richard Sandiford
arch64-linux-gnu. Thanks, Richard 2019-11-08 Richard Sandiford gcc/ * tree-vect-loop.c (neutral_op_for_slp_reduction): Take the vector type as an argument rather than reading it from the stmt_vec_info. (vect_create_epilog_for_reduction): Update accordingl

Re: [4/6] Optionally pick the cheapest loop_vec_info

2019-11-08 Thread Richard Sandiford
Richard Biener writes: > On Thu, Nov 7, 2019 at 6:15 PM Richard Sandiford > wrote: >> >> Richard Biener writes: >> > On Wed, Nov 6, 2019 at 3:01 PM Richard Sandiford >> > wrote: >> >> >> >> Richard Biener writes: >> &g

Re: [PATCH][vect] PR 92351: When peeling for alignment make alignment of epilogues unknown

2019-11-08 Thread Richard Sandiford
Richard Biener writes: > On Thu, 7 Nov 2019, Andre Vieira (lists) wrote: >> On 07/11/2019 14:00, Richard Biener wrote: >> > On Thu, 7 Nov 2019, Andre Vieira (lists) wrote: >> > >> >> Hi, >> >> >> >> PR92351 reports a bug in which a wrongly aligned load is generated for an >> >> epilogue of a main

Re: [PATCH] Fix PR92324

2019-11-08 Thread Richard Sandiford
Richard Biener writes: > On Fri, 8 Nov 2019, Richard Sandiford wrote: > >> Richard Biener writes: >> > I've been sitting on this for a few days since I'm not 100% happy >> > with how the code looks like. There's possibly still holes in it >> &g

Fix SLP downward group access classification (PR92420)

2019-11-08 Thread Richard Sandiford
ithout the assert, after a grace period? Thanks, Richard 2019-11-08 Richard Sandiford gcc/ PR tree-optimization/92420 * tree-vect-stmts.c (get_negative_load_store_type): Move further up file. (get_group_load_store_type): Use it for reversed SLP accesses. gcc

[C++ PATCH] Opt out of GNU vector extensions for built-in SVE types

2019-11-08 Thread Richard Sandiford
hout the translation unit but can only be used in functions for which SVE is enabled. --------- 2019-11-08 Richard Sandiford gcc/cp/ * cp-tree.h (CP_AGGREGATE_TYPE_P): Check for gnu_vector_type_p instead of VECT

Re: [8/n] Replace autovectorize_vector_sizes with autovectorize_vector_modes

2019-11-11 Thread Richard Sandiford
Ping Richard Sandiford writes: > Richard Biener writes: >> On Fri, Oct 25, 2019 at 2:37 PM Richard Sandiford >> wrote: >>> >>> This is another patch in the series to remove the assumption that >>> all modes involved in vectorisation have to be the

[0/8] Improve vector alias checks for WAR and WAW dependencies

2019-11-11 Thread Richard Sandiford
For: void f1 (int *x, int *y) { for (int i = 0; i < 32; ++i) x[i] += y[i]; } we check at runtime whether one vector at x would overlap one vector at y. But in cases like this, the vector code would handle x <= y just fine, since any write to address A still happens after any rea

[1/8] Move canonicalisation of dr_with_seg_len_pair_ts

2019-11-11 Thread Richard Sandiford
The two users of tree-data-ref's runtime alias checks both canonicalise the order of the dr_with_seg_lens in a pair before passing them to prune_runtime_alias_test_list. It's more convenient for later patches if prune_runtime_alias_test_list does that itself. 2019-11-11 Richard

[2/8] Delay swapping data refs in prune_runtime_alias_test_list

2019-11-11 Thread Richard Sandiford
dr_with_seg_len on success, rather than changing an existing one in-place. It would then be easy to merge both the dr_as and dr_bs if we wanted to, rather than requiring one of them to be equal. But here I tried to do something that could be backported if necessary. 2019-11-11 Richard Sandiford gcc

[3/8] Add flags to dr_with_seg_len_pair_t

2019-11-11 Thread Richard Sandiford
This patch adds a bunch of flags to dr_with_seg_len_pair_t, for use by later patches. The update to tree-loop-distribution.c is conservatively correct, but might be tweakable later. 2019-11-11 Richard Sandiford gcc/ * tree-data-ref.h (DR_ALIAS_RAW, DR_ALIAS_WAR, DR_ALIAS_WAW

[4/8] Record whether a dr_with_seg_len contains mixed steps

2019-11-11 Thread Richard Sandiford
ither way, it still seems wrong to use DR_STEP when it doesn't represent all checks that have been merged into the pair. 2019-11-11 Richard Sandiford gcc/ * tree-data-ref.h (DR_ALIAS_MIXED_STEPS): New flag. * tree-data-ref.c (prune_runtime_alias_test_list): Set it when

[5/8] Dump the list of merged alias pairs

2019-11-11 Thread Richard Sandiford
ue in vect-alias-check-9.c so that the result was less likely to be accidentally correct if the alias isn't honoured. 2019-11-11 Richard Sandiford gcc/ * tree-data-ref.c (dump_alias_pair): New function. (prune_runtime_alias_test_list): Use it to dump each merged alias pair.

[6/8] Print the type of alias check in a dump message

2019-11-11 Thread Richard Sandiford
This patch prints a message to say how an alias check is being implemented. 2019-11-11 Richard Sandiford gcc/ * tree-data-ref.c (create_intersect_range_checks_index) (create_intersect_range_checks): Print dump messages. gcc/testsuite/ * gcc.dg/vect/vect-alias-check-1

[7/8] Use a single comparison for index-based alias checks

2019-11-11 Thread Richard Sandiford
seg_len1 and seg_len2 negation for cases in which seg_len is a "negative unsigned" value narrower than 64 bits, like it is for 32-bit targets. Previously we'd end up with values like 0x1 instead of 1. 2019-11-11 Richard Sandiford gcc

[8/8] Optimise WAR and WAW alias checks

2019-11-11 Thread Richard Sandiford
check are mixed together, rather than all statements for the second access following all statements for the first access. The new code for gcc.target/aarch64/sve/var_strict_[135].c is slightly better than before. 2019-11-11 Richard Sandiford gcc/ * tree-data-ref.c (create_intersect_ra

Re: [committed] Handle POLY_INT_CST in copy_reference_ops_from_ref

2019-11-12 Thread Richard Sandiford
Christophe Lyon writes: > On Fri, 8 Nov 2019 at 10:44, Richard Sandiford > wrote: >> >> Tested on aarch64-linux-gnu and x86_64-linux-gnu. Applied as obvious. >> > > Hi Richard, > > The new deref_2.c test fails with -mabi=ilp32: > FAIL: gcc.target/aarc

[C] Add a target hook that allows targets to verify type usage

2019-11-12 Thread Richard Sandiford
_p in r277950, but here the emphasis is on testing sizelessness. Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install? Richard 2019-11-12 Richard Sandiford gcc/ * target.h (type_context_kind): New enum. (verify_type_context): Declare. * target.def (ve

Re: [8/n] Replace autovectorize_vector_sizes with autovectorize_vector_modes

2019-11-12 Thread Richard Sandiford
Richard Biener writes: > On Wed, Oct 30, 2019 at 4:58 PM Richard Sandiford > wrote: >> >> Richard Biener writes: >> > On Fri, Oct 25, 2019 at 2:37 PM Richard Sandiford >> > wrote: >> >> >> >> This is another patch in the seri

Re: [PATCH 0/6] Implement asm flag outputs for arm + aarch64

2019-11-12 Thread Richard Sandiford
Richard Henderson writes: > I've put the implementation into config/arm/aarch-common.c, so > that it can be shared between the two targets. This required > a little bit of cleanup to the CC modes and constraints to get > the two targets to match up. > > I really should have done more than just x8

Re: [PATCH 2/4] MSP430: Disable exception handling by default for C++

2019-11-12 Thread Richard Sandiford
Jozef Lawrynowicz writes: > diff --git a/gcc/testsuite/lib/gcc-dg.exp b/gcc/testsuite/lib/gcc-dg.exp > index 1df645e283c..1ce449cb935 100644 > --- a/gcc/testsuite/lib/gcc-dg.exp > +++ b/gcc/testsuite/lib/gcc-dg.exp > @@ -417,6 +417,16 @@ proc gcc-dg-prune { system text } { > return "::unsupp

Re: [PATCH 0/6] Implement asm flag outputs for arm + aarch64

2019-11-13 Thread Richard Sandiford
Richard Henderson writes: > On 11/12/19 9:21 PM, Richard Sandiford wrote: >> Apart from the vc/vs thing you mentioned in the follow-up for 4/6, >> it looks like 4/6, 5/6 and 6/6 are missing "hs" and "lo". OK for >> aarch64 with those added. > > Ar

Re: [C++] Fix interaction between aka changes and DR1558 (PR92206)

2019-11-13 Thread Richard Sandiford
Jason Merrill writes: > On 10/25/19 2:53 PM, Richard Sandiford wrote: >> One of the changes in r277281 was to make the typedef variant >> handling in strip_typedefs pass the raw DECL_ORIGINAL_TYPE to the >> recursive call, instead of applying TYPE_MAIN_VARIANT first. >&

[AArch64] Use aarch64_sve_int_mode in SVE ACLE code

2019-11-13 Thread Richard Sandiford
This is a like-for-like change at the moment, but is a prerequisite for removing mode_for_int_vector. Tested on aarch64-linux-gnu and applied as r278120. Richard 2019-11-13 Richard Sandiford gcc/ * config/aarch64/aarch64-sve-builtins-functions.h (unary_count::expand): Use

Re: [8/n] Replace autovectorize_vector_sizes with autovectorize_vector_modes

2019-11-13 Thread Richard Sandiford
Richard Biener writes: > On Tue, Nov 12, 2019 at 6:54 PM Richard Sandiford > wrote: >> >> Richard Biener writes: >> > On Wed, Oct 30, 2019 at 4:58 PM Richard Sandiford >> > wrote: >> >> >> >> Richard Biener writes: >> >

[committed][AArch64] Remove fictitious [SU]RHSUB instructions

2020-01-09 Thread Richard Sandiford
We've had skeleton support for "SRHSUB" and "URHSUB" since the initial commit of the port, but no such instructions exist. Tested on aarch64-linux-gnu and applied as 280049. Richard 2020-01-09 Richard Sandiford gcc/ * config/aarch64/iterators.md

[committed][AArch64] Tweak iterator usage for [SU]Q{ADD,SUB}

2020-01-09 Thread Richard Sandiford
or these patterns would be "ssadd" and "usadd" respectively. (Unfortunately, the optabs don't extend to vectors yet, something that would be good to fix in GCC 11.) This patch therefore does what the comment implies and uses q to distinguish qadd and qsub instead. T

[committed][AArch64] Specify some SVE ACLE functions in a more generic way

2020-01-09 Thread Richard Sandiford
This patch generalises some boilerplate that becomes much more common with SVE2 intrinsics. Tested on aarch64-linux-gnu and applied as r280051. Richard 2020-01-09 Richard Sandiford gcc/ * config/aarch64/aarch64-sve-builtins-functions.h (code_for_mode_function): New class

[committed][AArch64] Rename SVE shape "unary_count" to "unary_to_uint"

2020-01-09 Thread Richard Sandiford
t;_to_uint". This patch renames the existing unary_count shape to match the new scheme. Tested on aarch64-linux-gnu and applied as r280052. Richard 2020-01-09 Richard Sandiford gcc/ * config/aarch64/aarch64-sve-builtins-shapes.h (unary_count): Delete. (unary_to_uint): Defi

[committed][AArch64] Rename UNSPEC_WHILE* to match instruction mnemonics

2020-01-09 Thread Richard Sandiford
The UNSPEC_WHILE*s had an underscore before the condition code, whereas almost all other SVE unspecs are taken directly from the mnemonic. Tested on aarch64-linux-gnu and applied as r280053. Richard 2020-01-09 Richard Sandiford gcc/ * config/aarch64/aarch64.md (UNSPEC_WHILE_LE

[committed][AArch64] Simplify WHILERW and WHILEWR definition

2020-01-09 Thread Richard Sandiford
I'd made WHILERW and WHILEWR use separate patterns from the SVE WHILE instructions, but they're similar enough that we can use a single pattern. This means that we also get the flag-related patterns "for free". Tested on aarch64-linux-gnu and applied as r280054. Richard

Re: [GCC][PATCH][Aarch64] Add Bfloat16_t scalar type, vector types and machine modes to Aarch64 back-end [1/2]

2020-01-09 Thread Richard Sandiford
Thanks for the update, looks great. Stam Markianos-Wright writes: > diff --git a/gcc/config/aarch64/arm_bf16.h b/gcc/config/aarch64/arm_bf16.h > new file mode 100644 > index > ..884b6f3bc7a28c516e54c26a71b1b769f55867a7 > --- /dev/null > +++ b/gcc/config/aa

Re: [GCC][PATCH][AArch64]Add ACLE intrinsics for dot product (usdot - vector, dot - by element) for AArch64 AdvSIMD ARMv8.6 Extension

2020-01-09 Thread Richard Sandiford
OK, thanks. Richard Stam Markianos-Wright writes: > On 12/30/19 10:21 AM, Richard Sandiford wrote: >> Stam Markianos-Wright writes: >>> On 12/20/19 2:13 PM, Richard Sandiford wrote: >>>> Stam Markianos-Wright writes: >>>>> +**... >>>

Re: [GCC][PATCH][AArch64]Add ACLE intrinsics for bfdot for ARMv8.6 Extension

2020-01-09 Thread Richard Sandiford
Please update the names of the testsuite files to match the ones in the bfloat16_t patch. (Same for the usdot/sudot patch -- sorry for forgetting there.) OK with that change, thanks. Richard Stam Markianos-Wright writes: > On 12/30/19 10:29 AM, Richard Sandiford wrote: >> Stam

Re: [GCC][PATCH][Aarch64] Add Bfloat16_t scalar type, vector types and machine modes to Aarch64 back-end [2/2]

2020-01-09 Thread Richard Sandiford
Stam Markianos-Wright writes: > diff --git a/gcc/testsuite/g++.target/aarch64/bfloat_cpp_typecheck.C > b/gcc/testsuite/g++.target/aarch64/bfloat_cpp_typecheck.C > new file mode 100644 > index 000..55cbb0b0ef7 > --- /dev/null > +++ b/gcc/testsuite/g++.target/aarch64/bfloat_cpp_typecheck.C

Re: [Patch] [AArch64] [SVE] Implement svld1ro intrinsic.

2020-01-09 Thread Richard Sandiford
Matthew Malcomson writes: > We take no action to ensure the SVE vector size is large enough. It is > left to the user to check that before compiling this intrinsic or before > running such a program on a machine. > > The main difference between ld1ro and ld1rq is in the allowed offsets, > the imp

[committed][AArch64] Add banner comments to aarch64-sve2.md

2020-01-09 Thread Richard Sandiford
This patch imposes the same sort of structure on aarch64-sve2.md as we already have for aarch64-sve.md, before it grows a lot more patterns. Tested on aarch64-linux-gnu and applied as 280058. Richard 2020-01-09 Richard Sandiford gcc/ * config/aarch64/aarch64-sve2.md: Add banner

[committed][AArch64] Pass a mode to some SVE immediate queries

2020-01-09 Thread Richard Sandiford
It helps the SVE2 ACLE support if aarch64_sve_arith_immediate_p and aarch64_sve_sqadd_sqsub_immediate_p accept scalars as well as vectors. Tested on aarch64-linux-gnu and applied as r280059. Richard 2020-01-09 Richard Sandiford gcc/ * config/aarch64/aarch64-protos.h

Fix gather/scatter check when updating a vector epilogue loop

2020-01-10 Thread Richard Sandiford
when testing with fixed-length -msve-vector-bits=128. Tested on aarch64-linux-gnu and x86_64-linux-gnu. Maybe verging on the obvious, but: OK to install? Richard 2020-01-10 Richard Sandiford gcc/ * tree-vect-loop.c (update_epilogue_loop_vinfo): Update DR_REF for any type of

Use get_related_vectype_for_scalar_type for reduction indices

2020-01-10 Thread Richard Sandiford
-length -msve-vector-bits=128. Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install? Richard 2020-01-10 Richard Sandiford gcc/ * tree-vect-loop.c (vect_create_epilog_for_reduction): Use get_related_vectype_for_scalar_type rather than build_vector_type to

[AArch64] Require aarch64_sve256_hw for a 256-bit SVE test

2020-01-10 Thread Richard Sandiford
One of the SVE run tests was specific to 256-bit SVE but tried to run for all SVE lengths. Tested on aarch64-linux-gnu and applied as r280104. Richard 2020-01-10 Richard Sandiford gcc/testsuite/ * gcc.target/aarch64/sve/index_1_run.c: Require aarch64_sve256_hw rather than

Fix type mismatch in SLPed constructors

2020-01-10 Thread Richard Sandiford
on aarch64-linux-gnu and x86_64-linux-gnu. OK to install? Richard 2020-01-10 Richard Sandiford gcc/ * tree-vect-slp.c (vectorize_slp_instance_root_stmt): Use a VIEW_CONVERT_EXPR if the vectorized constructor has a diffeent type from the lhs. Index: gcc/tree-

[committed][AArch64] Tighten mode checks in aarch64_builtin_vectorized_function

2020-01-10 Thread Richard Sandiford
ferent ABI. SVE handles this kind of thing using optabs instead.) Tested on aarch64-linux-gnu and applied as 280114. Richard 2020-01-10 Richard Sandiford gcc/ * config/aarch64/aarch64-builtins.c (aarch64_builtin_vectorized_function): Check for specific vector modes, r

Re: [GCC][PATCH][Aarch64] Add Bfloat16_t scalar type, vector types and machine modes to Aarch64 back-end [2/2]

2020-01-10 Thread Richard Sandiford
Stam Markianos-Wright writes: > On 1/9/20 4:13 PM, Stam Markianos-Wright wrote: >> On 1/9/20 4:07 PM, Richard Sandiford wrote: >>> Stam Markianos-Wright writes: >>>> diff --git a/gcc/testsuite/g++.target/aarch64/bfloat_cpp_typecheck.C >>&

[committed][AArch64] Fix reversed vcond_mask invocation in aarch64_evpc_sel

2020-01-10 Thread Richard Sandiford
ed as r280121. Richard 2020-01-10 Richard Sandiford gcc/ * config/aarch64/aarch64.c (aarch64_evpc_sel): Fix gen_vcond_mask invocation. gcc/testsuite/ * gcc.target/aarch64/sve/sel_1.c: Use SVE types for the arguments and return values. Use check-function-bodies in

[AArch64] Make -msve-vector-bits=128 generate VL-specific code

2020-01-10 Thread Richard Sandiford
do the same for big-endian targets, but it could have quite a high overhead; see the comment in the patch for details. Tested on aarch64-linux-gnu and applied as r280125. Hopefully my last ever commit via svn :-) Richard 2020-01-10 Richard Sandiford gcc/ * doc/invoke.texi (-msve

Re: [PATCH] Add initial octeontx2 support.

2020-01-11 Thread Richard Sandiford
writes: > From: Andrew Pinski > > This adds octeontx2 naming. It currently uses the cortexa57 > cost model and schedule model until I submit this. This is > more a place holder to get the naming of the cores in GCC 10. > I will submit the cost model in the next couple of days. > > OK? Bootstra

Re: [PATCH] Fix typo and avoid possible memory leak

2020-01-13 Thread Richard Sandiford
"Kewen.Lin" writes: > Hi, > > Function average_num_loop_insns forgets to free loop body in early return. > Besides, overflow comparison checks 100 (e6) but the return value is > 10 (e5), I guess it's unexpected, a typo? > > Bootstrapped and regress tested on powerpc64le-linux-gnu. > I

Re: [PATCHv2] Add initial octeontx2 support.

2020-01-13 Thread Richard Sandiford
writes: > From: Andrew Pinski > > This adds octeontx2 naming. It currently uses the cortexa57 > cost model and schedule model until I submit this. This is > more a place holder to get the naming of the cores in GCC 10. > I will submit the cost model in the next couple of days. > > OK? Bootstra

[PATCH] PR tree-optimization/93247 - ICE in get_load_store_type

2020-01-15 Thread Richard Sandiford
aarch64-linux-gnu. OK to install? Richard 2020-01-15 Richard Sandiford gcc/ PR tree-optimization/93247 * tree-vect-loop.c (update_epilogue_loop_vinfo): Check the access type of the stmt that we're going to vectorize. gcc/testsuite/ PR tree-optimization/

[committed] aarch64: Fix BE SVE mode punning involving floats

2020-01-16 Thread Richard Sandiford
The patterns used by aarch64_split_sve_subreg_move only support integer modes, so if the widest mode is a float, we should get its integer equivalent. Fixes gcc.target/aarch64/sel_3.c for big-endian targets. Tested on aarch64-linux-gnu and aarch64_be-none-elf. Richard 2020-01-16 Richard

[PATCH] gimplifier: handle POLY_INT_CST-sized TARGET_EXPRs

2020-01-16 Thread Richard Sandiford
very low risk for non-SVE targets though, so OK anyway? Richard 2020-01-16 Richard Sandiford gcc/ * gimplify.c (gimplify_return_expr): Use poly_int_tree_p rather than testing directly for INTEGER_CST. (gimplify_target_expr, gimplify_omp_depend): Likewise. gcc/test

Re: [PATCH][AArch64] Fix shrinkwrapping interactions with atomics (PR92692)

2020-01-16 Thread Richard Sandiford
Wilco Dijkstra writes: > The separate shrinkwrapping pass may insert stores in the middle > of atomics loops which can cause issues on some implementations. > Avoid this by delaying splitting of atomic patterns until after > prolog/epilog generation. > > Bootstrap completed, no test regressions on

Re: [PATCH][AArch64] PR92424: Fix -fpatchable-function-entry=N,M with BTI

2020-01-16 Thread Richard Sandiford
Szabolcs Nagy writes: > this affects the linux kernel and technically a wrong code bug > so this fix tries to be backportable (fixing all issues with > -fpatchable-function-entry=N,M will likely require new option). Even for the backportable version, I think it would be better not to duplicate so

Re: [PATCH][AArch64] Enable CLI for Armv8.6-A f64mm

2020-01-16 Thread Richard Sandiford
Matthew Malcomson writes: > This patch is necessary for sve-ld1ro intrinsic I posted in > https://gcc.gnu.org/ml/gcc-patches/2020-01/msg00466.html . > > I had mistakenly thought this option was already enabled upstream. > > This provides the option +f64mm, that turns on the 64 bit floating point >

Re: [PATCH][AARCH64] Set jump-align=4 for neoversen1

2020-01-17 Thread Richard Sandiford
Wilco Dijkstra writes: > Testing shows the setting of 32:16 for jump alignment has a significant > codesize > cost, however it doesn't make a difference in performance. So set jump-align > to 4 to get 1.6% codesize improvement. I was leaving this to others in case it was obvious to them. On the

Re: [PATCH][AARCH64] Enable compare branch fusion

2020-01-17 Thread Richard Sandiford
Wilco Dijkstra writes: > Enable the most basic form of compare-branch fusion since various CPUs > support it. This has no measurable effect on cores which don't support > branch fusion, but increases fusion opportunities on cores which do. If you're able to say for the record which cores you test

[PATCH] Fix gcc.dg/torture/pr91323.c for aarch64 targets

2020-01-17 Thread Richard Sandiford
signalling NaNs. Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install? Richard 2020-01-17 Richard Sandiford gcc/ * dojump.c (split_comparison): Use HONOR_NANS rather than HONOR_SNANS when splitting LTGT. --- gcc/dojump.c | 2 +- 1 file changed, 1 insertion(+), 1

[committed] aarch64: Don't raise FE_INVALID for -__builtin_isgreater [PR93133]

2020-01-17 Thread Richard Sandiford
for GCC 10. Tested on aarch64-linux-gnu, applied. Richard 2020-01-17 Richard Sandiford gcc/ * config/aarch64/aarch64.h (REVERSIBLE_CC_MODE): Return false for FP modes. (REVERSE_CONDITION): Delete. * config/aarch64/iterators.md (CC_ONLY): New mode iterator.

Re: [PATCH Coroutines]Fix ICE when co_awaiting on void type

2020-01-20 Thread Richard Sandiford
Jakub Jelinek writes: > On Mon, Jan 20, 2020 at 08:59:20AM +, Iain Sandoe wrote: >> Hi Bin, >> >> bin.cheng wrote: >> >> > gcc/cp >> > 2020-01-20 Bin Cheng >> > >> >* coroutines.cc (build_co_await): Skip getting complete type for >> > void. >> > >> > gcc/testsuite >> > 2020-01

Re: [PATCH 2/4 GCC11] Add target hook stride_dform_valid_p

2020-01-20 Thread Richard Sandiford
"Kewen.Lin" writes: > gcc/ChangeLog > > 2020-01-16 Kewen Lin > > * config/rs6000/rs6000.c (TARGET_STRIDE_DFORM_VALID_P): New macro. > (rs6000_stride_dform_valid_p): New function. > * doc/tm.texi: Regenerate. > * doc/tm.texi.in (TARGET_STRIDE_DFORM_VALID_P): New hook. >

Re: [PATCH] Fix target/93119 (aarch64): ICE with traditional TLS support on ILP32

2020-01-20 Thread Richard Sandiford
writes: > From: Andrew Pinski > > The problem here was g:23b88fda665d2f995c was not a complete fix > for supporting tranditional TLS on ILP32. > > So the problem here is a couple of things, first __tls_get_addr > call will return a C pointer value so we need to use ptr_mode > when we are creating

[PATCH] cselib: Fix handling of multireg values for call insns [PR93170]

2020-01-20 Thread Richard Sandiford
something we did before g:3bd2918594dae34ae84f too). Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install? Richard 2020-01-20 Richard Sandiford gcc/ PR rtl-optimization/93170 * cselib.c (cselib_invalidate_regno_val): New function, split out from

[PATCH] lra: Stop registers being incorrectly marked live [PR92989]

2020-01-20 Thread Richard Sandiford
x86_64-linux-gnu, and that the preprocessed libstdc++ code now compiles for mipsisa64-elf. OK to install? Richard 2020-01-20 Richard Sandiford gcc/ PR rtl-optimization/92989 * lra-lives.c (process_bb_lives): Update the live-in set before processing additional clobb

Re: [committed][GCC][PATCH][AArch64] Fix unused variable warning breaking bootstrap.

2020-01-20 Thread Richard Sandiford
her than add ATTRIBUTE_UNUSED. That's the "style" used elsewhere in the file and also keeps the line length under 80 chars. Tested on aarch64-linux-gnu and applied. Richard 2020-01-20 Richard Sandiford gcc/ * config/aarch64/aarch64-sve-builtins-base.cc (svld1r

Re: [PATCH] Align __patchable_function_entries to POINTER_SIZE

2020-01-20 Thread Richard Sandiford
"Fangrui Song via gcc-patches" writes: > Fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93194 Applied, thanks. Richard > From 60f489f2bf2b32afd1bdbb2405bb028dcedf82cc Mon Sep 17 00:00:00 2001 > From: Fangrui Song > Date: Tue, 7 Jan 2020 20:46:26 -0800 > Subject: [PATCH] Align __patchable_f

Re: [PATCH][AARCH64] Set jump-align=4 for neoversen1

2020-01-20 Thread Richard Sandiford
Wilco Dijkstra writes: > Hi Kyrill & Richard, > >> I was leaving this to others in case it was obvious to them. On the >> basis that silence suggests it wasn't, :-) could you go into more details? >> Is it expected on first principles that jump alignment doesn't matter >> for Neoverse N1, or is t

Re: [AArch64] effective_target for aarch64 f64mm asm

2020-01-21 Thread Richard Sandiford
Matthew Malcomson writes: > Commit 9ceec73 introduced intrinsics for the AArch64 FP64 matrix > multiply instructions. These require binutils support for the same > instructions. > ( See https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01234.html for the > testsuite failures this introduced. ) > > Th

[PATCH] testsuite: Add target/xfail argument to check-function-bodies

2020-01-21 Thread Richard Sandiford
chard 2020-01-21 Richard Sandiford gcc/ * doc/sourcebuild.texi (check-function-bodies): Add an optional target/xfail selector. gcc/testsuite/ * lib/scanasm.exp (check-function-bodies): Add an optional target/xfail selector. --- gcc/doc/sourcebuild.texi

Re: [PATCH v2][AArch64] PR92424: Fix -fpatchable-function-entry=N,M with BTI

2020-01-21 Thread Richard Sandiford
Szabolcs Nagy writes: > v2: > - emit bti based on feedback from Richard Sandiford > (dont copy varasm logic). > - add testcases. > - kept bti outside the patch area if possible, i.e. option (b) > in earlier discussion. > > This fix does not update the documentation of

[committed] aarch64: Fix SVE ACLE handling of SImode pointers

2020-01-21 Thread Richard Sandiford
This long-overdue patch promotes SImode pointers to DImode addresses, avoiding various ICEs in the existing tests. Tested on aarch64-linux-gnu and aarch64_be-elf, applied. There are still other ILP32-related ACLE failures to go... Richard 2020-01-21 Richard Sandiford gcc/ * config

[committed] aarch64: Use stdint types for SVE ACLE elements

2020-01-21 Thread Richard Sandiford
uot; to: void f(int32_t *); void f(int64_t *); would be ambiguous. It also matches the corresponding behaviour. Tested on aarch64-linux-gnu and aarch64_be-elf, applied. There are still other ILP32-related ACLE failures to go... Richard 2020-01-21 Richard Sandifo

Re: [PATCH] aarch64: Fix aarch64_expand_subvti constant handling [PR93335]

2020-01-22 Thread Richard Sandiford
Jakub Jelinek writes: > Hi! > > The two patterns that call aarch64_expand_subvti ensure that {low,high}_in1 > is a register, while {low,high}_in2 can be a register or immediate. > subdi3_compare1_imm uses the aarch64_plus_immediate predicate for its last > two operands (the value and negated value

[PATCH] cfgexpand: Update partition size when merging variables

2020-01-22 Thread Richard Sandiford
. Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install? Richard 2020-01-22 Richard Sandiford gcc/ * cfgexpand.c (union_stack_vars): Update the size. gcc/testsuite/ * gcc.target/aarch64/sve/acle/general/stack_vars_1.c: New test. --- gcc/cfgexpand.c

[PATCH] cprop: Don't replace fixed hard regs with pseudos [PR93124]

2020-01-22 Thread Richard Sandiford
ith a constant is still potentially useful though, since we'll only make the change if the insn pattern allows it. This part 1 of the fix for PR93124. Part 2 contains the testcase. Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install? Richard 2020-01-22 Richard Sandiford gcc/

[PATCH] auto-inc-dec: Don't add incs/decs to bare CLOBBERs [PR93124]

2020-01-22 Thread Richard Sandiford
fix. Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install? Richard 2020-01-22 Richard Sandiford gcc/ PR rtl-optimization/93124 * auto-inc-dec.c (merge_in_block): Don't add auto inc/decs to bare USE and CLOBBER insns. gcc/testsuite/ * gcc.dg/to

[committed] Extend r279588 to g++.dg/ext/sve-sizeless-1.C

2020-01-22 Thread Richard Sandiford
In r279588 I'd for some reason only patched g++.dg/ext/sve-sizeless-2.C, even though g++.dg/ext/sve-sizeless-1.C has the same problem. Tested on aarch64-linux-gnu and aarch64_be-elf, applied. Richard 2020-01-22 Richard Sandiford gcc/testsuite/ * g++.dg/ext/sve-sizeless-1.C:

[committed] Fix gcc.target/aarch64/sve/sel_3.c for big-endian targets

2020-01-22 Thread Richard Sandiford
A pasto in this test meant that we needed extra reverse instructions for big-endian targets. Tested on aarch64-linux-gnu and aarch64_be-elf, applied. Richard 2020-01-22 Richard Sandiford gcc/testsuite/ * gcc.target/aarch64/sve/sel_3.c (permute_vnx4sf): Take __SVFloat32_t

[committed] Skip gcc.target/aarch64/sve/tls_preserve* for emultls targets

2020-01-22 Thread Richard Sandiford
These tests are supposed to be testing the tlsdesc handling and so don't apply to emultls targets. Tested on aarch64-linux-gnu and aarch64_be-elf, applied. Richard 2020-01-22 Richard Sandiford gcc/testsuite/ * gcc.target/aarch64/sve/tls_preserve_1.c: Require tls_n

Re: [PATCH, v2] wwwdocs: e-mail subject lines for contributions

2020-01-22 Thread Richard Sandiford
"Richard Earnshaw (lists)" writes: > On 21/01/2020 17:20, Jason Merrill wrote: >> On 1/21/20 10:40 AM, Richard Earnshaw (lists) wrote: >>> On 21/01/2020 15:39, Jakub Jelinek wrote: On Tue, Jan 21, 2020 at 03:33:22PM +, Richard Earnshaw (lists) wrote: >> Some examples would be us

Re: [PATCH] wide-int: i386: Fix ICEs on TImode signed overflow add/sub patterns [PR93376]

2020-01-23 Thread Richard Sandiford
Jakub Jelinek writes: > Hi! > > The following testcase ICEs, because during try_combine of i3: > (insn 18 17 19 2 (parallel [ > (set (reg:CCO 17 flags) > (eq:CCO (plus:OI (sign_extend:OI (reg:TI 96)) > (const_int 1 [0x1])) > (

<    1   2   3   4   5   6   7   8   9   10   >