Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-23 Thread Richard Sandiford
Qing Zhao writes: >> On Sep 22, 2020, at 1:35 PM, H.J. Lu wrote: >> On Tue, Sep 22, 2020 at 11:25 AM Qing Zhao > <mailto:qing.z...@oracle.com>> wrote: >>>> On Sep 22, 2020, at 11:31 AM, Richard Sandiford >>>> wrote: >>>> Taking each

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-23 Thread Richard Sandiford
Qing Zhao writes: >> On Sep 22, 2020, at 12:06 PM, Richard Sandiford >> wrote: >>>>> >>>>> The following is what I see from i386.md: (I didn’t look at how >>>>> “UNSPEC_volatile” is used in data flow analysis in GCC yet) >>>

Re: [PATCH] vect: Fix epilogue loop handling of partial vectors

2020-09-23 Thread Richard Sandiford
"Kewen.Lin" writes: > on 2020/9/22 下午10:34, Richard Sandiford wrote: >> Also, while splitting out the logic that handles epilogues with >> constant iterations, I added a check to make sure that we don't >> try to use partial vectors to vectorise a single-scalar

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-23 Thread Richard Sandiford
Qing Zhao writes: >> On Sep 23, 2020, at 5:43 AM, Richard Sandiford >> wrote: >> >> Qing Zhao writes: >>>> On Sep 22, 2020, at 1:35 PM, H.J. Lu wrote: >>>> On Tue, Sep 22, 2020 at 11:25 AM Qing Zhao >>> <mailto:qing.z...@orac

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-23 Thread Richard Sandiford
Qing Zhao writes: >> On Sep 23, 2020, at 6:05 AM, Richard Sandiford >> wrote: >> >> Qing Zhao mailto:qing.z...@oracle.com>> writes: >>>> On Sep 22, 2020, at 12:06 PM, Richard Sandiford >>>> wrote: >>>>>>&g

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-23 Thread Richard Sandiford
Qing Zhao writes: Dropping them is fine with me FWIW. That seems like a natural use for the new hook: drop zeroing that isn't actively wrong, but isn't likely to be useful either. >>> >>> Okay, I will add a new hook for this purpose. >> >> It doesn't need to be a new hook. The

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

2020-09-23 Thread Richard Sandiford
Qing Zhao writes: >> On Sep 23, 2020, at 9:32 AM, Richard Sandiford >> wrote: >> >> Qing Zhao writes: >>>> On Sep 23, 2020, at 6:05 AM, Richard Sandiford >>>> wrote: >>>> >>>> Qing Zhao mailto:qing.z...@ora

[PATCH] arm: Fix canary address calculation for non-PIC

2020-09-23 Thread Richard Sandiford
For non-PIC, the stack protector patterns did: rtx mem = XEXP (force_const_mem (SImode, operands[1]), 0); emit_move_insn (operands[2], mem); Here, operands[1] is the address of the canary (&__stack_chk_guard) and operands[2] is the register that we want to move that address in

[committed] aarch64: Add a couple of extra stack-protector tests

2020-09-23 Thread Richard Sandiford
These tests were inspired by corresponding arm ones. They already pass. Tested on aarch64-linux-gnu and aarch64_be-elf, pushed to master. Richard gcc/testsuite/ * gcc.target/aarch64/stack-protector-3.c: New test. * gcc.target/aarch64/stack-protector-4.c: Likewise. --- .../gcc.

[committed] aarch64: Prevent canary address being spilled to stack

2020-09-23 Thread Richard Sandiford
This patch fixes the equivalent of arm bug PR85434/CVE-2018-12886 for aarch64: under high register pressure, the -fstack-protector code might spill the address of the canary onto the stack and reload it at the test site, giving an attacker the opportunity to change the expected canary value. This

[PATCH] arm: Add a couple of extra stack-protector tests

2020-09-23 Thread Richard Sandiford
These tests were inspired by the corresponding aarch64 ones that I just committed. They already pass. Tested on arm-linux-gnueabi, arm-linux-gnueabihf and armeb-eabi. OK for trunk? Richard gcc/testsuite/ * gcc.target/arm/stack-protector-5.c: New test. * gcc.target/arm/stack-pro

Re: [PATCH] arm: Add a couple of extra stack-protector tests

2020-09-24 Thread Richard Sandiford
Kyrylo Tkachov writes: > Hi Richard, > >> -Original Message----- >> From: Richard Sandiford >> Sent: 23 September 2020 19:34 >> To: gcc-patches@gcc.gnu.org >> Cc: ni...@redhat.com; Richard Earnshaw ; >> Ramana Radhakrishnan ; Kyrylo >> Tkachov

Re: [PATCH PR96757] aarch64: ICE during GIMPLE pass: vect

2020-09-24 Thread Richard Sandiford
Hi, "duanbo (C)" writes: > Sorry for the late reply. My time to apologise for the late reply. > Thanks for your suggestions. I have modified accordingly. > Attached please find the v1 patch. Thanks, the logic to choose which precision we pick looks good. But I think the build_mask_conversions

Re: [PATCH v3 1/2] IFN: Implement IFN_VEC_SET for ARRAY_REF with VIEW_CONVERT_EXPR

2020-09-24 Thread Richard Sandiford
xionghu luo writes: > @@ -2658,6 +2659,43 @@ expand_vect_cond_mask_optab_fn (internal_fn, gcall > *stmt, convert_optab optab) > > #define expand_vec_cond_mask_optab_fn expand_vect_cond_mask_optab_fn > > +/* Expand VEC_SET internal functions. */ > + > +static void > +expand_vec_set_optab_fn

[PATCH] arm: Fix fp16 move patterns for base MVE

2020-09-25 Thread Richard Sandiford
This patch fixes ICEs in gcc.dg/torture/float16-basic.c for -march=armv8.1-m.main+mve -mfloat-abi=hard. The problem was that an fp16 argument was (rightly) being passed in FPRs, but the fp16 move patterns only handled GPRs. LRA then cycled trying to look for a way of handling the FPR. It looks l

Re: [PATCH] middle-end/96814 - fix VECTOR_BOOLEAN_TYPE_P CTOR RTL expansion

2020-09-25 Thread Richard Sandiford
Richard Biener writes: > The RTL expansion code for CTORs doesn't handle VECTOR_BOOLEAN_TYPE_P > with bit-precision elements correctly as the testcase shows before > the PR97085 fix. The following makes it do the correct thing > (not 100% sure for CTOR of sub-vectors due to the lack of a testcase

Re: [PATCH v2 2/2] rs6000: Expand vec_insert in expander instead of gimple [PR79251]

2020-09-25 Thread Richard Sandiford
Richard Biener writes: > On Thu, Sep 24, 2020 at 9:38 PM Segher Boessenkool > wrote: >> >> Hi! >> >> On Thu, Sep 24, 2020 at 04:55:21PM +0200, Richard Biener wrote: >> > Btw, on x86_64 the following produces sth reasonable: >> > >> > #define N 32 >> > typedef int T; >> > typedef T V __attribute__

Re: [PATCH] aarch64: Do not alter force_reg returned rtx expanding pauth builtins

2020-09-25 Thread Richard Sandiford
Andrea Corallo writes: > Hi Richard, > > thanks for reviewing > > Richard Sandiford writes: > >> Andrea Corallo writes: >>> Hi all, >>> >>> having a look for force_reg returned rtx later on modified I've found >>> this other case

Re: One issue with default implementation of zero_call_used_regs

2020-09-25 Thread Richard Sandiford
Qing Zhao writes: > Hi, Richard, > > As you suggested, I added a default implementation of the target hook > “zero_cal_used_regs (HARD_REG_SET)” as following in my latest patch > > > /* The default hook for TARGET_ZERO_CALL_USED_REGS. */ > > void > default_zero_call_used_regs (HARD_REG_SET need_

Re: [PATCH] middle-end/96814 - fix VECTOR_BOOLEAN_TYPE_P CTOR RTL expansion

2020-09-25 Thread Richard Sandiford
Richard Biener writes: >> What do we allow for non-boolean constructors. E.g. for: >> >> v2hi = 0xf001; >> >> do we allow the CONSTRUCTOR to be { 0xf001 }? Is the type of an >> initialiser value allowed to be arbitrarily different from the type >> of the elements being initialised? >> >> Or

Re: [PATCH] middle-end/96814 - fix VECTOR_BOOLEAN_TYPE_P CTOR RTL expansion

2020-09-25 Thread Richard Sandiford
Richard Biener writes: > On Fri, 25 Sep 2020, Richard Sandiford wrote: > >> Richard Biener writes: >> >> What do we allow for non-boolean constructors. E.g. for: >> >> >> >> v2hi = 0xf001; >> >> >> >> do we allow the

Re: [PATCH v4 1/3] IFN: Implement IFN_VEC_SET for ARRAY_REF with VIEW_CONVERT_EXPR

2020-09-25 Thread Richard Sandiford
xionghu luo writes: > @@ -2658,6 +2659,45 @@ expand_vect_cond_mask_optab_fn (internal_fn, gcall > *stmt, convert_optab optab) > > #define expand_vec_cond_mask_optab_fn expand_vect_cond_mask_optab_fn > > +/* Expand VEC_SET internal functions. */ > + > +static void > +expand_vec_set_optab_fn

Re: One issue with default implementation of zero_call_used_regs

2020-09-25 Thread Richard Sandiford
Qing Zhao writes: >> On Sep 25, 2020, at 7:53 AM, Richard Sandiford >> wrote: >> >> Qing Zhao writes: >>> Hi, Richard, >>> >>> As you suggested, I added a default implementation of the target hook >>> “zero_cal_used_regs (HARD_RE

Re: One issue with default implementation of zero_call_used_regs

2020-09-25 Thread Richard Sandiford
Qing Zhao writes: >> On Sep 25, 2020, at 10:28 AM, Richard Sandiford >> wrote: >> >> Qing Zhao mailto:qing.z...@oracle.com>> writes: >>>> On Sep 25, 2020, at 7:53 AM, Richard Sandiford >>>> wrote: >>>> >>>>

Re: One issue with default implementation of zero_call_used_regs

2020-09-25 Thread Richard Sandiford
Qing Zhao writes: > Last question, in the following code portion: > > /* Now we get a hard register set that need to be zeroed, pass it to > target to generate zeroing sequence. */ > HARD_REG_SET zeroed_hardregs; > start_sequence (); > zeroed_hardregs = targetm.calls.zero_call_used_r

Re: [PATCH V2] aarch64: Do not alter force_reg returned rtx expanding pauth builtins

2020-09-28 Thread Richard Sandiford
Andrea Corallo writes: > Hi all, > > here the reworked patch addressing Richard's suggestions. > > Regtested and bootsraped on aarch64-linux-gnu. > > Okay for trunk? OK, thanks. Richard

Re: [PATCH v2 3/16]middle-end Add basic SLP pattern matching scaffolding.

2020-09-29 Thread Richard Sandiford
Richard Biener writes: >> > > @@ -2192,6 +2378,17 @@ vect_analyze_slp_instance (vec_info *vinfo, >> > > &tree_size, bst_map); >> > >if (node != NULL) >> > > { >> > > + /* Temporarily allow add_stmt calls again. */ >> > > + vinfo->stmt_vec_info_ro =

Re: [PATCH v2 6/16]middle-end Add Complex Addition with rotation detection

2020-09-29 Thread Richard Sandiford
Tamar Christina writes: > diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi > index > 2b46286943778e16d95b15def4299bcbf8db7eb8..71e226505b2619d10982b59a4ebbed73a70f29be > 100644 > --- a/gcc/doc/md.texi > +++ b/gcc/doc/md.texi > @@ -6132,6 +6132,17 @@ floating-point mode. > > This pattern is not

aarch64/arm: GCC 10 backportx

2020-09-29 Thread Richard Sandiford
I've backported the following SVE ACLE and stack-protector patches to GCC 10. The arm one was approved last week. Tested on aarch64-linux-gnu and arm-linux-gnueabihf. Richard >From 0559badf0176b257d3cba89f8eb4b08948216002 Mon Sep 17 00:00:00 2001 From: Richard Sandiford Date: Tue

Ping: [PATCH] arm: Add new vector mode macros

2020-09-29 Thread Richard Sandiford
Ping Richard Sandiford writes: > Kyrylo Tkachov writes: >> This looks like a productive way forward to me. >> Okay if the other maintainer don't object by the end of the week. > > Thanks. Dennis pointed out off-list that it regressed > armv8_2-fp16-arith-

[committed][AArch64] Don't apply mode_for_int_vector to scalars

2019-10-23 Thread Richard Sandiford
aarch64_emit_approx_sqrt handles both vectors and scalars and was using mode_for_int_vector even for the scalar case. Although that happened to work, it isn't how mode_for_int_vector is supposed to be used. Tested on aarch64-linux-gnu and applied as r277311. Richard 2019-10-23 Ri

RFC/A: Add a targetm.vectorize.related_mode hook

2019-10-23 Thread Richard Sandiford
x27;t be posting the vectoriser patch for a few days, hence the RFC/A tag. Tested individually on aarch64-linux-gnu and as a series on x86_64-linux-gnu. OK to install? Or if not yet, does the idea look OK? I'll post some follow-up patches too. Richard 2019-10-23 Richard Sandiford

Replace mode_for_int_vector with related_int_vector_mode

2019-10-23 Thread Richard Sandiford
chance to pick its preferred vector mode for the given element mode and size. Tested individually on aarch64-linux-gnu and as a series on x86_64-linux-gnu. OK to install? Richard 2019-10-23 Richard Sandiford gcc/ * machmode.h (mode_for_int_vector): Delete

Add build_truth_vector_type_for_mode

2019-10-23 Thread Richard Sandiford
number of type functions by one. Tested individually on aarch64-linux-gnu and as a series on x86_64-linux-gnu. OK to install? Richard 2019-10-23 Richard Sandiford gcc/ * tree.h (build_truth_vector_type_for_mode): Declare. * tree.c (build_truth_vector_type_for_mode): New

Remove build_{same_sized_,}truth_vector_type

2019-10-23 Thread Richard Sandiford
hich truth_type_for would pass a size of zero for BLKmode vector types. Tested individually on aarch64-linux-gnu and as a series on x86_64-linux-gnu. OK to install? Richard 2019-10-23 Richard Sandiford gcc/ * tree.h (build_truth_vector_type): Delete. (build_same_sized_tru

Pass the data vector mode to get_mask_mode

2019-10-23 Thread Richard Sandiford
. OK to install? Richard 2019-10-23 Richard Sandiford gcc/ * target.def (get_mask_mode): Take a vector mode itself as argument, instead of properties about the vector mode. * doc/tm.texi: Regenerate. * targhooks.h (default_get_mask_mode): Update to reflect new

Re: RFC/A: Add a targetm.vectorize.related_mode hook

2019-10-23 Thread Richard Sandiford
Richard Biener writes: > On Wed, Oct 23, 2019 at 1:00 PM Richard Sandiford > wrote: >> >> This patch is the first of a series that tries to remove two >> assumptions: >> >> (1) that all vectors involved in vectorisation must be the same size >> >&

Re: RFC/A: Add a targetm.vectorize.related_mode hook

2019-10-23 Thread Richard Sandiford
Richard Biener writes: > On Wed, Oct 23, 2019 at 1:51 PM Richard Sandiford > wrote: >> >> Richard Biener writes: >> > On Wed, Oct 23, 2019 at 1:00 PM Richard Sandiford >> > wrote: >> >> >> >> This patch is the first of a series that

Fix reductions for fully-masked loops

2019-10-24 Thread Richard Sandiford
definitely an improvement for SVE though, since it means we can lift the old restriction of not using fully-masked loops for reduction chains. Tested on aarch64-linux-gnu (with and without SVE) and x86_64-linux-gnu. OK to install? Richard 2019-10-24 Richard Sandiford gcc/ * tree-vect

Re: Pass the data vector mode to get_mask_mode

2019-10-24 Thread Richard Sandiford
Bernhard Reutner-Fischer writes: > On 23 October 2019 13:16:19 CEST, Richard Sandiford > wrote: > >>+++ gcc/config/gcn/gcn.c 2019-10-23 12:13:54.091122156 +0100 >>@@ -3786,8 +3786,7 @@ gcn_expand_builtin (tree exp, rtx target >>a vector.

Re: RFC/A: Add a targetm.vectorize.related_mode hook

2019-10-24 Thread Richard Sandiford
"H.J. Lu" writes: > On Wed, Oct 23, 2019 at 4:51 AM Richard Sandiford > wrote: >> >> Richard Biener writes: >> > On Wed, Oct 23, 2019 at 1:00 PM Richard Sandiford >> > wrote: >> >> >> >> This patch is the first of a serie

Re: RFC/A: Add a targetm.vectorize.related_mode hook

2019-10-25 Thread Richard Sandiford
Richard Biener writes: > On Wed, Oct 23, 2019 at 2:12 PM Richard Sandiford > wrote: >> >> Richard Biener writes: >> > On Wed, Oct 23, 2019 at 1:51 PM Richard Sandiford >> > wrote: >> >> >> >> Richard Biener writes: >> >

Re: [SVE] PR91272

2019-10-25 Thread Richard Sandiford
Hi Prathamesh, I've just committed a patch that fixes a large number of SVE reduction-related failures. Could you rebase and retest on top of that? Sorry for messing you around, but regression testing based on the state before the patch wouldn't have been that meaningful. In particular... Prath

[committed] Update SVE tests for recent XPASSes

2019-10-25 Thread Richard Sandiford
-linux-gnu (with and without SVE) and applied as r277441. Richard 2019-10-25 Richard Sandiford gcc/testsuite/ * gcc.target/aarch64/sve/loop_add_5.c: Remove XFAILs for tests that now pass. * gcc.target/aarch64/sve/reduc_1.c: Likewise. * gcc.target/aarch64/sve

[committed] Fix failure in gcc.target/sve/reduc_strict_3.c

2019-10-25 Thread Richard Sandiford
Unwanted unrolling meant that we had more single-precision FADDAs than expected. Tested on aarch64-linux-gnu (with and without SVE) and applied as r277442. Richard 2019-10-25 Richard Sandiford gcc/testsuite/ * gcc.target/aarch64/sve/reduc_strict_3.c (double_reduc1): Prevent

[0/n] Support multiple vector sizes for vectorisation

2019-10-25 Thread Richard Sandiford
This is a continuation of the patch series I started on Wednesday this time posted under a covering message. Parts 1-5 were: [1/n] https://gcc.gnu.org/ml/gcc-patches/2019-10/msg01634.html [2/n] https://gcc.gnu.org/ml/gcc-patches/2019-10/msg01637.html [3/n] https://gcc.gnu.org/ml/gcc-patches/2019-

[6/n] Use build_vector_type_for_mode in get_vectype_for_scalar_type_and_size

2019-10-25 Thread Richard Sandiford
lding the type.) 2019-10-24 Richard Sandiford gcc/ * tree-vect-stmts.c (get_vectype_for_scalar_type_and_size): If targetm.vectorize.preferred_simd_mode returns an integer mode, use mode_for_vector to decide what the vector type's mode should actuall

[7/n] Use consistent compatibility checks in vectorizable_shift

2019-10-25 Thread Richard Sandiford
igned amounts or unsigned shifts by signed amounts; verify_gimple_assign_binary is happy with those. This patch therefore goes for a middle ground of checking both TYPE_MODE and TYPE_VECTOR_SUBPARTS, using the same condition in both places. 2019-10-24 Richard Sandiford gcc/ * t

[8/n] Replace autovectorize_vector_sizes with autovectorize_vector_modes

2019-10-25 Thread Richard Sandiford
. A later patch will pass this mode to targetm.vectorize.related_mode to get the vector mode for a given element mode. Until then, the modes simply act as an alternative way of specifying the vector size. 2019-10-24 Richard Sandiford gcc/ * target.h (vector_sizes, auto_vector_size

[9/n] Replace vec_info::vector_size with vec_info::vector_mode

2019-10-25 Thread Richard Sandiford
This patch replaces vec_info::vector_size with vec_info::vector_mode, but for now continues to use it as a way of specifying a single vector size. This makes it easier for later patches to use related_vector_mode instead. 2019-10-24 Richard Sandiford gcc/ * tree-vectorizer.h

[10/n] Make less use of get_same_sized_vectype

2019-10-25 Thread Richard Sandiford
times let us change the original scalar type to a "nicer" scalar type, but that isn't what's happening here.) This is a prerequisite to supporting multiple vector sizes in the same vec_info. 2019-10-24 Richard Sandiford gcc/ * tree-vect-stmts.c (vectorizabl

[11/n] Support vectorisation with mixed vector sizes

2019-10-25 Thread Richard Sandiford
patch also seemed like a good opportunity to add some more dump messages: one to make it clear which vector size/mode was being used when analysis passed or failed, and another to say when we've decided to skip a redundant vector size/mode. 2019-10-24 Richard Sandiford gcc/ *

[12/n] [AArch64] Support vectorising with multiple vector sizes

2019-10-25 Thread Richard Sandiford
10-24 Richard Sandiford gcc/ * config/aarch64/aarch64.c (aarch64_vectorize_related_mode): New function. (aarch64_autovectorize_vector_modes): Also add V4HImode and V2SImode. (TARGET_VECTORIZE_RELATED_MODE): Define. gcc/testsuite/ * gcc.dg/vect/vect-outer

[13/n] Allow mixed vector sizes within a single vectorised stmt

2019-10-25 Thread Richard Sandiford
- there's no need to compute nunits_vectype if its element type is the same as STMT_VINFO_VECTYPE's. - it's useful to distinguish the nunits_vectype from the main vectype in dump messages - when reusing the existing STMT_VINFO_VECTYPE, it's useful to say so in the dump, and sa

[14/n] Vectorise conversions between differently-sized integer vectors

2019-10-25 Thread Richard Sandiford
This patch adds AArch64 patterns for converting between 64-bit and 128-bit integer vectors, and makes the vectoriser and expand pass use them. 2019-10-24 Richard Sandiford gcc/ * tree-vect-stmts.c (vectorizable_conversion): Extend the non-widening and non-narrowing path to

[C++] Fix interaction between aka changes and DR1558 (PR92206)

2019-10-25 Thread Richard Sandiford
ssume the type is validated elsewhere. It seems a rather clunky fix, sorry, but restoring the TYPE_MAIN_VARIANT (...) isn't compatible with the aka stuff. Bootstrapped & regression-tested on aarch64-linux-gnu. OK to install? Richard 2019-10-25 Richard Sandiford gcc/cp/

Re: [PATCH] Fix PR92222

2019-10-26 Thread Richard Sandiford
Richard Biener writes: > We have to check each operand for being in a pattern, not just the > first when avoiding build from scalars (we could possibly handle > the special case of some of them being the pattern stmt root, but > that would be a followup improvement). > > Bootstrap & regtest runnin

Re: [SVE] PR91272

2019-10-27 Thread Richard Sandiford
Prathamesh Kulkarni writes: > @@ -10288,6 +10261,23 @@ vectorizable_condition (stmt_vec_info stmt_info, > gimple_stmt_iterator *gsi, > vect_finish_stmt_generation (stmt_info, new_stmt, gsi); > vec_compare = vec_compare_name; > } > + > + if (

Re: [PATCH 5/9] ifcvt: Allow constants operands in noce_convert_multiple_sets.

2019-10-27 Thread Richard Sandiford
Coming back to this just in time for it not to be three months later, sorry... I still think it would be better to consolidate ifcvt a bit more, rather than effectively duplicate bits of cond_move_process_if_block in noce_convert_multiple_sets. But perhaps it was a historical mistake to have two

Re: [PATCH 6/9] ifcvt: Extract cc comparison from jump.

2019-10-27 Thread Richard Sandiford
Robin Dapp writes: > This patch extracts a cc comparison from the initial compare/jump > insn and allows it to be passed to noce_emit_cmove and > emit_conditional_move. > --- > gcc/ifcvt.c | 68 > gcc/optabs.c | 7 -- > gcc/optabs.h | 2

Re: [PATCH 7/9] ifcvt: Emit two cmov variants and choose the less expensive one.

2019-10-27 Thread Richard Sandiford
Robin Dapp writes: > This patch duplicates the previous noce_emit_cmove logic. First it > passes the canonical comparison emits the sequence and costs it. > Then, a second, separate sequence is created by passing the cc compare > we extracted before. The costs of both sequences are compared and

Re: [PATCH 9/9] ifcvt: Also pass reversed cc comparison.

2019-10-27 Thread Richard Sandiford
Robin Dapp writes: > When then and else are reversed, we would swap new_val and old_val. > The same has to be done for our new code paths. > Also, emit_conditional_move may perform swapping. In case we need to > swap, the cc comparison also needs to be swapped and for this we pass > the reversed

Re: Add a simulate_builin_function_decl langhook

2019-10-28 Thread Richard Sandiford
Jeff Law writes: > On 10/5/19 5:29 AM, Richard Sandiford wrote: >> >> Sure. This message is going to go to the other extreme, sorry, but I'm >> not sure which part will be the most convincing (if any). > No worries. Worst case going to the other extreme is I hav

[committed][AArch64] Handle scalars in cmp and shift immediate queries

2019-10-29 Thread Richard Sandiford
77556. Richard 2019-10-29 Richard Sandiford gcc/ * config/aarch64/aarch64.c (aarch64_sve_cmp_immediate_p) (aarch64_simd_shift_imm_p): Accept scalars as well as vectors. * config/aarch64/predicates.md (aarch64_sve_cmp_vsc_immediate) (aarch64_sve_cmp_vsd_immediate): Ac

[committed][AArch64] Add FFR and FFRT registers

2019-10-29 Thread Richard Sandiford
-29 Richard Sandiford gcc/ * config/aarch64/aarch64.md (FFR_REGNUM, FFRT_REGNUM): New constants. * config/aarch64/aarch64.h (FIRST_PSEUDO_REGISTER): Bump to FFRT_REGNUM + 1. (FFR_REGS, PR_AND_FFR_REGS): New register classes. (REG_CLASS_NAMES

[committed][AArch64] Extend SVE reverse permutes to predicates

2019-10-29 Thread Richard Sandiford
This is tested by the main SVE ACLE patches, but since it affects the evpc routines, it seemed worth splitting out. Tested on aarch64-linux-gnu (with and without SVE) and aarch64_be-elf. Applied as r277562. Richard 2019-10-29 Richard Sandiford gcc/ * config/aarch64/aarch64-sve.md

[committed][AArch64] Add support for the SVE PCS

2019-10-29 Thread Richard Sandiford
E register save as a stack probe too, and thus prevents the save from being shrink-wrapped if stack clash protection is enabled. The changelog describes the low-level details. Tested on aarch64-linux-gnu (with and without SVE) and aarch64_be-elf. Applied as r277564. Richard 2019-10-29 Richard S

Re: [PATCH] Fix PR92162

2019-10-29 Thread Richard Sandiford
e. AIUI this what the REDUC_IDX on the COND_EXPR now tells us. Reverting that fixes ICEs in gcc.target/aarch64/sve/clastb*. Tested on aarch64-linux-gnu (with and without SVE) and x86_64-linux-gnu. Thanks, Richard 2019-10-29 R

[15/n] Consider building nodes from scalars in vect_slp_analyze_node_operations

2019-10-29 Thread Richard Sandiford
ater during the analysis phase, e.g. because the target doesn't support a particular vector operation. This is needed to avoid regressions with a later patch. 2019-10-29 Richard Sandiford gcc/ * tree-vect-slp.c (vect_contains_pattern_stmt_p): New function. (vect_slp

[16/n] Apply maximum nunits for BB SLP

2019-10-29 Thread Richard Sandiford
cost issue though; see PR92265 for details. 2019-10-29 Richard Sandiford gcc/ * tree-vectorizer.h (vect_get_vector_types_for_stmt): Take an optional maximum nunits. (get_vectype_for_scalar_type): Likewise. Also declare a form that takes a

Re: Deprecating cc0 (and consequently cc0 targets)

2019-10-30 Thread Richard Sandiford
Richard Biener writes: > On Tue, Oct 29, 2019 at 8:34 PM Jeff Law wrote: >> >> On 10/29/19 6:26 AM, John Paul Adrian Glaubitz wrote: >> > Hello! >> > >> > We have raised $5000 to support anyone willing to work on this for the >> > m68k target [1]. We really need the m68k to stay as it's essential

Re: RFC/A: Add a targetm.vectorize.related_mode hook

2019-10-30 Thread Richard Sandiford
The series posted so far now shows how the hook would be used in practice. Just wanted to follow up on some points here: Richard Sandiford writes: > Richard Biener writes: >> On Wed, Oct 23, 2019 at 2:12 PM Richard Sandiford >> wrote: >>> >>> Richard Biener w

Re: [8/n] Replace autovectorize_vector_sizes with autovectorize_vector_modes

2019-10-30 Thread Richard Sandiford
Richard Biener writes: > On Fri, Oct 25, 2019 at 2:37 PM Richard Sandiford > wrote: >> >> This is another patch in the series to remove the assumption that >> all modes involved in vectorisation have to be the same size. >> Rather than have the target provide a lis

Re: [RFC PATCH] targetm.omp.device_kind_arch_isa and OpenMP declare variant kind/arch/isa handling

2019-10-31 Thread Richard Sandiford
Thanks for implementing this. Jakub Jelinek writes: > On Wed, Oct 30, 2019 at 02:12:30PM +, Szabolcs Nagy wrote: >> On 29/10/2019 17:15, Jakub Jelinek wrote: >> > +void f03 (void); >> > +#pragma omp declare variant (f03) match >> > (device={kind(any),arch(x86_64),isa(avx512f,avx512bw)}) >> >

Re: [committed][AArch64] Add support for the SVE PCS

2019-10-31 Thread Richard Sandiford
sted on aarch64-linuxg-gnu, Richard The SVE PCS support broke go, D and Ada because those languages don't call TARGET_INIT_BUILTINS. We therefore ended up trying to get the TYPE_MAIN_VARIANT of a null __SVBool_t. We shouldn't really need to apply TYPE_MAIN_VARIANT there anyway, since the ABI-defin

[committed][AArch64] Split gcc.target/aarch64/sve/reduc_strict_3.c

2019-10-31 Thread Richard Sandiford
lable vectors. I think the test probably predates support for variable-length loop-aware SLP. Tested on aarch64-linux-gnu and applied as r277681. Richard 2019-10-31 Richard Sandiford gcc/testsuite/ * gcc.target/aarch64/sve/reduc_strict_3.c: Split all but the first function out

[committted][AArch64] Split gcc.target/aarch64/sve/vcond_4*

2019-10-31 Thread Richard Sandiford
ly tests what's left in vcond_4.c, but that too is OK, since the point of the test was to compare the default handling of each comparison in vcond_4.c with the -fno-trapping-math equivalent. Tested on aarch64-linux-gnu and applied as r277682. Richard 2019-10-31 Richard Sandiford gc

[committed][AArch64] Fix g++.target/aarch64/sve/vcond_1_run.C

2019-10-31 Thread Richard Sandiford
This had been failing since a mass renaming. Noticed it a few times before but somehow never got around to fixing it. Tested on aarch64-linux-gnu and applied as r277683. Richard 2019-10-31 Richard Sandiford gcc/testsuite/ * g++.target/aarch64/sve/vcond_1_run.C: Update test name

Re: [PATCH][vect] Clean up orig_loop_vinfo from vect_analyze_loop

2019-11-01 Thread Richard Sandiford
"Andre Vieira (lists)" writes: > Hi, > > After my patch I believe the only way orig_loop_vinfo is not null when > calling vect_analyze_loop is when it is called for an epilogue and in > that case we no longer use that variable, since > LOOP_VINFO_ORIG_LOOP_INFO is already set for the epilogue's

[D] Remove unchecked to_constant in VECTOR_TYPE handling

2019-11-04 Thread Richard Sandiford
recognise. The brace indentation matches the surrounding style. Tested on aarch64-linux-gnu. OK to install? Richard 2019-11-04 Richard Sandiford gcc/d/ * d-builtins.cc (build_frontend_type): Cope with variable TYPE_VECTOR_SUBPARTS. Index: gcc/d/d-b

[0/4] Vector epilogues vs. mixed vector sizes

2019-11-04 Thread Richard Sandiford
This patch bridges the gap between the recent epilogue vectorisation patches and https://gcc.gnu.org/ml/gcc-patches/2019-10/msg01822.html . I don't have any evidence that the series is independently useful, but it shouldn't make things worse either. Tested individually on aarch64-linux-gnu and as

[1/4] Restructure vect_analyze_loop

2019-11-04 Thread Richard Sandiford
t;simdlen" in the new vect_epilogues condition. That should be a separate change though. This may conflict with Andre's fix for libgomp; I'll adjust if that goes in first. 2019-11-04 Richard Sandiford gcc/ * tree-vect-loop.c (vect_analyze_loop): Break out of the main

[2/4] Check the VF is small enough for an epilogue loop

2019-11-04 Thread Richard Sandiford
for correctness as well. This can happen if the sizes returned by autovectorize_vector_sizes happen to be out of order, e.g. because the target prefers smaller vectors. It can also happen with later patches if two vectorisation attempts happen to end up with the same VF. 2019-11-04 Richard

[4/4] Use scan-tree-dump instead of scan-tree-dump-times for some vect tests

2019-11-04 Thread Richard Sandiford
currences. All that matters is zero vs. nonzero. 2019-11-04 Richard Sandiford gcc/testsuite/ * gcc.dg/vect/slp-9.c: Use scan-tree-dump rather than scan-tree-dump-times. * gcc.dg/vect/slp-widen-mult-s16.c: Likewise. * gcc.dg/vect/slp-widen-mult-u8.c: Likewise.

[3/4] Don't vectorise single-iteration epilogues

2019-11-04 Thread Richard Sandiford
With a later patch I saw a case in which we peeled a single iteration for gaps but didn't need to peel further iterations to make up a full vector. We then tried to vectorise the single-iteration epilogue. 2019-11-04 Richard Sandiford gcc/ * tree-vect-loop.c (vect_analyze

Re: [16/n] Apply maximum nunits for BB SLP

2019-11-05 Thread Richard Sandiford
Richard Biener writes: > On Tue, Oct 29, 2019 at 6:05 PM Richard Sandiford > wrote: >> >> The BB vectoriser picked vector types in the same way as the loop >> vectoriser: it picked a vector mode/size for the region and then >> based all the vector types off tha

[0/6] Optionally pick the cheapest loop_vec_info

2019-11-05 Thread Richard Sandiford
This series adds a mode in which we try to vectorise loops once for each supported vector mode combination and then pick the one with the lowest cost. There are only really two patches for that: one to add the feature and another to enable it by default for SVE. However, for it to work as hoped,

[1/6] Fix vectorizable_conversion costs

2019-11-05 Thread Richard Sandiford
nt for the promotion and demotion costs; previously we gave multiple copies the same cost as a single copy. Later patches test this, but it seemed worth splitting out. 2019-11-05 Richard Sandiford gcc/ * tree-vect-stmts.c (vect_model_promotion_demotion_cost): Take the num

[2/6] Don't assign a cost to vectorizable_assignment

2019-11-05 Thread Richard Sandiford
against either the scalar or vector costs. Later patches test this, but it seemed worth splitting out. 2019-11-04 Richard Sandiford gcc/ * tree-vect-stmts.c (vectorizable_assignment): Don't add a cost. Index: g

[3/6] Avoid accounting for non-existent vector loop versioning

2019-11-05 Thread Richard Sandiford
know whether that's true once we've calculated what the runtime threshold would be. 2019-11-04 Richard Sandiford gcc/ * tree-vectorizer.h (vect_apply_runtime_profitability_check_p): New function. * tree-vect-loop-manip.c (vect_loop_versioning): Use it.

[4/6] Optionally pick the cheapest loop_vec_info

2019-11-05 Thread Richard Sandiford
->simdlen over any larger or smaller VF, regardless of costs or target preferences. 2019-11-05 Richard Sandiford gcc/ * params.def (vect-compare-loop-costs): New param. * doc/invoke.texi: Document it. * tree-vectorizer.h (_loop_vec_info::vec_outside_c

[5/6] Account for the cost of generating loop masks

2019-11-05 Thread Richard Sandiford
We didn't take the cost of generating loop masks into account, and so tended to underestimate the cost of loops that need multiple masks. 2019-11-05 Richard Sandiford gcc/ * tree-vect-loop.c (vect_estimate_min_profitable_iters): Include the cost of generating loop masks.

[6/6][AArch64] Enable vect-compare-loop-costs by default for SVE

2019-11-05 Thread Richard Sandiford
This patch enables vect-compare-loop-costs by default for SVE, both so that we can compare SVE against Advanced SIMD and so that (with future patches) we can compare multiple SVE vectorisation approaches against each other. I'll apply if the prerequisites are approved. 2019-11-05 Ri

Re: [10/n] Make less use of get_same_sized_vectype

2019-11-05 Thread Richard Sandiford
Richard Biener writes: > On Fri, Oct 25, 2019 at 2:41 PM Richard Sandiford > wrote: >> >> Some callers of get_same_sized_vectype were dealing with operands that >> are constant or defined externally, and so have no STMT_VINFO_VECTYPE >> available.

Re: [C++] Fix interaction between aka changes and DR1558 (PR92206)

2019-11-05 Thread Richard Sandiford
Ping Richard Sandiford writes: > One of the changes in r277281 was to make the typedef variant > handling in strip_typedefs pass the raw DECL_ORIGINAL_TYPE to the > recursive call, instead of applying TYPE_MAIN_VARIANT first. > This PR shows that that interacts badly with the impleme

Re: [PATCH] Support multiple registers for the frame pointer

2019-11-05 Thread Richard Sandiford
Dimitar Dimitrov writes: > On Sat, 2 Nov 2019, 19:28:38 EET Kwok Cheung Yeung wrote: >> The AMD GCN architecture uses 64-bit pointers, but the scalar registers >> are 32-bit wide, so pointers must reside in a pair of registers. > ... >> Bootstrapped on x86_64 and tested with no regressions, which

[10a/n] Require equal type sizes for vectorised calls

2019-11-05 Thread Richard Sandiford
-gnu and the series as a whole on x86_64-linux-gnu. 2019-11-04 Richard Sandiford gcc/ * tree-vect-stmts.c (vectorizable_call): Require the types to have the same size. Index: gcc/tree-vect-stmts.c === --- gcc/tree

[11a/n] Avoid retrying with the same vector modes

2019-11-05 Thread Richard Sandiford
trying the same combination of vector modes multiple times. This patch adds a check to prevent that. As before: each patch tested individually on aarch64-linux-gnu and the series as a whole on x86_64-linux-gnu. 2019-11-04 Richard Sandiford gcc/ * tree-vectorizer.h (vec_info::mode_set

[17/17] Extend can_duplicate_and_interleave_p to mixed-size vectors

2019-11-05 Thread Richard Sandiford
-04 Richard Sandiford gcc/ * tree-vectorizer.h (can_duplicate_and_interleave_p): Take an element type rather than an element mode. * tree-vect-slp.c (can_duplicate_and_interleave_p): Likewise. Use get_vectype_for_scalar_type to query the natural types for a gi

Re: [11a/n] Avoid retrying with the same vector modes

2019-11-06 Thread Richard Sandiford
Richard Biener writes: > On Tue, Nov 5, 2019 at 9:25 PM Richard Sandiford > wrote: >> >> Patch 12/n makes the AArch64 port add four entries to >> autovectorize_vector_modes. Each entry describes a different >> vector mode assignment for vector code that mixes 8-bit

<    1   2   3   4   5   6   7   8   9   10   >