Qing Zhao writes:
>> On Sep 22, 2020, at 1:35 PM, H.J. Lu wrote:
>> On Tue, Sep 22, 2020 at 11:25 AM Qing Zhao > <mailto:qing.z...@oracle.com>> wrote:
>>>> On Sep 22, 2020, at 11:31 AM, Richard Sandiford
>>>> wrote:
>>>> Taking each
Qing Zhao writes:
>> On Sep 22, 2020, at 12:06 PM, Richard Sandiford
>> wrote:
>>>>>
>>>>> The following is what I see from i386.md: (I didn’t look at how
>>>>> “UNSPEC_volatile” is used in data flow analysis in GCC yet)
>>>
"Kewen.Lin" writes:
> on 2020/9/22 下午10:34, Richard Sandiford wrote:
>> Also, while splitting out the logic that handles epilogues with
>> constant iterations, I added a check to make sure that we don't
>> try to use partial vectors to vectorise a single-scalar
Qing Zhao writes:
>> On Sep 23, 2020, at 5:43 AM, Richard Sandiford
>> wrote:
>>
>> Qing Zhao writes:
>>>> On Sep 22, 2020, at 1:35 PM, H.J. Lu wrote:
>>>> On Tue, Sep 22, 2020 at 11:25 AM Qing Zhao >>> <mailto:qing.z...@orac
Qing Zhao writes:
>> On Sep 23, 2020, at 6:05 AM, Richard Sandiford
>> wrote:
>>
>> Qing Zhao mailto:qing.z...@oracle.com>> writes:
>>>> On Sep 22, 2020, at 12:06 PM, Richard Sandiford
>>>> wrote:
>>>>>>&g
Qing Zhao writes:
Dropping them is fine with me FWIW. That seems like a natural use
for the new hook: drop zeroing that isn't actively wrong, but isn't
likely to be useful either.
>>>
>>> Okay, I will add a new hook for this purpose.
>>
>> It doesn't need to be a new hook. The
Qing Zhao writes:
>> On Sep 23, 2020, at 9:32 AM, Richard Sandiford
>> wrote:
>>
>> Qing Zhao writes:
>>>> On Sep 23, 2020, at 6:05 AM, Richard Sandiford
>>>> wrote:
>>>>
>>>> Qing Zhao mailto:qing.z...@ora
For non-PIC, the stack protector patterns did:
rtx mem = XEXP (force_const_mem (SImode, operands[1]), 0);
emit_move_insn (operands[2], mem);
Here, operands[1] is the address of the canary (&__stack_chk_guard)
and operands[2] is the register that we want to move that address in
These tests were inspired by corresponding arm ones. They already pass.
Tested on aarch64-linux-gnu and aarch64_be-elf, pushed to master.
Richard
gcc/testsuite/
* gcc.target/aarch64/stack-protector-3.c: New test.
* gcc.target/aarch64/stack-protector-4.c: Likewise.
---
.../gcc.
This patch fixes the equivalent of arm bug PR85434/CVE-2018-12886
for aarch64: under high register pressure, the -fstack-protector
code might spill the address of the canary onto the stack and
reload it at the test site, giving an attacker the opportunity
to change the expected canary value.
This
These tests were inspired by the corresponding aarch64 ones that I just
committed. They already pass.
Tested on arm-linux-gnueabi, arm-linux-gnueabihf and armeb-eabi.
OK for trunk?
Richard
gcc/testsuite/
* gcc.target/arm/stack-protector-5.c: New test.
* gcc.target/arm/stack-pro
Kyrylo Tkachov writes:
> Hi Richard,
>
>> -Original Message-----
>> From: Richard Sandiford
>> Sent: 23 September 2020 19:34
>> To: gcc-patches@gcc.gnu.org
>> Cc: ni...@redhat.com; Richard Earnshaw ;
>> Ramana Radhakrishnan ; Kyrylo
>> Tkachov
Hi,
"duanbo (C)" writes:
> Sorry for the late reply.
My time to apologise for the late reply.
> Thanks for your suggestions. I have modified accordingly.
> Attached please find the v1 patch.
Thanks, the logic to choose which precision we pick looks good.
But I think the build_mask_conversions
xionghu luo writes:
> @@ -2658,6 +2659,43 @@ expand_vect_cond_mask_optab_fn (internal_fn, gcall
> *stmt, convert_optab optab)
>
> #define expand_vec_cond_mask_optab_fn expand_vect_cond_mask_optab_fn
>
> +/* Expand VEC_SET internal functions. */
> +
> +static void
> +expand_vec_set_optab_fn
This patch fixes ICEs in gcc.dg/torture/float16-basic.c for
-march=armv8.1-m.main+mve -mfloat-abi=hard. The problem was
that an fp16 argument was (rightly) being passed in FPRs,
but the fp16 move patterns only handled GPRs. LRA then cycled
trying to look for a way of handling the FPR.
It looks l
Richard Biener writes:
> The RTL expansion code for CTORs doesn't handle VECTOR_BOOLEAN_TYPE_P
> with bit-precision elements correctly as the testcase shows before
> the PR97085 fix. The following makes it do the correct thing
> (not 100% sure for CTOR of sub-vectors due to the lack of a testcase
Richard Biener writes:
> On Thu, Sep 24, 2020 at 9:38 PM Segher Boessenkool
> wrote:
>>
>> Hi!
>>
>> On Thu, Sep 24, 2020 at 04:55:21PM +0200, Richard Biener wrote:
>> > Btw, on x86_64 the following produces sth reasonable:
>> >
>> > #define N 32
>> > typedef int T;
>> > typedef T V __attribute__
Andrea Corallo writes:
> Hi Richard,
>
> thanks for reviewing
>
> Richard Sandiford writes:
>
>> Andrea Corallo writes:
>>> Hi all,
>>>
>>> having a look for force_reg returned rtx later on modified I've found
>>> this other case
Qing Zhao writes:
> Hi, Richard,
>
> As you suggested, I added a default implementation of the target hook
> “zero_cal_used_regs (HARD_REG_SET)” as following in my latest patch
>
>
> /* The default hook for TARGET_ZERO_CALL_USED_REGS. */
>
> void
> default_zero_call_used_regs (HARD_REG_SET need_
Richard Biener writes:
>> What do we allow for non-boolean constructors. E.g. for:
>>
>> v2hi = 0xf001;
>>
>> do we allow the CONSTRUCTOR to be { 0xf001 }? Is the type of an
>> initialiser value allowed to be arbitrarily different from the type
>> of the elements being initialised?
>>
>> Or
Richard Biener writes:
> On Fri, 25 Sep 2020, Richard Sandiford wrote:
>
>> Richard Biener writes:
>> >> What do we allow for non-boolean constructors. E.g. for:
>> >>
>> >> v2hi = 0xf001;
>> >>
>> >> do we allow the
xionghu luo writes:
> @@ -2658,6 +2659,45 @@ expand_vect_cond_mask_optab_fn (internal_fn, gcall
> *stmt, convert_optab optab)
>
> #define expand_vec_cond_mask_optab_fn expand_vect_cond_mask_optab_fn
>
> +/* Expand VEC_SET internal functions. */
> +
> +static void
> +expand_vec_set_optab_fn
Qing Zhao writes:
>> On Sep 25, 2020, at 7:53 AM, Richard Sandiford
>> wrote:
>>
>> Qing Zhao writes:
>>> Hi, Richard,
>>>
>>> As you suggested, I added a default implementation of the target hook
>>> “zero_cal_used_regs (HARD_RE
Qing Zhao writes:
>> On Sep 25, 2020, at 10:28 AM, Richard Sandiford
>> wrote:
>>
>> Qing Zhao mailto:qing.z...@oracle.com>> writes:
>>>> On Sep 25, 2020, at 7:53 AM, Richard Sandiford
>>>> wrote:
>>>>
>>>>
Qing Zhao writes:
> Last question, in the following code portion:
>
> /* Now we get a hard register set that need to be zeroed, pass it to
> target to generate zeroing sequence. */
> HARD_REG_SET zeroed_hardregs;
> start_sequence ();
> zeroed_hardregs = targetm.calls.zero_call_used_r
Andrea Corallo writes:
> Hi all,
>
> here the reworked patch addressing Richard's suggestions.
>
> Regtested and bootsraped on aarch64-linux-gnu.
>
> Okay for trunk?
OK, thanks.
Richard
Richard Biener writes:
>> > > @@ -2192,6 +2378,17 @@ vect_analyze_slp_instance (vec_info *vinfo,
>> > > &tree_size, bst_map);
>> > >if (node != NULL)
>> > > {
>> > > + /* Temporarily allow add_stmt calls again. */
>> > > + vinfo->stmt_vec_info_ro =
Tamar Christina writes:
> diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> index
> 2b46286943778e16d95b15def4299bcbf8db7eb8..71e226505b2619d10982b59a4ebbed73a70f29be
> 100644
> --- a/gcc/doc/md.texi
> +++ b/gcc/doc/md.texi
> @@ -6132,6 +6132,17 @@ floating-point mode.
>
> This pattern is not
I've backported the following SVE ACLE and stack-protector patches
to GCC 10. The arm one was approved last week.
Tested on aarch64-linux-gnu and arm-linux-gnueabihf.
Richard
>From 0559badf0176b257d3cba89f8eb4b08948216002 Mon Sep 17 00:00:00 2001
From: Richard Sandiford
Date: Tue
Ping
Richard Sandiford writes:
> Kyrylo Tkachov writes:
>> This looks like a productive way forward to me.
>> Okay if the other maintainer don't object by the end of the week.
>
> Thanks. Dennis pointed out off-list that it regressed
> armv8_2-fp16-arith-
aarch64_emit_approx_sqrt handles both vectors and scalars and was using
mode_for_int_vector even for the scalar case. Although that happened
to work, it isn't how mode_for_int_vector is supposed to be used.
Tested on aarch64-linux-gnu and applied as r277311.
Richard
2019-10-23 Ri
x27;t be posting the vectoriser patch
for a few days, hence the RFC/A tag.
Tested individually on aarch64-linux-gnu and as a series on
x86_64-linux-gnu. OK to install? Or if not yet, does the idea
look OK?
I'll post some follow-up patches too.
Richard
2019-10-23 Richard Sandiford
chance to pick
its preferred vector mode for the given element mode and size.
Tested individually on aarch64-linux-gnu and as a series on
x86_64-linux-gnu. OK to install?
Richard
2019-10-23 Richard Sandiford
gcc/
* machmode.h (mode_for_int_vector): Delete
number of type functions by one.
Tested individually on aarch64-linux-gnu and as a series on
x86_64-linux-gnu. OK to install?
Richard
2019-10-23 Richard Sandiford
gcc/
* tree.h (build_truth_vector_type_for_mode): Declare.
* tree.c (build_truth_vector_type_for_mode): New
hich truth_type_for would pass a size of zero for
BLKmode vector types.
Tested individually on aarch64-linux-gnu and as a series on
x86_64-linux-gnu. OK to install?
Richard
2019-10-23 Richard Sandiford
gcc/
* tree.h (build_truth_vector_type): Delete.
(build_same_sized_tru
. OK to install?
Richard
2019-10-23 Richard Sandiford
gcc/
* target.def (get_mask_mode): Take a vector mode itself as argument,
instead of properties about the vector mode.
* doc/tm.texi: Regenerate.
* targhooks.h (default_get_mask_mode): Update to reflect new
Richard Biener writes:
> On Wed, Oct 23, 2019 at 1:00 PM Richard Sandiford
> wrote:
>>
>> This patch is the first of a series that tries to remove two
>> assumptions:
>>
>> (1) that all vectors involved in vectorisation must be the same size
>>
>&
Richard Biener writes:
> On Wed, Oct 23, 2019 at 1:51 PM Richard Sandiford
> wrote:
>>
>> Richard Biener writes:
>> > On Wed, Oct 23, 2019 at 1:00 PM Richard Sandiford
>> > wrote:
>> >>
>> >> This patch is the first of a series that
definitely an improvement for SVE though,
since it means we can lift the old restriction of not using fully-masked
loops for reduction chains.
Tested on aarch64-linux-gnu (with and without SVE) and x86_64-linux-gnu.
OK to install?
Richard
2019-10-24 Richard Sandiford
gcc/
* tree-vect
Bernhard Reutner-Fischer writes:
> On 23 October 2019 13:16:19 CEST, Richard Sandiford
> wrote:
>
>>+++ gcc/config/gcn/gcn.c 2019-10-23 12:13:54.091122156 +0100
>>@@ -3786,8 +3786,7 @@ gcn_expand_builtin (tree exp, rtx target
>>a vector.
"H.J. Lu" writes:
> On Wed, Oct 23, 2019 at 4:51 AM Richard Sandiford
> wrote:
>>
>> Richard Biener writes:
>> > On Wed, Oct 23, 2019 at 1:00 PM Richard Sandiford
>> > wrote:
>> >>
>> >> This patch is the first of a serie
Richard Biener writes:
> On Wed, Oct 23, 2019 at 2:12 PM Richard Sandiford
> wrote:
>>
>> Richard Biener writes:
>> > On Wed, Oct 23, 2019 at 1:51 PM Richard Sandiford
>> > wrote:
>> >>
>> >> Richard Biener writes:
>> >
Hi Prathamesh,
I've just committed a patch that fixes a large number of SVE
reduction-related failures. Could you rebase and retest on top of that?
Sorry for messing you around, but regression testing based on the state
before the patch wouldn't have been that meaningful. In particular...
Prath
-linux-gnu (with and without SVE) and applied as r277441.
Richard
2019-10-25 Richard Sandiford
gcc/testsuite/
* gcc.target/aarch64/sve/loop_add_5.c: Remove XFAILs for tests
that now pass.
* gcc.target/aarch64/sve/reduc_1.c: Likewise.
* gcc.target/aarch64/sve
Unwanted unrolling meant that we had more single-precision FADDAs
than expected.
Tested on aarch64-linux-gnu (with and without SVE) and applied as r277442.
Richard
2019-10-25 Richard Sandiford
gcc/testsuite/
* gcc.target/aarch64/sve/reduc_strict_3.c (double_reduc1): Prevent
This is a continuation of the patch series I started on Wednesday
this time posted under a covering message. Parts 1-5 were:
[1/n] https://gcc.gnu.org/ml/gcc-patches/2019-10/msg01634.html
[2/n] https://gcc.gnu.org/ml/gcc-patches/2019-10/msg01637.html
[3/n] https://gcc.gnu.org/ml/gcc-patches/2019-
lding the type.)
2019-10-24 Richard Sandiford
gcc/
* tree-vect-stmts.c (get_vectype_for_scalar_type_and_size): If
targetm.vectorize.preferred_simd_mode returns an integer mode,
use mode_for_vector to decide what the vector type's mode
should actuall
igned amounts or unsigned shifts by
signed amounts; verify_gimple_assign_binary is happy with those.
This patch therefore goes for a middle ground of checking both TYPE_MODE
and TYPE_VECTOR_SUBPARTS, using the same condition in both places.
2019-10-24 Richard Sandiford
gcc/
* t
.
A later patch will pass this mode to targetm.vectorize.related_mode
to get the vector mode for a given element mode. Until then, the modes
simply act as an alternative way of specifying the vector size.
2019-10-24 Richard Sandiford
gcc/
* target.h (vector_sizes, auto_vector_size
This patch replaces vec_info::vector_size with vec_info::vector_mode,
but for now continues to use it as a way of specifying a single
vector size. This makes it easier for later patches to use
related_vector_mode instead.
2019-10-24 Richard Sandiford
gcc/
* tree-vectorizer.h
times let us change the original scalar type
to a "nicer" scalar type, but that isn't what's happening here.)
This is a prerequisite to supporting multiple vector sizes in the same
vec_info.
2019-10-24 Richard Sandiford
gcc/
* tree-vect-stmts.c (vectorizabl
patch also seemed like a good opportunity to add some more dump
messages: one to make it clear which vector size/mode was being used
when analysis passed or failed, and another to say when we've decided
to skip a redundant vector size/mode.
2019-10-24 Richard Sandiford
gcc/
*
10-24 Richard Sandiford
gcc/
* config/aarch64/aarch64.c (aarch64_vectorize_related_mode): New
function.
(aarch64_autovectorize_vector_modes): Also add V4HImode and V2SImode.
(TARGET_VECTORIZE_RELATED_MODE): Define.
gcc/testsuite/
* gcc.dg/vect/vect-outer
- there's no need to compute nunits_vectype if its element type is
the same as STMT_VINFO_VECTYPE's.
- it's useful to distinguish the nunits_vectype from the main vectype
in dump messages
- when reusing the existing STMT_VINFO_VECTYPE, it's useful to say so
in the dump, and sa
This patch adds AArch64 patterns for converting between 64-bit and
128-bit integer vectors, and makes the vectoriser and expand pass
use them.
2019-10-24 Richard Sandiford
gcc/
* tree-vect-stmts.c (vectorizable_conversion): Extend the
non-widening and non-narrowing path to
ssume the type is validated
elsewhere.
It seems a rather clunky fix, sorry, but restoring the
TYPE_MAIN_VARIANT (...) isn't compatible with the aka stuff.
Bootstrapped & regression-tested on aarch64-linux-gnu. OK to install?
Richard
2019-10-25 Richard Sandiford
gcc/cp/
Richard Biener writes:
> We have to check each operand for being in a pattern, not just the
> first when avoiding build from scalars (we could possibly handle
> the special case of some of them being the pattern stmt root, but
> that would be a followup improvement).
>
> Bootstrap & regtest runnin
Prathamesh Kulkarni writes:
> @@ -10288,6 +10261,23 @@ vectorizable_condition (stmt_vec_info stmt_info,
> gimple_stmt_iterator *gsi,
> vect_finish_stmt_generation (stmt_info, new_stmt, gsi);
> vec_compare = vec_compare_name;
> }
> +
> + if (
Coming back to this just in time for it not to be three months later,
sorry...
I still think it would be better to consolidate ifcvt a bit more,
rather than effectively duplicate bits of cond_move_process_if_block
in noce_convert_multiple_sets. But perhaps it was a historical
mistake to have two
Robin Dapp writes:
> This patch extracts a cc comparison from the initial compare/jump
> insn and allows it to be passed to noce_emit_cmove and
> emit_conditional_move.
> ---
> gcc/ifcvt.c | 68
> gcc/optabs.c | 7 --
> gcc/optabs.h | 2
Robin Dapp writes:
> This patch duplicates the previous noce_emit_cmove logic. First it
> passes the canonical comparison emits the sequence and costs it.
> Then, a second, separate sequence is created by passing the cc compare
> we extracted before. The costs of both sequences are compared and
Robin Dapp writes:
> When then and else are reversed, we would swap new_val and old_val.
> The same has to be done for our new code paths.
> Also, emit_conditional_move may perform swapping. In case we need to
> swap, the cc comparison also needs to be swapped and for this we pass
> the reversed
Jeff Law writes:
> On 10/5/19 5:29 AM, Richard Sandiford wrote:
>>
>> Sure. This message is going to go to the other extreme, sorry, but I'm
>> not sure which part will be the most convincing (if any).
> No worries. Worst case going to the other extreme is I hav
77556.
Richard
2019-10-29 Richard Sandiford
gcc/
* config/aarch64/aarch64.c (aarch64_sve_cmp_immediate_p)
(aarch64_simd_shift_imm_p): Accept scalars as well as vectors.
* config/aarch64/predicates.md (aarch64_sve_cmp_vsc_immediate)
(aarch64_sve_cmp_vsd_immediate): Ac
-29 Richard Sandiford
gcc/
* config/aarch64/aarch64.md (FFR_REGNUM, FFRT_REGNUM): New constants.
* config/aarch64/aarch64.h (FIRST_PSEUDO_REGISTER): Bump to
FFRT_REGNUM + 1.
(FFR_REGS, PR_AND_FFR_REGS): New register classes.
(REG_CLASS_NAMES
This is tested by the main SVE ACLE patches, but since it affects
the evpc routines, it seemed worth splitting out.
Tested on aarch64-linux-gnu (with and without SVE) and aarch64_be-elf.
Applied as r277562.
Richard
2019-10-29 Richard Sandiford
gcc/
* config/aarch64/aarch64-sve.md
E register save as a stack probe
too, and thus prevents the save from being shrink-wrapped if stack clash
protection is enabled.
The changelog describes the low-level details.
Tested on aarch64-linux-gnu (with and without SVE) and aarch64_be-elf.
Applied as r277564.
Richard
2019-10-29 Richard S
e.
AIUI this what the REDUC_IDX on the COND_EXPR now tells us.
Reverting that fixes ICEs in gcc.target/aarch64/sve/clastb*. Tested on
aarch64-linux-gnu (with and without SVE) and x86_64-linux-gnu.
Thanks,
Richard
2019-10-29 R
ater
during the analysis phase, e.g. because the target doesn't support a
particular vector operation.
This is needed to avoid regressions with a later patch.
2019-10-29 Richard Sandiford
gcc/
* tree-vect-slp.c (vect_contains_pattern_stmt_p): New function.
(vect_slp
cost issue though; see PR92265 for details.
2019-10-29 Richard Sandiford
gcc/
* tree-vectorizer.h (vect_get_vector_types_for_stmt): Take an
optional maximum nunits.
(get_vectype_for_scalar_type): Likewise. Also declare a form that
takes a
Richard Biener writes:
> On Tue, Oct 29, 2019 at 8:34 PM Jeff Law wrote:
>>
>> On 10/29/19 6:26 AM, John Paul Adrian Glaubitz wrote:
>> > Hello!
>> >
>> > We have raised $5000 to support anyone willing to work on this for the
>> > m68k target [1]. We really need the m68k to stay as it's essential
The series posted so far now shows how the hook would be used in practice.
Just wanted to follow up on some points here:
Richard Sandiford writes:
> Richard Biener writes:
>> On Wed, Oct 23, 2019 at 2:12 PM Richard Sandiford
>> wrote:
>>>
>>> Richard Biener w
Richard Biener writes:
> On Fri, Oct 25, 2019 at 2:37 PM Richard Sandiford
> wrote:
>>
>> This is another patch in the series to remove the assumption that
>> all modes involved in vectorisation have to be the same size.
>> Rather than have the target provide a lis
Thanks for implementing this.
Jakub Jelinek writes:
> On Wed, Oct 30, 2019 at 02:12:30PM +, Szabolcs Nagy wrote:
>> On 29/10/2019 17:15, Jakub Jelinek wrote:
>> > +void f03 (void);
>> > +#pragma omp declare variant (f03) match
>> > (device={kind(any),arch(x86_64),isa(avx512f,avx512bw)})
>> >
sted on aarch64-linuxg-gnu,
Richard
The SVE PCS support broke go, D and Ada because those languages don't
call TARGET_INIT_BUILTINS. We therefore ended up trying to get the
TYPE_MAIN_VARIANT of a null __SVBool_t.
We shouldn't really need to apply TYPE_MAIN_VARIANT there anyway,
since the ABI-defin
lable vectors.
I think the test probably predates support for variable-length
loop-aware SLP.
Tested on aarch64-linux-gnu and applied as r277681.
Richard
2019-10-31 Richard Sandiford
gcc/testsuite/
* gcc.target/aarch64/sve/reduc_strict_3.c: Split all but the
first function out
ly tests what's left in vcond_4.c,
but that too is OK, since the point of the test was to compare the
default handling of each comparison in vcond_4.c with the
-fno-trapping-math equivalent.
Tested on aarch64-linux-gnu and applied as r277682.
Richard
2019-10-31 Richard Sandiford
gc
This had been failing since a mass renaming. Noticed it a few times
before but somehow never got around to fixing it.
Tested on aarch64-linux-gnu and applied as r277683.
Richard
2019-10-31 Richard Sandiford
gcc/testsuite/
* g++.target/aarch64/sve/vcond_1_run.C: Update test name
"Andre Vieira (lists)" writes:
> Hi,
>
> After my patch I believe the only way orig_loop_vinfo is not null when
> calling vect_analyze_loop is when it is called for an epilogue and in
> that case we no longer use that variable, since
> LOOP_VINFO_ORIG_LOOP_INFO is already set for the epilogue's
recognise.
The brace indentation matches the surrounding style.
Tested on aarch64-linux-gnu. OK to install?
Richard
2019-11-04 Richard Sandiford
gcc/d/
* d-builtins.cc (build_frontend_type): Cope with variable
TYPE_VECTOR_SUBPARTS.
Index: gcc/d/d-b
This patch bridges the gap between the recent epilogue vectorisation
patches and https://gcc.gnu.org/ml/gcc-patches/2019-10/msg01822.html .
I don't have any evidence that the series is independently useful,
but it shouldn't make things worse either.
Tested individually on aarch64-linux-gnu and as
t;simdlen" in the new vect_epilogues condition. That should be
a separate change though.
This may conflict with Andre's fix for libgomp; I'll adjust if
that goes in first.
2019-11-04 Richard Sandiford
gcc/
* tree-vect-loop.c (vect_analyze_loop): Break out of the main
for correctness as well.
This can happen if the sizes returned by autovectorize_vector_sizes
happen to be out of order, e.g. because the target prefers smaller
vectors. It can also happen with later patches if two vectorisation
attempts happen to end up with the same VF.
2019-11-04 Richard
currences. All that matters is zero vs. nonzero.
2019-11-04 Richard Sandiford
gcc/testsuite/
* gcc.dg/vect/slp-9.c: Use scan-tree-dump rather than
scan-tree-dump-times.
* gcc.dg/vect/slp-widen-mult-s16.c: Likewise.
* gcc.dg/vect/slp-widen-mult-u8.c: Likewise.
With a later patch I saw a case in which we peeled a single iteration
for gaps but didn't need to peel further iterations to make up a full
vector. We then tried to vectorise the single-iteration epilogue.
2019-11-04 Richard Sandiford
gcc/
* tree-vect-loop.c (vect_analyze
Richard Biener writes:
> On Tue, Oct 29, 2019 at 6:05 PM Richard Sandiford
> wrote:
>>
>> The BB vectoriser picked vector types in the same way as the loop
>> vectoriser: it picked a vector mode/size for the region and then
>> based all the vector types off tha
This series adds a mode in which we try to vectorise loops once for
each supported vector mode combination and then pick the one with the
lowest cost. There are only really two patches for that: one to add the
feature and another to enable it by default for SVE. However, for it to
work as hoped,
nt for the promotion and demotion costs; previously we gave
multiple copies the same cost as a single copy.
Later patches test this, but it seemed worth splitting out.
2019-11-05 Richard Sandiford
gcc/
* tree-vect-stmts.c (vect_model_promotion_demotion_cost): Take the
num
against either the scalar or vector costs.
Later patches test this, but it seemed worth splitting out.
2019-11-04 Richard Sandiford
gcc/
* tree-vect-stmts.c (vectorizable_assignment): Don't add a cost.
Index: g
know whether that's true once we've calculated what the runtime
threshold would be.
2019-11-04 Richard Sandiford
gcc/
* tree-vectorizer.h (vect_apply_runtime_profitability_check_p):
New function.
* tree-vect-loop-manip.c (vect_loop_versioning): Use it.
->simdlen over any larger or smaller VF, regardless of costs
or target preferences.
2019-11-05 Richard Sandiford
gcc/
* params.def (vect-compare-loop-costs): New param.
* doc/invoke.texi: Document it.
* tree-vectorizer.h (_loop_vec_info::vec_outside_c
We didn't take the cost of generating loop masks into account, and so
tended to underestimate the cost of loops that need multiple masks.
2019-11-05 Richard Sandiford
gcc/
* tree-vect-loop.c (vect_estimate_min_profitable_iters): Include
the cost of generating loop masks.
This patch enables vect-compare-loop-costs by default for SVE, both so
that we can compare SVE against Advanced SIMD and so that (with future
patches) we can compare multiple SVE vectorisation approaches against
each other.
I'll apply if the prerequisites are approved.
2019-11-05 Ri
Richard Biener writes:
> On Fri, Oct 25, 2019 at 2:41 PM Richard Sandiford
> wrote:
>>
>> Some callers of get_same_sized_vectype were dealing with operands that
>> are constant or defined externally, and so have no STMT_VINFO_VECTYPE
>> available.
Ping
Richard Sandiford writes:
> One of the changes in r277281 was to make the typedef variant
> handling in strip_typedefs pass the raw DECL_ORIGINAL_TYPE to the
> recursive call, instead of applying TYPE_MAIN_VARIANT first.
> This PR shows that that interacts badly with the impleme
Dimitar Dimitrov writes:
> On Sat, 2 Nov 2019, 19:28:38 EET Kwok Cheung Yeung wrote:
>> The AMD GCN architecture uses 64-bit pointers, but the scalar registers
>> are 32-bit wide, so pointers must reside in a pair of registers.
> ...
>> Bootstrapped on x86_64 and tested with no regressions, which
-gnu and the series as a whole on x86_64-linux-gnu.
2019-11-04 Richard Sandiford
gcc/
* tree-vect-stmts.c (vectorizable_call): Require the types
to have the same size.
Index: gcc/tree-vect-stmts.c
===
--- gcc/tree
trying the same
combination of vector modes multiple times. This patch adds a
check to prevent that.
As before: each patch tested individually on aarch64-linux-gnu and the
series as a whole on x86_64-linux-gnu.
2019-11-04 Richard Sandiford
gcc/
* tree-vectorizer.h (vec_info::mode_set
-04 Richard Sandiford
gcc/
* tree-vectorizer.h (can_duplicate_and_interleave_p): Take an
element type rather than an element mode.
* tree-vect-slp.c (can_duplicate_and_interleave_p): Likewise.
Use get_vectype_for_scalar_type to query the natural types
for a gi
Richard Biener writes:
> On Tue, Nov 5, 2019 at 9:25 PM Richard Sandiford
> wrote:
>>
>> Patch 12/n makes the AArch64 port add four entries to
>> autovectorize_vector_modes. Each entry describes a different
>> vector mode assignment for vector code that mixes 8-bit
301 - 400 of 9159 matches
Mail list logo