On Thu, 14 Sep 2017, David Edelsohn wrote:
> * tree-ssa-sccvn.c (visit_phi): Merge undefined values similar
> to VN_TOP.
>
> This seems to have regressed
>
> FAIL: gcc.dg/tree-prof/time-profiler-2.c scan-ipa-dump-times profile
> "Read tp_first_run: 0" 2
> FAIL: gcc.dg/tree-prof/time-profiler-2.c
Hi Ian,
> On Thu, Sep 14, 2017 at 3:19 PM, Rainer Orth
> wrote:
>>
>>> I've committed a patch to libgo to upgrade it to the recent Go 1.9 release.
>>>
>>> As usual with these upgrades, the patch is too large to attach here.
>>> I've attached the changes to files that are more or less specific to
Ping.
On Friday, September 01 2017, I wrote:
> On Wednesday, August 23 2017, Pedro Alves wrote:
>
>> On 08/23/2017 05:17 AM, Sergio Durigan Junior wrote:
>>> Hi there,
>>>
>>> This is a series of two patches, one for GDB and one for GCC, which aims
>>> to improve the detection and handling of tr
On Thu, Sep 14, 2017 at 6:30 PM, Kugan Vivekanandarajah
wrote:
> This patch prevent tree unroller from completely unrolling inner loops if that
> results in excessive strided-loads in outer loop.
Same comments from the RTL version.
Though one more comment here:
+ if (!INDIRECT_REF_P (op)
+
On Thu, Sep 14, 2017 at 6:33 PM, Kugan Vivekanandarajah
wrote:
> This patch adds aarch64_loop_unroll_adjust to limit partial unrolling
> in rtl based on strided-loads in loop.
Can you expand on this some more? Like give an example of where this
helps? I am trying to better understand your count
On Thu, Sep 14, 2017 at 6:28 PM, Kugan Vivekanandarajah
wrote:
> This patch adds number of hw prefetchers available to
> cpu_prefetch_tune so it can be used in loop unrolling decisions.
Can you explain the difference between this and num_slots
(PARAM_SIMULTANEOUS_PREFETCHES)? Because it seems li
This patch adds aarch64_loop_unroll_adjust to limit partial unrolling
in rtl based on strided-loads in loop.
Thanks,
Kugan
gcc/ChangeLog:
2017-09-12 Kugan Vivekanandarajah
* cfgloop.h (iv_analyze_biv): export.
* loop-iv.c: Likewise.
* config/aarch64/aarch64.c (strided_load_p): Ne
Change iv_analyze_result to take const_rtx. This is just to make the
next patch compile. No functional changes:
Thanks,
Kugan
gcc/ChangeLog:
2017-09-12 Kugan Vivekanandarajah
* cfgloop.h (iv_analyze_result): Change 2nd param from rtx to
const_rtx.
* df-core.c (df_find_def): Lik
This patch prevent tree unroller from completely unrolling inner loops if that
results in excessive strided-loads in outer loop.
Thanks,
Kugan
gcc/ChangeLog:
2017-09-12 Kugan Vivekanandarajah
* config/aarch64/aarch64.c (count_mem_load_streams): New.
(aarch64_ok_to_unroll): New.
*
This patch adds number of hw prefetchers available to
cpu_prefetch_tune so it can be used in loop unrolling decisions.
Thanks,
Kugan
gcc/ChangeLog:
2017-09-12 Kugan Vivekanandarajah
* config/aarch64/aarch64-protos.h (struct cpu_prefetch_tune): Add
new field hw_prefetchers_avail.
This patch adds separate params for rtl unroller so that they can be
tunned accordingly. Default values I have are based on some testing on
aarch64. I am happy to leave it as the current value and set them in
the back-end.
Thanks,
Kugan
gcc/ChangeLog:
2017-09-12 Kugan Vivekanandarajah
*
While loop unrolling helps to keep the pipeline busy in modern
processors, it also can increase the memory streams resulting in
collisions for the hardware prefetcher that can impact performance.
This patch series tries to detect this and limit the loop unrolling.
Patch 1 : Add separate parms for
On Thu, Sep 14, 2017 at 11:39:54AM -0500, Segher Boessenkool wrote:
> [ pressed send too early ]
>
> On Thu, Sep 14, 2017 at 10:18:55AM -0500, Pat Haugen wrote:
> > --- gcc/config/rs6000/rs6000.c (revision 252029)
> > +++ gcc/config/rs6000/rs6000.c (working copy)
> > @@ -37807,6 +37807,1
On Thu, Sep 14, 2017 at 3:19 PM, Rainer Orth
wrote:
>
>> I've committed a patch to libgo to upgrade it to the recent Go 1.9 release.
>>
>> As usual with these upgrades, the patch is too large to attach here.
>> I've attached the changes to files that are more or less specific to
>> gccgo.
>>
>> Th
On Thu, Sep 14, 2017 at 09:54:14AM -0500, Segher Boessenkool wrote:
> On Wed, Sep 13, 2017 at 05:46:00PM -0400, Michael Meissner wrote:
> > This patch adds support on PowerPC ISA 3.0 for the built-in function
> > __builtin_sqrtf128 generating the XSSQRTQP hardware square root instruction
> > and
>
Hi Ian,
> I've committed a patch to libgo to upgrade it to the recent Go 1.9 release.
>
> As usual with these upgrades, the patch is too large to attach here.
> I've attached the changes to files that are more or less specific to
> gccgo.
>
> This upgrade required some changes to the gotools Makef
On Thu, Sep 14, 2017 at 10:32:12PM +0100, Pedro Alves wrote:
> On 09/14/2017 09:26 PM, Jakub Jelinek wrote:
> > +@item c++17
> > +@itemx c++1z
> > +The 2017 ISO C++ standard plus amendments.
> > +The name @samp{c++1z} is deprecated.
> > +
> > +@item gnu++17
> > +@itemx gnu++1z
> > +GNU dialect of @
On 09/14/2017 09:26 PM, Jakub Jelinek wrote:
> +@item c++17
> +@itemx c++1z
> +The 2017 ISO C++ standard plus amendments.
> +The name @samp{c++1z} is deprecated.
> +
> +@item gnu++17
> +@itemx gnu++1z
> +GNU dialect of @option{-std=c++17}.
> +The name @samp{gnu++17} is deprecated.
> @end table
I
On Thu, Sep 14, 2017 at 02:24:01PM -0700, Mike Stump wrote:
> > --- gcc/doc/invoke.texi.jj 2017-09-12 21:57:57.0 +0200
> > +++ gcc/doc/invoke.texi 2017-09-14 19:32:34.342959968 +0200
> > @@ -1870,15 +1870,15 @@ GNU dialect of @option{-std=c++14}.
> > This is the default for C++ code.
>
On Sep 14, 2017, at 1:26 PM, Jakub Jelinek wrote:
>
> Given https://herbsutter.com/2017/09/06/c17-is-formally-approved/
> this patch makes -std=c++17 and -std=gnu++17 the documented options
> --- gcc/doc/invoke.texi.jj2017-09-12 21:57:57.0 +0200
> +++ gcc/doc/invoke.texi 2017-0
GCC maintainers:
Here is an updated patch to address the comment from Segher. The one
comment that was not addressed was:
>> +(define_insn "altivec_lvsl_reg"
>> + [(set (match_operand:V16QI 0 "vsx_register_operand" "=v")
>> + (unspec:V16QI
>> + [(match_operand:DI 1 "gpc_reg_operand
On Thu, 2017-09-14 at 09:38 -0500, Bill Schmidt wrote:
> On Sep 14, 2017, at 5:15 AM, Richard Biener
> wrote:
> >
> > On Wed, Sep 13, 2017 at 10:14 PM, Bill Schmidt
> > wrote:
> >> On Sep 13, 2017, at 10:40 AM, Bill Schmidt
> >> wrote:
> >>>
> >>> On Sep 13, 2017, at 7:23 AM, Richard Biener
Hi!
For firstprivate vars, even when implicit, the privatized entity is
what the reference refers to; if its copy ctor or dtor need instantiation,
doing this at gimplification time is too late, therefore we should handle
it during genericization like we handle non-reference firstprivatized vars.
Hi!
When the expression replace_placeholders is called on contains
many SAVE_EXPRs that appear more than once in the tree, we hang walking them
over and over again, while it is sufficient to just walk it without
duplicates (not using cp_walk_tree_without_duplicates, because the callback
can cp_wal
I realized there was no test on the noexcept qualification of the move
constructor with allocator.
I added some and found out that patch was missing a noexcept
qualification at _Rb_tree level.
Here is the updated patch fully tested, ok to commit ?
François
On 13/09/2017 21:57, François Dum
On Thu, Sep 14, 2017 at 07:34:14PM +, de Vries, Tom wrote:
> --- a/libgomp/testsuite/libgomp.c++/c++.exp
> +++ b/libgomp/testsuite/libgomp.c++/c++.exp
> @@ -22,6 +22,11 @@ dg-init
> # Turn on OpenMP.
> lappend ALWAYS_CFLAGS "additional_flags=-fopenmp"
>
> +# Switch into C++ mode. Otherwis
> I know we don't have
> libgomp.c-c++-common (maybe we should add that)
Like so?
Ran:
- make check-target-libgomp RUNTESTFLAGS=c.exp=cancel-taskgroup-1.c
- make check-target-libgomp RUNTESTFLAGS=c++.exp=cancel-taskgroup-1.c
Currently running make check-target-libgomp.
OK for trunk if tests pa
Calls to gcc_jit_context_get_builtin_function that accessed builtins
in sanitizer.def and after (or failed to match any builtin) led to
a crash accessing a NULL builtin name.
The entries with the NULL name came from these lines in sanitizer.def:
/* This has to come before all the sanitizer buil
On Wed, Sep 13, 2017 at 06:08:45PM -0500, Segher Boessenkool wrote:
> On Tue, Sep 12, 2017 at 07:17:07PM -0400, Michael Meissner wrote:
> > On Tue, Sep 12, 2017 at 05:41:34PM -0500, Segher Boessenkool wrote:
> > > This needs "TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT" I think? Which
> > > is the sa
On Wed, Sep 13, 2017 at 10:49:43PM +, Joseph Myers wrote:
> On Wed, 13 Sep 2017, Michael Meissner wrote:
>
> > This patch adds support on PowerPC ISA 3.0 for the built-in function
> > __builtin_sqrtf128 generating the XSSQRTQP hardware square root instruction
> > and
> > the built-in function
Hi Richard,
Is it OK to throw a hard error for this? Maybe the rules are different
from C and C++, but normally we can't do that for code that's only
invalid if executed. An unconditional warning would be good though.
I can also issue an unconditional warning; this will even simplify
the cod
On Thu, 2017-09-14 at 11:53 -0600, Jeff Law wrote:
>
>
> And I think that's starting to zero in on the problem --
> WORD_REGISTER_OPERATIONS is zero on aarch64 as you don't get extension
> to word_mode for W form registers.
>
> I wonder if what needs to happen is somehow look to extend that code
On 09/14/2017 10:33 AM, Steve Ellcey wrote:
> On Thu, 2017-09-14 at 09:03 -0600, Jeff Law wrote:
>> On 09/13/2017 03:46 PM, Steve Ellcey wrote:
>>>
>>> In arm32 rtl expansion, when reading the QI memory location, I see
>>> these instructions get generated:
>>>
>>> (insn 10 3 11 2 (set (reg:SI 119
On 09/14/2017 10:33 AM, Steve Ellcey wrote:
> On Thu, 2017-09-14 at 09:03 -0600, Jeff Law wrote:
>> On 09/13/2017 03:46 PM, Steve Ellcey wrote:
>>>
>>> In arm32 rtl expansion, when reading the QI memory location, I see
>>> these instructions get generated:
>>>
>>> (insn 10 3 11 2 (set (reg:SI 119
On Thu, Sep 14, 2017 at 11:53:02AM -0500, Pat Haugen wrote:
> On 09/14/2017 11:35 AM, Segher Boessenkool wrote:
> > On Thu, Sep 14, 2017 at 10:18:55AM -0500, Pat Haugen wrote:
> >> --- gcc/config/rs6000/rs6000.c (revision 252029)
> >> +++ gcc/config/rs6000/rs6000.c (working copy)
> >> @@ -3
On 09/14/2017 02:01 AM, Pierre-Marie de Rodat wrote:
> Hello,
>
> This commit adds comments to fields in the cgraph_thunk_info structure
> declaration from cgraph.h. They will hopefully answer questions that
> people like myself can ask while discovering the thunk machinery. I
> also made an asse
I've committed a patch to libgo to upgrade it to the recent Go 1.9 release.
As usual with these upgrades, the patch is too large to attach here.
I've attached the changes to files that are more or less specific to
gccgo.
This upgrade required some changes to the gotools Makefile. And one
test ha
On 09/13/2017 01:19 PM, Richard Sandiford wrote:
> This also seemed like a good opportunity to reverse the sense of the
> hook to "can", to avoid the awkward double negative in !CANNOT.
Yea. The double-negatives can sometimes make code hard to read.
>
> Tested on aarch64-linux-gnu, x86_64-linux
[ pressed send too early ]
On Thu, Sep 14, 2017 at 10:18:55AM -0500, Pat Haugen wrote:
> --- gcc/config/rs6000/rs6000.c(revision 252029)
> +++ gcc/config/rs6000/rs6000.c(working copy)
> @@ -37807,6 +37807,11 @@ rs6000_set_up_by_prologue (struct hard_r
> add_to_hard_reg_set (&s
On 09/13/2017 01:21 PM, Richard Sandiford wrote:
> I'm not sure the documentation is correct that outprec is always less
> than inprec, and each non-default implementation tested for the case
> in which it wasn't, but the patch leaves it as-is.
While the non-default implementations may always test
On 09/14/2017 11:35 AM, Segher Boessenkool wrote:
> On Thu, Sep 14, 2017 at 10:18:55AM -0500, Pat Haugen wrote:
>> --- gcc/config/rs6000/rs6000.c (revision 252029)
>> +++ gcc/config/rs6000/rs6000.c (working copy)
>> @@ -37807,6 +37807,11 @@ rs6000_set_up_by_prologue (struct hard_r
>>
* tree-ssa-sccvn.c (visit_phi): Merge undefined values similar
to VN_TOP.
This seems to have regressed
FAIL: gcc.dg/tree-prof/time-profiler-2.c scan-ipa-dump-times profile
"Read tp_first_run: 0" 2
FAIL: gcc.dg/tree-prof/time-profiler-2.c scan-ipa-dump-times profile
"Read tp_first_run: 2" 1
FAIL:
>
> Well, it's of course the poor-mans solution compared to providing our own
> ifunc-enabled libm ...
One benefit here would be that we could have our own calling convention for
this. So for floor/ceil we may just declare registers to be preserved (as
they are on all modern AVX enabled cpus) wh
On 09/13/2017 01:22 PM, Richard Sandiford wrote:
> Nice and easy, one definition and one use :-)
>
> Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu.
> Also tested by comparing the testsuite assembly output on at least one
> target per CPU directory. OK to install?
>
> Ri
On Thu, Sep 14, 2017 at 10:18:55AM -0500, Pat Haugen wrote:
> --- gcc/config/rs6000/rs6000.c(revision 252029)
> +++ gcc/config/rs6000/rs6000.c(working copy)
> @@ -37807,6 +37807,11 @@ rs6000_set_up_by_prologue (struct hard_r
> add_to_hard_reg_set (&set->set, Pmode, RS6000_PIC_O
On Thu, 2017-09-14 at 09:03 -0600, Jeff Law wrote:
> On 09/13/2017 03:46 PM, Steve Ellcey wrote:
> >
> > In arm32 rtl expansion, when reading the QI memory location, I see
> > these instructions get generated:
> >
> > (insn 10 3 11 2 (set (reg:SI 119)
> > (zero_extend:SI (mem:QI (reg/v/f:
On 09/04/17 10:07, Bernd Edlinger wrote:
> Hi,
>
> as you know we have a -Wcast-align warning which works only for
> STRICT_ALIGNMENT targets. But occasionally it would be nice to be
> able to switch this warning on even for other targets.
>
> Therefore I would like to add a strict version of th
Hi Carl,
On Wed, Sep 13, 2017 at 04:29:01PM -0700, Carl Love wrote:
> -- add "TARGET_SF_FPR && TARGET_FPRND" to the define_insn "lrintsfsi2"
> as mentioned it was missing on the original define_insn for fctiw.
I don't think TARGET_FPRND is correct: this instruction is in the original
PowerPC spec
Revision 235876 inadvertently caused the TOC reg to be marked as set up
in prologue, which prevents shrink-wrapping from moving the prologue
past a TOC reference. The following patch corrects the situation.
Bootstrap/regtest on powerpc64le-linux and powerpc64-linux(-m32/-m64)
with no new regressio
On 09/13/2017 03:46 PM, Steve Ellcey wrote:
> On Wed, 2017-09-13 at 14:46 -0500, Segher Boessenkool wrote:
>> On Wed, Sep 13, 2017 at 06:13:50PM +0100, Kyrill Tkachov wrote:
>>>
>>> We are usually hesitant to add explicit subreg matching in the MD pattern
>>> (though I don't remember if there's a
Hi,
Current pcom implementation rewrites into lcssa form after all loops are
transformed, this is
not enough because unrolling of later loop checks lcssa form in function
tree_transform_and_unroll_loop.
This simple patch rewrites loop into lcssa form if store-store chain is
handled. I think it
On Wed, Sep 13, 2017 at 05:46:00PM -0400, Michael Meissner wrote:
> This patch adds support on PowerPC ISA 3.0 for the built-in function
> __builtin_sqrtf128 generating the XSSQRTQP hardware square root instruction
> and
> the built-in function __builtin_fmaf128 generating XSMADDQP, XSMSUBQP,
> XS
On Sep 14, 2017, at 5:15 AM, Richard Biener wrote:
>
> On Wed, Sep 13, 2017 at 10:14 PM, Bill Schmidt
> wrote:
>> On Sep 13, 2017, at 10:40 AM, Bill Schmidt
>> wrote:
>>>
>>> On Sep 13, 2017, at 7:23 AM, Richard Biener
>>> wrote:
On Tue, Sep 12, 2017 at 11:08 PM, Will Schmidt
>>>
On Thu, Sep 14, 2017 at 4:05 PM, Richard Sandiford
wrote:
> Richard Biener writes:
>> On Thu, Sep 14, 2017 at 1:23 PM, Richard Sandiford
>> wrote:
>>> This patch adds a helper function for getting the number of
>>> bytes accessed by a scalar data reference, which helps when general
>>> modes hav
Richard Biener writes:
> On Thu, Sep 14, 2017 at 1:13 PM, Richard Sandiford
> wrote:
>> Previously VECTOR_CST_NELTS (t) read the number of elements from
>> TYPE_VECTOR_SUBPARTS (TREE_TYPE (t)). There were two ways of handling
>> this with variable TYPE_VECTOR_SUBPARTS: either forcibly convert th
Richard Biener writes:
> On Thu, Sep 14, 2017 at 1:23 PM, Richard Sandiford
> wrote:
>> This patch adds a helper function for getting the number of
>> bytes accessed by a scalar data reference, which helps when general
>> modes have a variable size.
>>
>> Tested on aarch64-linux-gnu, x86_64-linux
Hi,
I missed a target lp64 require for the fold-vec-ld-longlong.c test.
I'm now wearing my cone of shame. :-(
Committing as trivial, momentarily.
Thanks,
-Will
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-ld-longlong.c
b/gcc/testsuite/gcc.target/powerpc/fold-vec-ld-longlong.c
in
On Thu, Sep 14, 2017 at 1:25 PM, Richard Sandiford
wrote:
> This patch makes the vectoriser use the gimple-fold.h routines
> in more cases, instead of vect_init_vector. Later patches want
> to use the same interface to handle variable-length vectors.
>
> Tested on aarch64-linux-gnu, x86_64-linux-
On Thu, Sep 14, 2017 at 1:24 PM, Richard Sandiford
wrote:
> Epilogue vectorisation uses the vectorisation factor of the main loop
> as the maximum vectorisation factor allowed for correctness. That makes
> sense as a conservatively correct value, since the chosen vectorisation
> factor will be st
On Thu, Sep 14, 2017 at 1:23 PM, Richard Sandiford
wrote:
> This patch adds a helper function for getting the number of
> bytes accessed by a scalar data reference, which helps when general
> modes have a variable size.
>
> Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu.
>
On Thu, Sep 14, 2017 at 1:22 PM, Richard Sandiford
wrote:
> The vectoriser sometimes considers lowering "vector" operations into N
> scalar word operations. This N needs to be fixed at compile time, so
> the condition guarding it needs to change when variable-lengh vectors
> are added. This patc
On Thu, Sep 14, 2017 at 1:22 PM, Richard Sandiford
wrote:
> This patch adds a vectoriser helper routine to calculate how
> many copies of a vector statement we need. At present this
> is always:
>
> LOOP_VINFO_VECT_FACTOR (loop_vinfo) / TYPE_VECTOR_SUBPARTS (vectype)
>
> but later patches add o
On Thu, Sep 14, 2017 at 1:20 PM, Richard Sandiford
wrote:
> This patch adds gimple-fold.h equivalents of build_vector and
> build_vector_from_val. Like the other gimple-fold.h routines
> they always return a valid gimple value and add any new
> statements to a given gimple_seq. In combination wi
On Thu, Sep 14, 2017 at 1:20 PM, Richard Sandiford
wrote:
> This patch makes can_vec_perm_p & co. take a vec<>, wrapped in new
> typedefs vec_perm_indices and auto_vec_perm_indices. There are two
> reasons for doing this for SVE:
>
> (1) it means that the number of elements is bundled with the el
On Thu, Sep 14, 2017 at 1:14 PM, Richard Sandiford
wrote:
> This patch makes build_vector take the elements as a vec<> rather
> than a tree *. This is useful for SVE because it bundles the number
> of elements with the elements themselves, and enforces the fact that
> the number is constant. Als
On Thu, Sep 14, 2017 at 1:13 PM, Richard Sandiford
wrote:
> Previously VECTOR_CST_NELTS (t) read the number of elements from
> TYPE_VECTOR_SUBPARTS (TREE_TYPE (t)). There were two ways of handling
> this with variable TYPE_VECTOR_SUBPARTS: either forcibly convert the
> number to a constant (which
On 2017.09.14 at 14:36 +0200, Jakub Jelinek wrote:
> On Thu, Sep 14, 2017 at 12:10:50PM +, Shalnov, Sergey wrote:
> > GCC has the option "mprefer-avx128" to use 128-bit AVX registers instead
> > of 256-bit AVX registers in the auto-vectorizer.
>
> > This patch enables the command line option "
On Thu, Sep 14, 2017 at 12:10:50PM +, Shalnov, Sergey wrote:
> GCC has the option "mprefer-avx128" to use 128-bit AVX registers instead
> of 256-bit AVX registers in the auto-vectorizer.
> This patch enables the command line option "mprefer-avx256" that reduces
> 512-bit registers usage in "ma
PING^1
On 08/30/2017 11:45 AM, Martin Liška wrote:
> Hi.
>
> This is follow up which I've just noticed. Main problem we have is that
> an instrumented compiler w/ -fprofile-generate (built in $OBJDIR/gcc
> subfolder)
> will generate all *.gcda files in a same dir as *.o files. That's problematic
Hello.
As mentioned at Cauldron 2017, second step in switch lowering should be massive
simplification in code that does expansion of balanced tree. Basically it
includes
VRP and DCE, which we can for obvious reason do by our own.
The patch does that, and introduces a separate pass for -O0 that's
Hi,
GCC has the option "mprefer-avx128" to use 128-bit AVX registers instead of
256-bit AVX registers in the auto-vectorizer.
This patch enables the command line option "mprefer-avx256" that reduces
512-bit registers usage in "march=skylake-avx512" mode.
This is the initial implementation of the
On Thu, 7 Sep 2017, Richard Biener wrote:
> On Thu, 7 Sep 2017, Richard Biener wrote:
>
> >
> > This enhances VN to do the same PHI handling as CCP, meeting
> > undefined and constant to constant. I've gone a little bit
> > further (and maybe will revisit this again) in also meeting
> > all-und
Hi,
This patch adds options -march=/-mtune=knm for Knights Mill.
2017-09-14 Sebastian Peryt
gcc/
* config.gcc: Support "knm".
* config/i386/driver-i386.c (host_detect_local_cpu): Detect "knm".
* config/i386/i386-c.c (ix86_target_macros_internal): Handle
PROCES
This patch makes the vectoriser use the gimple-fold.h routines
in more cases, instead of vect_init_vector. Later patches want
to use the same interface to handle variable-length vectors.
Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu.
OK to install?
Richard
2017-09-14
Epilogue vectorisation uses the vectorisation factor of the main loop
as the maximum vectorisation factor allowed for correctness. That makes
sense as a conservatively correct value, since the chosen vectorisation
factor will be strictly less than that anyway. However, once the VF
itself becomes
This patch adds a helper function for getting the number of
bytes accessed by a scalar data reference, which helps when general
modes have a variable size.
Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu.
OK to install?
Richard
2017-09-14 Richard Sandiford
The vectoriser sometimes considers lowering "vector" operations into N
scalar word operations. This N needs to be fixed at compile time, so
the condition guarding it needs to change when variable-lengh vectors
are added. This patch puts the condition into a helper routine so that
there's only one
This patch adds a vectoriser helper routine to calculate how
many copies of a vector statement we need. At present this
is always:
LOOP_VINFO_VECT_FACTOR (loop_vinfo) / TYPE_VECTOR_SUBPARTS (vectype)
but later patches add other cases. Another benefit of using
a helper routine is that it can a
This patch adds gimple-fold.h equivalents of build_vector and
build_vector_from_val. Like the other gimple-fold.h routines
they always return a valid gimple value and add any new
statements to a given gimple_seq. In combination with later
patches this reduces the number of force_gimple_operands.
This patch makes can_vec_perm_p & co. take a vec<>, wrapped in new
typedefs vec_perm_indices and auto_vec_perm_indices. There are two
reasons for doing this for SVE:
(1) it means that the number of elements is bundled with the elements
themselves, and is obviously constant.
(2) it makes it e
This patch makes build_vector take the elements as a vec<> rather
than a tree *. This is useful for SVE because it bundles the number
of elements with the elements themselves, and enforces the fact that
the number is constant. Also, I think things like the folds can be used
with any generic GNU v
Previously VECTOR_CST_NELTS (t) read the number of elements from
TYPE_VECTOR_SUBPARTS (TREE_TYPE (t)). There were two ways of handling
this with variable TYPE_VECTOR_SUBPARTS: either forcibly convert the
number to a constant (which is doable) or store the number directly
in the VECTOR_CST. The la
On Mon, Aug 14, 2017 at 10:25:22AM +0200, Tom de Vries wrote:
> 2017-08-14 Tom de Vries
>
> PR c/81844
Please use PR c/81875 instead, now that you've filed it.
> * c-parser.c (c_parser_omp_for_loop): Fix condition folding.
Fold only operands of cond, not cond itself.
?
> *
On Wed, Sep 13, 2017 at 10:14 PM, Bill Schmidt
wrote:
> On Sep 13, 2017, at 10:40 AM, Bill Schmidt
> wrote:
>>
>> On Sep 13, 2017, at 7:23 AM, Richard Biener
>> wrote:
>>>
>>> On Tue, Sep 12, 2017 at 11:08 PM, Will Schmidt
>>> wrote:
Hi,
[PATCH, rs6000] [v2] Folding of vector l
On Wed, Sep 13, 2017 at 7:34 PM, Martin Jambor wrote:
> Hello,
>
> I apologize for not coming back to this, I keep on getting distracted.
> Anyway...
>
> On Tue, Aug 15, 2017 at 02:20:55PM +, Joseph Myers wrote:
>> On Tue, 15 Aug 2017, Martin Jambor wrote:
>>
>> > I am not sure what to do abou
Hi Richard,
> -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" {
> target
> sparc*-*-* xfail ilp32 } } } */
> +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" {
> target
> sparc*-*-* } } } */
> /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "ve
On 08/10/2017 09:43 PM, Jason Merrill wrote:
> On 07/14/2017 01:35 AM, Martin Liška wrote:
>> On 05/01/2017 09:13 PM, Jason Merrill wrote:
>>> On Wed, Apr 26, 2017 at 6:58 AM, Martin Liška wrote:
On 04/25/2017 01:58 PM, Jakub Jelinek wrote:
> On Tue, Apr 25, 2017 at 01:48:05PM +0200, Mart
Hi all,
This patch generalizes the formation of LDP/STP that require a base
register.
Previously, we would only accept address pairs that were ordered in
ascending or descending order, and only strictly sequential loads/stores.
This patch improves that by allowing us to accept all orders of
On Thu, 14 Sep 2017, Rainer Orth wrote:
> Since
>
> 2017-06-02 Richard Biener
>
> * tree-vect-loop.c (vect_analyze_loop_operations): Not relevant
> PHIs are ok.
> * tree-vect-stmts.c (process_use): Do not mark backedge defs
> for inductions as relevant.
>
> gc
Since
2017-06-02 Richard Biener
* tree-vect-loop.c (vect_analyze_loop_operations): Not relevant
PHIs are ok.
* tree-vect-stmts.c (process_use): Do not mark backedge defs
for inductions as relevant.
gcc.dg/vect/vect-multitypes-12.c XPASSes on 32-bit SPARC:
XPAS
The function contains these lines:
if (debug_column_info)
fprint_ul (asm_out_file, column);
else
putc ('0', asm_out_file);
but they are dominated by:
if (!debug_column_info)
column = 0;
Bootstrapped/regtested on x86_64-suse-linux, applied on mainline as obviou
Hi!
While debugging this function I've noticed way too many formatting issues
and fixed them, committed as obvious to trunk:
2017-09-14 Jakub Jelinek
* combine.c (make_compound_operation_int): Formatting fixes.
--- gcc/combine.c.jj2017-09-12 21:58:06.0 +0200
+++ gcc/combi
Hello,
This commit adds comments to fields in the cgraph_thunk_info structure
declaration from cgraph.h. They will hopefully answer questions that
people like myself can ask while discovering the thunk machinery. I
also made an assertion stricter in cgraph_node::create_thunk.
I'm adding Nathan i
On Wed, Sep 13, 2017 at 04:20:32PM -0700, Cesar Philippidis wrote:
> 2017-09-13 Cesar Philippidis
>
> gcc/
> * omp-offload.c (oacc_xform_loop): Enable SIMD vectorization on
> non-SIMT targets in acc vector loops.
Ok, thanks.
Jakub
94 matches
Mail list logo