Hi Richard,
Here's the updated patch, thanks for the feedback so far!
Regards,
Tamar
From: Richard Sandiford
Sent: Thursday, June 8, 2017 11:32:07 AM
To: Tamar Christina
Cc: GCC Patches; nd; James Greenhalgh; Marcus Shawcroft; Richard Earnshaw
Subject: Re
Hi All,
Updating this patch with the feedback I've received from patch 1/4.
Thanks,
Tamar
From: gcc-patches-ow...@gcc.gnu.org on behalf
of Tamar Christina
Sent: Wednesday, June 7, 2017 12:38:37 PM
To: GCC Patches
Cc: nd; James Greenhalgh; Marcus Shawcro
On 9 June 2017 at 17:48, Richard Biener wrote:
> On June 9, 2017 5:32:10 PM GMT+02:00, Christophe Lyon
> wrote:
>>On 8 June 2017 at 15:49, Richard Biener wrote:
>>> On Thu, 8 Jun 2017, Richard Biener wrote:
>>>
The following fixes unsafe vectorization of reductions in outer loop
On 12.06.2017 08:30, Pitchumani Sivanupandi wrote:
On Friday 09 June 2017 03:59 PM, Georg-Johann Lay wrote:
Hi,
This patch adds support for devices that can access flash memory
by LD* instructions, hence there is no need to put .rodata in RAM.
The default linker script for the new multilib ver
On Mon, 12 Jun 2017, Christophe Lyon wrote:
> On 9 June 2017 at 17:48, Richard Biener wrote:
> > On June 9, 2017 5:32:10 PM GMT+02:00, Christophe Lyon
> > wrote:
> >>On 8 June 2017 at 15:49, Richard Biener wrote:
> >>> On Thu, 8 Jun 2017, Richard Biener wrote:
> >>>
>
> The following
Hello.
Sorry for this breakage, it's actually the same mistake I did in the PR
that belongs to the test I broke. I overlooked the ICE in log file.
I'm testing the patch, may I install it after it survives regression
tests?
Martin
>From 20e5419136ec26ed009ca93eedccd2582b65dd36 Mon Sep 17 00:00:00
Hi All,
this patch implements a optimization rewriting
x * copysign (1.0, y) and
x * copysign (-1.0, y)
to:
x ^ (y & (1 << sign_bit_position))
This is done by creating a special builtin during matching and generate the
appropriate instructions during expand. This new builtin is called XORSIG
Hi All,
this patch implements a optimization rewriting
x * copysign (1.0, y) and
x * copysign (-1.0, y)
to:
x ^ (y & (1 << sign_bit_position))
The patch provides AArch64 optabs for XORSIGN, both vectorized and scalar.
This patch is a revival of a previous patch
https://gcc.gnu.org/ml/gcc-pa
I am testing the following to fix PR81053.
Bootstrap and regtest running on x86_64-unknown-linux-gnu.
Richard.
2017-06-12 Richard Biener
PR tree-optimization/81053
* tree-vect-loop.c (vect_is_simple_reduction): Handle PHI
with backedge value not defined in loop. Sim
Tom de Vries writes:
> [ attached patch ]
>
> On 06/10/2017 09:57 AM, Tom de Vries wrote:
>> Hi,
>>
>> one thing that has bothered me on a regular basis is the inability to
>> spread long dejagnu directives over multiple lines.
>>
>> I've written a demonstrator patch (for the dejagnu sources) a
On Sat, Jun 10, 2017 at 11:06 AM, Richard Sandiford
wrote:
> Another one sorry, but:
>
> Bin Cheng writes:
>> diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
>> index af874e7..98caa5e 100644
>> --- a/gcc/tree-vect-loop.c
>> +++ b/gcc/tree-vect-loop.c
>> @@ -2214,6 +2214,36 @@ start_over:
"Bin.Cheng" writes:
> On Sat, Jun 10, 2017 at 10:40 AM, Richard Sandiford
> wrote:
>> Sorry to return this old patch, but:
>>
>> Bin Cheng writes:
>>> -/* Calculate the number of iterations under which scalar loop will be
>>> - preferred than vectorized loop. NITERS_PROLOG is the number of
>>
On Mon, Jun 12, 2017 at 9:19 AM, Richard Sandiford
wrote:
> "Bin.Cheng" writes:
>> On Sat, Jun 10, 2017 at 10:40 AM, Richard Sandiford
>> wrote:
>>> Sorry to return this old patch, but:
>>>
>>> Bin Cheng writes:
-/* Calculate the number of iterations under which scalar loop will be
-
On Mon, 12 Jun 2017, Bin.Cheng wrote:
> On Mon, Jun 12, 2017 at 9:19 AM, Richard Sandiford
> wrote:
> > "Bin.Cheng" writes:
> >> On Sat, Jun 10, 2017 at 10:40 AM, Richard Sandiford
> >> wrote:
> >>> Sorry to return this old patch, but:
> >>>
> >>> Bin Cheng writes:
> -/* Calculate the num
On Mon, 12 Jun 2017, Tamar Christina wrote:
> Hi All,
>
> this patch implements a optimization rewriting
>
> x * copysign (1.0, y) and
> x * copysign (-1.0, y)
>
> to:
>
> x ^ (y & (1 << sign_bit_position))
>
> This is done by creating a special builtin during matching and generate the
> ap
Ping?
Best regards,
Thomas
On 06/06/17 11:12, Thomas Preudhomme wrote:
On 09/05/17 23:36, Jan Hubicka wrote:
Ping?
Sorry for late reply
My turn to apologize now.
Hi,
This patch fixes an assert failure when linking one LTOed object file
having a weak alias with a regular object file cont
> Hello.
>
> Sorry for this breakage, it's actually the same mistake I did in the PR
> that belongs to the test I broke. I overlooked the ICE in log file.
>
> I'm testing the patch, may I install it after it survives regression
> tests?
OK,
thanks!
Honza
>
> Martin
> >From 20e5419136ec26ed009c
> 2017-05-24 Richard Sandiford
>
> gcc/
> * combine.c (make_field_assignment): Check len rather than the mode
> precision when calling force_to_mode.
OK for mainline.
--
Eric Botcazou
Building a powerpc-wrs-vxworks compiler with a very recent mainline
fails with numerous instances or error like:
In file included from
/powerpc-wrs-vxworks/sys-include/types/vxTypesOld.h:123:0,
from /gcc/include-fixed/stdint.h:16,
from
/powerpc-wrs-vxworks/sys-i
> I do not see a direct gen_return happening in function.c in the gcc-7
> branch.
>
> Is it somewhere else?
There is a call from force_nonfallthru_and_redirect in cfgrtl.c AFAICS.
So the code generated for your testcase is less optimized with GCC 7 and later
than with GCC 6 and earlier?
--
Er
On Fri, 2017-05-12 20:14:23 +0100, Graham Markall
wrote:
> Since the combine pass canonicalises shift-add insns using plus and
> ashift (as opposed to plus and mult which it previously used to do), it
> no longer creates *add_n or *sub_n insns, as the patterns match plus and
> mult only. The outc
Hi Segher!
On Tue, 2017-06-06 15:56:17 +, Segher Boessenkool
wrote:
> Since rs6000 no longer supports SPE, TARGET_FPRS now always is true.
>
> This makes TARGET_{SF,DF}_SPE always false. Many patterns in spe.md
> can now be deleted; which makes it possible to merge e.g. negdd2 with
> *negd
Hi Honza & Christophe,
I have tested your suggested fix. It does fix the regression.
Here is a simple patch for it.
After r249013, die () and dump_stack () are both in cold section. This makes
the compiler generate bl instruction for the function call, instead of
honoring the -mlong-calls option
> Hi Honza & Christophe,
>
> I have tested your suggested fix. It does fix the regression.
> Here is a simple patch for it.
>
> After r249013, die () and dump_stack () are both in cold section. This makes
> the compiler generate bl instruction for the function call, instead of
> honoring the -mlo
On Sun, Jun 11, 2017 at 07:38:04PM -0700, Ian Lance Taylor wrote:
> On Sun, Jun 11, 2017 at 4:40 AM, Segher Boessenkool
> wrote:
> >
> > The new split-1.c testcase fails on targets that do not support split
> > stack (like 32-bit PowerPC Linux). This patch fixes it by only running
> > the testcas
Hi!
On Mon, Jun 12, 2017 at 12:01:34PM +0200, Jan-Benedict Glaw wrote:
> On Tue, 2017-06-06 15:56:17 +, Segher Boessenkool
> wrote:
> > Since rs6000 no longer supports SPE, TARGET_FPRS now always is true.
> >
> > This makes TARGET_{SF,DF}_SPE always false. Many patterns in spe.md
> > can n
On Fri, Jun 09, 2017 at 01:03:34PM -0700, Jim Wilson wrote:
> # Arch Matches
> Index: gcc/doc/invoke.texi
> ===
> --- gcc/doc/invoke.texi (revision 249025)
> +++ gcc/doc/invoke.texi (working copy)
> @@ -13983,8 +13983,8 @@
2017-06-12 11:40 GMT+04:00 Georg-Johann Lay :
> On 12.06.2017 08:30, Pitchumani Sivanupandi wrote:
>>
>> On Friday 09 June 2017 03:59 PM, Georg-Johann Lay wrote:
>>>
>>> Hi,
>>>
>>> This patch adds support for devices that can access flash memory
>>> by LD* instructions, hence there is no need to p
Currently the FP reassociation width is set to 4 on AArch64. On recent
GCCs this has become more aggressive in splitting expressions. This means
many FMAs are split into FMUL and FADD. The reassociation increases register
pressure, in some benchmarks so much that inner loops start to spill.
This
On Fri, Jun 09, 2017 at 12:30:10PM -0700, Jason Merrill wrote:
> On Thu, Jun 8, 2017 at 12:30 PM, Jakub Jelinek wrote:
> > cp_genericize_r now instruments INTEGER_CSTs that have REFERENCE_TYPE,
> > so that we can diagnose binding references to NULL in some cases,
> > see PR79572. As the following
>
> [ Paul Hua sent a patch adding split_stack already, it was OKed, but
> it is not committed yet, fwiw ].
>
I saw this, so not commit my patch.
Paul.
I missed some things in config.gcc with my previous patches. This
should fix it; committing to trunk.
Sorry for the bother,
Segher
2017-06-12 Segher Boessenkool
* config.gcc: Remove rs6000/e500.h from tm_file for all targets.
---
gcc/config.gcc | 20 ++--
1 file
On 06/12/2017 05:17 AM, Olivier Hainque wrote:
Nathan, how does this look to you ?
I'm fine with it. Previously we tried to avoid fixincludes so that aan
updated toolchain could just drop in to an existing tornado
distribution. But that's no longer a concern.
nathan
--
Nathan Sidwell
This is the build failure of the Ada runtime on SPARC64/Linux, caused by a
miscompilation of the Ada front-end at -O2 or above. The symptom is exactly
the same as that of PR middle-end/44993, which was a similar build failure,
although the root cause is slightly different.
It's delicate stuff
On 06/09/2017 08:53 AM, Richard Earnshaw wrote:
This patch series implements the proposed change and provides support
for a generic way of adding optional features to architectures and CPU
names. The documentation patches at the end of the series explain the
new syntax, so I won't repeat all th
On 10 June 2017 at 01:27, Richard Earnshaw (lists)
wrote:
> On 09/06/17 23:45, Christophe Lyon wrote:
>> Hi Richard,
>>
>>
>> On 9 June 2017 at 14:53, Richard Earnshaw wrote:
>>>
>>> During the ARM BoF at the Cauldron last year I mentioned that I wanted
>>> to rework the way GCC on ARM handles th
On 06/09/2017 12:53 PM, Jan Hubicka wrote:
Hi,
this patch marks the obvious candidates for cold attribute and enables
cold auto-detection on some common coding patterns.
* class.c (build_vtbl_initializer): Mark dvirt_fn as cold.
* decl.c (cxx_init_decl_processing, push_throw_li
On 10/06/17 16:44 +0200, François Dumont wrote:
On 08/06/2017 15:22, Jonathan Wakely wrote:
Can't we just have one file per container type (maybe just called
default_init.cc) which tests default-initialization in test01() and
value-initialization in a test02() function?
While working on this we
Hi Tamar,
On 8 June 2017 at 18:50, James Greenhalgh wrote:
> On Wed, Jun 07, 2017 at 12:38:27PM +0100, Tamar Christina wrote:
>> Hi All,
>>
>> This patch allows the inlining of lrint when -fno-math-errno
>> assuming that errno does not need to be set when the rounded value
>> is not representable
> On Jun 12, 2017, at 13:19 , Nathan Sidwell wrote:
>
> On 06/12/2017 05:17 AM, Olivier Hainque wrote:
>
>> Nathan, how does this look to you ?
>
> I'm fine with it.
Great, thanks :-)
> Previously we tried to avoid fixincludes so that aan updated toolchain could
> just drop in to an existi
Hi Tom,
On 9 June 2017 at 17:25, Mike Stump wrote:
> On Jun 9, 2017, at 7:24 AM, Tom de Vries wrote:
>> this patch adds effective target stack_size.
>
>> OK for trunk if x86_64 and nvptx testing succeeds?
>
> Ok.
>
> The only last issue in this area that I know about is that there are a few
> m
Hi All,
I committed this as r249122 under the GCC obvious rule.
This fixes the failing test gcc.target/arm/sdiv_costs_1.c on soft float targets
by disabling it on those targets since the div calls aren't expanded.
gcc/testsuite/
2017-06-12 Tamar Christina
* gcc.target/arm/sdiv_costs_
Hi,
this patch adds code to output profile instantieis in callgraph.
Bootstrapped/regtested x86_64-linux, comitted.
Honza
Index: cgraph.c
===
--- cgraph.c(revision 249112)
+++ cgraph.c(working copy)
@@ -2094,7 +2094,7 @@ cgr
I am re-testing the following patch to fix PR81065 with the
fold_addr_of_array_ref_difference hunk added which was the only
case causing the gcc_unreachable to trigger in a all languages
bootstrap and regtest on x86_64-unknown-linux-gnu. The patch
to commit will omit this case completely.
Bootst
The Cortex-A53 scheduler model of FMAC bypass is not quite right
for FMAC to FMAC forwarding. Experiments also show the latencies of
FP operations are too high as well. Rather than adding more bypasses,
adjust the latencies of FP instructions to get a better schedule on
average. As a result SPEC
Hi,
As subject, for the testcase in the patch:
unsigned long
f2 (unsigned long a, int b)
{
unsigned long x = 1UL << b;
return a / x;
}
We currently generate:
f2:
mov x2, 1
lsl x1, x2, x1
udivx0, x0, x1
ret
Which could instead be tr
This is the same issue as PR73350 and PR80862 for disabling FP exceptions.
gcc -O0 -mavx512f -mavx512er returns exception
gcc -O2 -mavx512f -mavx512er returns nan
For this code:
#include
#include
#include
#include
#include
int main(int argc, char *argv[]) {
__m512 a = _mm512_set1_ps((f
Committed a less restrictive form in r249125 which now just requires
arm_v8_vfp_ok
gcc/testsuite/
2017-06-12 Tamar Christina
* gcc.target/arm/sdiv_costs_1.c: Require arm_v8_vfp_ok.
Thanks,
Tamar
From: gcc-patches-ow...@gcc.gnu.org on behalf
Hi,
In this testcase, all argument registers and the return register
will be general purpose registers:
long long
foo (long long a, long long b, long long c)
{
return ((a ^ b) & c) ^ b;
}
However, due to the implementation of aarch64_simd_bsl_internal
we'll match that pattern and em
On Mon, Jun 12, 2017 at 6:21 AM, Koval, Julia wrote:
> This is the same issue as PR73350 and PR80862 for disabling FP exceptions.
>
> gcc -O0 -mavx512f -mavx512er returns exception
> gcc -O2 -mavx512f -mavx512er returns nan
>
> For this code:
>
> #include
> #include
> #include
> #include
> #in
[Sorry for the re-send. I spotted that the attributes were not right for the
new pattern I was adding. The change between this and the first version was:
+ [(set_attr "type" "neon_bsl,neon_bsl,neon_bsl,multiple")
+ (set_attr "length" "4,4,4,12")]
]
---
Hi,
In this testcase, all argument
I've merged GCC trunk revision 249111 to the gccgo branch.
Ian
Hi,
In the AArch64 backend and scheduling models there is some confusion as to
what the load1/load2 etc. scheduling types refer to. This leads to us using
load1/load2 in two contexts - for a variety of 32-bit, 64-bit and 128-bit
loads in AArch32 and 128-bit loads in AArch64. That leads to an unde
Hi,
There seems to be a partial misconception in the AArch64 backend that
load1/load2 referred to the number of registers to load, rather than the
number of words to load. This patch fixes that using the new "number of
byte" types added in the previous patch.
That means using the load_16 and sto
On Mon, 12 Jun 2017, James Greenhalgh wrote:
>
> Hi,
>
> As subject, for the testcase in the patch:
>
> unsigned long
> f2 (unsigned long a, int b)
> {
> unsigned long x = 1UL << b;
> return a / x;
> }
>
> We currently generate:
>
> f2:
> mov x2, 1
> lsl
Hi,
PR71778 is an ICE when you pass a non-constant argument to an intrinsic
which requires a constant.
This ICE was introduced after we rewrote some of the builtin handling for
Neon intrinsics, the issue is that after throwing an error in
arm_expand_builtin_args, we return const0_rtx to indicate
From: Eric Botcazou
Date: Mon, 12 Jun 2017 11:27:10 +0200
>> I do not see a direct gen_return happening in function.c in the gcc-7
>> branch.
>>
>> Is it somewhere else?
>
> There is a call from force_nonfallthru_and_redirect in cfgrtl.c AFAICS.
>
> So the code generated for your testcase is l
Hi All,
The tests introduced for lrint in r249064 are failing on aarch64
bare metal because it's using different registers.
This patch generalizes the regexpr for the result so that it works
both for bare metal and linux.
regtested on aarch64-none-linux-gnu and aarch64-none-elf
Committed as r24
On 12/06/17 14:53, James Greenhalgh wrote:
Hi,
In the AArch64 backend and scheduling models there is some confusion as to
what the load1/load2 etc. scheduling types refer to. This leads to us using
load1/load2 in two contexts - for a variety of 32-bit, 64-bit and 128-bit
loads in AArch32 and 12
On 06/12/2017 02:28 PM, Christophe Lyon wrote:
Hi Tom,
On 9 June 2017 at 17:25, Mike Stump wrote:
On Jun 9, 2017, at 7:24 AM, Tom de Vries wrote:
this patch adds effective target stack_size.
OK for trunk if x86_64 and nvptx testing succeeds?
Ok.
The only last issue in this area that I
> On 06/09/2017 12:53 PM, Jan Hubicka wrote:
> >Hi,
> >this patch marks the obvious candidates for cold attribute and enables
> >cold auto-detection on some common coding patterns.
>
> > * class.c (build_vtbl_initializer): Mark dvirt_fn as cold.
> > * decl.c (cxx_init_decl_processing, push
Hi Christophe,
Thanks, I've committed a fix to the testcase.
Tamar
From: Christophe Lyon
Sent: Monday, June 12, 2017 1:10:38 PM
To: Tamar Christina
Cc: GCC Patches; nd; Richard Earnshaw; Marcus Shawcroft
Subject: Re: [PATCH][GCC][AArch64] Inline calls to
On 12/06/17 12:49, Christophe Lyon wrote:
> On 10 June 2017 at 01:27, Richard Earnshaw (lists)
> wrote:
>> On 09/06/17 23:45, Christophe Lyon wrote:
>>> Hi Richard,
>>>
>>>
>>> On 9 June 2017 at 14:53, Richard Earnshaw wrote:
During the ARM BoF at the Cauldron last year I mentioned that
I recently noticed that the GCC 'resolver' attribute used for ifunc's is not
on by default for aarch64 even though all the infrastructure to support it is
in place. I made memcpy an ifunc on aarch64 in glibc and am looking at
possibly using it for libatomic too. For this reason I would like to en
Ping. Original patch posted here:
https://gcc.gnu.org/ml/gcc-patches/2017-06/msg00091.html
Ping ^2. Updated patch posted here:
https://gcc.gnu.org/ml/gcc-patches/2017-05/msg01615.html
I would like to, but as far as I know the only testcase possible is below, and
as far as I know there is no possibility to use dg-error for runtime
exceptions(Sorry, if I'm wrong). There are only 2 versions of the flag
exception or no exception and the error is, when they are combined in CSE.
>
On Mon, Jun 12, 2017 at 9:06 AM, Koval, Julia wrote:
> I would like to, but as far as I know the only testcase possible is below,
> and as far as I know there is no possibility to use dg-error for runtime
> exceptions(Sorry, if I'm wrong). There are only 2 versions of the flag
> exception or no
On Mon, Jun 12, 2017 at 3:38 AM, Segher Boessenkool
wrote:
> On Sun, Jun 11, 2017 at 07:38:04PM -0700, Ian Lance Taylor wrote:
>> On Sun, Jun 11, 2017 at 4:40 AM, Segher Boessenkool
>> wrote:
>> >
>> > The new split-1.c testcase fails on targets that do not support split
>> > stack (like 32-bit P
On Mon, Jun 12, 2017 at 09:08:00AM -0700, H.J. Lu wrote:
> On Mon, Jun 12, 2017 at 9:06 AM, Koval, Julia wrote:
> > I would like to, but as far as I know the only testcase possible is below,
> > and as far as I know there is no possibility to use dg-error for runtime
> > exceptions(Sorry, if I'm
Richard Biener writes:
> On Mon, 12 Jun 2017, Tamar Christina wrote:
>> Hi All,
>>
>> this patch implements a optimization rewriting
>>
>> x * copysign (1.0, y) and
>> x * copysign (-1.0, y)
>>
>> to:
>>
>> x ^ (y & (1 << sign_bit_position))
>>
>> This is done by creating a special builtin
Catching the exception and calling std::terminate() prevents a useful
backtrace. Letting the runtime call terminate because of an unhandled
exception gives a backtrace showing the site of the throw.
PR libstdc++/55917
* src/c++11/thread.cc (execute_native_thread_routine): Remove
I'm so sorry, but I really don't get it. The right result of the test is:
Floating point exception (core dumped). The wrong result of the test is: nan(no
exception). If I get an exception(which is right) - the test is failed anyway.
The exception is raised in one instruction, I can't get any int
On Mon, 12 Jun 2017, Tamar Christina wrote:
> x * copysign (1.0, y) and
> x * copysign (-1.0, y)
>
> to:
>
> x ^ (y & (1 << sign_bit_position))
Note that this needs to be disabled for -fsignaling-nans, as if x is a
signaling NaN, the multiplication converts it to a quiet NaN and raises
"inv
On Jun 10, 2017, at 12:57 AM, Tom de Vries wrote:
>
> one thing that has bothered me on a regular basis is the inability to spread
> long dejagnu directives over multiple lines.
I'm not terribly in favor of this. I'd like to retain the ability to grep and
sed single line things. It makes exp
On 06/06/2017 03:49 AM, Torsten Duwe wrote:
On Sun, Jun 04, 2017 at 08:12:49PM -0600, Sandra Loosemore wrote:
On 05/29/2017 04:29 AM, Maxim Kuvyrkov wrote:
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 65308c9d933..6cbb77a8dc4 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invok
Hi,
I was asked by upstream to split the loop distribution patch into small ones.
It is hard because data structure and algorithm are closely coupled together.
Anyway, this is the patch series with smaller patches. Basically I tried to
separate data structure and bug-fix changes apart with one as
Hi,
this is a simple patch skipping distribution if there is no loop at all.
Bootstrap and test on x86_64 and AArch64. Is it OK?
Thanks,
bin
2017-06-07 Bin Cheng
* cfgloop.h (pass_loop_distribution::execute): Skip if no loops.From eb6a795331efde92fd6df1c6e612fb1ffa9f482f Mon Sep 17
Hi,
This simple patch marks distributed loops and skips it in following
distribution.
Bootstrap and test on x86_64 and AArch64. Is it OK?
Thanks,
bin
2017-06-07 Bin Cheng
* tree-loop-distribution.c (generate_loops_for_partition): Mark
distributed loops.
(pass_loop_di
Hi,
This simple patch refactors partition merge code and dump information.
Bootstrap and test on x86_64 and AArch64. Is it OK?
Thanks,
bin
2017-06-07 Bin Cheng
* tree-loop-distribution.c (enum fuse_type, fuse_message): New.
(partition_merge_into): New parameter. Dump reason
Hi,
This patch collects and preserves all data references in loop for whole
distribution life time. It will be used afterwards.
Bootstrap and test on x86_64 and AArch64. Is it OK?
Thanks,
bin
2017-06-07 Bin Cheng
* tree-loop-distribution.c (datarefs_vec, datarefs_map): New
g
Hi,
During the work I ran into a latent bug for distributing. For the moment we
sort statements
in dominance order, but that's not enough because basic blocks may be sorted in
reverse order
of execution flow. This results in wrong data dependence direction later.
This patch fixes
the issue by
Hi,
This simple patch computes and preserves loop nest vector for whole distribution
life time. The loop nest will be used multiple times in on-demand data
dependence
computation.
Bootstrap and test on x86_64 and AArch64. Is it OK?
Thanks,
bin
2017-06-07 Bin Cheng
* tree-loop-distr
Hi,
Current primitive cost model merges partitions with data references sharing the
same
base address. I believe it's designed to maximize data reuse in distribution,
but
that should be done by dedicated data reusing algorithm. At this stage of
merging,
we should be conservative and only merge
Hi,
This patch checks and records if partition can be executed in parallel by
looking if there exists data dependence cycles. The information is needed
for distribution because the idea is to distribute parallel type partitions
away from sequential ones. I believe current distribution doesn't wor
Hi,
This patch refactors struct partition for later distribution. It records
bitmap of data references in struct partition rather than vertices' data in
partition dependence graph. It simplifies code as well as enables following
rewriting.
Bootstrap and test on x86_64 and AArch64. Is it OK?
Tha
Hi,
This patch computes and caches data dependence relation in a hash table
so that it can be queried multiple times later for partition dependence
check.
Bootstrap and test on x86_64 and AArch64. Is it OK?
Thanks,
bin
2017-06-07 Bin Cheng
* tree-loop-distribution.c (struct ddr_entry
Hi,
This is the main patch rewriting loop distribution in order to handle hmmer.
It improves loop distribution by versioning loop under runtime alias check
conditions.
As described in comments, the patch basically implements distribution in the
following
steps:
1) Seed partitions with speci
Hi,
For now, loop distribution handles variables used outside of loop as reduction.
This is inaccurate because all partitions contain statement defining induction
vars. Ideally we should factor out scev-propagation as a standalone interface
which can be called when necessary. Before that, this pa
Hello,
when invoking dump_access_tree_1 from gdb, I found out that it
attempts to write to dump_file even though it should dump to its
parameter f. I am about to fix it by the following obvious patch,
after it passes bootstrap and testing on an x86_64-linux (along with
more substantive patches).
Hi,
this is a preparation for a patch fixing PR 80803. Basically, it
moves all checks for a non-null access->first_link before enqueuing a
SRA access into add_access_to_work_queue instead of each caller doing
it.
Moreover, it fixes a thinko in ancestor enqueuing by removing an
erroneous break wh
Hi,
the patch below fixes PR 80803 (and its newer duplicate 81063), it is
essentially a semi-rewrite of propagate_subaccesses_across_link.
When fixing the previous fallout from lazy setting of grp_write flag,
I failed to see that it does not look on sub-accesses of the LHSs at
all and thus does n
On Mon, 12 Jun 2017, Richard Earnshaw (lists) wrote:
> It does. The problem seems to be a generic one in the driver in that
> the rewrite rules are always passed the first instance of -march and not
> the last. Indeed, with the aarch64 compiler, if I write
>
> gcc -mcpu=native -mcpu=cortex-a53
The Go frontend method Type_conversion_expression::do_get_backend was
(in some circumstances) creating a Bexpression for the source
expression of the conversion and then throwing it away before using
it. This patch by Than McIntosh fixes up this method to insure that
the call to get_backend() on t
On Mon, Jun 12, 2017 at 09:16:32AM -0700, Ian Lance Taylor wrote:
> On Mon, Jun 12, 2017 at 3:38 AM, Segher Boessenkool
> wrote:
> > Ah, I see. Could you change the comment then, to say what we are
> > really testing?
>
> Sure. Updated as follows. Committed to mainline.
Thanks!
Segher
On Fri, Jun 09, 2017 at 04:12:25PM -0700, Carl E. Love wrote:
> GCC Maintainers:
>
> On Fri, 2017-06-09 at 16:05 -0500, Segher Boessenkool wrote:
>
> Fixed the various formatting (spaces) issues. Been toying with how to
> write a space checker for patches. Have to take some time to really
> thi
Hi!
This patch adds parsing of in_reduction and task_reduction clauses
and reduction on taskloop. The lowering/expansion and library side is not
done yet.
Committed to gomp-5_0-branch.
2017-06-12 Jakub Jelinek
* tree.def (OMP_TASKGROUP): Add another operand, move next to other
Hi!
OpenMP 5.0 is changing a couple of APIs, so that for pointer arguments where
the pointed array is not modified we use const void * instead of void *.
Committed to gomp-5_0-branch.
2017-06-12 Jakub Jelinek
* omp.h.in (omp_target_is_present, omp_target_disassociate_ptr):
Ch
[Disclaimer: I can't approve any of this :-)]
Iain Buclaw writes:
> 001 - The front-end (DMD) language implementation and license.
> 002 - The front-end (GDC) implementation.
> 003 - The front-end (GDC) changelogs (here be dragons).
> 004 - The front-end (GDC) config, makefile, and manpag
On Mon, 2017-06-12 at 14:09 -0400, Michael Meissner wrote:
> >
> > > > +(define_insn "vsx_xvcvsxwsp"
> > > > + [(set (match_operand:V4SF 0 "vsx_register_operand" "=v")
> > > > + (unspec:V4SF[(match_operand:V4SI 1 "vsx_register_operand" "v")]
> > > > + UNSPEC_VSX_CVSXWSP))
1 - 100 of 119 matches
Mail list logo