Re: [patch libstdc++] Optimize synchronization in std::future if futexes are available.
On 01/18/2015 05:19 AM, Jonathan Wakely wrote: > On 17/01/15 19:51 -0700, Sandra Loosemore wrote: >> On 01/17/2015 03:58 PM, Jonathan Wakely wrote: >>> >>> My fault, this additional chunk is needed alongside the patch I sent >>> earlier: >>> >>> --- a/libstdc++-v3/include/bits/atomic_futex.h >>> +++ b/libstdc++-v3/include/bits/atomic_futex.h >>> @@ -35,7 +35,7 @@ >>> #include >>> #include >>> #include >>> -#if !defined(_GLIBCXX_HAVE_LINUX_FUTEX) >>> +#if ! (defined(_GLIBCXX_HAVE_LINUX_FUTEX) && ATOMIC_INT_LOCK_FREE > 1) >>> #include >>> #include >>> #endif >>> >>> What I sent earlier causes your target to use std::mutex and >>> std::condition_variable, but without the bit above the headers aren't >>> included. >> >> Still no joy: >> /scratch/sandra/arm-fsf2/src/gcc-mainline/libstdc++-v3/src/c++11/futex.cc:45:3: >> error: '__atomic_futex_unsigned_base' has not been declared >> __atomic_futex_unsigned_base::_M_futex_wait_until(unsigned *__addr, >> ^ >> /scratch/sandra/arm-fsf2/src/gcc-mainline/libstdc++-v3/src/c++11/futex.cc:88:3: >> error: '__atomic_futex_unsigned_base' has not been declared >> __atomic_futex_unsigned_base::_M_futex_notify_all(unsigned* __addr) >> ^ > > futex.cc needs the same change ... I am still noticing a problem building a native X86_64 ToT compiler on Ubuntu 12.04.5 LTS. Attached is a patch where the CPP guards in atomic_futex.h are reflected in futex.cc, which fixes my build problem. OK to commit? Thanks, Doug >From 1debe13da342ac30d905e12a9ca243e30cb61870 Mon Sep 17 00:00:00 2001 From: Doug Gilmore Date: Wed, 28 Jan 2015 14:29:14 -0800 Subject: [PATCH] CPP guards in futex.cc should match guards in futex.ii. --- libstdc++-v3/src/c++11/futex.cc |4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/libstdc++-v3/src/c++11/futex.cc b/libstdc++-v3/src/c++11/futex.cc index 1336779..5087483 100644 --- a/libstdc++-v3/src/c++11/futex.cc +++ b/libstdc++-v3/src/c++11/futex.cc @@ -23,6 +23,7 @@ // <http://www.gnu.org/licenses/>. #include +#if defined(_GLIBCXX_HAS_GTHREADS) && defined(_GLIBCXX_USE_C99_STDINT_TR1) #if defined(_GLIBCXX_HAVE_LINUX_FUTEX) && ATOMIC_INT_LOCK_FREE > 1 #include #include @@ -93,4 +94,5 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION } } -#endif +#endif // _GLIBCXX_HAVE_LINUX_FUTEX && ATOMIC_INT_LOCK_FREE > 1 +#endif // _GLIBCXX_HAS_GTHREADS && _GLIBCXX_USE_C99_STDINT_TR1 -- 1.7.9.5
[PATCH, MIPS] Target flag and build option to disable indexed memory OPs.
I recently bisected PR78176 to problems introduced with r21650. Given the short time until the release, we would like to provide a target flag and build option to avoid the bug until we are able to resolve the problem with the commit. Note that as Matthew Fortune has mentioned in the PR: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78176#c5 the problem could also be addressed by updates to the Linux kernel since the problem is only exposed by running MIPS 32-bit binaries on 64-bit kernels. Bootstrapped on X86_64, regression tested on X86_64 and MIPS. OK to commit? Thanks, Doug >From 2a6b11b30ff335ea8e669ae8d3f1bd531ac5b8d3 Mon Sep 17 00:00:00 2001 From: Doug Gilmore Date: Wed, 11 Jan 2017 16:49:27 -0800 Subject: [PATCH] [MIPS] PR target/78176 add -mindexed-load-store. PR target/78176 * config.gcc (supported_defaults): Add indexed-load-store. (with_indexed_load_store): Add validation. (all_defaults): Add indexed-load-store. * config/mips/mips.opt (mindexed-load-store): New option. * gcc/config/mips/mips.h (OPTION_DEFAULT_SPECS): Add a default for mindexed-load-store. ISA_HAS_LXC1_SXC1 gate with mips_indexed_load_store. * gcc/doc/invoke.texi (-mindexed-load-store): Document the new option. * doc/install.texi (--with-indexed-load-store): Document the new option. --- gcc/config.gcc | 19 +-- gcc/config/mips/mips.h | 6 -- gcc/config/mips/mips.opt | 4 gcc/doc/install.texi | 8 gcc/doc/invoke.texi | 6 ++ 5 files changed, 39 insertions(+), 4 deletions(-) diff --git a/gcc/config.gcc b/gcc/config.gcc index 7c27546..e712599 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -3940,7 +3940,7 @@ case "${target}" in ;; mips*-*-*) - supported_defaults="abi arch arch_32 arch_64 float fpu nan fp_32 odd_spreg_32 tune tune_32 tune_64 divide llsc mips-plt synci" + supported_defaults="abi arch arch_32 arch_64 float fpu nan fp_32 odd_spreg_32 tune tune_32 tune_64 divide llsc mips-plt synci indexed-load-store" case ${with_float} in "" | soft | hard) @@ -4063,6 +4063,21 @@ case "${target}" in exit 1 ;; esac + + case ${with_indexed_load_store} in + yes) + with_indexed_load_store=indexed-load-store + ;; + no) + with_indexed_load_store=no-indexed-load-store + ;; + "") + ;; + *) + echo "Unknown indexed-load-store type used in --with-indexed-load-store" 1>&2 + exit 1 + ;; + esac ;; nds32*-*-*) @@ -4496,7 +4511,7 @@ case ${target} in esac t= -all_defaults="abi cpu cpu_32 cpu_64 arch arch_32 arch_64 tune tune_32 tune_64 schedule float mode fpu nan fp_32 odd_spreg_32 divide llsc mips-plt synci tls" +all_defaults="abi cpu cpu_32 cpu_64 arch arch_32 arch_64 tune tune_32 tune_64 schedule float mode fpu nan fp_32 odd_spreg_32 divide llsc mips-plt synci tls indexed-load-store" for option in $all_defaults do eval "val=\$with_"`echo $option | sed s/-/_/g` diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h index f91b43d..6d2aa9a 100644 --- a/gcc/config/mips/mips.h +++ b/gcc/config/mips/mips.h @@ -866,7 +866,8 @@ struct mips_cpu_info { {"divide", "%{!mdivide-traps:%{!mdivide-breaks:-mdivide-%(VALUE)}}" }, \ {"llsc", "%{!mllsc:%{!mno-llsc:-m%(VALUE)}}" }, \ {"mips-plt", "%{!mplt:%{!mno-plt:-m%(VALUE)}}" }, \ - {"synci", "%{!msynci:%{!mno-synci:-m%(VALUE)}}" } + {"synci", "%{!msynci:%{!mno-synci:-m%(VALUE)}}" }, \ + {"indexed-load-store", "%{!mindexed-load-store:%{!mno-indexed-load-store:-m%(VALUE)}}" } \ /* A spec that infers the: -mnan=2008 setting from a -mips argument, @@ -1030,7 +1031,8 @@ struct mips_cpu_info { /* ISA has floating-point indexed load and store instructions (LWXC1, LDXC1, SWXC1 and SDXC1). */ -#define ISA_HAS_LXC1_SXC1 ISA_HAS_FP4 +#define ISA_HAS_LXC1_SXC1 (ISA_HAS_FP4\ + && mips_indexed_load_store) /* ISA has paired-single instructions. */ #define ISA_HAS_PAIRED_SINGLE ((ISA_MIPS64\ diff --git a/gcc/config/mips/mips.opt b/gcc/config/mips/mips.opt index 2559649..ae1e4cf 100644 --- a/gcc/config/mips/mips.opt +++ b/gcc/config/mips/mips.opt @@ -388,6 +388,10 @@ mlra Target Report Var(mips_lra_flag) Init(1) Save Use LRA instead of reload. +mindexed-load-store +Target Report Var(mips_indexed_load_store) Init(1) +Use index memory Ops where applicable. + mtune= Target RejectNegative Joined Var(mips_tune_option) ToLower Enum(mips_arch_opt_value) -mtune=PROCESSOR Optimize the output for PROCESSOR. diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi index 4958773..ff91879 100644 --- a/gcc/doc/install.texi +++ b/gcc/doc/install.texi @@ -1371,6 +1371,14 @@ On MIPS targets, make @option{-msynci} the default when no On MIPS targets, make @option{-mno-synci} the default when no @option{-
Re: [PATCH, MIPS] Target flag and build option to disable indexed memory OPs.
On 01/17/2017 05:41 AM, Moore, Catherine wrote: > > >> ... >> Having thought further I agree we can safely ignore DSP indexed load >> and micromips LWXS on >> the basis that DSP code will not run on a MIPS64 processor anyway (at >> least none that I >> know of) so the issue cannot occur and similarly for microMIPS, there >> are no 64-bit cores. >> >> Restricting to just LWXC1/SWXC1/LDXC1/SDXC1 is therefore fine but >> we should reflect >> that in option names then. >> >> --with-lxc1-sxc1 --without-lxc1-sxc1 >> -mlxc1-sxc1 >> >> These names reflect the internal macro that controls availability of >> these instructions. >> >> Macro name: __mips_no_lxc1_sxc1 >> Defined when !ISA_HAS_LXC1_SXC1 so would be present even when >> targeting a core that >> doesn't have the instructions anyway. >> >> Any refinements to this Catherine? >> > No. This plan looks good. > Sounds good, I'll update the patch accordingly. BTW, if we did guard all of the indexed memory OPs with a flag there would be ~150 tests to clean up when configuring with indexed memory OPs disabled. When I tested with indexed memory OPs disabled with the original patch, there were no additional regressions. Also I'll be updating the bug report with my current take on what went wrong with r216501. Thanks, Doug
Re: [PATCH, MIPS] Target flag and build option to disable indexed memory OPs.
On 01/17/2017 05:41 AM, Moore, Catherine wrote: > > >> -Original Message- >> From: Matthew Fortune [mailto:matthew.fort...@imgtec.com] >> Sent: Tuesday, January 17, 2017 4:35 AM >> ... >> Thanks for the comments. >> >> Having thought further I agree we can safely ignore DSP indexed load >> and micromips LWXS on >> the basis that DSP code will not run on a MIPS64 processor anyway (at >> least none that I >> know of) so the issue cannot occur and similarly for microMIPS, there >> are no 64-bit cores. >> >> Restricting to just LWXC1/SWXC1/LDXC1/SDXC1 is therefore fine but >> we should reflect >> that in option names then. >> >> --with-lxc1-sxc1 --without-lxc1-sxc1 >> -mlxc1-sxc1 >> >> These names reflect the internal macro that controls availability of >> these instructions. >> >> Macro name: __mips_no_lxc1_sxc1 >> Defined when !ISA_HAS_LXC1_SXC1 so would be present even when >> targeting a core that >> doesn't have the instructions anyway. >> >> Any refinements to this Catherine? >> > No. This plan looks good. > Hi Everyone, I updated the patch accordingly. OK to commit? Thanks, Doug >From 5aa6e7b837a281651ac1c6c58291c96d6ff25c53 Mon Sep 17 00:00:00 2001 From: Doug Gilmore Date: Wed, 11 Jan 2017 16:49:27 -0800 Subject: [PATCH] [MIPS] PR target/78176 add -mlxc1-sxc1. PR target/78176 * config.gcc (supported_defaults): Add lxc1-sxc1. (with_lxc1_sxc1): Add validation. (all_defaults): Add lxc1-sxc1. * config/mips/mips.opt (mlxc1-sxc1): New option. * gcc/config/mips/mips.h (OPTION_DEFAULT_SPECS): Add a default for mlxc1-sxc1. (TARGET_CPU_CPP_BUILTINS) Add builtin_define for __mips_no_lxc1_sxc1. ISA_HAS_LXC1_SXC1 gate with mips_lxc1_sxc1. * gcc/doc/invoke.texi (-mlxc1-sxc1): Document the new option. * doc/install.texi (--with-lxc1-sxc1): Document the new option. --- gcc/config.gcc | 19 +-- gcc/config/mips/mips.h | 8 ++-- gcc/config/mips/mips.opt | 4 gcc/doc/install.texi | 8 gcc/doc/invoke.texi | 6 ++ 5 files changed, 41 insertions(+), 4 deletions(-) diff --git a/gcc/config.gcc b/gcc/config.gcc index 7c27546..913e5c2 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -3940,7 +3940,7 @@ case "${target}" in ;; mips*-*-*) - supported_defaults="abi arch arch_32 arch_64 float fpu nan fp_32 odd_spreg_32 tune tune_32 tune_64 divide llsc mips-plt synci" + supported_defaults="abi arch arch_32 arch_64 float fpu nan fp_32 odd_spreg_32 tune tune_32 tune_64 divide llsc mips-plt synci lxc1-sxc1" case ${with_float} in "" | soft | hard) @@ -4063,6 +4063,21 @@ case "${target}" in exit 1 ;; esac + + case ${with_lxc1_sxc1} in + yes) + with_lxc1_sxc1=lxc1-sxc1 + ;; + no) + with_lxc1_sxc1=no-lxc1-sxc1 + ;; + "") + ;; + *) + echo "Unknown lxc1-sxc1 type used in --with-lxc1-sxc1" 1>&2 + exit 1 + ;; + esac ;; nds32*-*-*) @@ -4496,7 +4511,7 @@ case ${target} in esac t= -all_defaults="abi cpu cpu_32 cpu_64 arch arch_32 arch_64 tune tune_32 tune_64 schedule float mode fpu nan fp_32 odd_spreg_32 divide llsc mips-plt synci tls" +all_defaults="abi cpu cpu_32 cpu_64 arch arch_32 arch_64 tune tune_32 tune_64 schedule float mode fpu nan fp_32 odd_spreg_32 divide llsc mips-plt synci tls lxc1-sxc1" for option in $all_defaults do eval "val=\$with_"`echo $option | sed s/-/_/g` diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h index f91b43d..6d9a7aa 100644 --- a/gcc/config/mips/mips.h +++ b/gcc/config/mips/mips.h @@ -637,6 +637,8 @@ struct mips_cpu_info { \ if (TARGET_CACHE_BUILTIN) \ builtin_define ("__GCC_HAVE_BUILTIN_MIPS_CACHE"); \ + if (!ISA_HAS_LXC1_SXC1) \ + builtin_define ("__mips_no_lxc1_sxc1");\ } \ while (0) @@ -866,7 +868,8 @@ struct mips_cpu_info { {"divide", "%{!mdivide-traps:%{!mdivide-breaks:-mdivide-%(VALUE)}}" }, \ {"llsc", "%{!mllsc:%{!mno-llsc:-m%(VALUE)}}" }, \ {"mips-plt", "%{!mplt:%{!mno-plt:-m%(VALUE)}}" }, \ - {"synci", "%{!msynci:%{!mno-synci:-m%(VALUE)}}" } + {"synci", "%{!msynci:%{!mno-synci:-m%(VALUE)}}" }, \ + {"lxc1-sxc1", "%{!mlxc1-sxc1:%{!mno-lxc1-sxc1:-m%(VALUE)}}" } \ /* A spec that infers the: -mnan=2008 setting from a -mips argument, @@ -1030,7 +1033,8 @@ struct mips_cpu_info { /* ISA has floating-point indexed load and store instructions (LWXC1, LDXC1, SWXC1 and SDXC1). */ -#define ISA_HAS_LXC1_SXC1 ISA_HAS_FP4 +#define ISA_HAS_LXC1_SXC1 (ISA_HAS_FP4\ + && mips_lxc1_sxc1) /* ISA has pair
RE: [PATCH] Fix PR tree-optimization/77654
Add missing attachment. Doug gcc/ PR tree-optimization/77654 * tree-ssa-alias.c (issue_prefetch_ref): Add call to duplicate_ssa_name_ptr_info. From ec9069b7b7b07d5fda9c04aaa9b385fba89a6e16 Mon Sep 17 00:00:00 2001 From: Doug Gilmore Date: Tue, 6 Sep 2016 10:18:42 -0700 Subject: [PATCH 2/2] Ensure points-to information is maintained for prefetch addresses. --- gcc/tree-ssa-loop-prefetch.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/gcc/tree-ssa-loop-prefetch.c b/gcc/tree-ssa-loop-prefetch.c index 26cf0a0..10ade186 100644 --- a/gcc/tree-ssa-loop-prefetch.c +++ b/gcc/tree-ssa-loop-prefetch.c @@ -39,6 +39,7 @@ along with GCC; see the file COPYING3. If not see #include "tree-ssa-loop-manip.h" #include "tree-ssa-loop-niter.h" #include "tree-ssa-loop.h" +#include "ssa.h" #include "tree-into-ssa.h" #include "cfgloop.h" #include "tree-scalar-evolution.h" @@ -1160,6 +1161,16 @@ issue_prefetch_ref (struct mem_ref *ref, unsigned unroll_factor, unsigned ahead) addr = force_gimple_operand_gsi (&bsi, unshare_expr (addr), true, NULL, true, GSI_SAME_STMT); } + + if (POINTER_TYPE_P (TREE_TYPE (addr_base))) + { + duplicate_ssa_name_ptr_info (addr, SSA_NAME_PTR_INFO (addr_base)); + /* As this isn't a plain copy we have to reset alignment + information. */ + if (SSA_NAME_PTR_INFO (addr)) + mark_ptr_info_alignment_unknown (SSA_NAME_PTR_INFO (addr)); + } + /* Create the prefetch instruction. */ prefetch = gimple_build_call (builtin_decl_explicit (BUILT_IN_PREFETCH), 3, addr, write_p, local); -- 1.7.9.5
RE: [PATCH] Fix PR tree-optimization/77654
It looks like the original message was dropped, resending. Doug From: Doug Gilmore Sent: Tuesday, September 20, 2016 2:12 PM To: gcc-patches@gcc.gnu.org; rgue...@gcc.gnu.org Subject: [PATCH] Fix PR tree-optimization/77654 From: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77654 Richard Biener wrote: > Looks good though addr_base should always be a pointer but it might > not be an SSA name so better check that... I took a look at other situations where duplicate_ssa_name_ptr_info() is called and found that there are no checks for the SSA name since that check is done in duplicate_ssa_name_ptr_info(). Do you still want the additional check added? Also does it make sense to make a test case for this? I was thinking of making the following change to: diff --git a/gcc/tree-ssa-alias.c b/gcc/tree-ssa-alias.c index 8051a66..b799c43 100644 --- a/gcc/tree-ssa-alias.c +++ b/gcc/tree-ssa-alias.c @@ -296,7 +296,16 @@ ptr_derefs_may_alias_p (tree ptr1, tree ptr2) pi1 = SSA_NAME_PTR_INFO (ptr1); pi2 = SSA_NAME_PTR_INFO (ptr2); if (!pi1 || !pi2) -return true; +{ + if (dump_file) + { + if (! pi1) + fprintf (dump_file, "%s pi1 is NULL\n", __FUNCTION__); + if (! pi2) + fprintf (dump_file, "%s pi2 is NULL\n", __FUNCTION__); + } + return true; +} Then when compiling the test case, we could scan for the RE "pi. is NULL" in the dump file created by compiling with -fdump-rtl-sched2. I attached the original patch. Thanks, Doug gcc/ PR tree-optimization/77654 * tree-ssa-alias.c (issue_prefetch_ref): Add call to duplicate_ssa_name_ptr_info.
RE: [PATCH] Fix PR tree-optimization/77654
> From: Richard Biener [rguent...@suse.de] > Sent: Wednesday, September 21, 2016 12:48 AM > To: Doug Gilmore > Cc: gcc-patches@gcc.gnu.org; rgue...@gcc.gnu.org > Subject: RE: [PATCH] Fix PR tree-optimization/77654 > > On Tue, 20 Sep 2016, Doug Gilmore wrote: > > > It looks like the original message was dropped, resending. > > > > Doug > > ________ > > From: Doug Gilmore > > Sent: Tuesday, September 20, 2016 2:12 PM > > To: gcc-patches@gcc.gnu.org; rgue...@gcc.gnu.org > > Subject: [PATCH] Fix PR tree-optimization/77654 > > > > From: > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77654 > > > > Richard Biener wrote: > > > Looks good though addr_base should always be a pointer but it might > > > not be an SSA name so better check that... > > > > I took a look at other situations where duplicate_ssa_name_ptr_info() > > is called and found that there are no checks for the SSA name since > > that check is done in duplicate_ssa_name_ptr_info(). Do you still > > want the additional check added? > > It checks for !ptr_info but it requires NAME to be an SSA name. > > From the attachment in bugzilla (the attachment didn't make it > here) > > > + > + if (POINTER_TYPE_P (TREE_TYPE (addr_base))) > + { > + duplicate_ssa_name_ptr_info (addr, SSA_NAME_PTR_INFO (addr_base)); > + /* As this isn't a plain copy we have to reset alignment > +information. */ > + if (SSA_NAME_PTR_INFO (addr)) > + mark_ptr_info_alignment_unknown (SSA_NAME_PTR_INFO (addr)); > + } > + > > I was talking about changing the if to > > if (TREE_CODE (addr_base) == SSA_NAME > && TREE_CODE (addr) == SSA_NAME) Sorry I that missed point. I glossed your comment "addr_base should always be a pointer", causing me to go off into the weeds. New patch attached. Thanks, Doug > > because the addresses could be invariant as far as I can see. > > > Also does it make sense to make a test case for this? > > I'm not sure how to easily test this. > > Richard. > > ... From 2d6cb0674ca66b4c5f6e335d73122e03413863e3 Mon Sep 17 00:00:00 2001 From: Doug Gilmore Date: Tue, 6 Sep 2016 10:18:42 -0700 Subject: [PATCH] Ensure points-to information is maintained for prefetch. gcc/ PR tree-optimization/77654 * tree-ssa-alias.c (issue_prefetch_ref): Add call to duplicate_ssa_name_ptr_info. --- gcc/tree-ssa-loop-prefetch.c | 12 1 file changed, 12 insertions(+) diff --git a/gcc/tree-ssa-loop-prefetch.c b/gcc/tree-ssa-loop-prefetch.c index 26cf0a0..d0bd2d3 100644 --- a/gcc/tree-ssa-loop-prefetch.c +++ b/gcc/tree-ssa-loop-prefetch.c @@ -39,6 +39,7 @@ along with GCC; see the file COPYING3. If not see #include "tree-ssa-loop-manip.h" #include "tree-ssa-loop-niter.h" #include "tree-ssa-loop.h" +#include "ssa.h" #include "tree-into-ssa.h" #include "cfgloop.h" #include "tree-scalar-evolution.h" @@ -1160,6 +1161,17 @@ issue_prefetch_ref (struct mem_ref *ref, unsigned unroll_factor, unsigned ahead) addr = force_gimple_operand_gsi (&bsi, unshare_expr (addr), true, NULL, true, GSI_SAME_STMT); } + + if (TREE_CODE (addr_base) == SSA_NAME + && TREE_CODE (addr) == SSA_NAME) + { + duplicate_ssa_name_ptr_info (addr, SSA_NAME_PTR_INFO (addr_base)); + /* As this isn't a plain copy we have to reset alignment + information. */ + if (SSA_NAME_PTR_INFO (addr)) + mark_ptr_info_alignment_unknown (SSA_NAME_PTR_INFO (addr)); + } + /* Create the prefetch instruction. */ prefetch = gimple_build_call (builtin_decl_explicit (BUILT_IN_PREFETCH), 3, addr, write_p, local); -- 1.7.9.5
RE: [PATCH] Fix PR tree-optimization/77654
> From: Richard Biener [rguent...@suse.de] > Sent: Thursday, September 22, 2016 12:43 AM > To: Doug Gilmore > Cc: gcc-patches@gcc.gnu.org; rgue...@gcc.gnu.org > Subject: RE: [PATCH] Fix PR tree-optimization/77654 > > On Wed, 21 Sep 2016, Doug Gilmore wrote: > > ... > > Sorry I that missed point. I glossed your comment "addr_base should > > always be a pointer", causing me to go off into the weeds. > > > > New patch attached. > > Ok if successfully bootstrapped / tested. > > Thanks, > Richard. The change bootstrapped on X86_64 and the several "make check" errors also appeared in latest archived mail message to gcc-testresults. Thanks, Doug > > > ...
RE: [PATCH] Fix PR tree-optimization/77654
> From: Christophe Lyon [christophe.l...@linaro.org] > Sent: Thursday, September 29, 2016 12:17 PM > To: Matthew Fortune > Cc: Doug Gilmore; Richard Biener; gcc-patches@gcc.gnu.org; rgue...@gcc.gnu.org > Subject: Re: [PATCH] Fix PR tree-optimization/77654 > ... > > Since this commit, I've noticed ICE on arm target: > FAIL: gcc.dg/params/blocksort-part.c -O3 --param prefetch-latency=0 > (internal compiler error) > FAIL: gcc.dg/params/blocksort-part.c -O3 --param prefetch-latency=0 > (test for excess errors) > Excess errors: > /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/testsuite/gcc.dg/params/blocksort-part.c:116:6: > internal compiler error: in duplicate > _ssa_name_ptr_info, at tree-ssanames.c:630 > 0xd5a972 duplicate_ssa_name_ptr_info(tree_node*, ptr_info_def*) > /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/tree-ssanames.c:630 > 0xcac0e0 issue_prefetch_ref > /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/tree-ssa-loop-prefetch.c:1168 > 0xcad89f issue_prefetches > /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/tree-ssa-loop-prefetch.c:1195 > 0xcad89f loop_prefetch_arrays > /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/tree-ssa-loop-prefetch.c:1928 > 0xcae722 tree_ssa_prefetch_arrays() > /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/tree-ssa-loop-prefetch.c:1992 > > > --target arm-none-linux-gnueabihf --with-cpu=cortex-a9 > --wihth-fpu=neon-fp16 --with-mode=arm > > I'm not sure the cpu/fpu/mode settings are mandatory but at least in this case > the compiler ICEs. > > Christophe I'll look into this. Doug > > > > (Fixed whitespace/tab issue in the code and incorrect file in changelog) > > > > I can't progress the bug status. Who does that normally? > > > > Matthew
RE: [PATCH] Fix PR tree-optimization/77654
> From: Christophe Lyon [christophe.l...@linaro.org] > Sent: Thursday, September 29, 2016 12:17 PM > To: Matthew Fortune > Cc: Doug Gilmore; Richard Biener; gcc-patches@gcc.gnu.org; rgue...@gcc.gnu.org > Subject: Re: [PATCH] Fix PR tree-optimization/77654 > > On 23 September 2016 at 17:55, Matthew Fortune > wrote: > > Doug Gilmore writes: > >> > From: Richard Biener [rguent...@suse.de] > >> > Sent: Thursday, September 22, 2016 12:43 AM > >> > To: Doug Gilmore > >> > Cc: gcc-patches@gcc.gnu.org; rgue...@gcc.gnu.org > >> > Subject: RE: [PATCH] Fix PR tree-optimization/77654 > >> > > >> > On Wed, 21 Sep 2016, Doug Gilmore wrote: > >> > > >> > ... > >> > > Sorry I that missed point. I glossed your comment "addr_base should > >> > > always be a pointer", causing me to go off into the weeds. > >> > > > >> > > New patch attached. > >> > > >> > Ok if successfully bootstrapped / tested. > >> > > >> > Thanks, > >> > Richard. > >> The change bootstrapped on X86_64 and the several "make check" errors > >> also appeared in latest archived mail message to gcc-testresults. > > > > Committed as r240439. > > > > Since this commit, I've noticed ICE on arm target: > FAIL: gcc.dg/params/blocksort-part.c -O3 --param prefetch-latency=0 > (internal compiler error) > FAIL: gcc.dg/params/blocksort-part.c -O3 --param prefetch-latency=0 > (test for excess errors) > Excess errors: > /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/testsuite/gcc.dg/params/blocksort-part.c:116:6: > internal compiler error: in duplicate > _ssa_name_ptr_info, at tree-ssanames.c:630 > ... Hi Christophe, I filed PR77808, will send out a fix shortly. BTW, I missed this in regression testing since -fprefetch-loop-arrays is needed to expose the problem. Are you setting this as the default in your compiler build? Thanks, Doug
Fix PR tree-optimization/77808, ICE in duplicate_ssa_name_ptr_info, at tree-ssanames.c:630 starting with r240439
My commit r240439 didn't handle the situation where setting --param prefetch-latency=0 can cause the prefetch address to be the same as the original address. In this case, no copying of points-to information should be done. Bootstrapped and regression tested on x86_64-linux, ok for trunk? Doug From d7a115e12856f2bcd4cefab38378f5d947c7d96a Mon Sep 17 00:00:00 2001 From: Doug Gilmore Date: Fri, 30 Sep 2016 11:28:20 -0700 Subject: [PATCH] Fix PR tree-optimization/77808 gcc/ PR tree-optimization/77808 * tree-ssa-loop-prefetch.c (issue_prefetch_ref): Fix problem exposed by specifying --param prefetch-latency=0. gcc/testsuite PR tree-optimization/77808 * gcc.dg/tree-ssa/pr77808.c: New testcase. --- gcc/testsuite/gcc.dg/tree-ssa/pr77808.c | 11 +++ gcc/tree-ssa-loop-prefetch.c| 3 ++- 2 files changed, 13 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr77808.c diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr77808.c b/gcc/testsuite/gcc.dg/tree-ssa/pr77808.c new file mode 100644 index 000..85393f4 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr77808.c @@ -0,0 +1,11 @@ +/* PR tree-optimization/77808 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -fprefetch-loop-arrays --param prefetch-latency=0" } */ + +void daxpy(int n, double da, double * __restrict dx, double * __restrict dy) +{ + int i; + + for (i = 0;i < n; i++) +dy[i] = dy[i] + da*dx[i]; +} diff --git a/gcc/tree-ssa-loop-prefetch.c b/gcc/tree-ssa-loop-prefetch.c index 056815d..43ee85a 100644 --- a/gcc/tree-ssa-loop-prefetch.c +++ b/gcc/tree-ssa-loop-prefetch.c @@ -1162,7 +1162,8 @@ issue_prefetch_ref (struct mem_ref *ref, unsigned unroll_factor, unsigned ahead) NULL, true, GSI_SAME_STMT); } - if (TREE_CODE (addr_base) == SSA_NAME + if (addr_base != addr + && TREE_CODE (addr_base) == SSA_NAME && TREE_CODE (addr) == SSA_NAME) { duplicate_ssa_name_ptr_info (addr, SSA_NAME_PTR_INFO (addr_base)); -- 1.9.1
Fix PR tree-optimization/77808, ICE in duplicate_ssa_name_ptr_info, at tree-ssanames.c:630 starting with r240439
It looks like my original message was dropped, resending -- Doug From: Doug Gilmore Sent: Friday, September 30, 2016 6:35 PM To: gcc-patches@gcc.gnu.org; Christophe Lyon Subject: Fix PR tree-optimization/77808, ICE in duplicate_ssa_name_ptr_info, at tree-ssanames.c:630 starting with r240439 My commit r240439 didn't handle the situation where setting --param prefetch-latency=0 can cause the prefetch address to be the same as the original address. In this case, no copying of points-to information should be done. Bootstrapped and regression tested on x86_64-linux, ok for trunk? Doug From d7a115e12856f2bcd4cefab38378f5d947c7d96a Mon Sep 17 00:00:00 2001 From: Doug Gilmore Date: Fri, 30 Sep 2016 11:28:20 -0700 Subject: [PATCH] Fix PR tree-optimization/77808 gcc/ PR tree-optimization/77808 * tree-ssa-loop-prefetch.c (issue_prefetch_ref): Fix problem exposed by specifying --param prefetch-latency=0. gcc/testsuite PR tree-optimization/77808 * gcc.dg/tree-ssa/pr77808.c: New testcase. --- gcc/testsuite/gcc.dg/tree-ssa/pr77808.c | 11 +++ gcc/tree-ssa-loop-prefetch.c| 3 ++- 2 files changed, 13 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr77808.c diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr77808.c b/gcc/testsuite/gcc.dg/tree-ssa/pr77808.c new file mode 100644 index 000..85393f4 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr77808.c @@ -0,0 +1,11 @@ +/* PR tree-optimization/77808 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -fprefetch-loop-arrays --param prefetch-latency=0" } */ + +void daxpy(int n, double da, double * __restrict dx, double * __restrict dy) +{ + int i; + + for (i = 0;i < n; i++) +dy[i] = dy[i] + da*dx[i]; +} diff --git a/gcc/tree-ssa-loop-prefetch.c b/gcc/tree-ssa-loop-prefetch.c index 056815d..43ee85a 100644 --- a/gcc/tree-ssa-loop-prefetch.c +++ b/gcc/tree-ssa-loop-prefetch.c @@ -1162,7 +1162,8 @@ issue_prefetch_ref (struct mem_ref *ref, unsigned unroll_factor, unsigned ahead) NULL, true, GSI_SAME_STMT); } - if (TREE_CODE (addr_base) == SSA_NAME + if (addr_base != addr + && TREE_CODE (addr_base) == SSA_NAME && TREE_CODE (addr) == SSA_NAME) { duplicate_ssa_name_ptr_info (addr, SSA_NAME_PTR_INFO (addr_base)); -- 1.9.1
Fix PR tree-optimization/77808, ICE in duplicate_ssa_name_ptr_info, at tree-ssanames.c:630 starting with r240439
It looks like my email being sent from my company is being dropped, re sending via my gmail account. Doug From: Doug Gilmore Sent: Friday, September 30, 2016 6:35 PM To: gcc-patches@gcc.gnu.org; Christophe Lyon Subject: Fix PR tree-optimization/77808, ICE in duplicate_ssa_name_ptr_info, at tree-ssanames.c:630 starting with r240439 My commit r240439 didn't handle the situation where setting --param prefetch-latency=0 can cause the prefetch address to be the same as the original address. In this case, no copying of points-to information should be done. Bootstrapped and regression tested on x86_64-linux, ok for trunk? Doug 0001-Fix-PR-tree-optimization-77808.patch Description: Binary data
RE: Fix PR tree-optimization/77808, ICE in duplicate_ssa_name_ptr_info, at tree-ssanames.c:630 starting with r240439
Hi Christophe, > From: Christophe Lyon [christophe.l...@linaro.org] > Sent: Saturday, October 01, 2016 7:57 AM > To: Doug Gilmore > Cc: gcc-patches@gcc.gnu.org > Subject: Re: Fix PR tree-optimization/77808, ICE in > duplicate_ssa_name_ptr_info, at tree-ssanames.c:630 starting with r240439 > > Hi Doug, > > ... > I can confirm that your patch fixes the ICE I was seeing. > > However, the new testcase does not pass on low end > architectures: > cc1: warning: -fprefetch-loop-arrays not supported for this target > (try -march switches) > > Can you add a guard? > > Thanks, > > Christophe I updated the test to only run on X86, MIPS and AARCH64. Is that OK? Thanks, Doug 0001-Fix-PR-tree-optimization-77808.patch Description: 0001-Fix-PR-tree-optimization-77808.patch
RE: Fix PR tree-optimization/77808, ICE in duplicate_ssa_name_ptr_info, at tree-ssanames.c:630 starting with r240439
>From: Christophe Lyon [christophe.l...@linaro.org] >Sent: Monday, October 03, 2016 12:05 AM >To: Doug Gilmore >Cc: gcc-patches@gcc.gnu.org >Subject: Re: Fix PR tree-optimization/77808, ICE in >duplicate_ssa_name_ptr_info, at tree-ssanames.c:630 starting with r240439 > >On 2 October 2016 at 23:05, Doug Gilmore wrote: >> Hi Christophe, >> >>> From: Christophe Lyon [christophe.l...@linaro.org] >>> Sent: Saturday, October 01, 2016 7:57 AM >>> To: Doug Gilmore >>> Cc: gcc-patches@gcc.gnu.org >>> Subject: Re: Fix PR tree-optimization/77808, ICE in >>> duplicate_ssa_name_ptr_info, at tree-ssanames.c:630 starting with r240439 >>> >>> Hi Doug, >>> >>> ... >>> I can confirm that your patch fixes the ICE I was seeing. >>> >>> However, the new testcase does not pass on low end >>> architectures: >>> cc1: warning: -fprefetch-loop-arrays not supported for this target >>> (try -march switches) >>> >>> Can you add a guard? >>> >>> Thanks, >>> >>> Christophe >> I updated the test to only run on X86, MIPS and AARCH64. Is that OK? >> > >I'm afraid not. > >The ICE occurred on some arm targets. By "low end" I meant armv5t for >example, as opposed to armv7t. >Is there a suitable effective target? I'll need to investigate that. BTW, gcc.dg/pr53550.c contains: /* PR tree-optimization/53550 */ /* { dg-do compile } */ /* { dg-options "-O2 -fprefetch-loop-arrays -w" } */ int * foo (int *x) { int *a = x + 10, *b = x, *c = a; while (b != c) *--c = *b++; return x; } Is it also failing on armv5t? I suppose it would. Thanks, Doug > >Thanks, > >Christophe > >> Thanks, >> >> Doug
RE: Fix PR tree-optimization/77808, ICE in duplicate_ssa_name_ptr_info, at tree-ssanames.c:630 starting with r240439
>From: Christophe Lyon [christophe.l...@linaro.org] >Sent: Monday, October 03, 2016 11:23 AM >To: Doug Gilmore >Cc: gcc-patches@gcc.gnu.org >Subject: Re: Fix PR tree-optimization/77808, ICE in >duplicate_ssa_name_ptr_info, at tree-ssanames.c:630 starting with r240439 > >On 3 October 2016 at 18:07, Doug Gilmore wrote: >>>From: Christophe Lyon [christophe.l...@linaro.org] >>>Sent: Monday, October 03, 2016 12:05 AM >>>To: Doug Gilmore >>>Cc: gcc-patches@gcc.gnu.org >>>Subject: Re: Fix PR tree-optimization/77808, ICE in >>>duplicate_ssa_name_ptr_info, at tree-ssanames.c:630 starting with r240439 >>> >>>On 2 October 2016 at 23:05, Doug Gilmore wrote: >>>> Hi Christophe, >>>> >>>>> From: Christophe Lyon [christophe.l...@linaro.org] >>>>> Sent: Saturday, October 01, 2016 7:57 AM >>>>> To: Doug Gilmore >>>>> Cc: gcc-patches@gcc.gnu.org >>>>> Subject: Re: Fix PR tree-optimization/77808, ICE in >>>>> duplicate_ssa_name_ptr_info, at tree-ssanames.c:630 starting with r240439 >>>>> >>>>> Hi Doug, >>>>> >>>>> ... >>>>> I can confirm that your patch fixes the ICE I was seeing. >>>>> >>>>> However, the new testcase does not pass on low end >>>>> architectures: >>>>> cc1: warning: -fprefetch-loop-arrays not supported for this target >>>>> (try -march switches) >>>>> >>>>> Can you add a guard? >>>>> >>>>> Thanks, >>>>> >>>>> Christophe >>>> I updated the test to only run on X86, MIPS and AARCH64. Is that OK? >>>> >>> >>>I'm afraid not. >>> >>>The ICE occurred on some arm targets. By "low end" I meant armv5t for >>>example, as opposed to armv7t. >>>Is there a suitable effective target? >> I'll need to investigate that. BTW, gcc.dg/pr53550.c contains: >> /* PR tree-optimization/53550 */ >> /* { dg-do compile } */ >> /* { dg-options "-O2 -fprefetch-loop-arrays -w" } */ >> >> int * >> foo (int *x) >> { >> int *a = x + 10, *b = x, *c = a; >> while (b != c) >> *--c = *b++; >> return x; >> } >> >> Is it also failing on armv5t? I suppose it would. >> >It doesn't, but that's probably thanks to -w Sounds like we don't need add guards then, it is just a matter of adding -w to the command line. Does that work for you? Thanks, Doug > >Christophe > >> Thanks, >> >> Doug >>> >>>Thanks, >>> >>>Christophe >>> >>>> Thanks, >>>> >>>> Doug
Re: [PATCH PR68030/PR69710][RFC]Introduce a simple local CSE interface and use it in vectorizer
I haven't seen any followups to this discussion of Bin's patch to PR68303 and PR69710, the patch submission: http://gcc.gnu.org/ml/gcc-patches/2016-05/msg02000.html Discussion: http://gcc.gnu.org/ml/gcc-patches/2016-07/msg00761.html http://gcc.gnu.org/ml/gcc-patches/2016-06/msg01551.html http://gcc.gnu.org/ml/gcc-patches/2016-06/msg00372.html http://gcc.gnu.org/ml/gcc-patches/2016-06/msg01550.html http://gcc.gnu.org/ml/gcc-patches/2016-05/msg02162.html http://gcc.gnu.org/ml/gcc-patches/2016-05/msg02155.html http://gcc.gnu.org/ml/gcc-patches/2016-05/msg02154.html so I did some investigation to get a better understanding of the issues involved. On 07/13/2016 01:59 PM, Jeff Law wrote: > On 05/25/2016 05:22 AM, Bin Cheng wrote: >> Hi, As analyzed in PR68303 and PR69710, vectorizer generates >> duplicated computations in loop's pre-header basic block when >> creating base address for vector reference to the same memory object. > Not a huge surprise. Loop optimizations generally have a tendency > to create and/or expose CSE opportunities. Unrolling is a common > culprit, there's certainly the possibility for header duplication, > code motions and IV rewriting to also expose/create redundant code. > > ... > > But, 1) It >> doesn't fix all the problem on x86_64. Root cause is computation for >> base address of the first reference is somehow moved outside of >> loop's pre-header, local CSE can't help in this case. > That's a bid odd -- have you investigated why this is outside the loop header? > ... I didn't look at this issue per se, but I did try running DOM between autovectorization and IVS. Just running DOM had little effect, what was crucial was adding the change Bin mentioned in his original message: Besides CSE issue, this patch also re-associates address expressions in vect_create_addr_base_for_vector_ref, specifically, it splits constant offset and adds it back near the expression root in IR. This is necessary because GCC only handles re-association for commutative operators in CSE. I attached a patch for these changes only. These are the important modifications that address the some of the IVS related issues exposed by PR68303. I found that adding the CSE change (or calling DOM between autovectorization and IVOPTS) is not needed, and from what I have seen, actually makes the code worse. Applying only the modifications to vect_create_addr_base_for_vector_ref, additional simplifications will be done when induction variables are found (function find_induction_variables). These simplications are indicated by the appearance of lines: Applying pattern match.pd:1056, generic-match.c:11865 in the IVOPS dump file. Now IVOPTs transforms the code so that constants now appear in the computation of the effective addresses for the memory OPs. However the code generated by IVOPTS still uses a separate base register for each memory reference. Later DOM3 transforms the code to use just one base register, which is the form the code needs to be in for the preliminary phase of IVOPTs where "IV uses" associated with memory OPs are placed into groups. At the time of this grouping, checks are done to ensure that for each member of a group the constant offsets don't overflow the immediate fields in actual machine instructions (more in this see * below). Currently it appears that an IV is generated for each memory reference. Instead of generating a new IV for each memory reference, we could try to detect that value of the new IV is just a constant offset of an existing IV and just generate a new temp reflecting that. I haven't worked through what needs to be done to implement that, but for the issue in PR69710 (saxpy example where the same IV should be used for a load and store) is straightforward to implement so since work has already been done in during data dependence analysis to detect this situation. I attached a patch for PR69710 that was bootstrapped and tested on X86_64 without errors. It does appear that it needs more testing, since I did notice SPEC 2006 h264ref produces different results with the patch applied, which I still need to investigate. Doug * Note that when IV uses are grouped, only positive constant offsets constraints are considered. That negative offsets can be used are reflected in the costs of using a different IV than the IV associated with a particular group. Thus once the optimal IV set is found, a different IV may chosen, which causes negative constant offsets to be used. >From 3ea70edb3bf68057c955d2b22204f17bb670f65a Mon Sep 17 00:00:00 2001 From: Doug Gilmore Date: Fri, 4 Nov 2016 18:49:58 -0700 Subject: [PATCH] [PR68030/PR69710] vect_create_addr_base_for_vector_ref changes only fix. This patch include changes noted in Bin's first patch message: https://gcc.gnu.org/ml/gcc-patches/2016-05/msg02000.html
Re: [PATCH PR68030/PR69710][RFC]Introduce a simple local CSE interface and use it in vectorizer
On 11/22/2016 08:07 AM, Bin.Cheng wrote: > On Mon, Nov 21, 2016 at 9:34 PM, Doug Gilmore wrote: >> I haven't seen any followups to this discussion of Bin's patch to >> PR68303 and PR69710, the patch submission: >> http://gcc.gnu.org/ml/gcc-patches/2016-05/msg02000.html >> >> Discussion: >> http://gcc.gnu.org/ml/gcc-patches/2016-07/msg00761.html >> http://gcc.gnu.org/ml/gcc-patches/2016-06/msg01551.html >> http://gcc.gnu.org/ml/gcc-patches/2016-06/msg00372.html >> http://gcc.gnu.org/ml/gcc-patches/2016-06/msg01550.html >> http://gcc.gnu.org/ml/gcc-patches/2016-05/msg02162.html >> http://gcc.gnu.org/ml/gcc-patches/2016-05/msg02155.html >> http://gcc.gnu.org/ml/gcc-patches/2016-05/msg02154.html >> >> >> so I did some investigation to get a better understanding of the >> issues involved. > Hi Doug, > Thanks for looking into this problem. >> >> On 07/13/2016 01:59 PM, Jeff Law wrote: >>> On 05/25/2016 05:22 AM, Bin Cheng wrote: >>>> Hi, As analyzed in PR68303 and PR69710, vectorizer generates >>>> duplicated computations in loop's pre-header basic block when >>>> creating base address for vector reference to the same memory object. >>> Not a huge surprise. Loop optimizations generally have a tendency >>> to create and/or expose CSE opportunities. Unrolling is a common >>> culprit, there's certainly the possibility for header duplication, >>> code motions and IV rewriting to also expose/create redundant code. >>> >>> ... >>> >>> But, 1) It >>>> doesn't fix all the problem on x86_64. Root cause is computation for >>>> base address of the first reference is somehow moved outside of >>>> loop's pre-header, local CSE can't help in this case. >>> That's a bid odd -- have you investigated why this is outside the loop >>> header? >>> ... >> I didn't look at this issue per se, but I did try running DOM between >> autovectorization and IVS. Just running DOM had little effect, what >> was crucial was adding the change Bin mentioned in his original >> message: >> >> Besides CSE issue, this patch also re-associates address >> expressions in vect_create_addr_base_for_vector_ref, specifically, >> it splits constant offset and adds it back near the expression >> root in IR. This is necessary because GCC only handles >> re-association for commutative operators in CSE. >> >> I attached a patch for these changes only. These are the important >> modifications that address the some of the IVS related issues exposed >> by PR68303. I found that adding the CSE change (or calling DOM between >> autovectorization and IVOPTS) is not needed, and from what I have > I checked the code again. As you said, re-association part is important > to enable CSE opportunities, no matter when and which pass handles it. > After re-association, the computation of base addresses are like: > > //preheader > b_1 = g_Input + var_offset_1; > vectp_1 = b_1 + cst_offset_1; > b_2 = g_Input + var_offset_2; > vectp_2 = b_2 + cst_offset_2; > ... > b_n = g_input + var_offset_n; > vectp_n = b_n + cst_offset_n; > > //loop > MEM[vectp_1]; > MEM[vectp_2]; > ... > MEM[vectp_n]; > > In fact, var_offset_1, var_offset_2, ..., var_offset_n are equal to others. > So > the addresses are in the form of "g_Input + var_offset + cst_offset_x" > differing > to each other wrto constant offset. The purpose of CSE is to propagate all > parts of this address to IVOPTs, otherwise IVOPTS only knows IVs as below: > > iv_use_1: {b_1 + cst_offset_1, step}_loop > iv_use_1: {b_2 + cst_offset_2, step}_loop > ... > iv_use_n: {b_n + cst_offset_n, step}_loop > >> seen, actually makes the code worse. >> >> Applying only the modifications to >> vect_create_addr_base_for_vector_ref, additional simplifications will >> be done when induction variables are found (function >> find_induction_variables). These simplications are indicated by the >> appearance of lines: >> >> Applying pattern match.pd:1056, generic-match.c:11865 > This doesn't look related to this problem to me. The simplification of this > problem is CSE, it's not what match.pd does. > >> >> in the IVOPS dump file. Now IVOPTs transforms the code so that >> constants now appear in the computation of the effective addresses for >> the memory OPs. However the code generated by IVOPTS still uses a >> sepa
[patch] update for test ftrapv-1.c
With r213117 we are seeing additional failures while testing a bare-iron build: FAIL: gcc.dg/torture/ftrapv-1.c -O0 (test for excess errors) ... This newly added test does a fork, though it has a guard to prevent testing targets that don't support fork, but the guard needs to be tweaked: diff --git a/gcc/testsuite/gcc.dg/torture/ftrapv-1.c b/gcc/testsuite/gcc.dg/torture/ftrapv-1.c index 4fdccd8..4fee1e1 100644 --- a/gcc/testsuite/gcc.dg/torture/ftrapv-1.c +++ b/gcc/testsuite/gcc.dg/torture/ftrapv-1.c @@ -1,7 +1,7 @@ /* { dg-do run } */ /* { dg-additional-options "-ftrapv" } */ /* { dg-require-effective-target trapping } */ -/* { dg-require-fork } */ +/* { dg-require-fork unused } */ #include #include --- end of patch OK to commit? Thanks, Doug