[PATCH] Fix typos in graphite testcases

2017-10-05 Thread Richard Biener
Causing some UNRESOLVED.

Committed.

Richard.


p3-3
Description: Binary data


Re: [committed] Propagate attributes, including optimization and target node, to OMP outlined regions (PR tree-optimization/82374)

2017-10-05 Thread Rainer Orth
Hi Jakub,

> This patch propagates attributes, including opt and target nodes, from
> the containing function to the OMP/OACC outlined region functions.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.
>
> 2017-10-04  Jakub Jelinek  
>
>   PR tree-optimization/82374
>   * omp-low.c (create_omp_child_function): Copy DECL_ATTRIBUTES,
>   DECL_FUNCTION_SPECIFIC_OPTIMIZATION,
>   DECL_FUNCTION_SPECIFIC_TARGET and DECL_FUNCTION_VERSIONED from
>   current_function_decl to the new decl.
>
>   * gcc.dg/gomp/pr82374.c: New test.

the new tests FAILs on 32-bit targets (seen on 32-bit Solaris and
Linux/x86_64 -m32):

FAIL: gcc.dg/gomp/pr82374.c (test for excess errors)
UNRESOLVED: gcc.dg/gomp/pr82374.c scan-tree-dump-times vect "vectorized 1 
loops" 2

Excess errors:
/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/gomp/pr82374.c:9:7: error: 
size of array 'a' is too large
/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/gomp/pr82374.c:10:7: error: 
size of array 'b' is too large
/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/gomp/pr82374.c:11:7: error: 
size of array 'c' is too large
/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/gomp/pr82374.c:12:7: error: 
size of array 'd' is too large

gcc.dg/gomp/pr82374.c: dump file does not exist

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [committed] Propagate attributes, including optimization and target node, to OMP outlined regions (PR tree-optimization/82374)

2017-10-05 Thread Jakub Jelinek
On Thu, Oct 05, 2017 at 09:59:35AM +0200, Rainer Orth wrote:
> Hi Jakub,
> 
> > 2017-10-04  Jakub Jelinek  
> >
> > PR tree-optimization/82374
> > * omp-low.c (create_omp_child_function): Copy DECL_ATTRIBUTES,
> > DECL_FUNCTION_SPECIFIC_OPTIMIZATION,
> > DECL_FUNCTION_SPECIFIC_TARGET and DECL_FUNCTION_VERSIONED from
> > current_function_decl to the new decl.
> >
> > * gcc.dg/gomp/pr82374.c: New test.
> 
> the new tests FAILs on 32-bit targets (seen on 32-bit Solaris and
> Linux/x86_64 -m32):
> 
> FAIL: gcc.dg/gomp/pr82374.c (test for excess errors)
> UNRESOLVED: gcc.dg/gomp/pr82374.c scan-tree-dump-times vect "vectorized 1 
> loops" 2
> 
> Excess errors:
> /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/gomp/pr82374.c:9:7: error: 
> size of array 'a' is too large
> /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/gomp/pr82374.c:10:7: error: 
> size of array 'b' is too large
> /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/gomp/pr82374.c:11:7: error: 
> size of array 'c' is too large
> /vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/gomp/pr82374.c:12:7: error: 
> size of array 'd' is too large
> 
> gcc.dg/gomp/pr82374.c: dump file does not exist

Oops, sorry, somehow I've missed this and also the goacc regressions when
testing the above patch; I've noticed this FAIL last night, just didn't get
to that until this morning, fixed thusly, committed as obvious:

2017-10-05  Jakub Jelinek  

* gcc.dg/gomp/pr82374.c (SIZE): Change from 1G to 1M to make it ilp32
friendly.

--- gcc/testsuite/gcc.dg/gomp/pr82374.c.jj  2017-10-04 09:50:02.0 
+0200
+++ gcc/testsuite/gcc.dg/gomp/pr82374.c 2017-10-05 10:06:29.630860238 +0200
@@ -4,7 +4,7 @@
 /* { dg-additional-options "-mavx -mno-avx2" { target i?86-*-* x86_64-*-* } } 
*/
 /* { dg-additional-options "-mvsx" { target powerpc_vsx_ok } } */
 
-#define SIZE (1024 * 1024 * 1024)
+#define SIZE (1024 * 1024)
 
 float a[SIZE];
 float b[SIZE];


Jakub


[openacc, testsuite, committed] Fix libgomp.oacc-c-c++-common/{loop-red-g-1,routine-g-1}.c for non-nvidia devices

2017-10-05 Thread Tom de Vries
[ was : Re: [openacc, testsuite, committed] Fix 
libgomp.oacc-c-c++-common/loop-g-{1,2}.c for non-nvidia devices ]


On 09/28/2017 08:47 AM, Tom de Vries wrote:

Hi,

this patch makes the test-cases libgomp.oacc-c-c++-common/loop-g-{1,2}.c 
  work for non-nvidia devices.


For nvidia devices, a vector_length of 32 is required for the test to pass.

For devices with a non-32 forced vector_length, this test-case will fail 
the test for excess errors due to:

...
warning: using vector_length (x), ignoring 32
...

Fixed by removing the explicit vector_length setting. For nvidia 
devices, 32 is required, but that's also the forced default, so there's 
no need to be explicit about it.


Committed as obvious.


Committed this similar patch.

Thanks,
- Tom
Fix libgomp.oacc-c-c++-common/{loop-red-g-1,routine-g-1}.c for non-nvidia devices

2017-10-05  Tom de Vries  

	* testsuite/libgomp.oacc-c-c++-common/loop-red-g-1.c (main): Remove
	vector_length(32) clause from acc parallel directive.
	* testsuite/libgomp.oacc-c-c++-common/routine-g-1.c (main): Same.

---
 libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-g-1.c | 2 +-
 libgomp/testsuite/libgomp.oacc-c-c++-common/routine-g-1.c  | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-g-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-g-1.c
index d241d41..929e01c 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-g-1.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/loop-red-g-1.c
@@ -11,7 +11,7 @@ int main ()
   int ondev = 0;
   int t = 0, h = 0;
   
-#pragma acc parallel num_gangs(32) vector_length(32) copy(ondev)
+#pragma acc parallel num_gangs(32) copy(ondev)
   {
 #pragma acc loop gang  reduction (+:t)
 for (unsigned ix = 0; ix < N; ix++)
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/routine-g-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/routine-g-1.c
index 9d14c3b..b6ab713 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/routine-g-1.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/routine-g-1.c
@@ -36,7 +36,7 @@ int main ()
   for (ix = 0; ix < N;ix++)
 ary[ix] = -1;
   
-#pragma acc parallel num_gangs(32) vector_length(32) copy(ary) copy(ondev)
+#pragma acc parallel num_gangs(32) copy(ary) copy(ondev)
   {
 ondev = __builtin_acc_on_device (5);
 gang (ary);


Re: Transform (x >> cst) != 0 to x >= (1 << cst) and (x >> cst) == 0 to x < (1 << cst)

2017-10-05 Thread Richard Biener
On Tue, 3 Oct 2017, Jeff Law wrote:

> On 10/03/2017 03:00 PM, Marc Glisse wrote:
> > On Tue, 3 Oct 2017, Jakub Jelinek wrote:
> > 
> >> On Tue, Oct 03, 2017 at 12:54:39PM -0700, Prathamesh Kulkarni wrote:
> >>> Hi,
> >>> This follow-up patch implements the patterns mentioned in $subject.
> >>> Bootstrap+test in progress on x86_64-unknown-linux-gnu and
> >>> aarch64-linux-gnu.
> >>> OK to commit if passes ?
> >>>
> >>> Thanks,
> >>> Prathamesh
> >>
> >>> 2017-10-03  Prathamesh Kulkarni  
> >>>
> >>> * match.pd ((X >> CST) == 0 -> X < (1 << CST)): New pattern.
> >>> ((X >> CST) != 0 -> X >= (1 << CST)): Likewise.
> > 
> > build_int_cstu doesn't work for vectors, you want build_one_cst. I never
> > know if we should check single_use or not :-(
> > 
> >>> testsuite/
> >>> * gcc.dg/tree-ssa/cmpdiv.c: Add test-cases f3 and f4.
> >>
> >> Why this way and not the other way around?
> > 
> > For high level gimple optimizations, X < CST is more convenient (and
> > smaller, just one insn) than (X >> CST) == 0.
> Right.  One could easily argue that Prathamesh's form should be the
> preferred form because it is simpler at the gimple level -- and that
> x86-isms should be dealt with later in the pipeline.
> 
> 
> > 
> >> E.g. i?86/x86_64 and various other targets have shift instructions which
> >> set condition codes, so (X >> 51) == 0 is certainly smaller
> >> smaller and I believe cheaper than the latter.
> >> Try:
> >> void foo (void);
> >>
> >> void
> >> bar (unsigned long long x)
> >> {
> >>  if ((x >> 51) == 0)
> >>    foo ();
> >> }
> >>
> >> void
> >> baz (unsigned long long x)
> >> {
> >>  if (x < (1LL << 51))
> >>    foo ();
> >> }
> >> with -Os on x86_64, the first function is 4 insns, 12 bytes,
> >> the second one 5 insns, 21 bytes.
> >>
> >> I wonder if this kind of instruction selection stuff shouldn't be
> >> done in target.pd instead, with input from the target.
> Right, but I think that argues that Prathamesh's patch is the right
> direction and that to move forward what needs to happen is something
> needs to be fixed at the gimple/rtl border to ensure we get good x86 code.

For this case it should be even "easy" within the current RTL expansion
framework since all we need is to look at a single comparison and its
operands.

So I'd suggest to go forward with the canonicalization as proposed
and address the expansion issue as followup.  It's currently a missed
optimization for baz anyway.

Richard.

> > 
> > At a late stage, maybe during an RTL pass or expansion (or just before
> > expansion) it would indeed be good to generate a shift for such
> > comparisons, on targets where that sets a cc. The lack of this
> > transformation could be considered a blocker for the other one, to avoid
> > regressing on bar.
> Right.  In fact, I think Jakub's test ought to be added to this work as
> part of its basic testing.
> 
> jeff
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Re: [PATCH] simplify-rtx: Remove non-simplifying simplification (PR77729)

2017-10-05 Thread Richard Biener
On Tue, Oct 3, 2017 at 12:40 AM, Jeff Law  wrote:
> On 10/02/2017 01:35 PM, Segher Boessenkool wrote:
>> If we have (X&C1)|C2 simplify_binary_operation_1 makes C1 as small as
>> possible.  This makes worse code in common cases like when the AND with
>> C1 is from a zero-extension.  This patch fixes it by removing this
>> transformation (twice).
>>
>> I tested this on 31 targets, also some variations that do the
>> transformation only for some conditions (like, do not do it if the
>> resulting constant looks "worse", e.g. has more stretches of ones).
>> 22 of those targets show no difference; 8 are best with this patch
>> variant (never do the transformation); and 64-bit hppa is not best,
>> but the difference is only four insns in a million.
>>
>> Bootstrapped and tested on powerpc64-linux {-m32,-m64}.  Is this okay
>> for trunk?  And backports?
>>
>>
>> Segher
>>
>>
>> 2017-10-02  Segher Boessenkool  
>>
>>   PR rtl-optimization/77729
>>   * simplify-rtx.c (simplify_binary_operation_1): Delete the (X&C1)|C2
>>   to (X&(C1&~C2))|C2 transformations.
> OK for the trunk.  I'm not sure if the BZ in question qualifies this
> patch for backporting though.  Release managers have the final say on
> the backport question.

We usually avoid doing "optimization regression" backports unless very very
important and obvious.

Richard.

> And FWIW, the PA being an outlier on an optimization question should
> never IMHO be a blocker :-)
>
> jeff


Re: Allow non-wi wi

2017-10-05 Thread Richard Biener
On Tue, Oct 3, 2017 at 8:34 PM, Richard Sandiford
 wrote:
> This patch uses global rather than member operators for wide-int.h,
> so that the first operand can be a non-wide-int type.

Not sure why we had the in-class ones.  If we had some good arguments
they'd still stand.  Do you remember?

> The patch also removes the and_not and or_not member functions.
> It was already inconsistent to have member functions for these
> two operations (one of which was never used) and not other wi::
> ones like udiv.  After the operator change, we'd have the additional
> inconsistency that "non-wi & wi" would work but "non-wi.and_not (wi)"
> wouldn't.
>
> Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu.
> Also tested by comparing the testsuite assembly output on at least one
> target per CPU directory.  OK to install?

Ok.

Thanks,
Richard.

> Richard
>
>
> 2017-10-03  Richard Sandiford  
>
> gcc/
> * wide-int.h (WI_BINARY_OPERATOR_RESULT): New macro.
> (WI_BINARY_PREDICATE_RESULT): Likewise.
> (wi::binary_traits::operator_result): New type.
> (wi::binary_traits::predicate_result): Likewise.
> (generic_wide_int::operator~, unary generic_wide_int::operator-)
> (generic_wide_int::operator==, generic_wide_int::operator!=)
> (generic_wide_int::operator&, generic_wide_int::and_not)
> (generic_wide_int::operator|, generic_wide_int::or_not)
> (generic_wide_int::operator^, generic_wide_int::operator+
> (binary generic_wide_int::operator-, generic_wide_int::operator*):
> Delete.
> (operator~, unary operator-, operator==, operator!=, operator&)
> (operator|, operator^, operator+, binary operator-, operator*): New
> functions.
> * expr.c (get_inner_reference): Use wi::bit_and_not.
> * fold-const.c (fold_binary_loc): Likewise.
> * ipa-prop.c (ipa_compute_jump_functions_for_edge): Likewise.
> * tree-ssa-ccp.c (get_value_from_alignment): Likewise.
> (bit_value_binop): Likewise.
> * tree-ssa-math-opts.c (find_bswap_or_nop_load): Likewise.
> * tree-vrp.c (zero_nonzero_bits_from_vr): Likewise.
> (extract_range_from_binary_expr_1): Likewise.
> (masked_increment): Likewise.
> (simplify_bit_ops_using_ranges): Likewise.
>
> Index: gcc/wide-int.h
> ===
> --- gcc/wide-int.h  2017-09-11 17:10:58.656085547 +0100
> +++ gcc/wide-int.h  2017-10-03 19:32:39.077055063 +0100
> @@ -262,11 +262,22 @@ #define OFFSET_INT_ELTS (ADDR_MAX_PRECIS
>  #define WI_BINARY_RESULT(T1, T2) \
>typename wi::binary_traits ::result_type
>
> +/* Likewise for binary operators, which excludes the case in which neither
> +   T1 nor T2 is a wide-int-based type.  */
> +#define WI_BINARY_OPERATOR_RESULT(T1, T2) \
> +  typename wi::binary_traits ::operator_result
> +
>  /* The type of result produced by T1 << T2.  Leads to substitution failure
> if the operation isn't supported.  Defined purely for brevity.  */
>  #define WI_SIGNED_SHIFT_RESULT(T1, T2) \
>typename wi::binary_traits ::signed_shift_result_type
>
> +/* The type of result produced by a sign-agnostic binary predicate on
> +   types T1 and T2.  This is bool if wide-int operations make sense for
> +   T1 and T2 and leads to substitution failure otherwise.  */
> +#define WI_BINARY_PREDICATE_RESULT(T1, T2) \
> +  typename wi::binary_traits ::predicate_result
> +
>  /* The type of result produced by a signed binary predicate on types T1 and 
> T2.
> This is bool if signed comparisons make sense for T1 and T2 and leads to
> substitution failure otherwise.  */
> @@ -382,12 +393,15 @@ #define WIDE_INT_REF_FOR(T) \
>struct binary_traits 
>{
>  typedef widest_int result_type;
> +/* Don't define operators for this combination.  */
>};
>
>template 
>struct binary_traits 
>{
>  typedef wide_int result_type;
> +typedef result_type operator_result;
> +typedef bool predicate_result;
>};
>
>template 
> @@ -397,6 +411,8 @@ #define WIDE_INT_REF_FOR(T) \
> so as not to confuse gengtype.  */
>  typedef generic_wide_int < fixed_wide_int_storage
>::precision> > result_type;
> +typedef result_type operator_result;
> +typedef bool predicate_result;
>  typedef bool signed_predicate_result;
>};
>
> @@ -404,6 +420,8 @@ #define WIDE_INT_REF_FOR(T) \
>struct binary_traits 
>{
>  typedef wide_int result_type;
> +typedef result_type operator_result;
> +typedef bool predicate_result;
>};
>
>template 
> @@ -413,6 +431,8 @@ #define WIDE_INT_REF_FOR(T) \
> so as not to confuse gengtype.  */
>  typedef generic_wide_int < fixed_wide_int_storage
>::precision> > result_type;
> +typedef result_type operator_result;
> +typedef bool predicate_result;
>  typedef re

Re: [patch] Fix wrong code with small structure return on PowerPC

2017-10-05 Thread Richard Biener
On Tue, Oct 3, 2017 at 8:39 PM, Eric Botcazou  wrote:
>> Reading the patch I think that it gets conservativeness wrong -- shouldn't
>> it be
>>
>>   if (is_definitely_initialized)
>>{
>>   SUBREG_PROMOTED_VAR_P (temp) = 1;
>>   SUBREG_PROMOTED_SET (temp, unsignedp);
>>}
>>
>> ?  Of course it's not easy to compute is_definitely_initialized
>> conservatively in an ad-hoc way at this place.  It should be relatively
>> straight-forward to do a conservative computation somewhere in cfgexpand.c
>> by propagating across SSA edges and recording a flag on SSA names though.
>> I assume we can take load destinations as fully initialized (should extend
>> properly) as well as call results (the ABI should extend, eventually we can
>> query the target if it does), likewise for function arguments.
>
> Yes, that's why the comment read "Try to detect if " and not "Detect if ".
>
>> On your patch:
>>
>> + /* Try to detect if the register contains uninitialized bits.
>> */ + if (SSA_NAME_IS_DEFAULT_DEF (ssa_name))
>> +   maybe_uninitialized = true;
>>
>> if you use ssa_undefined_value_p (ssa_name[, true]) you'd get function
>> paramters not undefined (which is probably desired?).  Likewise
>> partial initialized complex would get uninitialized (if passed , true).
>
> Ah, yes, I overlooked that.
>
>> Same inside the loop over PHI args though I wonder how pessimizing
>> it would be to simply do
>>
>>   if (ssa_undefined_value_p (ssa_name, true)
>>
>>   || gimple_code (SSA_NAME_DEF_STMT (ssa_name)) == GIMPLE_PHI)
>>
>> maybe_uninitialized = true;
>>
>> thus make all PHIs possibly uninitialized (your code wouldn't catch a
>> chained PHI with undef arg).
>
> Too big a hammer for such a very rare bug I think.
>
>> As said, a better solution would be to do a definitely-initialized
>> mini-propagation at RTL expansion time.
>
> I'm not sure if we really need to propagate.  What about the attached patch?
> It computes at expansion time whether the partition the SSA name is a member
> of contains an undefined value and, if so, doesn't set the promoted bit for
> the SUBREG.  My gut feeling is that it's sufficient in practice.

I'll take your word for it ;)

The patch is ok.

Thanks,
Richard.

>
> * tree-outof-ssa.h (ssaexpand): Add partitions_for_undefined_values.
> (always_initialized_rtx_for_ssa_name_p): New predicate.
> * tree-outof-ssa.c (remove_ssa_form): Initialize new field of SA.
> (finish_out_of_ssa): Free new field of SA.
> * tree-ssa-coalesce.h (get_undefined_value_partitions): Declare.
> * tree-ssa-coalesce.c: Include tree-ssa.h.
> (get_parm_default_def_partitions): Remove extern keyword.
> (get_undefined_value_partitions): New function.
> * expr.c (expand_expr_real_1) : For a SSA_NAME, do
> not set SUBREG_PROMOTED_VAR_P on the sub-register if it may contain
> uninitialized bits.
>
> --
> Eric Botcazou


Re: [PATCH] Improve alloca alignment

2017-10-05 Thread Richard Biener
 On Thu, Oct 5, 2017 at 1:07 AM, Jeff Law  wrote:
> On 10/04/2017 08:53 AM, Eric Botcazou wrote:
>>> This seems like a SPARC target problem to me -- essentially it's
>>> claiming a higher STACK_BOUNDARY than it really has.
>>
>> No, it is not, I can guarantee you that the stack pointer is always aligned 
>> to
>> 64-bit boundaries on SPARC, otherwise all hell would break loose...
> Then something is inconsistent somehwere.  Either the stack is aligned
> prior to that code or it is not.  If it is aligned, then Wilco's patch
> ought to keep it aligned.  If is not properly aligned, then well, that's
> the problem ISTM.
>
> Am I missing something here?

What I got from the discussion and the PR is that the stack hardregister
is properly aligned but what GCC maps to it (virtual or frame or whatever)
might not be at all points.

allocate_dynamic_stack_space uses virtual_stack_dynamic_rtx and I'm not
sure STACK_BOUNDARY applies to it?

Not that I know anything about this here ;)

Richard.

> jeff


Re: [C++ PATCH] Move mangling alias out of ::

2017-10-05 Thread Bernhard Reutner-Fischer
On Wed, Oct 04, 2017 at 12:51:18PM -0400, Nathan Sidwell wrote:

> Applying to trunk.

+void
+record_mangling (tree decl, bool need_warning)
+{
+  if (!mangled_decls)
+mangled_decls = hash_map::create_ggc (499);
+
+  gcc_checking_assert (DECL_ASSEMBLER_NAME_SET_P (decl));
+  tree id = DECL_ASSEMBLER_NAME (decl);
+  bool existed;
+  tree *slot = &mangled_decls->get_or_insert (id, &existed);
+
+  /* If this is already an alias, remove the alias, because the real
+ decl takes presidence.  */

s/presidence/precedence/ ?

+  if (!existed)
+;
+  else if (DECL_ARTIFICIAL (*slot) && DECL_IGNORED_P (*slot))
+if (symtab_node *n = symtab_node::get (*slot))
+  if (n->cpp_implicit_alias)
+   {
+ n->remove ();
+ existed = false;
+   }

Wouldn't simply
  symtab_node *n;
  if (existed
  && DECL_ARTIFICIAL (*slot)
  && DECL_IGNORED_P (*slot)
  && (n = symtab_node::get (*slot))
  && n->cpp_implicit_alias)
{
  n->remove ();
  existed = false;
}
be more in line with the coding style?

thanks,
+
+  if (!existed)
+*slot = decl;
+  else if (need_warning)
+{
+  error_at (DECL_SOURCE_LOCATION (decl),
+   "mangling of %q#D as %qE conflicts with a previous mangle",
+   decl, id);
+  inform (DECL_SOURCE_LOCATION (*slot),
+ "previous mangling %q#D", *slot);
+  inform (DECL_SOURCE_LOCATION (decl),
+ "a later -fabi-version= (or =0)"
+ " avoids this error with a change in mangling");
+  *slot = decl;
+}
+}


[PATCH][ARM][GCC] Add a self test for FPU feature bits

2017-10-05 Thread Tamar Christina
Hi All,

This patch adds a more elaborate self-test to the arm back-end
which tests to make sure you don't have any bit specified in
ISA_ALL_FPU_INTERNAL that are not defined by any fpu configuration.

Catching this during the self-test prevents an ICE at runtime when
-mfpu=auto tries to detect the fpu to use.

Regtested on arm-none-eabi and no issues.

Ok for trunk?

gcc/
2017-10-05  Tamar Christina  

* config/arm/arm.c (arm_test_fpu_data): New.
(arm_run_selftests): Call arm_test_fpu_data.

-- 
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 4cddf3b02e5e731507c2d326b80747524fd505bf..ae7e76f8b0c38f74c8d6479ff0a209df8b97d5b4 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -31393,10 +31393,43 @@ arm_test_cpu_arch_data (void)
 }
 }
 
+/* Scan the static data tables generated by parsecpu.awk looking for
+   potential issues with the data.  Here we check for consistency between the
+   fpu bits, in particular we check that ALL_FPU does not contain a feature bit
+   that is not defined by any FPU flag.  */
+static void
+arm_test_fpu_data (void)
+{
+  auto_sbitmap isa_all_fpubits (isa_num_bits);
+  auto_sbitmap fpubits (isa_num_bits);
+  auto_sbitmap tmpset (isa_num_bits);
+
+  static const enum isa_feature fpu_bitlist[]
+= { ISA_ALL_FPU_INTERNAL, isa_nobit };
+  arm_initialize_isa (isa_all_fpubits, fpu_bitlist);
+
+  for (unsigned int i = 0; i < TARGET_FPU_auto; i++)
+  {
+arm_initialize_isa (fpubits, all_fpus[i].isa_bits);
+bitmap_and_compl (tmpset, isa_all_fpubits, fpubits);
+bitmap_clear (isa_all_fpubits);
+bitmap_copy (isa_all_fpubits, tmpset);
+  }
+
+  if (!bitmap_empty_p (isa_all_fpubits))
+{
+	fprintf (stderr, "Error: found feature bits in the ALL_FPU_INTERAL"
+			 " group that are not defined by any FPU.\n"
+			 "   Check your arm-cpus.in.\n");
+	ASSERT_TRUE (bitmap_empty_p (isa_all_fpubits));
+}
+}
+
 static void
 arm_run_selftests (void)
 {
   arm_test_cpu_arch_data ();
+  arm_test_fpu_data ();
 }
 } /* Namespace selftest.  */
 



Re: [C++ PATCH] Move mangling alias out of ::

2017-10-05 Thread Nathan Sidwell

On 10/05/2017 05:28 AM, Bernhard Reutner-Fischer wrote:

On Wed, Oct 04, 2017 at 12:51:18PM -0400, Nathan Sidwell wrote:



+  /* If this is already an alias, remove the alias, because the real
+ decl takes presidence.  */

s/presidence/precedence/ ?


thanks,


+  if (!existed)
+;
+  else if (DECL_ARTIFICIAL (*slot) && DECL_IGNORED_P (*slot))



Wouldn't simply

...

be more in line with the coding style?


Moving the 'existed' test, but I dislike assignments in the middle of 
conditionals.  I think an earlier version had a non-empty first block, 
but I failed to clean it up.


Fixing thusly.

nathan

--
Nathan Sidwell
2017-10-05  Nathan Sidwell  

	* decl2.c (record_mangling): Fix spello and formatting from
	previous patch.

Index: decl2.c
===
--- decl2.c	(revision 253421)
+++ decl2.c	(working copy)
@@ -4377,10 +4377,8 @@ record_mangling (tree decl, bool need_wa
   tree *slot = &mangled_decls->get_or_insert (id, &existed);
 
   /* If this is already an alias, remove the alias, because the real
- decl takes presidence.  */
-  if (!existed)
-;
-  else if (DECL_ARTIFICIAL (*slot) && DECL_IGNORED_P (*slot))
+ decl takes precedence.  */
+  if (existed && DECL_ARTIFICIAL (*slot) && DECL_IGNORED_P (*slot))
 if (symtab_node *n = symtab_node::get (*slot))
   if (n->cpp_implicit_alias)
 	{


Re: [PATCH][ARM][GCC] Add a self test for FPU feature bits

2017-10-05 Thread Richard Earnshaw (lists)
On 05/10/17 10:59, Tamar Christina wrote:
> Hi All,
> 
> This patch adds a more elaborate self-test to the arm back-end
> which tests to make sure you don't have any bit specified in
> ISA_ALL_FPU_INTERNAL that are not defined by any fpu configuration.
> 
> Catching this during the self-test prevents an ICE at runtime when
> -mfpu=auto tries to detect the fpu to use.
> 
> Regtested on arm-none-eabi and no issues.
> 
> Ok for trunk?
> 
> gcc/
> 2017-10-05  Tamar Christina  
> 
>   * config/arm/arm.c (arm_test_fpu_data): New.
>   (arm_run_selftests): Call arm_test_fpu_data.
> 

Can you fix the comments that refers to ALL_FPU to be ALL_FPU_INTERNAL.

OK with that change.

R.


RE: 0001-Part-1.-Add-generic-part-for-Intel-CET-enabling

2017-10-05 Thread Tsimbalist, Igor V
I would like to implement the patch in a bit different way depending on answers 
I will get for
my following proposals:

- I propose to make a type with 'nocf_check' attribute to be different from 
type w/o the attribute.
   The reason is that the type with 'nocf_check' attribute implies different 
code generation. It will be
   done by setting affects_type_identity field to true for the attribute. That 
in turn will trigger
   needed or expected type checking;

- I propose to ignore the 'nocf_check' attribute if 'fcf-protection' option is 
not specified and output
   the warning about this. If there is no instrumentation the type with 
attribute should not be treated
   differently from type w/o the attribute (see previous item) and should not 
be recorded into the
   type.

If it's ok, please ignore my previous patch (version#3) and I will post the 
updated patch shortly.

Igor


> -Original Message-
> From: Tsimbalist, Igor V
> Sent: Friday, September 29, 2017 6:04 PM
> To: Jeff Law ; gcc-patches@gcc.gnu.org
> Cc: richard.guent...@gmail.com; Tsimbalist, Igor V
> 
> Subject: RE: 0001-Part-1.-Add-generic-part-for-Intel-CET-enabling
> 
> Updated patch, version #3.
> 
> Igor
> 
> 
> > -Original Message-
> > From: Tsimbalist, Igor V
> > Sent: Friday, September 29, 2017 4:32 PM
> > To: Jeff Law ; gcc-patches@gcc.gnu.org
> > Cc: richard.guent...@gmail.com; Tsimbalist, Igor V
> > 
> > Subject: RE: 0001-Part-1.-Add-generic-part-for-Intel-CET-enabling
> >
> > > -Original Message-
> > > From: Jeff Law [mailto:l...@redhat.com]
> > > Sent: Friday, September 29, 2017 12:44 AM
> > > To: Tsimbalist, Igor V ; gcc-
> > > patc...@gcc.gnu.org
> > > Cc: richard.guent...@gmail.com
> > > Subject: Re: 0001-Part-1.-Add-generic-part-for-Intel-CET-enabling
> > >
> > > On 09/19/2017 07:39 AM, Tsimbalist, Igor V wrote:
> > > > Here is an updated patch (version #2). The main differences are:
> > > >
> > > > - Change attribute and option names;
> > > > - Add additional parameter to gimple_build_call_from_tree by
> > > > adding a
> > > type parameter and
> > > >   use it 'nocf_check' attribute propagation;
> > > > - Reimplement fixes in expand_call_stmt to propagate 'nocf_check'
> > > > attribute;
> > > > - Consider 'nocf_check' attribute in Identical Code Folding (ICF)
> > > > optimization;
> > > > - Add warning for type inconsistency regarding 'nocf_check'
> > > > attribute;
> > > > - Many small fixes;
> > > >
> > > > gcc/c-family/
> > > > * c-attribs.c (handle_nocf_check_attribute): New function.
> > > > (c_common_attribute_table): Add 'nocf_check' handling.
> > > > * c-common.c (check_missing_format_attribute): New function.
> > > > * c-common.h: Likewise.
> > > >
> > > > gcc/c/
> > > > * c-typeck.c (convert_for_assignment): Add check for nocf_check
> > > > attribute.
> > > > * gimple-parser.c: Add second argument NULL to
> > > > gimple_build_call_from_tree.
> > > >
> > > > gcc/cp/
> > > > * typeck.c (convert_for_assignment): Add check for nocf_check
> > > > attribute.
> > > >
> > > > gcc/
> > > > * cfgexpand.c (expand_call_stmt): Set REG_CALL_NOCF_CHECK for
> > > > call insn.
> > > > * combine.c (distribute_notes): Add REG_CALL_NOCF_CHECK
> > > handling.
> > > > * common.opt: Add fcf-protection flag.
> > > > * emit-rtl.c (try_split): Add REG_CALL_NOCF_CHECK handling.
> > > > * flag-types.h: Add enum cf_protection_level.
> > > > * gimple.c (gimple_build_call_from_tree): Add second parameter.
> > > > Add 'nocf_check' attribute propagation to gimple call.
> > > > * gimple.h (gf_mask): Add GF_CALL_NOCF_CHECK.
> > > > (gimple_call_nocf_check_p): New function.
> > > > (gimple_call_set_nocf_check): Likewise.
> > > > * gimplify.c: Add second argument to 
> > > > gimple_build_call_from_tree.
> > > > * ipa-icf.c: Add nocf_check attribute in statement hash.
> > > > * recog.c (peep2_attempt): Add REG_CALL_NOCF_CHECK handling.
> > > > * reg-notes.def: Add REG_NOTE (CALL_NOCF_CHECK).
> > > > * toplev.c (process_options): Add flag_cf_protection handling.
> > > >
> > > > Is it ok for trunk?
> > > >
> > > > Thanks,
> > > > Igor
> > > >
> > > >
> > >
> > >
> > > >
> > > > diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
> > > > index
> > > > 0337537..77d1909 100644
> > > > --- a/gcc/c-family/c-attribs.c
> > > > +++ b/gcc/c-family/c-attribs.c
> > > > @@ -65,6 +65,7 @@ static tree handle_asan_odr_indicator_attribute
> > > > (tree *, tree, tree, int,  static tree
> > > > handle_stack_protect_attribute (tree *, tree, tree, int, bool *);
> > > > static tree handle_noinline_attribute (tree *, tree, tree, int,
> > > > bool *);  static tree handle_noclone_attribute (tree *, tree,
> > > > tree, int, bool *);
> > > > +static tree handle_nocf_check_attribute (tree *, tree, tree, int,
> > > > +bool *);
> > > >  stat

[PATCH][RFC] Instrument function exit with __builtin_unreachable in C++.

2017-10-05 Thread Martin Liška
Hello.

As discussed 2 days ago on IRC with Jakub and Jonathan, C++ standard says that 
having a non-return
function with missing return statement is undefined behavior. We've got 
-fsanitize=return check for
that and we can in such case instrument __builtin_unreachable(). This patch 
does that. It produces quite
some fallout in test-suite.

And there's still some fallout:

FAIL: g++.dg/cpp0x/constexpr-diag3.C  -std=c++11  (test for errors, line 7)
FAIL: g++.dg/cpp0x/constexpr-neg3.C  -std=c++11  (test for errors, line 12)
FAIL: g++.dg/cpp1y/constexpr-return2.C  -std=c++14  (test for errors, line 7)
FAIL: g++.dg/cpp1y/constexpr-return2.C  -std=c++14 (test for excess errors)
FAIL: g++.dg/cpp1y/pr63996.C  -std=c++14  (test for errors, line 9)
FAIL: g++.dg/cpp1y/pr63996.C  -std=c++14 (test for excess errors)

1) there are causing:

./xg++ -B. /home/marxin/Programming/gcc/gcc/testsuite/g++.dg/cpp1y/pr63996.C
/home/marxin/Programming/gcc/gcc/testsuite/g++.dg/cpp1y/pr63996.C:9:23:   in 
constexpr expansion of ‘foo(1)’
/home/marxin/Programming/gcc/gcc/testsuite/g++.dg/cpp1y/pr63996.C:4:1: error: 
‘__builtin_unreachable()’ is not a constant expression
 foo (int i)
 ^~~

Where we probably should not emit the built-in call. Am I right?

FAIL: g++.dg/torture/pr34850.C   -O1   (test for warnings, line 15)
FAIL: g++.dg/torture/pr34850.C   -O2   (test for warnings, line 15)
FAIL: g++.dg/torture/pr34850.C   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none   (test for warnings, line 15)
FAIL: g++.dg/torture/pr34850.C   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects   (test for warnings, line 15)
FAIL: g++.dg/torture/pr34850.C   -O3 -g   (test for warnings, line 15)
FAIL: g++.dg/torture/pr34850.C   -Os   (test for warnings, line 15)

2) this test is somehow hard to fix as it requires some unreachability.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Thoughts?
Martin

gcc/cp/ChangeLog:

2017-10-05  Martin Liska  

PR middle-end/82404
* cp-gimplify.c (cp_ubsan_maybe_instrument_return): Remove as
renamed to ...
(cp_maybe_instrument_return): ... this.
(cp_genericize): Update call of the function.

gcc/testsuite/ChangeLog:

2017-10-05  Martin Liska  

PR middle-end/82404
* c-c++-common/dfp/call-by-value.c (foo32): Change return value
to void.
(foo64): Likewise.
(foo128): Likewise.
* g++.dg/bprob/g++-bprob-1.C: Likewise.
* g++.dg/cpp0x/lambda/lambda-template.C (f): Likewise.
* g++.dg/cpp0x/range-for6.C (foo): Return a value.
* g++.dg/cpp0x/udlit-template.C: Change return value
to void.
* g++.dg/cpp1z/eval-order3.C (struct A): Return a value.
(operator>>): Likewise.
* g++.dg/expr/cond12.C (struct X): Return a value.
(X::operator=): Likewise.
* g++.dg/gcov/gcov-1.C: Change return value
to void.
* g++.dg/gcov/gcov-threads-1.C (ContentionNoDeadlock_thread):
Return a value.
* g++.dg/ipa/devirt-21.C: Likewise.
* g++.dg/ipa/devirt-23.C: Likewise.
* g++.dg/ipa/devirt-34.C (t): Likewise.
* g++.dg/opt/20050511-1.C (bar): Likewise.
* g++.dg/opt/const3.C (A::foo1): Likewise.
(A::foo2): Likewise.
* g++.dg/opt/pr23299.C (E::c): Likewise.
* g++.dg/other/copy2.C (A::operator=): Likewise.
* g++.dg/overload/addr1.C: Likewise.
* g++.dg/pr48484.C: Likewise.
* g++.dg/tls/thread_local3.C (thread_main): Likewise.
* g++.dg/tls/thread_local3g.C (thread_main): Likewise.
* g++.dg/tls/thread_local5.C (thread_main): Likewise.
* g++.dg/tls/thread_local5g.C (thread_main): Likewise.
* g++.dg/tls/thread_local6.C (thread_main): Likewise.
* g++.dg/tls/thread_local6g.C (thread_main): Likewise.
* g++.dg/torture/pr34850.C (OctetString::operator^=): Likewise.
* g++.dg/tree-prof/pr79259.C (fn2): Likewise.
* g++.dg/tree-ssa/pr33604.C (struct Value): Likewise.
* g++.dg/tree-ssa/pr81408.C (struct p): Likewise.
(av): Likewise.
* g++.dg/warn/string1.C (test): Likewise.
---
 gcc/cp/cp-gimplify.c| 18 --
 gcc/testsuite/c-c++-common/dfp/call-by-value.c  |  6 +++---
 gcc/testsuite/g++.dg/bprob/g++-bprob-1.C|  2 +-
 gcc/testsuite/g++.dg/cpp0x/lambda/lambda-template.C |  2 +-
 gcc/testsuite/g++.dg/cpp0x/range-for6.C |  2 ++
 gcc/testsuite/g++.dg/cpp0x/udlit-template.C |  2 +-
 gcc/testsuite/g++.dg/cpp1z/eval-order3.C|  4 ++--
 gcc/testsuite/g++.dg/expr/cond12.C  |  8 +++-
 gcc/testsuite/g++.dg/gcov/gcov-1.C  |  2 +-
 gcc/testsuite/g++.dg/gcov/gcov-threads-1.C  |  2 ++
 gcc/testsuite/g++.dg/ipa/devirt-21.C|  2 +-
 gcc/testsuite/g++.dg/ipa/devirt-23.C|  2 +-
 gcc/testsuite/g++.dg/ipa/devirt-34.C|  2 ++

Re: [Libgomp, Fortran] Fix canadian cross build

2017-10-05 Thread Yvan Roux
On 5 September 2017 at 12:04, Jakub Jelinek  wrote:
> On Tue, Sep 05, 2017 at 10:58:22AM +0200, Yvan Roux wrote:
>> ping
>>
>> https://gcc.gnu.org/ml/gcc-patches/2017-06/msg01784.html
>
> This really needs to be reviewed by a build machinery maintainer.

Thanks for the CC list update Jakub, ping again then...

https://gcc.gnu.org/ml/gcc-patches/2017-06/msg01784.html

>> > config/ChangeLog
>> > 2017-06-23  Yvan Roux  
>> >
>> > * acx.m4 (NCN_STRICT_CHECK_TARGET_TOOLS): Renamed to ...
>> > (NCN_STRICT_PATH_TARGET_TOOLS): ... this.  It reflects the 
>> > replacement
>> > of AC_CHECK_PROG by AC_PATH_PROG to get the absolute name of 
>> > the
>> > program.
>> > (ACX_CHECK_INSTALLED_TARGET_TOOL): Use renamed function.
>> >
>> > ChangeLog
>> > 2017-06-23  Yvan Roux  
>> >
>> > * configure.ac: Use NCN_STRICT_PATH_TARGET_TOOLS instead of
>> > NCN_STRICT_CHECK_TARGET_TOOLS.
>> > * configure: Regenerate.
>
> Jakub


Re: [PATCH v3 1/14] D: The front-end (DMD) language implementation and license.

2017-10-05 Thread Iain Buclaw
On 3 October 2017 at 23:36, Joseph Myers  wrote:
> On Tue, 3 Oct 2017, Jeff Law wrote:
>
>> /* Copyright (c) 2010-2014 by Digital Mars
>>  * All Rights Reserved, written by Walter Bright
>>  * http://www.digitalmars.com
>>  * Distributed under the Boost Software License, Version 1.0.
>>  * (See accompanying file LICENSE or copy at
>> http://www.boost.org/LICENSE_1_0.txt)
>>
>> If the code was assigned to the FSF in 2011, then the FSF would have
>> ownership of the code.  And the FSF would be the only entity that could
>> change the license (which according to your message changed to Boost in
>> 2014).  So something seems wrong here.
>
> The standard FSF assignment would allow the contributor to distribute
> their own code under such terms as they see fit.
>

Walter, would you mind clarifying details of your assignment? Was it a
standard assignment? Did you request for any amendments?

Jeff, I'm no legal, so I can't comment on it.  Maybe there's someone
from the FSF who be able to confirm?

I'll cc in Andrei as well, so the D language foundation is in on this.

Regards,
Iain.


Re: [PATCH][RFC] Instrument function exit with __builtin_unreachable in C++.

2017-10-05 Thread Jakub Jelinek
On Thu, Oct 05, 2017 at 12:31:23PM +0200, Martin Liška wrote:
> As discussed 2 days ago on IRC with Jakub and Jonathan, C++ standard says 
> that having a non-return
> function with missing return statement is undefined behavior. We've got 
> -fsanitize=return check for
> that and we can in such case instrument __builtin_unreachable(). This patch 
> does that. It produces quite
> some fallout in test-suite.
> 
> And there's still some fallout:
> 
> FAIL: g++.dg/cpp0x/constexpr-diag3.C  -std=c++11  (test for errors, line 7)
> FAIL: g++.dg/cpp0x/constexpr-neg3.C  -std=c++11  (test for errors, line 12)
> FAIL: g++.dg/cpp1y/constexpr-return2.C  -std=c++14  (test for errors, line 7)
> FAIL: g++.dg/cpp1y/constexpr-return2.C  -std=c++14 (test for excess errors)
> FAIL: g++.dg/cpp1y/pr63996.C  -std=c++14  (test for errors, line 9)
> FAIL: g++.dg/cpp1y/pr63996.C  -std=c++14 (test for excess errors)
> 
> 1) there are causing:
> 
> ./xg++ -B. /home/marxin/Programming/gcc/gcc/testsuite/g++.dg/cpp1y/pr63996.C
> /home/marxin/Programming/gcc/gcc/testsuite/g++.dg/cpp1y/pr63996.C:9:23:   in 
> constexpr expansion of ‘foo(1)’
> /home/marxin/Programming/gcc/gcc/testsuite/g++.dg/cpp1y/pr63996.C:4:1: error: 
> ‘__builtin_unreachable()’ is not a constant expression
>  foo (int i)
>  ^~~
> 
> Where we probably should not emit the built-in call. Am I right?

The problem is that I think we save the constexpr function bodies after
folding and genericization.  We shouldn't be doing that, if we save it
before, in addition to fixing quite a few bugs it would fix this as well.

If looking for a hack instead of proper fix here, allowing
__builtin_unreachable () in constant expressions would be probably
undesirable, because then people adding it themselves in their source code
would get it accepted.  So, either we need some way how to differentiate
between user written __builtin_unreachable and one we've added ourselves
(doesn't -fsanitize=return fail the similar way though?), either as
internal-fn instead, or just by using UNKNOWN_LOCATION or BUILTINS_LOCATION
or something similar to find out it is artificial; and of course if we
reach it during constexpr evaluation error some way.

Jakub


Re: [Libgomp, Fortran] Fix canadian cross build

2017-10-05 Thread Petr Ovtchenkov
On Thu, 5 Oct 2017 12:56:36 +0200
Yvan Roux  wrote:

> On 5 September 2017 at 12:04, Jakub Jelinek  wrote:
> > On Tue, Sep 05, 2017 at 10:58:22AM +0200, Yvan Roux wrote:
> >> ping
> >>
> >> https://gcc.gnu.org/ml/gcc-patches/2017-06/msg01784.html
> >
> > This really needs to be reviewed by a build machinery maintainer.
> 
> Thanks for the CC list update Jakub, ping again then...
> 
> https://gcc.gnu.org/ml/gcc-patches/2017-06/msg01784.html

BTW,

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71212
https://gcc.gnu.org/ml/gcc-patches/2017-09/msg01332.html
https://gcc.gnu.org/ml/gcc-patches/2017-09/msg01334.html

> 
> >> > config/ChangeLog
> >> > 2017-06-23  Yvan Roux  
> >> >
> >> > * acx.m4 (NCN_STRICT_CHECK_TARGET_TOOLS): Renamed to ...
> >> > (NCN_STRICT_PATH_TARGET_TOOLS): ... this.  It reflects the 
> >> > replacement
> >> > of AC_CHECK_PROG by AC_PATH_PROG to get the absolute name of 
> >> > the
> >> > program.
> >> > (ACX_CHECK_INSTALLED_TARGET_TOOL): Use renamed function.
> >> >
> >> > ChangeLog
> >> > 2017-06-23  Yvan Roux  
> >> >
> >> > * configure.ac: Use NCN_STRICT_PATH_TARGET_TOOLS instead of
> >> > NCN_STRICT_CHECK_TARGET_TOOLS.
> >> > * configure: Regenerate.
> >
> > Jakub

--

   - ptr


Re: [PATCH][GRAPHITE] Rewrite PHI handling in code-gen

2017-10-05 Thread Richard Biener
On Wed, 4 Oct 2017, Richard Biener wrote:

> 
> The following patch completely re-does PHI handling during 
> code-generation.  PHI handling is currently responsible for 99% of
> all code-generation issues.  With the patch the number of code-generation
> issues in SPEC 2k6 decreases from 180 to 5, similar adjustments happen
> to the testsuite - only gfortran.dg/graphite has some expected code-gen
> issues left.

So I messed up the testsuite update and it turns out all code-gen
issues are fixed in all graphite.exp testsuites.  Yay.

> The current idea of categorizing PHIs and doing code-gen based on
> pattern matching with the original GIMPLE IL isn't feasible given
> ISL can do transforms like peeling, optimizing away conditions and
> creating arbitrary number of GBB copies.  The current code fences
> off a lot of cases by simply giving up.
> 
> To fix the current code one would need to basically replicate the
> update-SSA machinery we already have (and pointlessly exercise
> from the graphite code-gen at the moment).
> 
> Thus the patch rips out all manual handling of PHIs during code-generation
> and leaves all cross-BB scalar updates to update-SSA.
> 
> This means "going out-of-SSA" again, but instead of applying out-of-SSA
> on the original GIMPLE IL I'm just doing this on-the-fly during
> scalar dependence generation and code generation.
> 
>  bb3:
>   goto bb5;
> 
>  bb4:
> 
>  bb5:
>   _2 = PHI <_3(3), _4(4)>
> 
> becomes (for an identity rewrite) before update-SSA:
> 
>  bb3':
>   D.1234 = _3;
> 
>  bb4':
>   D.1234 = _4;
> 
>  bb5':
>   _5 = D.1234;
> 
> with _5 being a new def for _2.  update-SSA then re-writes the
> _3 and _4 uses with available new defs we have registered during
> code generation of the _3 and _4 def copies and rewrites D.1234
> into SSA, inserting PHIs where necessary.
> 
> This scheme of course relies on ISL outputting a correct schedule
> which of course relies on us setting proper dependence constraints.
> I've fixed quite a few issues there, for example missing constraints
> for the SESE liveout variables.
> 
> One awkward thing is that to not confuse ISL with PHI edge copies
> placed in latch blocks, like
> 
>   for (int c0 = 0; c0 < P_22; c0 += 1) {
> S_6(0, c0);
> if (P_22 >= c0 + 2)
>   S_7(0, c0);
>   }
> 
> and ISL then happilly peeling off the last iteration where the latch S_7
> containing only the out-of-SSA copy is not needed.  So I'm trying to
> detect empty latches and instead insert the out-of-SSA copy in its
> predecessor instead (I know it doesn't matter if we execute the stmt
> in the last iteration).
> 
> The patch as-is ends up with quite some useless SSA copies which
> is cleaned up by the copyprop pass inside the graphite pipeline
> but can also be improved by teaching the into-SSA rewrite to
> eliminate copies.
> 
> I do expect issues with the patch (I'm seeing CE 416.gamess, but not
> sure why and 403.gcc miscompare), but it's already become somewhat too big 
> to handle.

The CE was on purpose, std=legacy brings it back, now it seems to
miscompare.

> Currently re-bootstrapping and testing after some cosmetic changes,
> testsuites ran successfully, SPEC CPU 2006 built and run (with test data).
> Statistics (with all graphite params set to unlimited) are
> 
> loop nest optimized: 119
> loop nest not optimized, code generation error: 5
> loop nest not optimized, optimized schedule is identical to original 
> schedule: 110
> loop nest not optimized, optimization timed out: 31
> loop nest not optimized, ISL signalled an error: 6
> loop nest: 271
> 
> Ok for trunk?

Thus the following updated patch (only the testsuite part changed).

Bootstrapped with -fgraphite-identity -floop-nest-optimize and tested
on x86_64-unknown-linux-gnu.

Ok?

There are two parts worth working on after this - one is creating
a versioning condition to fend off alias-set compute fails, the
other one is working on the proximity constraints.

After this is in I'll extract testcases for the remaining
code-generation issues in SPEC.

Thanks,
Richard.

2017-10-05  Richard Biener  

* graphite-isl-ast-to-gimple.c: Include ssa.h and tree-ssa.h.
(translate_isl_ast_to_gimple::translate_pending_phi_nodes,
translate_isl_ast_to_gimple::is_valid_rename,
translate_isl_ast_to_gimple::get_rename,
translate_isl_ast_to_gimple::get_def_bb_for_const,
translate_isl_ast_to_gimple::get_new_name,
translate_isl_ast_to_gimple::collect_all_ssa_names,
translate_isl_ast_to_gimple::copy_loop_phi_args,
translate_isl_ast_to_gimple::collect_all_ssa_names,
translate_isl_ast_to_gimple::copy_loop_phi_args,
translate_isl_ast_to_gimple::copy_loop_phi_nodes,
translate_isl_ast_to_gimple::add_close_phis_to_merge_points,
translate_isl_ast_to_gimple::add_close_phis_to_outer_loops,
translate_isl_ast_to_gimple::copy_loop_close_phi_args,
translate_isl_ast_to_gimple::copy_loop_close_phi_nodes,
   

[C++ PATCH] Kill IDENTIFIER_GLOBAL_VALUE

2017-10-05 Thread Nathan Sidwell
As mentioned yesterday, there's now no need for set_global_binding to 
have separate name & decl parms.  This patch makes that change and kills 
IDENTIFIER_GLOBAL_VALUE and SET_IDENTIFIER_GLOBAL_VALUE, which are have 
been forwarding macros for a long time now.


Because get_namespace_binding has the namespace arg first, we can't 
simply use a default arg there -- and swapping them round is 
incompatible with the rest of the name-lookup argument conventions.  So 
get_global_binding is simple inline wrapper to that.  (I decided to keep 
the NULL == global_namespace idiom to avoid each call instance having to 
go get the global_node.)


I discovered we had an identifier_global_value function already, used 
directly by c-common.c.  I moved it to cp-objcp-common.c.  I declined 
the opportunity to hookize it :)


Applying to trunk.

nathan

--
Nathan Sidwell
2017-10-05  Nathan Sidwell  

	Kill IDENTIFIER_GLOBAL_VALUE, SET_IDENTIFIER_GLOBAL_VALUE
	* cp-tree.h (IDENTIFIER_GLOBAL_VALUE,
	SET_IDENTIFIER_GLOBAL_VALUE): Delete.
	* name-lookup.h (set_global_binding): Remove NAME parm.
	(get_global_binding): New inline fn.
	* name-lookup.c (set_global_binding): Remove NAME parm. Adjust.
	(identifier_global_value): Move to ...
	* cp-objcp-common.c (identifier_global_value): ... here.
	* class.c (build_ctor_vtbl_group, build_vtbl_initializer): Adjust.
	* decl.c (record_builtin_type, expand_static_init,
	grokdeclarator): Adjust.
	* decl2.c (get_guard, get_local_tls_init_fn, get_tls_init_fn,
	get_tls_wrapper_fn, maybe_warn_sized_delete): Adjust.
	* except.c (declare_library_fn, build_throw): Adjust.
	* init.c (throw_bad_array_length): Adjust.
	* rtti.c (throw_bad_cast, throw_bad_typeid, get_tinfo_decl): Adjust.

Index: class.c
===
--- class.c	(revision 253413)
+++ class.c	(working copy)
@@ -8912,7 +8912,7 @@ build_ctor_vtbl_group (tree binfo, tree
 
   /* See if we've already created this construction vtable group.  */
   id = mangle_ctor_vtbl_for_type (t, binfo);
-  if (IDENTIFIER_GLOBAL_VALUE (id))
+  if (get_global_binding (id))
 return;
 
   gcc_assert (!SAME_BINFO_TYPE_P (BINFO_TYPE (binfo), t));
@@ -9294,7 +9294,7 @@ build_vtbl_initializer (tree binfo,
 	  if (!dvirt_fn)
 		{
 		  tree name = get_identifier ("__cxa_deleted_virtual");
-		  dvirt_fn = IDENTIFIER_GLOBAL_VALUE (name);
+		  dvirt_fn = get_global_binding (name);
 		  if (!dvirt_fn)
 		dvirt_fn = push_library_fn
 		  (name,
Index: cp-objcp-common.c
===
--- cp-objcp-common.c	(revision 253413)
+++ cp-objcp-common.c	(working copy)
@@ -400,6 +400,15 @@ cp_pushdecl (tree decl)
   return pushdecl (decl);
 }
 
+/* Get the global value binding of NAME.  Called directly from
+   c-common.c, not via a hook. */
+
+tree
+identifier_global_value (tree name)
+{
+  return get_global_binding (name);
+}
+
 /* Register c++-specific dumps.  */
 
 void
Index: cp-tree.h
===
--- cp-tree.h	(revision 253421)
+++ cp-tree.h	(working copy)
@@ -613,11 +613,6 @@ struct GTY(()) ptrmem_cst {
 };
 typedef struct ptrmem_cst * ptrmem_cst_t;
 
-#define IDENTIFIER_GLOBAL_VALUE(NODE) \
-  get_namespace_binding (NULL_TREE, (NODE))
-#define SET_IDENTIFIER_GLOBAL_VALUE(NODE, VAL) \
-  set_global_binding ((NODE), (VAL))
-
 #define CLEANUP_P(NODE)		TREE_LANG_FLAG_0 (TRY_BLOCK_CHECK (NODE))
 
 #define BIND_EXPR_TRY_BLOCK(NODE) \
Index: decl.c
===
--- decl.c	(revision 253426)
+++ decl.c	(working copy)
@@ -3897,7 +3897,7 @@ make_unbound_class_template (tree contex
RID_POINTERS.  NAME is the name used when looking up the builtin
type.  TYPE is the _TYPE node for the builtin type.
 
-   The calls to SET_IDENTIFIER_GLOBAL_VALUE below should be
+   The calls to set_global_binding below should be
eliminated.  Built-in types should not be looked up name; their
names are keywords that the parser can recognize.  However, there
is code in c-common.c that uses identifier_global_value to look up
@@ -3915,7 +3915,7 @@ record_builtin_type (enum rid rid_index,
   tree tname = get_identifier (name);
   tree tdecl = build_decl (BUILTINS_LOCATION, TYPE_DECL, tname, type);
   DECL_ARTIFICIAL (tdecl) = 1;
-  SET_IDENTIFIER_GLOBAL_VALUE (tname, tdecl);
+  set_global_binding (tdecl);
   decl = tdecl;
 }
 
@@ -3925,7 +3925,7 @@ record_builtin_type (enum rid rid_index,
 	{
 	  tree rdecl = build_decl (BUILTINS_LOCATION, TYPE_DECL, rname, type);
 	  DECL_ARTIFICIAL (rdecl) = 1;
-	  SET_IDENTIFIER_GLOBAL_VALUE (rname, rdecl);
+	  set_global_binding (rdecl);
 	  if (!decl)
 	decl = rdecl;
 	}
@@ -4509,7 +4509,7 @@ build_cp_library_fn_ptr (const char* nam
 }
 
 /* Like build_library_fn, but also pushes the function so that we will
-   be able to find it via IDENTIFIER_GLOBAL_VALUE.  Also, the function
+   be a

Re: [PATCH] C++ warning on vexing parse

2017-10-05 Thread Nathan Sidwell

On 10/04/2017 08:17 PM, Eric Gallager wrote:

On Tue, Oct 3, 2017 at 1:08 PM, Nathan Sidwell  wrote:

[non-c++ people on CC, there's a reason ...]

This patch adds a new warning, concerning unnecessary parentheses on a
declaration.  For instance:
prettyprinter (pp);




Could you check and see if this fixes any of the preexisting
currently-open bugs related to most-vexing-parse (or similar) warnings
on Bugzilla? For example:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64679
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25814
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69855


Sadly no.  But, 69855 appears to already have been fixed -- perhaps by 
my namespace lookup changes earlier this year?


nathan

--
Nathan Sidwell


Re: [PATCH] simplify-rtx: Remove non-simplifying simplification (PR77729)

2017-10-05 Thread Segher Boessenkool
On Thu, Oct 05, 2017 at 10:46:13AM +0200, Richard Biener wrote:
> >>   PR rtl-optimization/77729
> >>   * simplify-rtx.c (simplify_binary_operation_1): Delete the (X&C1)|C2
> >>   to (X&(C1&~C2))|C2 transformations.
> > OK for the trunk.  I'm not sure if the BZ in question qualifies this
> > patch for backporting though.  Release managers have the final say on
> > the backport question.
> 
> We usually avoid doing "optimization regression" backports unless very very
> important and obvious.

It's very obvious, and quite important, but not very very important so I
won't champion backports :-)


Segher


[PATCH GCC][1/7]Delete unused field of struct partition in loop distribution

2017-10-05 Thread Bin Cheng
Hi,
This patch set implements distribution and builtin pattern distribution for
loop nest.  It consists of below patches:
  Patches [1~4]: Cleanup and (latent) bug fixes.
  Patch [5]: Loop nest distribution of two-level innermost loop nest.
  Patches [6,7]: Loop nest builtin pattern distribution.

This is the first simple patch deleting unused field of struct partition in loop
distribution.  It's an obvious change.
Bootstrap and test in patch set on x86_64 and AArch64.

Thanks,
bin
2017-10-04  Bin Cheng  

* tree-loop-distribution.c (struct partition): Remove unused field
loops of the structure.
(partition_alloc, partition_free): Ditto.
(build_rdg_partition_for_vertex): Ditto.From 6f1b39f7bea2e77fc320bc70829b3e1445633d1b Mon Sep 17 00:00:00 2001
From: Bin Cheng 
Date: Tue, 26 Sep 2017 16:54:35 +0100
Subject: [PATCH 1/7] struct-partition-20170925.txt

---
 gcc/tree-loop-distribution.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/gcc/tree-loop-distribution.c b/gcc/tree-loop-distribution.c
index 26b8b9a..3db3d6e 100644
--- a/gcc/tree-loop-distribution.c
+++ b/gcc/tree-loop-distribution.c
@@ -593,8 +593,6 @@ struct partition
 {
   /* Statements of the partition.  */
   bitmap stmts;
-  /* Loops of the partition.  */
-  bitmap loops;
   /* True if the partition defines variable which is used outside of loop.  */
   bool reduction_p;
   /* For builtin partition, true if it executes one iteration more than
@@ -619,7 +617,6 @@ partition_alloc (void)
 {
   partition *partition = XCNEW (struct partition);
   partition->stmts = BITMAP_ALLOC (NULL);
-  partition->loops = BITMAP_ALLOC (NULL);
   partition->reduction_p = false;
   partition->kind = PKIND_NORMAL;
   partition->datarefs = BITMAP_ALLOC (NULL);
@@ -632,7 +629,6 @@ static void
 partition_free (partition *partition)
 {
   BITMAP_FREE (partition->stmts);
-  BITMAP_FREE (partition->loops);
   BITMAP_FREE (partition->datarefs);
   free (partition);
 }
@@ -1279,8 +1275,6 @@ build_rdg_partition_for_vertex (struct graph *rdg, int v)
   FOR_EACH_VEC_ELT (nodes, i, x)
 {
   bitmap_set_bit (partition->stmts, x);
-  bitmap_set_bit (partition->loops,
-		  loop_containing_stmt (RDG_STMT (rdg, x))->num);
 
   for (j = 0; RDG_DATAREFS (rdg, x).iterate (j, &dr); ++j)
 	{
-- 
1.9.1



[PATCH GCC][2/7]Don't rename variables for deleted new preheader

2017-10-05 Thread Bin Cheng
Hi,
I noticed that new_preheader basic block could be deleted if the copied
loop is added at entry in function slpeel_tree_duplicate_loop_to_edge_cfg.
This simple patch skips new_preheader during variable renaming if it is
deleted.
Bootstrap and test in patch set on x86_64 and AArch64, is it OK?

Thanks,
bin
2017-10-04  Bin Cheng  

* tree-vect-loop-manip.c (slpeel_tree_duplicate_loop_to_edge_cfg): Skip
renaming variables in new preheader if it's deleted.From 9c7719402c9528b517d8408419c2e9b930708772 Mon Sep 17 00:00:00 2001
From: Bin Cheng 
Date: Fri, 22 Sep 2017 16:50:40 +0100
Subject: [PATCH 2/7] skip-new_preheader.txt

---
 gcc/tree-vect-loop-manip.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c
index f78e4b4..2c724a2 100644
--- a/gcc/tree-vect-loop-manip.c
+++ b/gcc/tree-vect-loop-manip.c
@@ -496,7 +496,8 @@ slpeel_tree_duplicate_loop_to_edge_cfg (struct loop *loop,
 			   loop_preheader_edge (new_loop)->src);
 }
 
-  for (unsigned i = 0; i < scalar_loop->num_nodes + 1; i++)
+  /* Skip new preheader since it's deleted if copy loop is added at entry.  */
+  for (unsigned i = (at_exit ? 0 : 1); i < scalar_loop->num_nodes + 1; i++)
 rename_variables_in_bb (new_bbs[i], duplicate_outer_loop);
 
   if (scalar_loop != loop)
-- 
1.9.1



[PATCH GCC][3/7]Don't skip renaming PHIs in loop nest with only one inner loop

2017-10-05 Thread Bin Cheng
Hi,
Function rename_variables_in_bb skips renaming PHI nodes in loop nest if the
outer loop has only one inner loop.  This breaks loop nest distribution when
inner loop has PHI node initialized from outer loop's variable.  Unfortunately,
I lost the original C code illustrating the issue.  Now it is only triggered
in building spec2006/416.gamess with loop nest distribution, but I failed to
reduce a test from it.
Bootstrap and test in patch set on x86_64 and AArch64, is it OK?

Thanks,
bin
2017-10-04  Bin Cheng  

* tree-vect-loop-manip.c (rename_variables_in_bb): Rename PHI nodes
when copying loop nest with only one inner loop.From fa3cc2014278d672db94bad5c6a606cb6888d79a Mon Sep 17 00:00:00 2001
From: Bin Cheng 
Date: Mon, 25 Sep 2017 16:40:11 +0100
Subject: [PATCH 3/7] rename-variables-in-loop-nest.txt

---
 gcc/tree-vect-loop-manip.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c
index 2c724a2..9fd65a7 100644
--- a/gcc/tree-vect-loop-manip.c
+++ b/gcc/tree-vect-loop-manip.c
@@ -117,8 +117,6 @@ rename_variables_in_bb (basic_block bb, bool rename_from_outer_loop)
 		  || single_pred (e->src) != outer_loop->header)
 		continue;
 		}
-	  else
-		continue;
 	}
 	}
   for (gphi_iterator gsi = gsi_start_phis (bb); !gsi_end_p (gsi);
-- 
1.9.1



[PATCH GCC][4/7]Choose exit edge/path when removing inner loop's exit statement

2017-10-05 Thread Bin Cheng
Hi,
Function generate_loops_for_partition chooses arbitrary path when removing exit
condition not in partition.  This is fine for now because it's impossible to 
have
loop exit condition in case of innermost distribution.  After extending to loop
nest distribution, we must choose exit edge/path for inner loop's exit 
condition,
otherwise an infinite empty loop will be generated.  Test case added.

Bootstrap and test in patch set on x86_64 and AArch64, is it OK?

Thanks,
bin
2017-10-04  Bin Cheng  

* tree-loop-distribution.c (generate_loops_for_partition): Remove
inner loop's exit stmt by making it always exit the loop, otherwise
we would generate an infinite empty loop.

gcc/testsuite/ChangeLog
2017-10-04  Bin Cheng  

* gcc.dg/tree-ssa/ldist-27.c: New test.From 29f15d5a166b139d8d2dad2ee798c4d0a338f820 Mon Sep 17 00:00:00 2001
From: Bin Cheng 
Date: Mon, 25 Sep 2017 16:52:42 +0100
Subject: [PATCH 4/7] loop_nest-exit-cond-distribution.txt

---
 gcc/testsuite/gcc.dg/tree-ssa/ldist-27.c | 38 
 gcc/tree-loop-distribution.c | 16 +++---
 2 files changed, 51 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ldist-27.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ldist-27.c b/gcc/testsuite/gcc.dg/tree-ssa/ldist-27.c
new file mode 100644
index 000..3580c65
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ldist-27.c
@@ -0,0 +1,38 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -ftree-loop-distribute-patterns -fdump-tree-ldist-details" } */
+
+#define M (300)
+#define N (200)
+
+struct st
+{
+  double a[M];
+  double b[M];
+  double c[M][N];
+};
+
+int __attribute__ ((noinline)) foo (struct st *s)
+{
+  int i, j;
+  for (i = 0; i != M;)
+{
+  s->a[i] = 0.0;
+  s->b[i] = 1.0;
+  for (j = 0; 1; ++j)
+	{
+	  if (j == N) goto L2;
+	  s->c[i][j] = 0.0;
+	}
+L2:
+  ++i;
+}
+  return 0;
+}
+
+int main (void)
+{
+  struct st s;
+  return foo (&s);
+}
+
+/* { dg-final { scan-tree-dump "distributed: split to " "ldist" } } */
diff --git a/gcc/tree-loop-distribution.c b/gcc/tree-loop-distribution.c
index 3db3d6e..999b32e 100644
--- a/gcc/tree-loop-distribution.c
+++ b/gcc/tree-loop-distribution.c
@@ -830,6 +830,10 @@ generate_loops_for_partition (struct loop *loop, partition *partition,
   for (i = 0; i < loop->num_nodes; i++)
 {
   basic_block bb = bbs[i];
+  edge inner_exit = NULL;
+
+  if (loop != bb->loop_father)
+	inner_exit = single_exit (bb->loop_father);
 
   for (gphi_iterator bsi = gsi_start_phis (bb); !gsi_end_p (bsi);)
 	{
@@ -848,11 +852,17 @@ generate_loops_for_partition (struct loop *loop, partition *partition,
 	  && !is_gimple_debug (stmt)
 	  && !bitmap_bit_p (partition->stmts, gimple_uid (stmt)))
 	{
-	  /* Choose an arbitrary path through the empty CFG part
-		 that this unnecessary control stmt controls.  */
+	  /* In distribution of loop nest, if bb is inner loop's exit_bb,
+		 we choose its exit edge/path in order to avoid generating
+		 infinite loop.  For all other cases, we choose an arbitrary
+		 path through the empty CFG part that this unnecessary
+		 control stmt controls.  */
 	  if (gcond *cond_stmt = dyn_cast  (stmt))
 		{
-		  gimple_cond_make_false (cond_stmt);
+		  if (inner_exit && inner_exit->flags & EDGE_TRUE_VALUE)
+		gimple_cond_make_true (cond_stmt);
+		  else
+		gimple_cond_make_false (cond_stmt);
 		  update_stmt (stmt);
 		}
 	  else if (gimple_code (stmt) == GIMPLE_SWITCH)
-- 
1.9.1



[PATCH GCC][5/7]Extend loop distribution for two-level innermost loop nest

2017-10-05 Thread Bin Cheng
Hi,
For now distribution pass only handles the innermost loop.  This patch extends 
the pass
to cover two-level innermost loop nest.  It also refactors code in 
pass_loop_distribution::execute
for better reading.  Note I restrict it to 2-level loop nest on purpose because 
of high
cost in data dependence computation.  Some compilation time optimizations like 
reusing
the data reference finding, data dependence computing, would require a rewrite 
of this
pass like the proposed loop interchange implementation.  But that's another 
task.

This patch introduces a temporary TODO for loop nest builtin partition which is 
covered
by next two patches.

With this patch, kernel loop in bwaves now can be distributed, thus exposed for 
further
interchange.  This patch adds new test for matrix multiplication, as well as 
adjusts
test strings of existing tests.
Bootstrap and test in patch set on x86_64 and AArch64, is it OK?

Thanks,
bin
2017-10-04  Bin Cheng  

* tree-loop-distribution.c: Adjust the general comment.
(NUM_PARTITION_THRESHOLD): New macro.
(ssa_name_has_uses_outside_loop_p): Support loop nest distribution.
(classify_partition): Skip builtin pattern of loop nest's inner loop.
(merge_dep_scc_partitions): New parameter ignore_alias_p and use it
in call to build_partition_graph.
(finalize_partitions): New parameter.  Make loop distribution more
conservative by fusing more partitions.
(distribute_loop): Don't do runtime alias check in case of loop nest
distribution.
(find_seed_stmts_for_distribution): New function.
(pass_loop_distribution::execute): Refactor code finding seed stmts
into above function.  Support loop nest distribution for two-level
innermost loop nest.  Adjust dump information.

gcc/testsuite/ChangeLog
2017-10-04  Bin Cheng  

* gcc.dg/tree-ssa/ldist-7.c: Adjust test string.
* gcc.dg/tree-ssa/ldist-16.c: Ditto.
* gcc.dg/tree-ssa/ldist-25.c: Ditto.
* gcc.dg/tree-ssa/ldist-33.c: Ditto.diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ldist-16.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ldist-16.c
index f43b64e..f4f3a44 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ldist-16.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ldist-16.c
@@ -16,5 +16,5 @@ void foo (int n)
 
 /* We should not apply loop distribution and not generate a memset (0).  */
 
-/* { dg-final { scan-tree-dump "Loop 1 is the same" "ldist" } } */
+/* { dg-final { scan-tree-dump "Loop 1 not distributed" "ldist" } } */
 /* { dg-final { scan-tree-dump-times "generated memset zero" 0 "ldist" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ldist-25.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ldist-25.c
index 699bf38..c0b95fc 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ldist-25.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ldist-25.c
@@ -22,4 +22,4 @@ foo (void)
 }
 }
 
-/* { dg-final { scan-tree-dump "Loop . is the same" "ldist" } } */
+/* { dg-final { scan-tree-dump "Loop . not distributed" "ldist" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ldist-33.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ldist-33.c
new file mode 100644
index 000..24d27fd
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ldist-33.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-loop-distribution -ftree-loop-distribute-patterns 
-fdump-tree-ldist-details" } */
+
+#define N (1024)
+double a[N][N], b[N][N], c[N][N];
+
+void
+foo (void)
+{
+  unsigned i, j, k;
+
+  for (i = 0; i < N; ++i)
+for (j = 0; j < N; ++j)
+  {
+   c[i][j] = 0.0;
+   for (k = 0; k < N; ++k)
+ c[i][j] += a[i][k] * b[k][j];
+  }
+}
+
+/* { dg-final { scan-tree-dump "Loop nest . distributed: split to 1 loops and 
1 library" "ldist" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ldist-7.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ldist-7.c
index f31d051..2eb1f74 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ldist-7.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ldist-7.c
@@ -28,4 +28,4 @@ int loop1 (int k)
   return a[1000-2] + b[1000-1] + c[1000-2] + d[1000-2];
 }
 
-/* { dg-final { scan-tree-dump-times "distributed" 0 "ldist" } } */
+/* { dg-final { scan-tree-dump-times "distributed: " 0 "ldist" } } */
diff --git a/gcc/tree-loop-distribution.c b/gcc/tree-loop-distribution.c
index 999b32e..59a968c 100644
--- a/gcc/tree-loop-distribution.c
+++ b/gcc/tree-loop-distribution.c
@@ -83,8 +83,8 @@ along with GCC; see the file COPYING3.  If not see
loops and recover to the original one.
 
TODO:
- 1) We only distribute innermost loops now.  This pass should handle loop
-   nests in the future.
+ 1) We only distribute innermost two-level loop nest now.  We should
+   extend it for arbitrary loop nests in the future.
  2) We only fuse partitions in SCC now.  A better fusion algorithm is
desired to minimize loop overhead, maximize parallelism and maximize
data reuse.  */
@@ -118,6 +118,11 @@ along with GCC; see the file COPYIN

[PATCH GCC][6/7]Support loop nest distribution for builtin partition

2017-10-05 Thread Bin Cheng
Hi,
This patch rewrites classification part of builtin partition so that nested
builtin partitions are supported.  With this extension, below loop nest:
void
foo (void)
{
  for (unsigned i = 0; i < M; ++i)
for (unsigned j = 0; j < N; ++j)
  arr[i][j] = 0;

will be distributed into a single memset, rather than a loop of memset.
Bootstrap and test in patch set on x86_64 and AArch64, is it OK?

Thanks,
bin
2017-10-04  Bin Cheng  

* tree-loop-distribution.c (struct builtin_info): New struct.
(struct partition): Refactor fields into struct builtin_info.
(partition_free): Free struct builtin_info.
(build_size_arg_loc, build_addr_arg_loc): Delete.
(generate_memset_builtin, generate_memcpy_builtin): Get memory range
information from struct builtin_info.
(find_single_drs): New function refactored from classify_partition.
Also moved builtin validity checks to this function.
(compute_access_range, alloc_builtin): New functions.
(classify_builtin_1, classify_builtin_2): New functions.
(classify_partition): Refactor code into functions find_single_drs,
classify_builtin_1 and classify_builtin_2.
(distribute_loop): Don't do runtime alias check when distributing
loop nest.
(find_seed_stmts_for_distribution): New function.
(pass_loop_distribution::execute): Refactor code finding seed
stmts into above function.  Support distribution for the innermost
two-level loop nest.  Adjust dump information.

gcc/testsuite/ChangeLog
2017-10-04  Bin Cheng  

* gcc.dg/tree-ssa/ldist-28.c: New test.
* gcc.dg/tree-ssa/ldist-29.c: New test.
* gcc.dg/tree-ssa/ldist-30.c: New test.
* gcc.dg/tree-ssa/ldist-31.c: New test.diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ldist-28.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ldist-28.c
new file mode 100644
index 000..4420139
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ldist-28.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-loop-distribution -ftree-loop-distribute-patterns 
-fdump-tree-ldist-details" } */
+
+#define M (256)
+#define N (1024)
+int arr[M][N];
+
+void
+foo (void)
+{
+  for (unsigned i = 0; i < M; ++i)
+for (unsigned j = 0; j < N; ++j)
+  arr[i][j] = 0;
+}
+
+/* { dg-final { scan-tree-dump "Loop nest . distributed: split to 0 loops and 
1 library" "ldist" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ldist-29.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ldist-29.c
new file mode 100644
index 000..9ce93e8
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ldist-29.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-loop-distribution -ftree-loop-distribute-patterns 
-fdump-tree-ldist-details" } */
+
+#define M (256)
+#define N (512)
+int arr[M][N];
+
+void
+foo (void)
+{
+  for (unsigned i = 0; i < M; ++i)
+for (unsigned j = 0; j < N - 1; ++j)
+  arr[i][j] = 0;
+}
+
+/* { dg-final { scan-tree-dump-not "Loop nest . distributed: split to" "ldist" 
} } */
+/* { dg-final { scan-tree-dump-times "Loop . distributed: split to 0 loops and 
1 library" 1 "ldist" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ldist-30.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ldist-30.c
new file mode 100644
index 000..f31860a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ldist-30.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-loop-distribution -ftree-loop-distribute-patterns 
-fdump-tree-ldist-details" } */
+
+#define M (256)
+#define N (512)
+int a[M][N], b[M][N];
+
+void
+foo (void)
+{
+  for (unsigned i = 0; i < M; ++i)
+for (unsigned j = N; j > 0; --j)
+  a[i][j - 1] = b[i][j - 1];
+}
+
+/* { dg-final { scan-tree-dump-times "Loop nest . distributed: split to" 1 
"ldist" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ldist-31.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ldist-31.c
new file mode 100644
index 000..60a9f74
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ldist-31.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-loop-distribution -ftree-loop-distribute-patterns 
-fdump-tree-ldist-details" } */
+
+#define M (256)
+#define N (512)
+int a[M][N], b[M][N], c[M];
+
+void
+foo (void)
+{
+  for (int i = M - 1; i >= 0; --i)
+{
+  c[i] = 0;
+  for (unsigned j = N; j > 0; --j)
+   a[i][j - 1] = b[i][j - 1];
+}
+}
+
+/* { dg-final { scan-tree-dump-times "Loop nest . distributed: split to 0 
loops and 2 library" 1 "ldist" } } */
diff --git a/gcc/tree-loop-distribution.c b/gcc/tree-loop-distribution.c
index 59a968c..237474f 100644
--- a/gcc/tree-loop-distribution.c
+++ b/gcc/tree-loop-distribution.c
@@ -581,72 +581,82 @@ build_rdg (struct loop *loop, control_dependences *cd)
 
 
 /* Kind of distributed loop.  */
 enum partition_kind {
 PKIND_NORMAL, PKIND_MEMSET, PKIND_MEMCPY, PKIND_MEMMOVE
 };
 
 /* Type of distributed loop.  */
 enum partition_type {
 /* The distributed loop can be executed

[PATCH GCC][7/7]Merge adjacent memset builtin partitions

2017-10-05 Thread Bin Cheng
Hi,
This patch merges adjacent memset builtin partitions if possible.  It is
a useful special case optimization transforming below code:

#define M (256)
#define N (512)

struct st
{
  int a[M][N];
  int c[M];
  int b[M][N];
};

void
foo (struct st *p)
{
  for (unsigned i = 0; i < M; ++i)
{
  p->c[i] = 0;
  for (unsigned j = N; j > 0; --j)
{
  p->a[i][j - 1] = 0;
  p->b[i][j - 1] = 0;
}
}

into a single memset function call, rather than three calls initializing
the structure field by field.

Bootstrap and test in patch set on x86_64 and AArch64, is it OK?

Thanks,
bin
2017-10-04  Bin Cheng  

* tree-loop-distribution.c (tree-ssa-loop-ivopts.h): New header file.
(struct builtin_info): New fields.
(classify_builtin_1): Compute and record base and offset parts for
memset builtin partition by calling strip_offset.
(fuse_memset_builtins): New function.
(finalize_partitions): Fuse adjacent memset partitions by calling
above function.
* tree-ssa-loop-ivopts.c (strip_offset): Delete static declaration.
Expose the interface.
* tree-ssa-loop-ivopts.h (strip_offset): New declaration.

gcc/testsuite/ChangeLog
2017-10-04  Bin Cheng  

* gcc.dg/tree-ssa/ldist-17.c: Adjust test string.
* gcc.dg/tree-ssa/ldist-32.c: New test.diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ldist-17.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ldist-17.c
index 4efc0a4..b3617f6 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ldist-17.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ldist-17.c
@@ -45,5 +45,5 @@ mad_synth_mute (struct mad_synth *synth)
   return;
 }
 
-/* { dg-final { scan-tree-dump "distributed: split to 0 loops and 4 library 
calls" "ldist" } } */
-/* { dg-final { scan-tree-dump-times "generated memset zero" 4 "ldist" } } */
+/* { dg-final { scan-tree-dump "Loop nest . distributed: split to 0 loops and 
1 library calls" "ldist" } } */
+/* { dg-final { scan-tree-dump-times "generated memset zero" 1 "ldist" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ldist-32.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ldist-32.c
new file mode 100644
index 000..477d222
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ldist-32.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-loop-distribution -ftree-loop-distribute-patterns 
-fdump-tree-ldist-details" } */
+
+#define M (256)
+#define N (512)
+
+struct st
+{
+  int a[M][N];
+  int c[M];
+  int b[M][N];
+};
+
+void
+foo (struct st *p)
+{
+  for (unsigned i = 0; i < M; ++i)
+{
+  p->c[i] = 0;
+  for (unsigned j = N; j > 0; --j)
+   {
+ p->a[i][j - 1] = 0;
+ p->b[i][j - 1] = 0;
+   }
+}
+}
+
+/* { dg-final { scan-tree-dump-times "Loop nest . distributed: split to 0 
loops and 1 library" 1 "ldist" } } */
+/* { dg-final { scan-tree-dump-times "__builtin_memset \\(.*, 0, 1049600\\);" 
1 "ldist" } } */
diff --git a/gcc/tree-loop-distribution.c b/gcc/tree-loop-distribution.c
index 237474f..ac1903d 100644
--- a/gcc/tree-loop-distribution.c
+++ b/gcc/tree-loop-distribution.c
@@ -106,6 +106,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "stor-layout.h"
 #include "tree-cfg.h"
 #include "tree-ssa-loop-manip.h"
+#include "tree-ssa-loop-ivopts.h"
 #include "tree-ssa-loop.h"
 #include "tree-into-ssa.h"
 #include "tree-ssa.h"
@@ -604,6 +605,10 @@ struct builtin_info
   tree dst_base;
   tree src_base;
   tree size;
+  /* Base and offset part of dst_base after stripping constant offset.  This
+ is only used in memset builtin distribution for now.  */
+  tree dst_base_base;
+  unsigned HOST_WIDE_INT dst_base_offset;
 };
 
 /* Partition for loop distribution.  */
@@ -1500,7 +1505,11 @@ classify_builtin_1 (loop_p loop, partition *partition, 
data_reference_p dr)
   if (!compute_access_range (loop, dr, &base, &size))
 return;
 
-  partition->builtin = alloc_builtin (dr, NULL, base, NULL_TREE, size);
+  struct builtin_info *builtin;
+  builtin = alloc_builtin (dr, NULL, base, NULL_TREE, size);
+  builtin->dst_base_base = strip_offset (builtin->dst_base,
+&builtin->dst_base_offset);
+  partition->builtin = builtin;
   partition->kind = PKIND_MEMSET;
 }
 
@@ -2461,6 +2470,115 @@ version_for_distribution_p (vec 
*partitions,
   return (alias_ddrs->length () > 0);
 }
 
+/* Fuse adjacent memset builtin PARTITIONS if possible.  This is a special
+   case optimization transforming below code:
+
+ __builtin_memset (&obj, 0, 100);
+ _1 = &obj + 100;
+ __builtin_memset (_1, 0, 200);
+ _2 = &obj + 300;
+ __builtin_memset (_2, 0, 100);
+
+   into:
+
+ __builtin_memset (&obj, 0, 400);
+
+   Note we don't have dependence information between different partitions
+   at this point, as a result, we can't handle nonadjacent memset builtin
+   partitions since dependence might be broken.  */
+
+static void
+fuse_memset_builtins (vec *partitions)
+{
+  unsigned i, j, k, l;

Re: [PATCH] C++ warning on vexing parse

2017-10-05 Thread Nathan Sidwell

On 10/04/2017 10:44 AM, Jason Merrill wrote:


Hmm, I don't think we want to diagnose these; if a parameter-list
follows the parenthesized declarator, it isn't ambiguous.


True.  I went with that approach


3 and 4 seem like false positives.  The problematic cases are all ones
where the parenthesized name is at the end of the declarator; if the
declarator continues after the name, either within or without the
parens, this


I went with this approach.  For an array declarator of the form:
  T (ary...[X]);
we only warn when the '(' and ')' are on the same line.

Thus no triggers on GCC's source base (other than the earlier two I 
already fixed).


Applying the attached

nathan
--
Nathan Sidwell
2017-10-05  Nathan Sidwell  

	gcc/cp/
	Warn on MVP declarations
	* cp-tree.h (struct cp_declarator): Add parenthesized field.
	* decl.c (grokdeclarator): Warn about unnecessary parens.
	* parser.c (make_declarator): Init parenthesized field.
	(cp_parser_direct_declarator): Set parenthesized field.

	gcc/
	* doc/invoke.texi (Wparentheses): Document C++ MVP behaviour.

	gcc/testsuite/
	* g++.dg/warn/mvp.C: New.

Index: gcc/cp/cp-tree.h
===
--- gcc/cp/cp-tree.h	(revision 253445)
+++ gcc/cp/cp-tree.h	(working copy)
@@ -5671,6 +5671,10 @@ struct cp_declarator {
   /* Whether we parsed an ellipsis (`...') just before the declarator,
  to indicate this is a parameter pack.  */
   BOOL_BITFIELD parameter_pack_p : 1;
+  /* If this declarator is parenthesized, this the open-paren.  It is
+ UNKNOWN_LOCATION when not parenthesized.  */
+  location_t parenthesized;
+
   location_t id_loc; /* Currently only set for cdk_id, cdk_decomp and
 			cdk_function. */
   /* GNU Attributes that apply to this declarator.  If the declarator
Index: gcc/cp/decl.c
===
--- gcc/cp/decl.c	(revision 253445)
+++ gcc/cp/decl.c	(working copy)
@@ -10807,6 +10807,13 @@ grokdeclarator (const cp_declarator *dec
 	attr_flags);
 	}
 
+  /* We don't want to warn in parmeter context because we don't
+	 yet know if the parse will succeed, and this might turn out
+	 to be a constructor call.  */
+  if (decl_context != PARM
+	  && declarator->parenthesized != UNKNOWN_LOCATION)
+	warning_at (declarator->parenthesized, OPT_Wparentheses,
+		"unnecessary parentheses in declaration of %qs", name);
   if (declarator->kind == cdk_id || declarator->kind == cdk_decomp)
 	break;
 
Index: gcc/cp/parser.c
===
--- gcc/cp/parser.c	(revision 253445)
+++ gcc/cp/parser.c	(working copy)
@@ -1451,6 +1451,7 @@ make_declarator (cp_declarator_kind kind
 
   declarator = (cp_declarator *) alloc_declarator (sizeof (cp_declarator));
   declarator->kind = kind;
+  declarator->parenthesized = UNKNOWN_LOCATION;
   declarator->attributes = NULL_TREE;
   declarator->std_attributes = NULL_TREE;
   declarator->declarator = NULL;
@@ -19808,6 +19809,7 @@ cp_parser_direct_declarator (cp_parser*
   bool saved_in_declarator_p = parser->in_declarator_p;
   bool first = true;
   tree pushed_scope = NULL_TREE;
+  cp_token *open_paren = NULL, *close_paren = NULL;
 
   while (true)
 {
@@ -19858,6 +19860,8 @@ cp_parser_direct_declarator (cp_parser*
 	  tree params;
 	  bool is_declarator = false;
 
+	  open_paren = NULL;
+
 	  /* In a member-declarator, the only valid interpretation
 		 of a parenthesis is the start of a
 		 parameter-declaration-clause.  (It is invalid to
@@ -19979,6 +19983,7 @@ cp_parser_direct_declarator (cp_parser*
 	  parser->default_arg_ok_p = saved_default_arg_ok_p;
 	  parser->in_declarator_p = saved_in_declarator_p;
 
+	  open_paren = token;
 	  /* Consume the `('.  */
 	  matching_parens parens;
 	  parens.consume_open (parser);
@@ -19992,6 +19997,7 @@ cp_parser_direct_declarator (cp_parser*
 	  parser->in_type_id_in_expr_p = saved_in_type_id_in_expr_p;
 	  first = false;
 	  /* Expect a `)'.  */
+	  close_paren = cp_lexer_peek_token (parser->lexer);
 	  if (!parens.require_close (parser))
 		declarator = cp_error_declarator;
 	  if (declarator == cp_error_declarator)
@@ -20013,6 +20019,7 @@ cp_parser_direct_declarator (cp_parser*
 	  if (ctor_dtor_or_conv_p)
 	*ctor_dtor_or_conv_p = 0;
 
+	  open_paren = NULL;
 	  first = false;
 	  parser->default_arg_ok_p = false;
 	  parser->in_declarator_p = true;
@@ -20308,6 +20315,22 @@ cp_parser_direct_declarator (cp_parser*
  point.  That's an error; the declarator is not optional.  */
   if (!declarator)
 cp_parser_error (parser, "expected declarator");
+  else if (open_paren)
+{
+  /* Record overly parenthesized declarator so we can give a
+	 diagnostic about confusing decl/expr disambiguation.  */
+  if (declarator->kind == cdk_array)
+	{
+	  /* If the open and close parens are on different lines, this
+	 is probably a

Re: [RFA] [PATCH] Add a warning for invalid function casts

2017-10-05 Thread Bernd Edlinger
On 10/05/17 02:24, Eric Gallager wrote:
> Sorry if this is a stupid question, but could you explain how this
> warning is different from -Wbad-function-cast? Something about direct
> calls to functions vs. passing them as function pointers?

No, it is not :)

-Wbad-function-cast is IMHO a strange legacy warning.

It is C-only, and it triggers only if the result of a function call
is cast to a type with a different TREE_CODE, so for instance
int <-> float <-> pointer.

It would trigger for perfectly valid code like this:

i = (int) floor (f);

while we have no warning for

i = floor (f);

What I want to diagnose is assigning a function pointer via an explicit
type cast to another function pointer, when there is no possible
implicit conversion between the two function pointer types.
Thus the cast was used to silence a warning/error in the first place.

The idea for this warning came up when someone spotted a place in
openssl, where a type cast was used to change the return value of a
callback function from long to int:

https://github.com/openssl/openssl/issues/4413

But due to the type cast there was never ever any warning from this
invalid type cast.


Bernd.


Zen tuning part 1 (reassociation width)

2017-10-05 Thread Jan Hubicka
Hi,
this patch enables reassociation of integer and vector operations for Zen.
While doing so I have noticed that the logic is split across three target hooks
(TARGET_VECTOR_PARALLEL_EXECUTION, TARGET_REASSOC_INT_TO_PARALLEL and
TARGET_REASSOC_FP_TO_PARALLEL) and function ix86_reassociation_width.

This makes it quite non-obious that all three places needs to be kept in sync
and in fact the comments in ix86_reassociation_width seems to suggest that
often this was not done properly.

This patch replaces it by similar scheme as arm backend - the reassociation
widths are split to int, fp, vector int and vector fp and present in cost
tables (where one looks while doing the tuning).

ix86_reassociation_width then just handles special cases which does not fit
into table scheme very well (like assymetry of integer vector operations on Zen
and the fact htat Zen splits 256 bit operations to 128bit parts and thus
reassociation width is smaller).

I have also kept existing logic of capping the value by 2 for 32bit compilation.
I did not benchmarked that thoroughly but I assume it sort of makes sense given
the register pressure costs.

I have tried my best to preserve existing settings for different target (some
of them I think does not make much sense).  One change I am aware of is that I
enabled reassociation for core2 which shares same cost table with later core
based CPUs.  I think that should be a good idea because core2 parallelizm was
similar to later variants.

For zen i have experiented with enabling reassociation in all four cases
(int/fp vector or not). Using the theroretical width needed by the CPU leads
to small regression in specFP2000 and thus I chose 4 instead of 6. We may want
to reduce the value if further regressions are tracked down to this patch.

Incrementally we may try to experient with sanitizing the values for other 
targets
and generic tuning.

Bootstrapped/regtested x86_64-linux.
Will commit it shortly.

Honza

* i386.c (ix86_size_cost, i386_cost, i486_cost, pentium_cost,
lakemont_cost, pentiumpro_cost, geode_cost, k6_cost,
athlon_cost, k8_cost, amdfam10_cost, btver1_cost, btver2_cost,
pentium4_cost, nocona_cost): Set reassociation width to 1.
(bdver1_cost, bdver2_cost, bdver3_cost, bdver4_cost): Set reassociation
width to 2 for fp operations and 1 otherwise.
(znver1_cost): Set scalar reassoc width to 4 and vector to 3 and 6
for int and fp.
(atom_cost): Set reassociation width to 2.
(slm_cost, generic_cost): Set fp reassociation width to 2 and 1 
otherwise.
(intel_cost): Set fp reassociation width to 4 and 1 otherwise.
(core_cost): Set fp reassociation width to 4 and vector to 2.
(ix86_reassociation_width): Rewrite using cost table; special case
plus/minus on Zen; honor X86_TUNE_SSE_SPLIT_REGS
and TARGET_AVX128_OPTIMAL.
* i386.h (processor_costs): Add
reassoc_int, reassoc_fp, reassoc_vec_int, reassoc_vec_fp.
(TARGET_VECTOR_PARALLEL_EXECUTION, TARGET_REASSOC_INT_TO_PARALLEL,
TARGET_REASSOC_FP_TO_PARALLEL): Remove.
* x86-tune.def (X86_TUNE_REASSOC_INT_TO_PARALLEL): Remove.
(X86_TUNE_REASSOC_FP_TO_PARALLEL): Remove.
(X86_TUNE_VECTOR_PARALLEL_EXECUTION):  Remove.
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 253443)
+++ config/i386/i386.c  (working copy)
@@ -177,6 +177,7 @@ struct processor_costs ix86_size_cost =
   COSTS_N_BYTES (2),   /* cost of FABS instruction.  */
   COSTS_N_BYTES (2),   /* cost of FCHS instruction.  */
   COSTS_N_BYTES (2),   /* cost of FSQRT instruction.  */
+  1, 1, 1, 1,  /* reassoc int, fp, vec_int, vec_fp.  */
   ix86_size_memcpy,
   ix86_size_memset,
   1,   /* scalar_stmt_cost.  */
@@ -253,6 +254,7 @@ struct processor_costs i386_cost = {/*
   COSTS_N_INSNS (22),  /* cost of FABS instruction.  */
   COSTS_N_INSNS (24),  /* cost of FCHS instruction.  */
   COSTS_N_INSNS (122), /* cost of FSQRT instruction.  */
+  1, 1, 1, 1,  /* reassoc int, fp, vec_int, vec_fp.  */
   i386_memcpy,
   i386_memset,
   1,   /* scalar_stmt_cost.  */
@@ -330,6 +332,7 @@ struct processor_costs i486_cost = {/*
   COSTS_N_INSNS (3),   /* cost of FABS instruction.  */
   COSTS_N_INSNS (3),   /* cost of FCHS instruction.  */
   COSTS_N_INSNS (83),  /* cost of FSQRT instruction.  */
+  1, 1, 1, 1,  /* reassoc int, fp, vec_int, vec_fp.  */
   i486_memcpy,
   i486_memset,
   1,   /* scalar_stmt_cost.  */
@@ -405,6 +408,7 @@ struct processor_costs pentium_cost = {
   COSTS_N_INSNS (1),   /* cost of FABS in

Re: [AArch64] Backport to gcc-7 PR71727 fix -mstrict-align

2017-10-05 Thread Richard Earnshaw (lists)
On 04/10/17 14:35, Christophe Lyon wrote:
> Hi,
> 
> I've recently committed a follow-up fix for PR71727 for -mstrict-align
> on aarch64 (r253242).
> I think it would be appropriate to apply it to gcc-7-branch. The patch
> from trunk applies cleanly to gcc-7-branch.
> 
> Although the original bug was reported against 4.9.4, 5.3.1, 6.1.0,
> Naveen's patch was not backported to these branches, so it's not
> appropriate to backport my patch there.
> 
> OK?
> 
> Thanks,
> 
> Christophe
> 
> 
> aarch64-strict-align3.chlog.txt
> 
> 
> 2017-09-20  Christophe Lyon  
> 
>   PR target/71727
>   gcc/
>   * config/aarch64/aarch64.c
>   (aarch64_builtin_support_vector_misalignment): Always return false
>   when misalignment is unknown.
> 
>   gcc/testsuite/
>   * gcc.target/aarch64/pr71727-2.c: New test.
> 

OK.

R.

> 
> aarch64-strict-align3.patch.txt
> 
> 
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 799989a..7cc67ec 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -11757,19 +11757,9 @@ aarch64_builtin_support_vector_misalignment 
> (machine_mode mode,
>if (optab_handler (movmisalign_optab, mode) == CODE_FOR_nothing)
>  return false;
>  
> +  /* Misalignment factor is unknown at compile time.  */
>if (misalignment == -1)
> - {
> -   /* Misalignment factor is unknown at compile time but we know
> -  it's word aligned.  */
> -   if (aarch64_simd_vector_alignment_reachable (type, is_packed))
> -{
> -  int element_size = TREE_INT_CST_LOW (TYPE_SIZE (type));
> -
> -  if (element_size != 64)
> -return true;
> -}
> -   return false;
> - }
> + return false;
>  }
>return default_builtin_support_vector_misalignment (mode, type, 
> misalignment,
> is_packed);
> diff --git a/gcc/testsuite/gcc.target/aarch64/pr71727-2.c 
> b/gcc/testsuite/gcc.target/aarch64/pr71727-2.c
> new file mode 100644
> index 000..2bc803a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/pr71727-2.c
> @@ -0,0 +1,16 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mstrict-align -O3" } */
> +
> +unsigned char foo(const unsigned char *buffer, unsigned int length)
> +{
> +  unsigned char sum;
> +  unsigned int  count;
> +
> +  for (sum = 0, count = 0; count < length; count++) {
> +sum = (unsigned char) (sum + *(buffer + count));
> +  }
> +
> +  return sum;
> +}
> +
> +/* { dg-final { scan-assembler-times "and\tw\[0-9\]+, w\[0-9\]+, 15" 1 } } */
> 



Re: [PATCH] Add sanopt support for UBSAN_PTR.

2017-10-05 Thread Martin Liška
On 10/04/2017 08:06 PM, Jakub Jelinek wrote:
> On Wed, Oct 04, 2017 at 11:05:23AM +0200, Martin Liška wrote:
>> Following patch adds support for optimizing out unnecessary UBSAN_PTR checks.
>> It handles separately positive and negative offsets, zero offset is ignored.
>> Apart from that, we utilize get_inner_reference for local and global 
>> variables,
>> that also helps to reduce some.
> 
> Thanks for working on it.
> 
>> @@ -148,6 +149,61 @@ struct sanopt_tree_triplet_hash : typed_noop_remove 
>> 
>>}
>>  };
>>  
>> +/* Tree couple for ptr_check_map.  */
>> +struct sanopt_tree_couple
>> +{
>> +  tree ptr;
>> +  bool positive;
> 
> Maybe pos_p instead, so that it isn't that long?

Hi.

Yes.

> 
>> +  static inline bool
>> +  equal (const sanopt_tree_couple &ref1, const sanopt_tree_couple &ref2)
>> +  {
>> +return operand_equal_p (ref1.ptr, ref2.ptr, 0)
>> +   && ref1.positive == ref2.positive;
> 
> Or this would need to use ()s around the return expression for emacs
> users.

Done.

> 
>> +/* Return true when pointer PTR for a given OFFSET is already sanitized
>> +   in a given sanitization context CTX.  */
>> +
>> +static bool
>> +has_dominating_ubsan_ptr_check (sanopt_ctx *ctx, tree ptr, tree cur_offset)
> 
> Any reason why you pass cur_offset as a tree rather than wide_int & or
> offset_int &?  If the caller makes sure it is sign extended from
> pointer/sizetype's precision, then:
> 
>> +{
>> +  int r = get_range_pos_neg (cur_offset);
> 
> here you can just wi::neg_p or similar.
> 
>> +  /* We already have recorded a UBSAN_PTR check for this pointer.  Perhaps 
>> we
>> + can drop this one.  But only if this check doesn't specify larger 
>> offset.
>> + */
>> +  tree offset = gimple_call_arg (g, 1);
>> +  gcc_assert (TREE_CODE (offset) == INTEGER_CST);
>> +
>> +  if (positive && tree_int_cst_le (cur_offset, offset))
>> +return true;
>> +  else if (!positive && tree_int_cst_le (offset, cur_offset))
>> +return true;
> 
> And the comparisons here in wide_int would be cheaper too.
> 
>> +
>> +static void
>> +record_ubsan_ptr_check_stmt (sanopt_ctx *ctx, gimple *stmt, tree ptr,
>> + tree offset)
>> +{
>> +  int r = get_range_pos_neg (offset);
> 
> See above.

Good point! I will use offset_int at all places.

> 
>> +  gcc_assert (r != 3);
>> +  bool positive = r == 1;
>> +
>> +  sanopt_tree_couple couple;
>> +  couple.ptr = ptr;
>> +  couple.positive = positive;
>> +
>> +  auto_vec &v = ctx->ptr_check_map.get_or_insert (couple);
>> +  v.safe_push (stmt);
>> +}
>> +
>> +/* Optimize away redundant UBSAN_PTR calls.  */
>> +
>> +static bool
>> +maybe_optimize_ubsan_ptr_ifn (sanopt_ctx *ctx, gimple *stmt)
>> +{
>> +  gcc_assert (gimple_call_num_args (stmt) == 2);
>> +  tree ptr = gimple_call_arg (stmt, 0);
>> +  tree cur_offset = gimple_call_arg (stmt, 1);
>> +
>> +  if (TREE_CODE (cur_offset) != INTEGER_CST)
>> +return false;
>> +
>> +  if (integer_zerop (cur_offset))
>> +return true;
>> +
>> +  if (has_dominating_ubsan_ptr_check (ctx, ptr, cur_offset))
>> +return true;
>> +
>> +  tree base = ptr;
>> +  if (TREE_CODE (base) == ADDR_EXPR)
>> +{
>> +  base = TREE_OPERAND (base, 0);
>> +
>> +  HOST_WIDE_INT bitsize, bitpos;
>> +  machine_mode mode;
>> +  int volatilep = 0, reversep, unsignedp = 0;
>> +  tree offset;
>> +  base = get_inner_reference (base, &bitsize, &bitpos, &offset, &mode,
>> +  &unsignedp, &reversep, &volatilep);
>> +  if (DECL_P (base))
>> +{
>> +  gcc_assert (!DECL_REGISTER (base));
>> +  /* If BASE is a fixed size automatic variable or
>> + global variable defined in the current TU and bitpos
>> + fits, don't instrument anything.  */
>> +  if ((offset == NULL_TREE || TREE_CODE (offset) == INTEGER_CST)
> 
> Do you really need to handle offset != NULL_TREE?
> If the bit offset is representable in shwi, then it will just be
> in bitpos and offset will be NULL.

For this:
UBSAN_PTR (&MEM[(void *)&b + 9223372036854775807B], 1);

I have offset:
 
constant 9223372036854775807>

which is a valid offset.

> 
>> +  && (VAR_P (base)
>> +  || TREE_CODE (base) == PARM_DECL
>> +  || TREE_CODE (base) == RESULT_DECL)
>> +  && DECL_SIZE (base)
>> +  && TREE_CODE (DECL_SIZE (base)) == INTEGER_CST
> 
> Why are you testing here DECL_SIZE when you actually then use
> DECL_SIZE_UNIT?

Yep, will fix that.

> 
>> +  && (!is_global_var (base) || decl_binds_to_current_def_p (base)))
>> +{
>> +  HOST_WIDE_INT bytepos = bitpos / BITS_PER_UNIT;
>> +  offset_int total_offset = bytepos;
>> +  if (offset != NULL_TREE)
>> +total_offset += wi::sext (wi::to_offset (offset), POINTER_SIZE);
>> +  total_offset += wi::sext (wi::to_offset (cur_offset),
>> +POINTER_SIZE);
> 
> Why are you sign extending it each time?  Can't it be sign exte

Re: [RFA] [PATCH] Add a warning for invalid function casts

2017-10-05 Thread Bernd Edlinger
On 10/03/17 23:34, Joseph Myers wrote:
> On Tue, 3 Oct 2017, Bernd Edlinger wrote:
> 
>> invalid, also if both function types have a non-null TYPE_ARG_TYPES
>> I would say this deserves a warning.  As an exception I have
> 
> I'm not convinced by the TYPE_ARG_TYPES check, at least for C.
> 

I will drop that, for C and C++, and try to get it working without
that kludge.

> In C, unprototyped function types are not compatible with variadic
> function types or functions with argument types changed by default
> argument promotions (that is, int () and int (char) and int (int, ...) are
> all incompatible).  I'd think it appropriate to warn about such
> conversions, given that they are cases where calling the converted
> function has undefined behavior.
> 

Right, interesting is that this does not produce a warning,

int test(int);
int foo()
{

   int (*x)();
   x = test;
   return x(1);
}


while the following example produces a warning:

int test(int,...);
int foo()
{

   int (*x)(int);
   x = test;
   return x(1);
}

gcc -Wall -W -S t1.c
t1.c: In function 'foo':
t1.c:6:5: warning: assignment to 'int (*)(int)' from incompatible 
pointer type 'int (*)(int)' [-Wincompatible-pointer-types]
x = test;
  ^

I will send a patch that adds ", ..." to the parameter list
in the diagnostics...

But why is int(*)(int) compatible to (int)(*)() but not
to int(*)(int,...) ?


> There may well be cases of interfaces where void (*) (void) is used as a
> generic function pointer type (always converted to / from the actual type
> of the function in question), for which this warning would not be
> suitable.
> 

Yes, those would get a warning, or have to add a cast to uintptr_t, if
it is not possible to fix the code otherwise.  libffi does this...


Bernd.


Re: [RFA] [PATCH] Add a warning for invalid function casts

2017-10-05 Thread Andreas Schwab
On Okt 05 2017, Bernd Edlinger  wrote:

> The idea for this warning came up when someone spotted a place in
> openssl, where a type cast was used to change the return value of a
> callback function from long to int:
>
> https://github.com/openssl/openssl/issues/4413
>
> But due to the type cast there was never ever any warning from this
> invalid type cast.

Note that the type cast itself is perfectly valid (all function pointers
are alike), but it's the call site that is problematic: unless it casts
back to the real type of the function before the call it is causing
undefined behviour.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: [PATCH] PR target/68256 Defining TARGET_USE_CONSTANT_BLOCKS_P causes go bootstrap failure on aarch64

2017-10-05 Thread Richard Earnshaw (lists)
On 04/10/17 00:50, vladimir.mezent...@oracle.com wrote:
> From: Vladimir Mezentsev 
> 
> Tested on aarch64-linux-gnu.
> No regression.
> No bootstrap failure.
> 
> gcc/ChangeLog:
> 2017-09-26  Vladimir Mezentsev  
> 
> * gcc/config/aarch64/aarch64.c: restore fix in 
> aarch64_use_blocks_for_constant_p

Vladimir,

Thanks for the patch, but I can't accept this as it stands.  The patch
lacks a description of what is motivating the change, so I can't really
review it sensibly.  What's the problem you are trying to address?  Why
do you believe this is the right fix?  Do you have a test case?

The ChangeLog entry doesn't help me either: restore what fix?

Is the patch associated with a bugzilla report?  If so, then please
include that in the ChangeLog in the form "PR /.

Incidentally, ChangeLog entries should be complete sentences (begin with
a capital letter and end with a full stop).

R.


> ---
>  gcc/config/aarch64/aarch64.c | 8 +++-
>  1 file changed, 3 insertions(+), 5 deletions(-)
> 
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 1c14008..b377bc7 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -6193,11 +6193,9 @@ aarch64_can_use_per_function_literal_pools_p (void)
>  static bool
>  aarch64_use_blocks_for_constant_p (machine_mode, const_rtx)
>  {
> -  /* Fixme:: In an ideal world this would work similar
> - to the logic in aarch64_select_rtx_section but this
> - breaks bootstrap in gcc go.  For now we workaround
> - this by returning false here.  */
> -  return false;
> +   /* We can't use blocks for constants when we're using a per-function
> +  constant pool.  */
> +  return !aarch64_can_use_per_function_literal_pools_p ();
>  }
>  
>  /* Select appropriate section for constants depending
> 



Re: [RFA] [PATCH] Add a warning for invalid function casts

2017-10-05 Thread Joseph Myers
On Thu, 5 Oct 2017, Bernd Edlinger wrote:

> But why is int(*)(int) compatible to (int)(*)() but not
> to int(*)(int,...) ?

I think it's a matter of what function types were possible in K&R C.  
 variadic types weren't (there was an older ), and 
neither were types with arguments of type float or narrower-than-int 
integers (because those always got promoted to a wider type when passed as 
function arguments to an unprototyped function).  And those types that 
were impossible in K&R C always require function prototypes.  (The 
possibility of function types without a prototype is a legacy feature.  
There was some suggestion of removing it for C11, but no-one ever produced 
a paper for WG14 proposing the actual wording changes that would have been 
needed.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH][GRAPHITE] Rewrite PHI handling in code-gen

2017-10-05 Thread Sebastian Pop
On Thu, Oct 5, 2017 at 6:43 AM, Richard Biener  wrote:

> On Wed, 4 Oct 2017, Richard Biener wrote:
>
> >
> > The following patch completely re-does PHI handling during
> > code-generation.  PHI handling is currently responsible for 99% of
> > all code-generation issues.  With the patch the number of code-generation
> > issues in SPEC 2k6 decreases from 180 to 5, similar adjustments happen
> > to the testsuite - only gfortran.dg/graphite has some expected code-gen
> > issues left.
>
> So I messed up the testsuite update and it turns out all code-gen
> issues are fixed in all graphite.exp testsuites.  Yay.
>
> > The current idea of categorizing PHIs and doing code-gen based on
> > pattern matching with the original GIMPLE IL isn't feasible given
> > ISL can do transforms like peeling, optimizing away conditions and
> > creating arbitrary number of GBB copies.  The current code fences
> > off a lot of cases by simply giving up.
> >
> > To fix the current code one would need to basically replicate the
> > update-SSA machinery we already have (and pointlessly exercise
> > from the graphite code-gen at the moment).
> >
> > Thus the patch rips out all manual handling of PHIs during
> code-generation
> > and leaves all cross-BB scalar updates to update-SSA.
> >
> > This means "going out-of-SSA" again, but instead of applying out-of-SSA
> > on the original GIMPLE IL I'm just doing this on-the-fly during
> > scalar dependence generation and code generation.
>

Sounds good!


> >
> >  bb3:
> >   goto bb5;
> >
> >  bb4:
> >
> >  bb5:
> >   _2 = PHI <_3(3), _4(4)>
> >
> > becomes (for an identity rewrite) before update-SSA:
> >
> >  bb3':
> >   D.1234 = _3;
> >
> >  bb4':
> >   D.1234 = _4;
> >
> >  bb5':
> >   _5 = D.1234;
> >
> > with _5 being a new def for _2.  update-SSA then re-writes the
> > _3 and _4 uses with available new defs we have registered during
> > code generation of the _3 and _4 def copies and rewrites D.1234
> > into SSA, inserting PHIs where necessary.
> >
> > This scheme of course relies on ISL outputting a correct schedule
> > which of course relies on us setting proper dependence constraints.
> > I've fixed quite a few issues there, for example missing constraints
> > for the SESE liveout variables.
> >
> > One awkward thing is that to not confuse ISL with PHI edge copies
> > placed in latch blocks, like
> >
> >   for (int c0 = 0; c0 < P_22; c0 += 1) {
> > S_6(0, c0);
> > if (P_22 >= c0 + 2)
> >   S_7(0, c0);
> >   }
> >
> > and ISL then happilly peeling off the last iteration where the latch S_7
> > containing only the out-of-SSA copy is not needed.  So I'm trying to
> > detect empty latches and instead insert the out-of-SSA copy in its
> > predecessor instead (I know it doesn't matter if we execute the stmt
> > in the last iteration).
> >
> > The patch as-is ends up with quite some useless SSA copies which
> > is cleaned up by the copyprop pass inside the graphite pipeline
> > but can also be improved by teaching the into-SSA rewrite to
> > eliminate copies.
> >
> > I do expect issues with the patch (I'm seeing CE 416.gamess, but not
> > sure why and 403.gcc miscompare), but it's already become somewhat too
> big
> > to handle.
>
> The CE was on purpose, std=legacy brings it back, now it seems to
> miscompare.
>
> > Currently re-bootstrapping and testing after some cosmetic changes,
> > testsuites ran successfully, SPEC CPU 2006 built and run (with test
> data).
> > Statistics (with all graphite params set to unlimited) are
> >
> > loop nest optimized: 119
> > loop nest not optimized, code generation error: 5
> > loop nest not optimized, optimized schedule is identical to original
> > schedule: 110
> > loop nest not optimized, optimization timed out: 31
> > loop nest not optimized, ISL signalled an error: 6
> > loop nest: 271
> >
> > Ok for trunk?
>
> Thus the following updated patch (only the testsuite part changed).
>
> Bootstrapped with -fgraphite-identity -floop-nest-optimize and tested
> on x86_64-unknown-linux-gnu.
>
> Ok?
>

Yes, the patch looks good.  Thanks!


>
> There are two parts worth working on after this - one is creating
> a versioning condition to fend off alias-set compute fails, the
> other one is working on the proximity constraints.
>

Right.
We also need to tag commutative and associative reductions
in the dependence graph.  Now that the code generation will
nicely handle scalar dependences, we may want to add back
some of the code from this commit:
https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=228255


Re: [PATCH][GRAPHITE] Adjust CASE_CONVERT in extract_affine

2017-10-05 Thread Sebastian Pop
On Wed, Oct 4, 2017 at 2:45 AM, Richard Biener  wrote:

>
> While my last change involving signed types was correct it wasn't optimal.
> We can avoid the modulo constraints if the conversion is widening
> (thus all values fit in the new type).
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu, ok?


> Thanks,
> Richard.
>
> 2017-10-04  Richard Biener  
>
> * graphite-sese-to-poly.c (extract_affine): For casts increasing
> precision do not perform modulo reduction.
>

Looks good.
Thanks!


Re: [PATCH] Fix typos in graphite testcases

2017-10-05 Thread Sebastian Pop
On Thu, Oct 5, 2017 at 2:44 AM, Richard Biener 
wrote:

> Causing some UNRESOLVED.
>
> Committed.
>

Looks good.  Thanks!


Re: [PATCH] [graphite] translate reads and writes in a single traversal of memory ops

2017-10-05 Thread Sebastian Pop
On Mon, Oct 2, 2017 at 4:18 AM, Richard Biener 
wrote:

> On Mon, Oct 2, 2017 at 6:53 AM, Sebastian Pop 
> wrote:
> > The patch moves the code that translates reads and writes to isl
> representation
> > in a same loop.  This is to avoid traversing the scop blocks and arrays
> with
> > memory operations 3 times.
>
> LGTM.
>

Richard, could you please commit this patch, as I will need to figure out
why my
ssh keys don't let me to commit the code.  I will probably need to update
the key.

Thanks,
Sebastian


Re: [PATCH][GRAPHITE] Rewrite PHI handling in code-gen

2017-10-05 Thread Sebastian Pop
On Thu, Oct 5, 2017 at 9:20 AM, Sebastian Pop  wrote:
>
> We also need to tag commutative and associative reductions
> in the dependence graph.  Now that the code generation will
> nicely handle scalar dependences, we may want to add back
> some of the code from this commit:
> https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=228255
>
>
The above patch is for tagging the assoc/comm reductions.
Here is the patch that removes reduction dependences
that can be ignored to compute a new schedule:
https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=228530


tree-complex.c: fix some_nonzerop test over reals (and a bug fix)

2017-10-05 Thread Laurent Thevenoux
Hello, 

This patch improves the some_nonzerop(tree t) function from tree-complex.c file 
(the function is only used there). 

This function returns true if a tree as a parameter is not the constant 0 (or 
+0.0 only for reals when !flag_signed_zeros ). The former result is then used 
to determine if some simplifications are possible for complex expansions 
(addition, multiplication, and division). 

Unfortunately, if the tree is a real constant, the function always return true, 
even for +0.0 because of the explicit test on flag_signed_zeros (so if your 
system enables signed zeros you cannot benefit from those simplifications). To 
avoid this behavior and allow complex expansion simplifications, I propose the 
following patch, which test for the sign of the real constant 0.0 instead of 
checking the flag. 

This first fix reveals a bug (thanks to c-c++-common/torture/complex-sign-add.c 
) in the simplification section of expand_complex_addition (also fixed in the 
patch). 

The patch has passed bootstrap and testing on x86_64-pc-linux-gnu . 

Best regards, 
Laurent Thévenoux 
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 2191d62..a6124ce 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,11 @@
+2017-10-05  Laurent Thévenoux  
+
+	* tree-complex.c (some_nonzerop): Adjust the way of testing real
+	constants. Allows existing simplifications of complex expansions
+	to be well done.
+	* tree-complex.c (expand_complex_addition): Bug fixing for complex
+	addition expansion simplification.
+
 2017-10-02  Bill Schmidt  
 
 	Backport from mainline
diff --git a/gcc/tree-complex.c b/gcc/tree-complex.c
index e0dd3d9..413b601 100644
--- a/gcc/tree-complex.c
+++ b/gcc/tree-complex.c
@@ -110,8 +110,11 @@ some_nonzerop (tree t)
   /* Operations with real or imaginary part of a complex number zero
  cannot be treated the same as operations with a real or imaginary
  operand if we care about the signs of zeros in the result.  */
-  if (TREE_CODE (t) == REAL_CST && !flag_signed_zeros)
-zerop = real_identical (&TREE_REAL_CST (t), &dconst0);
+  if (TREE_CODE (t) == REAL_CST)
+{
+  if (real_identical (&TREE_REAL_CST (t), &dconst0))
+	zerop = !real_isneg (&TREE_REAL_CST (t));
+}
   else if (TREE_CODE (t) == FIXED_CST)
 zerop = fixed_zerop (t);
   else if (TREE_CODE (t) == INTEGER_CST)
@@ -937,7 +940,7 @@ expand_complex_addition (gimple_stmt_iterator *gsi, tree inner_type,
 
 case PAIR (VARYING, ONLY_REAL):
   rr = gimplify_build2 (gsi, code, inner_type, ar, br);
-  ri = ai;
+  ri = bi;
   break;
 
 case PAIR (VARYING, ONLY_IMAG):


Re: [PATCH][GCC][testsuite][mid-end][ARM][AARCH64] Fix failing vec align tests.

2017-10-05 Thread Richard Earnshaw (lists)
On 02/10/17 14:13, Tamar Christina wrote:
> Hi All,
> 
> Previously I had corrected the vect_hw_misalign check which prompted these
> three test to start failing because the condition needs to be inverted in the
> testcases.
> 
> Regtested on aarch64-none-elf, arm-none-linux-gnueabihf and 
> x86_64-pc-linux-gnu.
> 
> Ok for trunk?
> 
> Thanks,
> Tamar.
> 
> gcc/testsuite/
> 2017-10-02  Tamar Christina  
> 
>   * gcc.dg/vect/vect-align-1.c: Fix vect_hw_misalign condition.
>   * gcc.dg/vect/vect-align-2.c: Likewise.
>   * gcc.dg/vect/vect-multitypes-1.c: Likewise.
> 

OK.

R.

> 
> 8256-diff.patch
> 
> 
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-align-1.c 
> b/gcc/testsuite/gcc.dg/vect/vect-align-1.c
> index 
> ea5ac0444fab4b6139e0e4b1019a98d92291ed4a..d56898c4d23406b4c8cc53fa1409974b6ab05485
>  100644
> --- a/gcc/testsuite/gcc.dg/vect/vect-align-1.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-align-1.c
> @@ -47,6 +47,6 @@ int main (void)
>return 0;
>  }
>  
> -/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 
> "vect" { target vect_hw_misalign } } } */
> -/* { dg-final { scan-tree-dump-times "Alignment of access forced using 
> versioning" 1 "vect" { xfail vect_hw_misalign} } } */
> +/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 1 
> "vect" { target { vect_hw_misalign && { arm_vect_no_misalign } } } } } */
> +/* { dg-final { scan-tree-dump-times "Alignment of access forced using 
> versioning" 1 "vect" { target { vect_hw_misalign && arm_vect_no_misalign } } 
> } } */
>  /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-align-2.c 
> b/gcc/testsuite/gcc.dg/vect/vect-align-2.c
> index 
> 200d556927446cd706f454a388d3a83a004004e2..39708648703357e9360e0b63ca7070c4c21def03
>  100644
> --- a/gcc/testsuite/gcc.dg/vect/vect-align-2.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-align-2.c
> @@ -43,5 +43,5 @@ int main (void)
>  
>  
>  /* { dg-final { scan-tree-dump-times "Alignment of access forced using 
> peeling" 0 "vect" } } */
> -/* { dg-final { scan-tree-dump-times "Alignment of access forced using 
> versioning" 1 "vect" { xfail vect_hw_misalign} } } */
> +/* { dg-final { scan-tree-dump-times "Alignment of access forced using 
> versioning" 1 "vect" { target { vect_hw_misalign && arm_vect_no_misalign } } 
> } } */
>  /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-multitypes-1.c 
> b/gcc/testsuite/gcc.dg/vect/vect-multitypes-1.c
> index 
> fd7cacb483d9cfbea6d909ba12e67544fa32a190..836fa76d7887c8f67cb634021ab3c01f08da7a70
>  100644
> --- a/gcc/testsuite/gcc.dg/vect/vect-multitypes-1.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-multitypes-1.c
> @@ -83,5 +83,5 @@ int main (void)
>  
>  /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { xfail { 
> vect_no_align && { ! vect_hw_misalign } } } } } */
>  /* { dg-final { scan-tree-dump-times "Alignment of access forced using 
> peeling" 2 "vect" { xfail {{ vect_no_align && { ! vect_hw_misalign } } || 
> {vect_sizes_32B_16B }}} } } */
> -/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 4 
> "vect" { xfail {{ vect_no_align && { ! vect_hw_misalign } } || 
> {vect_sizes_32B_16B }}} } } */
> +/* { dg-final { scan-tree-dump-times "Vectorizing an unaligned access" 4 
> "vect" { target {{ vect_no_align && { ! vect_hw_misalign } } || 
> {vect_sizes_32B_16B }}} } } */
>  
> 



Re: [GCC][PATCH][testsuite][mid-end] Fix failing slp test on aarch64 and arm.

2017-10-05 Thread Richard Earnshaw (lists)
On 02/10/17 14:20, Tamar Christina wrote:
> Hi All,
> 
> The slp vectorization test currently fails on AArch32 and AArch64
> due to it not taking into account that we do have 128 bit vectors in
> NEON. This means that two of the loops get vectorized instead of just 1.
> 
> So update the conditions to include a check for neon.
> 
> Regtested on aarch64-none-elf.
> 
> Respin of patch https://gcc.gnu.org/ml/gcc-patches/2017-09/msg01805.html
> 
> Ok for trunk?
> 
> Thanks,
> Tamar.
> 
> gcc/testsuite/
> 2017-10-02  Tamar Christina  
> 
>   * gcc.dg/vect/slp-perm-9.c: Use vect_sizes_16B_8B.
>   * lib/target-supports.exp (vect_sizes_16B_8B): New.
> 
> gcc/doc
> 
>   * sourcebuild.texi (vect_sizes_16B_8B, vect_sizes_32B_16B): New.
> 

OK.

R.

> 
> 8221-diff.patch
> 
> 
> diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
> index 
> 56e1b4eb103ab412b29d6dcd9b556515ebc2ac63..98cebf7b58798abdcaa108daadcd6273667dc785
>  100644
> --- a/gcc/doc/sourcebuild.texi
> +++ b/gcc/doc/sourcebuild.texi
> @@ -1507,6 +1507,12 @@ Target supports conversion from @code{float} to 
> @code{unsigned int}.
>  
>  @item vect_max_reduc
>  Target supports max reduction for vectors.
> +
> +@item vect_sizes_16B_8B
> +Target supports 16- and 8-bytes vectors.
> +
> +@item vect_sizes_32B_16B
> +Target supports 32- and 16-bytes vectors.
>  @end table
>  
>  @subsubsection Thread Local Storage attributes
> diff --git a/gcc/testsuite/gcc.dg/vect/slp-perm-9.c 
> b/gcc/testsuite/gcc.dg/vect/slp-perm-9.c
> index 
> 4d9c11dcc476a8023b3eaac2ae76cc01bd0db182..b9b5a3b87ad031a5ab7421efce2c2b0fdf9145f3
>  100644
> --- a/gcc/testsuite/gcc.dg/vect/slp-perm-9.c
> +++ b/gcc/testsuite/gcc.dg/vect/slp-perm-9.c
> @@ -54,8 +54,8 @@ int main (int argc, const char* argv[])
>return 0;
>  }
>  
> -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  { target 
> { {! vect_perm } || {! vect_sizes_32B_16B } } } } } */
> -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect"  { target 
> { { vect_perm } && { vect_sizes_32B_16B } } } } } */
> +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  { target 
> { {! vect_perm } || {! vect_sizes_16B_8B } } } } } */
> +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect"  { target 
> { { vect_perm } && { vect_sizes_16B_8B } } } } } */
>  /* { dg-final { scan-tree-dump-times "permutation requires at least three 
> vectors" 1 "vect" { target vect_perm_short } } } */
>  /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 0 "vect" 
> { target { {! vect_perm } || {! vect_sizes_32B_16B } } } } } */
>  /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" 
> { target { { vect_perm } && { vect_sizes_32B_16B } } } } } */
> diff --git a/gcc/testsuite/lib/target-supports.exp 
> b/gcc/testsuite/lib/target-supports.exp
> index 
> 57f646ce2df5bcd5619870403242e73f6e91ff77..8ad9b602d277c28a6e34942a564d2ce05da7857f
>  100644
> --- a/gcc/testsuite/lib/target-supports.exp
> +++ b/gcc/testsuite/lib/target-supports.exp
> @@ -7561,6 +7561,19 @@ proc check_effective_target_vect_sizes_32B_16B { } {
>}
>  }
>  
> +# Return true if 16- and 8-bytes vectors are available.
> +
> +proc check_effective_target_vect_sizes_16B_8B { } {
> +  if { [check_avx_available]
> +   || [is-effective-target arm_neon]
> +   || [istarget aarch64*-*-*] } {
> + return 1;
> +  } else {
> +return 0;
> +  }
> +}
> +
> +
>  # Return true if 128-bits vectors are preferred even if 256-bits vectors
>  # are available.
>  
> 



Re: [PATCH] Fix PR81422[aarch64] internal compiler error: in update_equiv_regs, at ira.c:3425

2017-10-05 Thread Richard Earnshaw (lists)
On 19/09/17 20:22, Qing Zhao wrote:
> Hi,
> 
> This patch fixes the aarch64 bug 81422 
> https://gcc.gnu.org/PR81422
> 
> Before adding REG_EQUIV notes in the TLS symbol handling code,
> we should check whether the "dest" is a REG or NOT (sometimes,
> it's a SUBREG as in this bug). Only when the “dest” is a REG, the note
> will be added.
> 
> a new small testing case is added. 
> 
> this change has been tested on aarch64-unknown-linux-gnu with no regression. 
> 
> Thanks!
> 
> Qing
> 
> 
> ===
> 
> gcc/ChangeLog:
> 
>* config/aarch64/aarch64.c (aarch64_load_symref_appropriately):
>check whether the dest is REG before adding REG_EQUIV note.
> 
> gcc/testsuite/ChangeLog:
> 
>PR target/81422
>* gcc.target/aarch64/pr81422.C: New test.
> 

Two minor nits to fix:

- ChangeLog entries should start with a capital letter (s/check/Check/).
- Please use hard tabs rather than 8 consecutive spaces (each of your
new 'if' statements).

OK with those changes.

R.

> ---
>  gcc/config/aarch64/aarch64.c   | 12 
>  gcc/testsuite/gcc.target/aarch64/pr81422.C | 15 +++
>  2 files changed, 23 insertions(+), 4 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/pr81422.C
> 
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index a2ecd7a..6c3ef76 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -1480,7 +1480,8 @@ aarch64_load_symref_appropriately (rtx dest, rtx imm,
> tp = gen_lowpart (mode, tp);
>  
>   emit_insn (gen_rtx_SET (dest, gen_rtx_PLUS (mode, tp, x0)));
> - set_unique_reg_note (get_last_insn (), REG_EQUIV, imm);
> +if (REG_P (dest)) 
> +   set_unique_reg_note (get_last_insn (), REG_EQUIV, imm);
>   return;
>}
>  
> @@ -1514,7 +1515,8 @@ aarch64_load_symref_appropriately (rtx dest, rtx imm,
> }
>  
>   emit_insn (gen_rtx_SET (dest, gen_rtx_PLUS (mode, tp, tmp_reg)));
> - set_unique_reg_note (get_last_insn (), REG_EQUIV, imm);
> +if (REG_P (dest)) 
> +   set_unique_reg_note (get_last_insn (), REG_EQUIV, imm);
>   return;
>}
>  
> @@ -1555,7 +1557,8 @@ aarch64_load_symref_appropriately (rtx dest, rtx imm,
>   gcc_unreachable ();
> }
>  
> - set_unique_reg_note (get_last_insn (), REG_EQUIV, imm);
> +if (REG_P (dest)) 
> +   set_unique_reg_note (get_last_insn (), REG_EQUIV, imm);
>   return;
>}
>  
> @@ -1584,7 +1587,8 @@ aarch64_load_symref_appropriately (rtx dest, rtx imm,
>   emit_insn (gen_tlsie_tiny_sidi (dest, imm, tp));
> }
>  
> - set_unique_reg_note (get_last_insn (), REG_EQUIV, imm);
> +if (REG_P (dest))
> +   set_unique_reg_note (get_last_insn (), REG_EQUIV, imm);
>   return;
>}
>  
> diff --git a/gcc/testsuite/gcc.target/aarch64/pr81422.C 
> b/gcc/testsuite/gcc.target/aarch64/pr81422.C
> new file mode 100644
> index 000..5bcc948
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/pr81422.C
> @@ -0,0 +1,15 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O0" } */
> +
> +struct DArray
> +{
> +__SIZE_TYPE__ length;
> +int* ptr;
> +};
> +
> +void foo35(DArray)
> +{
> +static __thread int x[5];
> +foo35({5, (int*)&x});
> +}
> +
> 



Re: [PATCH v3 7/14] D: Add D language support to GCC targets.

2017-10-05 Thread Richard Earnshaw (lists)
On 02/10/17 10:00, Iain Buclaw wrote:
> This patch add D language support to targets of GCC itself.
> 
> Changes since previous are just removing patches for untested target
> configurations, these can be re-added later on an as-per basis.
> 
> ---
> 
> 
> 07-v3-d-gcc-target.patch
> 
> 
> gcc/ChangeLog
> 
>   * gcc/Makefile.in (tm_d_file_list, tm_d_include_list,
>   TM_D_H, D_TARGET_DEF, D_TARGET_H, D_TARGET_OBJS): New variables.
>   (tm_d.h, cs-tm_d.h, default-d.o, d/d-target-hooks-def.h,
>   s-d-target-hooks-def-h): New rules.
>   (s-tm-texi): Also check timestamp on d-target.def.
>   (generated_files): Add TM_D_H and d-target-hooks-def.h.
>   (build/genhooks.o): Also depend on D_TARGET_DEF.
>   * gcc/config.gcc (tm_d_file, d_target_objs, target_has_targetdm):
>   New variables.
>   * config/aarch64/aarch64-d.c: New file.
>   * config/aarch64/aarch64-linux.h (GNU_USER_TARGET_D_CRITSEC_SIZE):
>   Define.
>   * config/aarch64/aarch64-protos.h (aarch64_d_target_versions): New
>   prototype.
>   * config/aarch64/aarch64.h (TARGET_D_CPU_VERSIONS): Define.
>   * config/aarch64/t-aarch64 (aarch64-d.o): New rule.
>   * config/arm/arm-d.c: New file.
>   * config/arm/arm-protos.h (arm_d_target_versions): New prototype.
>   * config/arm/arm.h (TARGET_D_CPU_VERSIONS): Define.
>   * config/arm/linux-eabi.h (EXTRA_TARGET_D_OS_VERSIONS): Define.
>   * config/arm/t-arm (arm-d.o): New rule.
>   * config/default-d.c: New file.
>   * config/glibc-d.c: New file.
>   * config/gnu.h (GNU_USER_TARGET_D_OS_VERSIONS): Define.
>   * config/i386/i386-d.c: New file.
>   * config/i386/i386-protos.h (ix86_d_target_versions): New prototype.
>   * config/i386/i386.h (TARGET_D_CPU_VERSIONS): Define.
>   * config/i386/linux-common.h (EXTRA_TARGET_D_OS_VERSIONS,
>   GNU_USER_TARGET_D_CRITSEC_SIZE): Define.
>   * config/i386/t-i386 (i386-d.o): New rule.
>   * config/kfreebsd-gnu.h (GNU_USER_TARGET_D_OS_VERSIONS): Define.
>   * config/kopensolaris-gnu.h (GNU_USER_TARGET_D_OS_VERSIONS): Define.
>   * config/linux-android.h (ANDROID_TARGET_D_OS_VERSIONS): Define.
>   * config/linux.h (GNU_USER_TARGET_D_OS_VERSIONS): Define.
>   * config/mips/linux-common.h (EXTRA_TARGET_D_OS_VERSIONS): Define.
>   * config/mips/mips-d.c: New file.
>   * config/mips/mips-protos.h (mips_d_target_versions): New prototype.
>   * config/mips/mips.h (TARGET_D_CPU_VERSIONS): Define.
>   * config/mips/t-mips (mips-d.o): New rule.
>   * config/powerpcspe/linux.h (GNU_USER_TARGET_D_OS_VERSIONS): Define.
>   * config/powerpcspe/linux64.h (GNU_USER_TARGET_D_OS_VERSIONS): Define.
>   * config/powerpcspe/powerpcspe-d.c: New file.
>   * config/powerpcspe/powerpcspe-protos.h (rs6000_d_target_versions):
>   New prototype.
>   * config/powerpcspe/powerpcspe.h (TARGET_D_CPU_VERSIONS): Define.
>   * config/powerpcspe/t-powerpcspe (powerpcspe-d.o): New rule.
>   * config/rs6000/linux.h (GNU_USER_TARGET_D_OS_VERSIONS): Define.
>   * config/rs6000/linux64.h (GNU_USER_TARGET_D_OS_VERSIONS): Define.
>   * config/rs6000/rs6000-d.c: New file.
>   * config/rs6000/rs6000-protos.h (rs6000_d_target_versions): New
>   prototype.
>   * config/rs6000/t-rs6000 (rs6000-d.o): New rule.
>   * config/s390/s390-d.c: New file.
>   * config/s390/s390-protos.h (s390_d_target_versions): New prototype.
>   * config/s390/s390.h (TARGET_D_CPU_VERSIONS): Define.
>   * config/s390/t-s390 (s390-d.o): New rule.
>   * config/sparc/sparc-d.c: New file.
>   * config/sparc/sparc-protos.h (sparc_d_target_versions): New prototype.
>   * config/sparc/sparc.h (TARGET_D_CPU_VERSIONS): Define.
>   * config/sparc/t-sparc (sparc-d.o): New rule.
>   * config/t-glibc (glibc-d.o): New rule.
>   * gcc/configure.ac (tm_d_file): New variable.
>   (tm_d_file_list, tm_d_include_list, d_target_objs): Add substitute.
>   * gcc/configure: Regenerated. 
>   * doc/tm.texi.in: Add @node for D language, and @hook for
>   TARGET_D_CPU_VERSIONS, TARGET_D_OS_VERSIONS, and TARGET_D_CRITSEC_SIZE.
>   * doc/tm.texi: Regenerated.
>   * gcc/genhooks.c: Include d/d-target.def.

I've not reviewed all of this, but a couple of observations.

1) We don't use the ARM64 in gcc.  Please change them to AArch64.
2) The documentation for TARGET_D_CRITSEC_SIZE should probably be
explicit that the size is in bytes (it's implied by the examples).  Some
hooks/macros in GCC return a size in bits, so it's best to be explicit.

I can't see any other obvious issues with the ARM and AArch64 code.

R.

> 
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index 0bde7acf914..c8b58bb8d23 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -546,6 +546,8 @@ tm_include_list=@tm_include_list@
>  tm_defines=@tm_defines@
>  tm_p_file_list=@tm_p_file_list@
>  tm_p_include_list=@tm_p_include_list@
>

Re: [PATCH][RFC] Instrument function exit with __builtin_unreachable in C++.

2017-10-05 Thread Jason Merrill
On Thu, Oct 5, 2017 at 6:31 AM, Martin Liška  wrote:
> As discussed 2 days ago on IRC with Jakub and Jonathan, C++ standard says 
> that having a non-return
> function with missing return statement is undefined behavior. We've got 
> -fsanitize=return check for
> that and we can in such case instrument __builtin_unreachable(). This patch 
> does that.

Great.

> And there's still some fallout:
>
> FAIL: g++.dg/cpp0x/constexpr-diag3.C  -std=c++11  (test for errors, line 7)
> FAIL: g++.dg/cpp0x/constexpr-neg3.C  -std=c++11  (test for errors, line 12)
> FAIL: g++.dg/cpp1y/constexpr-return2.C  -std=c++14  (test for errors, line 7)
> FAIL: g++.dg/cpp1y/constexpr-return2.C  -std=c++14 (test for excess errors)
> FAIL: g++.dg/cpp1y/pr63996.C  -std=c++14  (test for errors, line 9)
> FAIL: g++.dg/cpp1y/pr63996.C  -std=c++14 (test for excess errors)
>
> 1) there are causing:
>
> ./xg++ -B. /home/marxin/Programming/gcc/gcc/testsuite/g++.dg/cpp1y/pr63996.C
> /home/marxin/Programming/gcc/gcc/testsuite/g++.dg/cpp1y/pr63996.C:9:23:   in 
> constexpr expansion of ‘foo(1)’
> /home/marxin/Programming/gcc/gcc/testsuite/g++.dg/cpp1y/pr63996.C:4:1: error: 
> ‘__builtin_unreachable()’ is not a constant expression
>  foo (int i)
>  ^~~
>
> Where we probably should not emit the built-in call. Am I right?

Or constexpr.c could give a more friendly diagnostic for
__builtin_unreachable.  It's correct to give a diagnostic here for
this testcase.

> FAIL: g++.dg/torture/pr34850.C   -O1   (test for warnings, line 15)
> FAIL: g++.dg/torture/pr34850.C   -O2   (test for warnings, line 15)
> FAIL: g++.dg/torture/pr34850.C   -O2 -flto -fno-use-linker-plugin 
> -flto-partition=none   (test for warnings, line 15)
> FAIL: g++.dg/torture/pr34850.C   -O2 -flto -fuse-linker-plugin 
> -fno-fat-lto-objects   (test for warnings, line 15)
> FAIL: g++.dg/torture/pr34850.C   -O3 -g   (test for warnings, line 15)
> FAIL: g++.dg/torture/pr34850.C   -Os   (test for warnings, line 15)
>
> 2) this test is somehow hard to fix as it requires some unreachability.

Can't we add a return statement to memset?

Also, this testcase seems to indicate a danger from this patch, like
we've seen before with bounds checking: if we have a checking path
that complains about invalid arguments and then has undefined
behavior, we optimize away the diagnostic.  Can we warn in such cases?
 We probably want to provide a way to turn off this behavior.

If we're going to enable this by default, we probably also want
-Wreturn-type on by default.

Jason


Zen tuning part 2: Increase branch_cost to 3

2017-10-05 Thread Jan Hubicka
Hi,
this patch increases branch_cost to 3.  Constant 2 is apparently copied from
bdver4 costs while core and generic use 3.  3 seems to give best results for
spec2000 and also significantly improve monte carlo benchmark from scimark.

Bootstrapped/regtested x86_64-linux, comitted.

Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 253448)
+++ config/i386/i386.c  (working copy)
@@ -1421,7 +1421,7 @@ struct processor_costs znver1_cost = {
  to limit number of prefetches at all, as their execution also takes some
  time).  */
   100, /* number of parallel prefetches.  */
-  2,   /* Branch cost.  */
+  3,   /* Branch cost.  */
   COSTS_N_INSNS (6),   /* cost of FADD and FSUB insns.  */
   COSTS_N_INSNS (6),   /* cost of FMUL instruction.  */
   COSTS_N_INSNS (42),  /* cost of FDIV instruction.  */


Re: [PATCH v3 7/14] D: Add D language support to GCC targets.

2017-10-05 Thread Iain Buclaw
On 5 October 2017 at 16:56, Richard Earnshaw (lists)
 wrote:
> On 02/10/17 10:00, Iain Buclaw wrote:
>> This patch add D language support to targets of GCC itself.
>>
>> Changes since previous are just removing patches for untested target
>> configurations, these can be re-added later on an as-per basis.
>>
>> ---
>>
>>
>> 07-v3-d-gcc-target.patch
>>
>>
>> gcc/ChangeLog
>>
>>   * gcc/Makefile.in (tm_d_file_list, tm_d_include_list,
>>   TM_D_H, D_TARGET_DEF, D_TARGET_H, D_TARGET_OBJS): New variables.
>>   (tm_d.h, cs-tm_d.h, default-d.o, d/d-target-hooks-def.h,
>>   s-d-target-hooks-def-h): New rules.
>>   (s-tm-texi): Also check timestamp on d-target.def.
>>   (generated_files): Add TM_D_H and d-target-hooks-def.h.
>>   (build/genhooks.o): Also depend on D_TARGET_DEF.
>>   * gcc/config.gcc (tm_d_file, d_target_objs, target_has_targetdm):
>>   New variables.
>>   * config/aarch64/aarch64-d.c: New file.
>>   * config/aarch64/aarch64-linux.h (GNU_USER_TARGET_D_CRITSEC_SIZE):
>>   Define.
>>   * config/aarch64/aarch64-protos.h (aarch64_d_target_versions): New
>>   prototype.
>>   * config/aarch64/aarch64.h (TARGET_D_CPU_VERSIONS): Define.
>>   * config/aarch64/t-aarch64 (aarch64-d.o): New rule.
>>   * config/arm/arm-d.c: New file.
>>   * config/arm/arm-protos.h (arm_d_target_versions): New prototype.
>>   * config/arm/arm.h (TARGET_D_CPU_VERSIONS): Define.
>>   * config/arm/linux-eabi.h (EXTRA_TARGET_D_OS_VERSIONS): Define.
>>   * config/arm/t-arm (arm-d.o): New rule.
>>   * config/default-d.c: New file.
>>   * config/glibc-d.c: New file.
>>   * config/gnu.h (GNU_USER_TARGET_D_OS_VERSIONS): Define.
>>   * config/i386/i386-d.c: New file.
>>   * config/i386/i386-protos.h (ix86_d_target_versions): New prototype.
>>   * config/i386/i386.h (TARGET_D_CPU_VERSIONS): Define.
>>   * config/i386/linux-common.h (EXTRA_TARGET_D_OS_VERSIONS,
>>   GNU_USER_TARGET_D_CRITSEC_SIZE): Define.
>>   * config/i386/t-i386 (i386-d.o): New rule.
>>   * config/kfreebsd-gnu.h (GNU_USER_TARGET_D_OS_VERSIONS): Define.
>>   * config/kopensolaris-gnu.h (GNU_USER_TARGET_D_OS_VERSIONS): Define.
>>   * config/linux-android.h (ANDROID_TARGET_D_OS_VERSIONS): Define.
>>   * config/linux.h (GNU_USER_TARGET_D_OS_VERSIONS): Define.
>>   * config/mips/linux-common.h (EXTRA_TARGET_D_OS_VERSIONS): Define.
>>   * config/mips/mips-d.c: New file.
>>   * config/mips/mips-protos.h (mips_d_target_versions): New prototype.
>>   * config/mips/mips.h (TARGET_D_CPU_VERSIONS): Define.
>>   * config/mips/t-mips (mips-d.o): New rule.
>>   * config/powerpcspe/linux.h (GNU_USER_TARGET_D_OS_VERSIONS): Define.
>>   * config/powerpcspe/linux64.h (GNU_USER_TARGET_D_OS_VERSIONS): Define.
>>   * config/powerpcspe/powerpcspe-d.c: New file.
>>   * config/powerpcspe/powerpcspe-protos.h (rs6000_d_target_versions):
>>   New prototype.
>>   * config/powerpcspe/powerpcspe.h (TARGET_D_CPU_VERSIONS): Define.
>>   * config/powerpcspe/t-powerpcspe (powerpcspe-d.o): New rule.
>>   * config/rs6000/linux.h (GNU_USER_TARGET_D_OS_VERSIONS): Define.
>>   * config/rs6000/linux64.h (GNU_USER_TARGET_D_OS_VERSIONS): Define.
>>   * config/rs6000/rs6000-d.c: New file.
>>   * config/rs6000/rs6000-protos.h (rs6000_d_target_versions): New
>>   prototype.
>>   * config/rs6000/t-rs6000 (rs6000-d.o): New rule.
>>   * config/s390/s390-d.c: New file.
>>   * config/s390/s390-protos.h (s390_d_target_versions): New prototype.
>>   * config/s390/s390.h (TARGET_D_CPU_VERSIONS): Define.
>>   * config/s390/t-s390 (s390-d.o): New rule.
>>   * config/sparc/sparc-d.c: New file.
>>   * config/sparc/sparc-protos.h (sparc_d_target_versions): New prototype.
>>   * config/sparc/sparc.h (TARGET_D_CPU_VERSIONS): Define.
>>   * config/sparc/t-sparc (sparc-d.o): New rule.
>>   * config/t-glibc (glibc-d.o): New rule.
>>   * gcc/configure.ac (tm_d_file): New variable.
>>   (tm_d_file_list, tm_d_include_list, d_target_objs): Add substitute.
>>   * gcc/configure: Regenerated.
>>   * doc/tm.texi.in: Add @node for D language, and @hook for
>>   TARGET_D_CPU_VERSIONS, TARGET_D_OS_VERSIONS, and TARGET_D_CRITSEC_SIZE.
>>   * doc/tm.texi: Regenerated.
>>   * gcc/genhooks.c: Include d/d-target.def.
>
> I've not reviewed all of this, but a couple of observations.
>
> 1) We don't use the ARM64 in gcc.  Please change them to AArch64.

I did a grep, and these are only mentioned as ARM64 in comments.

> 2) The documentation for TARGET_D_CRITSEC_SIZE should probably be
> explicit that the size is in bytes (it's implied by the examples).  Some
> hooks/macros in GCC return a size in bits, so it's best to be explicit.
>
> I can't see any other obvious issues with the ARM and AArch64 code.
>

OK, thanks.

Iain.


[PATCH] ira-color: fix allocno_priority_compare_func for qsort (PR 82395)

2017-10-05 Thread Alexander Monakov
Hello,

In ira-color.c, qsort comparator allocno_priority_compare_func lacks anti-
commutativity and can indicate A < B < A if boths allocnos satisfy
non_spilled_static_chain_regno_p.  It should fall down to following
sub-comparisons in that case.

There is another issue: the comment doesn't match the code.  We are sorting
allocnos by priority, higher first, and the comment says that allocnos
corresponding to static chain pointer register should go first. However,
the code implements the opposite ordering.

I think the comment documents the intended behavior and the code is wrong.
Thus, the patch changes comparison value to match the comment.

A similar issue was present in lra-assigns.c and was fixed by this patch:
https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00893.html

Bootstrapped and regtested on x86-64, OK for trunk?

Thanks.
Alexander

PR rtl-optimization/82395
* ira-color.c (allocno_priority_compare_func): Fix comparison step
based on non_spilled_static_chain_regno_p.

diff --git a/gcc/ira-color.c b/gcc/ira-color.c
index 22fdb88274d..31a4a8074d1 100644
--- a/gcc/ira-color.c
+++ b/gcc/ira-color.c
@@ -3005,14 +3005,13 @@ allocno_priority_compare_func (const void *v1p, const 
void *v2p)
 {
   ira_allocno_t a1 = *(const ira_allocno_t *) v1p;
   ira_allocno_t a2 = *(const ira_allocno_t *) v2p;
-  int pri1, pri2;
+  int pri1, pri2, diff;
 
   /* Assign hard reg to static chain pointer pseudo first when
  non-local goto is used.  */
-  if (non_spilled_static_chain_regno_p (ALLOCNO_REGNO (a1)))
-return 1;
-  else if (non_spilled_static_chain_regno_p (ALLOCNO_REGNO (a2)))
-return -1;
+  if ((diff = (non_spilled_static_chain_regno_p (ALLOCNO_REGNO (a2))
+  - non_spilled_static_chain_regno_p (ALLOCNO_REGNO (a1 != 0)
+return diff;
   pri1 = allocno_priorities[ALLOCNO_NUM (a1)];
   pri2 = allocno_priorities[ALLOCNO_NUM (a2)];
   if (pri2 != pri1)


Zen tuning part 3: Simplify ia32_multipass_dfa_lookahead

2017-10-05 Thread Jan Hubicka
Hi,
ia32_multipass_dfa_lookahead currently defaults to issue rate for all modern
CPUs.  It return 1 for pentiumpro and k6 but I believe it is only because no
one was interested to benchmark it and enable it.

I do not think these two exceptions are worth of the maintenance burden
especially because both of these was quite decoder bound and decoders was
not always symmetric, so DFA is most likely helping there as well.

This also fixes problem that function should return 4 for znver1.

Bootstrapped/regtested x86_64-linux, will commit it shortly.

* i386.c (ia32_multipass_dfa_lookahead): Default to issue rate
for post-reload scheduling.
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 253449)
+++ config/i386/i386.c  (working copy)
@@ -30761,44 +30761,13 @@ ix86_adjust_cost (rtx_insn *insn, int de
 static int
 ia32_multipass_dfa_lookahead (void)
 {
-  switch (ix86_tune)
-{
-case PROCESSOR_PENTIUM:
-case PROCESSOR_LAKEMONT:
-  return 2;
-
-case PROCESSOR_PENTIUMPRO:
-case PROCESSOR_K6:
-  return 1;
-
-case PROCESSOR_BDVER1:
-case PROCESSOR_BDVER2:
-case PROCESSOR_BDVER3:
-case PROCESSOR_BDVER4:
-  /* We use lookahead value 4 for BD both before and after reload
-schedules. Plan is to have value 8 included for O3. */
-return 4;
-
-case PROCESSOR_CORE2:
-case PROCESSOR_NEHALEM:
-case PROCESSOR_SANDYBRIDGE:
-case PROCESSOR_HASWELL:
-case PROCESSOR_BONNELL:
-case PROCESSOR_SILVERMONT:
-case PROCESSOR_KNL:
-case PROCESSOR_KNM:
-case PROCESSOR_INTEL:
-  /* Generally, we want haifa-sched:max_issue() to look ahead as far
-as many instructions can be executed on a cycle, i.e.,
-issue_rate.  I wonder why tuning for many CPUs does not do this.  */
-  if (reload_completed)
-return ix86_issue_rate ();
-  /* Don't use lookahead for pre-reload schedule to save compile time.  */
-  return 0;
-
-default:
-  return 0;
-}
+  /* Generally, we want haifa-sched:max_issue() to look ahead as far
+ as many instructions can be executed on a cycle, i.e.,
+ issue_rate.  */
+  if (reload_completed)
+return ix86_issue_rate ();
+  /* Don't use lookahead for pre-reload schedule to save compile time.  */
+  return 0;
 }
 
 /* Return true if target platform supports macro-fusion.  */


Re: [PATCH][AArch64] Add BIC-imm and ORR-imm SIMD pattern

2017-10-05 Thread Sudi Das

Hi Steve

Sorry about this.
I am on it. I have a fix and I am running tests on it right now.

Sudi


From: Steve Ellcey 
Sent: Thursday, October 5, 2017 12:05 AM
To: Richard Earnshaw; Sudi Das; James Greenhalgh
Cc: gcc-patches@gcc.gnu.org; nd; Marcus Shawcroft
Subject: Re: [PATCH][AArch64] Add BIC-imm and ORR-imm SIMD pattern
    
On Wed, 2017-10-04 at 16:41 +0100, Richard Earnshaw (lists) wrote:
> On 02/10/17 10:05, Sudi Das wrote:
> > 
> > 2017-10-02 Sudakshina Das  
> > 
> >  * config/aarch64/aarch64-protos.h (enum simd_immediate_check): New 
> >check type
> >  for aarch64_simd_valid_immediate.
> >  (aarch64_output_simd_mov_immediate): Update prototype.
> >  (aarch64_simd_valid_immediate): Update prototype.
> > 
> >  * config/aarch64/aarch64-simd.md (orr3): modified pattern to add
> >  support for ORR-immediate.
> >  (and3): modified pattern to add support for BIC-immediate.
> > 
> >  * config/aarch64/aarch64.c (aarch64_simd_valid_immediate): Function 
> >now checks
> >  for valid immediate for BIC and ORR based on new enum argument.
> >  (aarch64_output_simd_mov_immediate): Function now used to output 
> >BIC/ORR imm
> >  as well based on new enum argument.
> >  
> >  * config/aarch64/constraints.md (Do): New vector immediate constraint.
> >  (Db) : Likewise.
> > 
> >  * config/aarch64/predicates.md (aarch64_reg_or_orr_imm): New
> > predicate.
> >  (aarch64_reg_or_bic_imm): Likewise.

I think this patch is causing a bunch of test failures on aarch64.  I
had to apply the patch for PR82396 (that was reverted) in order to
build ToT GCC, but when I did that and ran the testsuite I got a bunch
of failures like:

/home/sellcey/cavium-pr-27386/src/gcc/gcc/testsuite/gcc.c-
torture/compile/pr54713-1.c:45:18: internal compiler error: in
aarch64_simd_valid_immediate, at config/aarch64/aarch64.c:11539
0xf2227b aarch64_simd_valid_immediate(rtx_def*, machine_mode, bool,
simd_immediate_info*, simd_immediate_check)
../../../src/gcc/gcc/config/aarch64/aarch64.c:11539
0x11047b3 aarch64_reg_or_bic_imm(rtx_def*, machine_mode)
../../../src/gcc/gcc/config/aarch64/predicates.md:79
0xab29ab insn_operand_matches(insn_code, unsigned int, rtx_def*)
../../../src/gcc/gcc/optabs.c:6891
0xab29ab maybe_legitimize_operand_same_code
../../../src/gcc/gcc/optabs.c:6919
0xab545f maybe_legitimize_operand
../../../src/gcc/gcc/optabs.c:6990
0xab545f maybe_legitimize_operands(insn_code, unsigned int, unsigned
int, expand_operand*)
../../../src/gcc/gcc/optabs.c:7055
0xab5a8f maybe_gen_insn(insn_code, unsigned int, expand_operand*)
../../../src/gcc/gcc/optabs.c:7073
0xab8503 expand_binop_directly
../../../src/gcc/gcc/optabs.c:1075
0xab87af expand_binop(machine_mode, optab_tag, rtx_def*, rtx_def*,
rtx_def*, int, optab_methods)
../../../src/gcc/gcc/optabs.c:1156
0x8736d7 expand_expr_real_2(separate_ops*, rtx_def*, machine_mode,
expand_modifier)
../../../src/gcc/gcc/expr.c:9582
0x749027 expand_gimple_stmt_1
../../../src/gcc/gcc/cfgexpand.c:3691
0x749027 expand_gimple_stmt
../../../src/gcc/gcc/cfgexpand.c:3751
0x750387 expand_gimple_basic_block
../../../src/gcc/gcc/cfgexpand.c:5750
0x751ef7 execute
../../../src/gcc/gcc/cfgexpand.c:6357


[PATCH 2/2] C/C++: add fix-it hints for various missing symbols (v3)

2017-10-05 Thread David Malcolm
Here's a slight update to this patch, since v2 was made invalid by 
r253411 ("C: underline parameters in mismatching function calls").

Both v2 and r253411 added code to c-parser.c/h to track the location_t
of the last consumed token (although I somehow managed to name the new
field in c_parser differently between the two versions...)

This version (v3) is the same as v2, but removes the copy of the
above code, updating the usage sites to use the field name from
r253411.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu
in conjunction with patch 1 of the kit:
  https://gcc.gnu.org/ml/gcc-patches/2017-09/msg01745.html

OK for trunk?

Blurb from v2 follows:

The patch improves our C/C++ frontends' handling of missing
symbols, by making c_parser_require and cp_parser_require use
"better" locations for the diagnostic, and insert fix-it hints,
under certain circumstances (see the comments in the patch for
full details).

For example, for this code with a missing semicolon:

  $ cat test.c
  int missing_semicolon (void)
  {
return 42
  }

  trunk currently emits:

  test.c:4:1: error: expected ';' before '}' token
   }
   ^

This patch adds a fix-it hint for the missing semicolon, and puts
the error at the location of the missing semicolon, printing the
followup token as a secondary location:

  test.c:3:12: error: expected ';' before '}' token
 return 42
  ^
  ;
   }
   ~

More examples can be seen in the test cases.

This is a revised version of the patch I posted here:
  https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00135.html

Some of the changes in that patch landed in trunk in r251026
(aka 3fe34694f0990d1d649711ede0326497f8a849dc
"C/C++: show pertinent open token when missing a close token"),
so this patch contains the remaining part, updated also for the
previous patch that reunifies the cloned copies of cp_parser_error
introduced in r251026.

It also:
- fixes the typo seen by Jeff
- eliminated some unnecessary changes to c-c++-common/missing-symbol.c
- fixes some bugs

r250133, r250134, and r251026 already incorporated the suggestion from
Richard Sandiford to consolidate note-printing when the matching location
is near the primary location of the diagnostic.

This patch doesn't address Joseph's requests to tackle PR 7356 and
PR 18248, but he said that it was OK to leave these for followups.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu
in conjunction with patch 1 of the kit.

OK for trunk?

gcc/c-family/ChangeLog:
* c-common.c (enum missing_token_insertion_kind): New enum.
(get_missing_token_insertion_kind): New function.
(maybe_suggest_missing_token_insertion): New function.
* c-common.h (maybe_suggest_missing_token_insertion): New decl.

gcc/c/ChangeLog:
* c-parser.c (c_parser_require): Add "type_is_unique" param and
use it to guard calls to maybe_suggest_missing_token_insertion.
(c_parser_parms_list_declarator): Override default value of new
"type_is_unique" param to c_parser_require.
(c_parser_asm_statement): Likewise.
* c-parser.h (c_parser_require): Add "type_is_unique" param,
defaulting to true.

gcc/cp/ChangeLog:
* parser.c (get_required_cpp_ttype): New function.
(cp_parser_error_1): Call it, using the result to call
maybe_suggest_missing_token_insertion.

gcc/testsuite/ChangeLog:
* c-c++-common/cilk-plus/AN/parser_errors.c: Update expected
output to reflect changes to reported locations of missing
symbols.
* c-c++-common/cilk-plus/AN/parser_errors2.c: Likewise.
* c-c++-common/cilk-plus/AN/parser_errors3.c: Likewise.
* c-c++-common/cilk-plus/AN/pr61191.c: Likewise.
* c-c++-common/gomp/pr63326.c: Likewise.
* c-c++-common/missing-close-symbol.c: Likewise, also update for
new fix-it hints.
* c-c++-common/missing-symbol.c: Likewise, also add test coverage
for missing colon in ternary operator.
* g++.dg/cpp1y/digit-sep-neg.C: Likewise.
* g++.dg/cpp1y/pr65202.C: Likewise.
* g++.dg/missing-symbol-2.C: New test case.
* g++.dg/other/do1.C: Update expected output to reflect
changes to reported locations of missing symbols.
* g++.dg/parse/error11.C: Likewise.
* g++.dg/template/error11.C: Likewise.
* gcc.dg/missing-symbol-2.c: New test case.
* gcc.dg/missing-symbol-3.c: New test case.
* gcc.dg/noncompile/940112-1.c: Update expected output to reflect
changes to reported locations of missing symbols.
* gcc.dg/noncompile/971104-1.c: Likewise.
* obj-c++.dg/exceptions-6.mm: Likewise.
* obj-c++.dg/pr48187.mm: Likewise.
* objc.dg/exceptions-6.m: Likewise.
---
 gcc/c-family/c-common.c| 158 +
 gcc/c-family/c-common.h|   3 +
 gcc/c/c-parser.c   

[PATCH] Print variadic C-functions properly in diagnostics

2017-10-05 Thread Bernd Edlinger
Hi!

This fixes the c-pretty-printing of variadic function types by
adding ", ..." to the output of the function parameters.


Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
Is it OK for trunk?


Thanks
Bernd.
c-family:
2017-10-05  Bernd Edlinger  

	* c-pretty-print.c (pp_c_parameter_type_list): Print ... for variadic
	functions.

testsuite:
2017-10-05  Bernd Edlinger  

	* gcc.dg/Wincompatible-pointer-types-1.c: New test.


--- gcc/c-family/c-pretty-print.c	2017-08-07 10:37:07.0 +0200
+++ gcc/c-family/c-pretty-print.c	2017-10-04 12:38:09.412750910 +0200
@@ -521,6 +521,11 @@ pp_c_parameter_type_list (c_pretty_print
 	  else
 	pp->abstract_declarator (TREE_VALUE (parms));
 	}
+  if (!first && !parms)
+	{
+	  pp_separate_with (pp, ',');
+	  pp_c_ws_string (pp, "...");
+	}
 }
   pp_c_right_paren (pp);
 }
--- /dev/null	2017-09-18 13:50:08.343098047 +0200
+++ gcc/testsuite/gcc.dg/Wincompatible-pointer-types-1.c	2017-10-04 18:05:24.172717734 +0200
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-pedantic-errors" } */
+
+void f (int, ...);
+
+int
+f1 (void)
+{
+  int (*x) ();
+  x = f; /* { dg-error "assignment to 'int \\(\\*\\)\\(\\)' from incompatible pointer type 'void \\(\\*\\)\\(int,  \.\.\.\\)'" } */
+  return x (1);
+}


Re: [RFA] [PATCH] Add a warning for invalid function casts

2017-10-05 Thread Martin Sebor

On 10/03/2017 01:33 PM, Bernd Edlinger wrote:

Hi!

I have implemented a warning -Wcast-function-type that analyzes
type casts which change the function signatures.

I would consider function pointers with different result type
invalid, also if both function types have a non-null TYPE_ARG_TYPES
I would say this deserves a warning.  As an exception I have
used for instance in recog.h, the warning allows casting
a function with the type typedef rtx (*stored_funcptr) (...);
to any function with the same result type.

I would think a warning like that should be enabled with -Wextra.

Attached is a first version of the patch and as you can see
the warning found already lots of suspicious type casts.  The worst
is the splay-tree which always calls functions with uintptr_t
instead of the correct parameter type.  I was unable to find
a solution for this, and just silenced the warning with a
second type-cast.

Note that I also changed one line in libgo, but that is only
a quick hack which I only did to make the boot-strap with
all languages succeed.

I'm not sure if this warning may be a bit too strict, but I think
so far it just triggered on rather questionable code.

Thoughts?


My initial thought is that although casts between incompatible
function types undoubtedly mask bugs, the purpose of such casts
is to make such conversions possible.  Indiscriminately diagnosing
them would essentially eliminate this feature of the type system.
Having to add another cast to a different type(*) to suppress
the warning when such a conversion is intended doesn't seem like
a good solution (the next logical step to find bugs in those
conversions would then be to add another warning to see through
those additional casts, and so on).

With that, my question is: under what circumstances does the
warning not trigger on a cast to a function of an incompatible
type?

In my (very quick) tests the warning appears to trigger on all
strictly incompatible conversions, even if they are otherwise
benign, such as:

  int f (const int*);
  typedef int F (int*);

  F* pf1 = f;// -Wincompatible-pointer-types
  F* pf2 = (F*)f;// -Wcast-function-type

Likewise by:

  int f (signed char);
  typedef int F (unsigned char);

  F* pf = (F*)f;// -Wcast-function-type

I'd expect these conversions to be useful and would view warning
for them with -Wextra as excessive.  In fact, I'm not sure I see
the benefit of warning on these casts under any circumstances.

Similarly, for casts between pointers to the same integer type
with a different sign, or those involving ILP32/LP64 portability
issues I'd expect the warning not to trigger unless requested
(e.g., by some other option targeting those issues).

So based on these initial observations and despite the bugs it
may have uncovered, I share your concern that the warning in
its present form is too strict to be suitable for -Wextra, or
possibly even too noisy to be of practical use on its own and
outside of -Wextra.

Out of curiosity, have you done any tests on other code bases
besides GCC to see how many issues (true and false positives)
it finds?

Martin

[*] Strictly speaking, the semantics of casting a function
pointer to intptr_t aren't necessarily well-defined.  Only void*
can be portably converted to intptr_t and back, and only object
pointers are required to be convertible to void* and back.  And
although GCC defines the semantics of these conversions, forcing
programs to abandon a well defined language feature in favor of
one that's not as cleanly specified would undermine the goal
the warning is meant to achieve.


Fix PR ada/82393

2017-10-05 Thread Eric Botcazou
It's a build failure on Cygwin64 and is very similar to PR ada/64640, which 
was a build failure on Cygwin32.  Instead of defining non-existent constants, 
let's just reuse the DJGPP version of __gnat_set_mode here.

Applied on the mainline and 7 branch.


2017-10-05  Eric Botcazou  

PR ada/82393
* mingw32.h (_O_U8TEXT, _O_U16TEXT, _O_WTEXT): Delete.
* sysdep.c (__gnat_set_mode ): Use DJGPP version for Cygwin.

-- 
Eric BotcazouIndex: mingw32.h
===
--- mingw32.h	(revision 253398)
+++ mingw32.h	(working copy)
@@ -59,16 +59,6 @@
 #endif
 #include 
 
-#ifndef _O_U8TEXT
-#define _O_U8TEXT _O_TEXT
-#endif
-#ifndef _O_U16TEXT
-#define _O_U16TEXT _O_TEXT
-#endif
-#ifndef _O_WTEXT
-#define _O_WTEXT _O_TEXT
-#endif
-
 /* After including this file it is possible to use the character t as prefix
to routines. If GNAT_UNICODE_SUPPORT is defined then the unicode enabled
versions will be used.  */
Index: sysdep.c
===
--- sysdep.c	(revision 253398)
+++ sysdep.c	(working copy)
@@ -126,7 +126,7 @@ extern struct tm *localtime_r(const time
 
 */
 
-#if defined (WINNT) || defined (__CYGWIN__) || defined(__DJGPP__)
+#if defined (WINNT) || defined (__CYGWIN__) || defined (__DJGPP__)
 
 const char __gnat_text_translation_required = 1;
 
@@ -137,7 +137,7 @@ const char __gnat_text_translation_requi
 #define WIN_SETMODE _setmode
 #endif
 
-#if defined(__DJGPP__)
+#if defined (__DJGPP__)
 #include 
 #define _setmode setmode
 #endif /* __DJGPP__ */
@@ -154,7 +154,7 @@ __gnat_set_text_mode (int handle)
   WIN_SETMODE (handle, O_TEXT);
 }
 
-#ifdef __DJGPP__
+#if defined (__CYGWIN__) || defined (__DJGPP__)
 void
 __gnat_set_mode (int handle, int mode)
 {
@@ -826,7 +826,7 @@ __gnat_localtime_tzoff (const time_t *ti
 
 #elif defined (__APPLE__) || defined (__FreeBSD__) || defined (__linux__) \
   || defined (__GLIBC__) || defined (__DragonFly__) || defined (__OpenBSD__) \
-  || defined(__DJGPP__)
+  || defined (__DJGPP__)
 {
   localtime_r (timer, &tp);
   *off = tp.tm_gmtoff;


libbacktrace patch committed: Minor decompression improvement

2017-10-05 Thread Ian Lance Taylor
I've committed a patch to libbacktrace to speed up decompression a few
percent by loading 32-bit values rather than 8-bit bytes.
Bootstrapped and ran libbacktrace and Go tests on x86_64-pc-linux-gnu.
Committed to mainline.

Ian

2017-10-05  Ian Lance Taylor  

* elf.c (elf_zlib_fetch): Change pval argument to uint64_t *.
Read a four byte integer.
(elf_zlib_inflate): Change val to uint64_t.  Align pin to a 32-bit
boundary before ever calling elf_zlib_fetch.
* ztest.c (test_large): Simplify print statements a bit.
Index: elf.c
===
--- elf.c   (revision 253376)
+++ elf.c   (working copy)
@@ -1031,11 +1031,12 @@ elf_zlib_failed(void)
 
 static int
 elf_zlib_fetch (const unsigned char **ppin, const unsigned char *pinend,
-   uint32_t *pval, unsigned int *pbits)
+   uint64_t *pval, unsigned int *pbits)
 {
   unsigned int bits;
   const unsigned char *pin;
-  uint32_t val;
+  uint64_t val;
+  uint32_t next;
 
   bits = *pbits;
   if (bits >= 15)
@@ -1043,20 +1044,25 @@ elf_zlib_fetch (const unsigned char **pp
   pin = *ppin;
   val = *pval;
 
-  if (unlikely (pinend - pin < 2))
+  if (unlikely (pinend - pin < 4))
 {
   elf_zlib_failed ();
   return 0;
 }
-  val |= pin[0] << bits;
-  val |= pin[1] << (bits + 8);
-  bits += 16;
-  pin += 2;
-
-  /* We will need the next two bytes soon.  We ask for high temporal
- locality because we will need the whole cache line soon.  */
-  __builtin_prefetch (pin, 0, 3);
-  __builtin_prefetch (pin + 1, 0, 3);
+
+  /* We've ensured that PIN is aligned.  */
+  next = *(const uint32_t *)pin;
+
+#if __BYTE_ORDER == __ORDER_BIG_ENDIAN
+  next = __builtin_bswap32 (next);
+#endif
+
+  val |= (uint64_t)next << bits;
+  bits += 32;
+  pin += 4;
+
+  /* We will need the next four bytes soon.  */
+  __builtin_prefetch (pin, 0, 0);
 
   *ppin = pin;
   *pval = val;
@@ -1566,7 +1572,7 @@ elf_zlib_inflate (const unsigned char *p
   poutend = pout + sout;
   while ((pinend - pin) > 4)
 {
-  uint32_t val;
+  uint64_t val;
   unsigned int bits;
   int last;
 
@@ -1601,10 +1607,19 @@ elf_zlib_inflate (const unsigned char *p
}
   pin += 2;
 
-  /* Read blocks until one is marked last.  */
+  /* Align PIN to a 32-bit boundary.  */
 
   val = 0;
   bits = 0;
+  while uintptr_t) pin) & 3) != 0)
+   {
+ val |= (uint64_t)*pin << bits;
+ bits += 8;
+ ++pin;
+   }
+
+  /* Read blocks until one is marked last.  */
+
   last = 0;
 
   while (!last)
@@ -1671,6 +1686,14 @@ elf_zlib_inflate (const unsigned char *p
  pout += len;
  pin += len;
 
+ /* Align PIN.  */
+ while uintptr_t) pin) & 3) != 0)
+   {
+ val |= (uint64_t)*pin << bits;
+ bits += 8;
+ ++pin;
+   }
+
  /* Go around to read the next block.  */
  continue;
}
Index: ztest.c
===
--- ztest.c (revision 253377)
+++ ztest.c (working copy)
@@ -432,9 +432,9 @@ test_large (struct backtrace_state *stat
   ctime = average_time (ctimes, trials);
   ztime = average_time (ztimes, trials);
 
-  printf ("backtrace time: %zu ns\n", ctime);
-  printf ("zlib time:: %zu ns\n", ztime);
-  printf ("percentage: %g\n", (double) ztime / (double) ctime);
+  printf ("backtrace: %zu ns\n", ctime);
+  printf ("zlib : %zu ns\n", ztime);
+  printf ("ratio: %g\n", (double) ztime / (double) ctime);
 
   return;
 


Re: Zen tuning part 2: Increase branch_cost to 3

2017-10-05 Thread Joseph Myers
Should mention PR target/80313 (don't know if this is a complete fix, but 
it's at least an issue mentioned in that bug).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] Print variadic C-functions properly in diagnostics

2017-10-05 Thread Joseph Myers
On Thu, 5 Oct 2017, Bernd Edlinger wrote:

> Hi!
> 
> This fixes the c-pretty-printing of variadic function types by
> adding ", ..." to the output of the function parameters.
> 
> 
> Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
> Is it OK for trunk?

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] Fix PR81422[aarch64] internal compiler error: in update_equiv_regs, at ira.c:3425

2017-10-05 Thread Qing Zhao
HI, Richard,

> On Oct 5, 2017, at 9:53 AM, Richard Earnshaw (lists) 
>  wrote:
> 
> Two minor nits to fix:
> 
> - ChangeLog entries should start with a capital letter (s/check/Check/).
> - Please use hard tabs rather than 8 consecutive spaces (each of your
> new 'if' statements).
> 
> OK with those changes.
> 
> R.

Per your comments, my updated patch is following:

since I don’t have the SVN write permission, if the change is good, could you 
help me to check them in?

thanks a lot.

Qing

===

gcc/ChangeLog:

  * config/aarch64/aarch64.c (aarch64_load_symref_appropriately):
  Check whether the dest is REG before adding REG_EQUIV note.

gcc/testsuite/ChangeLog:

  PR target/81422
  * gcc.target/aarch64/pr81422.C: New test.

---
 gcc/config/aarch64/aarch64.c   | 12 
 gcc/testsuite/gcc.target/aarch64/pr81422.C | 15 +++
 2 files changed, 23 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/pr81422.C

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index ee98a1f..181f66a 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1490,7 +1490,8 @@ aarch64_load_symref_appropriately (rtx dest, rtx imm,
  tp = gen_lowpart (mode, tp);
 
emit_insn (gen_rtx_SET (dest, gen_rtx_PLUS (mode, tp, x0)));
-   set_unique_reg_note (get_last_insn (), REG_EQUIV, imm);
+   if (REG_P (dest))
+ set_unique_reg_note (get_last_insn (), REG_EQUIV, imm);
return;
   }
 
@@ -1524,7 +1525,8 @@ aarch64_load_symref_appropriately (rtx dest, rtx imm,
  }
 
emit_insn (gen_rtx_SET (dest, gen_rtx_PLUS (mode, tp, tmp_reg)));
-   set_unique_reg_note (get_last_insn (), REG_EQUIV, imm);
+   if (REG_P (dest))
+ set_unique_reg_note (get_last_insn (), REG_EQUIV, imm);
return;
   }
 
@@ -1565,7 +1567,8 @@ aarch64_load_symref_appropriately (rtx dest, rtx imm,
gcc_unreachable ();
  }
 
-   set_unique_reg_note (get_last_insn (), REG_EQUIV, imm);
+   if (REG_P (dest))
+ set_unique_reg_note (get_last_insn (), REG_EQUIV, imm);
return;
   }
 
@@ -1594,7 +1597,8 @@ aarch64_load_symref_appropriately (rtx dest, rtx imm,
emit_insn (gen_tlsie_tiny_sidi (dest, imm, tp));
  }
 
-   set_unique_reg_note (get_last_insn (), REG_EQUIV, imm);
+   if (REG_P (dest))
+ set_unique_reg_note (get_last_insn (), REG_EQUIV, imm);
return;
   }
 
diff --git a/gcc/testsuite/gcc.target/aarch64/pr81422.C 
b/gcc/testsuite/gcc.target/aarch64/pr81422.C
new file mode 100644
index 000..5bcc948
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr81422.C
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O0" } */
+
+struct DArray
+{
+__SIZE_TYPE__ length;
+int* ptr;
+};
+
+void foo35(DArray)
+{
+static __thread int x[5];
+foo35({5, (int*)&x});
+}
+
-- 
1.9.1



Re: [PATCH] Fix PR80295[aarch64] [7/8 Regression] ICE in __builtin_update_setjmp_buf expander

2017-10-05 Thread Richard Earnshaw (lists)
On 25/09/17 17:35, Qing Zhao wrote:
> Hi, Andreas,
> 
> thanks for the comment. 
> 
>> GNU style is line break before the operator, not after.
> 
> updated per your comment.
> 
> Qing.
> 
> ---
>  gcc/config/aarch64/aarch64.c   | 12 +---
>  gcc/config/aarch64/aarch64.h   |  2 +-
>  gcc/config/aarch64/aarch64.md  |  6 +++---
>  gcc/testsuite/gcc.target/aarch64/pr80295.c |  8 
>  4 files changed, 21 insertions(+), 7 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/pr80295.c
> 
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 6c3ef76..ff0890d 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -3693,7 +3693,9 @@ aarch64_expand_prologue (void)
>  stack_pointer_rtx,
>  GEN_INT (callee_offset)));
>RTX_FRAME_RELATED_P (insn) = 1;
> -  emit_insn (gen_stack_tie (stack_pointer_rtx, hard_frame_pointer_rtx));
> +  emit_insn (TARGET_ILP32 
> + ? gen_stack_tiesi (stack_pointer_rtx, 
> hard_frame_pointer_rtx)
> + : gen_stack_tiedi (stack_pointer_rtx, 
> hard_frame_pointer_rtx));
>  }
>  
>aarch64_save_callee_saves (DImode, callee_offset, R0_REGNUM, R30_REGNUM,
> @@ -3750,7 +3752,9 @@ aarch64_expand_epilogue (bool for_sibcall)
>if (final_adjust > crtl->outgoing_args_size || cfun->calls_alloca
>|| crtl->calls_eh_return)
>  {
> -  emit_insn (gen_stack_tie (stack_pointer_rtx, stack_pointer_rtx));
> +  emit_insn (TARGET_ILP32 
> + ? gen_stack_tiesi (stack_pointer_rtx, stack_pointer_rtx)
> + : gen_stack_tiedi (stack_pointer_rtx, stack_pointer_rtx));
>need_barrier_p = false;
>  }
>  
> @@ -3774,7 +3778,9 @@ aarch64_expand_epilogue (bool for_sibcall)
>   callee_adjust != 0, &cfi_ops);
>  
>if (need_barrier_p)
> -emit_insn (gen_stack_tie (stack_pointer_rtx, stack_pointer_rtx));
> +emit_insn (TARGET_ILP32 
> +   ? gen_stack_tiesi (stack_pointer_rtx, stack_pointer_rtx)
> +   : gen_stack_tiedi (stack_pointer_rtx, stack_pointer_rtx));
>  
>if (callee_adjust != 0)
>  aarch64_pop_regs (reg1, reg2, callee_adjust, &cfi_ops);
> diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
> index 8fada9e..df58442 100644
> --- a/gcc/config/aarch64/aarch64.h
> +++ b/gcc/config/aarch64/aarch64.h
> @@ -782,7 +782,7 @@ typedef struct
>  /* Specify the machine mode that the hardware addresses have.
> After generation of rtl, the compiler makes no further distinction
> between pointers and any other objects of this machine mode.  */
> -#define PmodeDImode
> +#define Pmode(TARGET_ILP32 ? SImode : DImode)

This is wrong.  AArch64 has both ptr_mode and Pmode.  Pmode must always
be DImode as (internally) all addresses must be 64-bit.  ptr_mode
reflects the ABI choice of 32/64-bit language level addresses.  The
register SP must always be a 64-bit value, even when all the top 32 bits
are zero.

R.

>  
>  /* A C expression whose value is zero if pointers that need to be extended
> from being `POINTER_SIZE' bits wide to `Pmode' are sign-extended and
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index bb7f2c0..30853b2 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -5533,10 +5533,10 @@
>[(set_attr "type" "call")
> (set_attr "length" "16")])
>  
> -(define_insn "stack_tie"
> +(define_insn "stack_tie"
>[(set (mem:BLK (scratch))
> - (unspec:BLK [(match_operand:DI 0 "register_operand" "rk")
> -  (match_operand:DI 1 "register_operand" "rk")]
> + (unspec:BLK [(match_operand:GPI 0 "register_operand" "rk")
> +  (match_operand:GPI 1 "register_operand" "rk")]
>   UNSPEC_PRLG_STK))]
>""
>""
> diff --git a/gcc/testsuite/gcc.target/aarch64/pr80295.c 
> b/gcc/testsuite/gcc.target/aarch64/pr80295.c
> new file mode 100644
> index 000..b3866d8
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/pr80295.c
> @@ -0,0 +1,8 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mabi=ilp32" } */
> +
> +void f (void *b) 
> +{ 
> +  __builtin_update_setjmp_buf (b); 
> +}
> +
> 



Re: [PATCH][RFC] Instrument function exit with __builtin_unreachable in C++.

2017-10-05 Thread Martin Liška

On 10/05/2017 05:07 PM, Jason Merrill wrote:

On Thu, Oct 5, 2017 at 6:31 AM, Martin Liška  wrote:

As discussed 2 days ago on IRC with Jakub and Jonathan, C++ standard says that 
having a non-return
function with missing return statement is undefined behavior. We've got 
-fsanitize=return check for
that and we can in such case instrument __builtin_unreachable(). This patch 
does that.


Great.


And there's still some fallout:

FAIL: g++.dg/cpp0x/constexpr-diag3.C  -std=c++11  (test for errors, line 7)
FAIL: g++.dg/cpp0x/constexpr-neg3.C  -std=c++11  (test for errors, line 12)
FAIL: g++.dg/cpp1y/constexpr-return2.C  -std=c++14  (test for errors, line 7)
FAIL: g++.dg/cpp1y/constexpr-return2.C  -std=c++14 (test for excess errors)
FAIL: g++.dg/cpp1y/pr63996.C  -std=c++14  (test for errors, line 9)
FAIL: g++.dg/cpp1y/pr63996.C  -std=c++14 (test for excess errors)

1) there are causing:

./xg++ -B. /home/marxin/Programming/gcc/gcc/testsuite/g++.dg/cpp1y/pr63996.C
/home/marxin/Programming/gcc/gcc/testsuite/g++.dg/cpp1y/pr63996.C:9:23:   in 
constexpr expansion of ‘foo(1)’
/home/marxin/Programming/gcc/gcc/testsuite/g++.dg/cpp1y/pr63996.C:4:1: error: 
‘__builtin_unreachable()’ is not a constant expression
  foo (int i)
  ^~~

Where we probably should not emit the built-in call. Am I right?


Or constexpr.c could give a more friendly diagnostic for
__builtin_unreachable.  It's correct to give a diagnostic here for
this testcase.


Hi.

That's good idea, any suggestion different from:

./xg++ -B. /home/marxin/Programming/gcc2/gcc/testsuite/g++.dg/cpp1y/pr63996.C
/home/marxin/Programming/gcc2/gcc/testsuite/g++.dg/cpp1y/pr63996.C:9:23:   in 
constexpr expansion of ‘foo(1)’
: error: constexpr can't contain call of a non-return function 
‘__builtin_unreachable’

which is probably misleading as it points to a function call that is actually 
missing in source code.
Should we distinguish between implicit and explicit __builtin_unreachable?




FAIL: g++.dg/torture/pr34850.C   -O1   (test for warnings, line 15)
FAIL: g++.dg/torture/pr34850.C   -O2   (test for warnings, line 15)
FAIL: g++.dg/torture/pr34850.C   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none   (test for warnings, line 15)
FAIL: g++.dg/torture/pr34850.C   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects   (test for warnings, line 15)
FAIL: g++.dg/torture/pr34850.C   -O3 -g   (test for warnings, line 15)
FAIL: g++.dg/torture/pr34850.C   -Os   (test for warnings, line 15)

2) this test is somehow hard to fix as it requires some unreachability.


Can't we add a return statement to memset?


Does not help :)



Also, this testcase seems to indicate a danger from this patch, like
we've seen before with bounds checking: if we have a checking path
that complains about invalid arguments and then has undefined
behavior, we optimize away the diagnostic.  Can we warn in such cases?


Good question, can you please provide more info how that happens? Do we have an 
example for that?


  We probably want to provide a way to turn off this behavior.

If we're going to enable this by default, we probably also want
-Wreturn-type on by default.


Agree.

Thanks,
Martin



Jason





Re: [PATCH] Add sanopt support for UBSAN_PTR.

2017-10-05 Thread Martin Sebor

On 10/04/2017 03:05 AM, Martin Liška wrote:

Hello.

Following patch adds support for optimizing out unnecessary UBSAN_PTR checks.
It handles separately positive and negative offsets, zero offset is ignored.
Apart from that, we utilize get_inner_reference for local and global variables,
that also helps to reduce some.

Some numbers:

1) postgres:

bloaty /tmp/after2 -- /tmp/before2
 VM SIZE  FILE SIZE
 ++ GROWING++
  [ = ]   0 .debug_abbrev  +1.84Ki  +0.3%

 -- SHRINKING  --
 -30.3% -3.98Mi .text  -3.98Mi -30.3%
  [ = ]   0 .debug_info-3.69Mi -17.2%
  [ = ]   0 .debug_loc -2.02Mi -13.4%
 -43.1% -1.37Mi .data  -1.37Mi -43.1%
  [ = ]   0 .debug_ranges   -390Ki -14.3%
  [ = ]   0 .debug_line -295Ki -11.6%
  -4.0% -26.3Ki .eh_frame  -26.3Ki  -4.0%
  [ = ]   0 [Unmapped] -1.61Ki -62.3%
  [ = ]   0 .strtab-1.15Ki  -0.4%
  [ = ]   0 .symtab-1.08Ki  -0.3%
  -0.4%-368 .eh_frame_hdr -368  -0.4%
  [ = ]   0 .debug_aranges-256  -0.7%
  [DEL] -16 [None]   0  [ = ]

 -28.0% -5.37Mi TOTAL  -11.8Mi -18.8%


This looks like an impressive improvement!  FWIW, I've been
meaning to look into similar opportunities mentioned in bug
79265.

Would making use of get_range_info() make sense here to detect
and eliminate even more cases?

Just a few minor mostly stylistic suggestions.

+/* Traits class for tree triplet hash maps below.  */
+
+struct sanopt_tree_couple_hash : typed_noop_remove 
+{
...
+  static inline bool
+  equal (const sanopt_tree_couple &ref1, const sanopt_tree_couple &ref2)

Member functions defined within the body of the class are implicitly
inline so while not wrong, there is no need to declare them inline
explicitly.

Also, since mark_deleted uses reinterpret_cast (as suggested by
GCC coding conventions) it seems that is_deleted should do the
same for consistency.  Alternatively, if there isn't enough
interest/consensus to follow this guideline perhaps it should
be removed from the GCC coding conventions.  (Very few GCC code
seems to use reinterpret_cast.)


+/* Return true when pointer PTR for a given OFFSET is already sanitized
+   in a given sanitization context CTX.  */

Shouldn't the comment read CUR_OFFSET?  I ask because the function
also declares a local variable OFFSET.

+static bool
+has_dominating_ubsan_ptr_check (sanopt_ctx *ctx, tree ptr, tree cur_offset)
+{
...
+  /* We already have recorded a UBSAN_PTR check for this pointer. 
Perhaps we
+ can drop this one.  But only if this check doesn't specify larger 
offset.

+ */
+  tree offset = gimple_call_arg (g, 1);

Martin

PS It seems to me that the test could be enabled for all targets
where UBSan is supported by making use of SIZE_MAX to compute
the values of the constants instead of hardwiring LP64 values.
I noticed the test doesn't exercise member arrays.  Are those
not handled by the patch?


Left checks:
261039
Optimized out:
85643

2) tramp3d:

bloaty after -- before
 VM SIZE FILE SIZE
 ++ GROWING   ++
  +167% +30 [Unmapped]+1.01Ki   +39%

 -- SHRINKING --
 -58.5% -2.52Mi .text -2.52Mi -58.5%
 -64.2%  -574Ki .data  -574Ki -64.2%
  -5.7% -4.27Ki .eh_frame -4.27Ki  -5.7%
  -6.4% -1.06Ki .gcc_except_table -1.06Ki  -6.4%
  -7.2%-192 .bss0  [ = ]
  -0.1% -32 .rodata   -32  -0.1%
  [ = ]   0 .strtab   -29  -0.0%
  -1.1% -24 .dynsym   -24  -1.1%
  -1.5% -24 .rela.plt -24  -1.5%
  [ = ]   0 .symtab   -24  -0.1%
  -0.6% -16 .dynstr   -16  -0.6%
  -1.5% -16 .plt  -16  -1.5%
  -1.4%  -8 .got.plt   -8  -1.4%
  -0.6%  -4 .hash  -4  -0.6%
  -1.1%  -2 .gnu.version   -2  -1.1%

 -58.0% -3.09Mi TOTAL -3.08Mi -55.0%

Left checks:
31131
Optimized out:
36752

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Ready to be installed?
Martin

gcc/ChangeLog:

2017-10-04  Martin Liska  

* sanopt.c (struct sanopt_tree_couple): New struct.
(struct sanopt_tree_couple_hash): Likewise.
(struct sanopt_ctx): Add ptr_check_map.
(has_dominating_ubsan_ptr_check): New function.
(record_ubsan_ptr_check_stmt): Likewise.
(maybe_optimize_ubsan_ptr_ifn): Likewise.
(sanopt_optimize_walker): Handle IFN_UBSAN_PTR.  Dump info
inline and newly print stmts that are left in code stream.
(pass_sanopt::execute): Handle also SANITIZE_POINTER_OVERFLOW.

gcc/testsuite/ChangeLog:

2017-10-04  Martin Liska  

* c-c++-common/ubsan/ptr-overflow-sanitization-1.c: New test.
---
 gcc/sanopt.c   | 212 -

Go patch committed: Consolidate all names in one place

2017-10-05 Thread Ian Lance Taylor
This patch to the Go frontend consolidates all external name handling
in one place: a new file names.cc.  This handles names that might
appear in the assembly code, not necessarily names that are only used
for local variables and labels and the like.

This patch does not change any of the names, it just puts all the name
code in one place.  I checked that none of the symbol names in libgo
changed when this patch was used.

Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian

2017-10-05  Ian Lance Taylor  

* Make-lang.in (GO_OBJS): Add go/names.o.
Index: gcc/go/Make-lang.in
===
--- gcc/go/Make-lang.in (revision 253311)
+++ gcc/go/Make-lang.in (working copy)
@@ -68,6 +68,7 @@ GO_OBJS = \
go/import.o \
go/import-archive.o \
go/lex.o \
+   go/names.o \
go/parse.o \
go/runtime.o \
go/statements.o \
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 253311)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-5989ef1cd0add98f107839759a5bc57f34354d39
+048914caa26b34eebabd0423ed48ee3ac34c919c
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/escape.cc
===
--- gcc/go/gofrontend/escape.cc (revision 253311)
+++ gcc/go/gofrontend/escape.cc (working copy)
@@ -575,7 +575,7 @@ debug_function_name(Named_object* fn)
 
   // Extract #.
   std::string name = Gogo::unpack_hidden_name(fn->name());
-  int closure_num = (int)strtol(name.substr(6).c_str(), NULL, 0);
+  int closure_num = Gogo::nested_function_num(fn->name());
   closure_num++;
 
   name = Gogo::unpack_hidden_name(enclosing->name());
Index: gcc/go/gofrontend/expressions.cc
===
--- gcc/go/gofrontend/expressions.cc(revision 253311)
+++ gcc/go/gofrontend/expressions.cc(working copy)
@@ -1299,28 +1299,12 @@ Func_descriptor_expression::do_get_backe
 return context->backend()->var_expression(this->dvar_, VE_rvalue, loc);
 
   Gogo* gogo = context->gogo();
-  std::string var_name;
+  std::string var_name(gogo->function_descriptor_name(no));
   bool is_descriptor = false;
   if (no->is_function_declaration()
   && !no->func_declaration_value()->asm_name().empty()
   && Linemap::is_predeclared_location(no->location()))
-{
-  if (no->func_declaration_value()->asm_name().substr(0, 8) != "runtime.")
-   var_name = no->func_declaration_value()->asm_name() + "_descriptor";
-  else
-   var_name = no->func_declaration_value()->asm_name() + "$descriptor";
-  is_descriptor = true;
-}
-  else
-{
-  if (no->package() == NULL)
-   var_name = gogo->pkgpath_symbol();
-  else
-   var_name = no->package()->pkgpath_symbol();
-  var_name.push_back('.');
-  var_name.append(Gogo::unpack_hidden_name(no->name()));
-  var_name.append("$descriptor");
-}
+is_descriptor = true;
 
   Btype* btype = this->type()->get_backend(gogo);
 
@@ -4348,27 +4332,23 @@ Unary_expression::do_get_backend(Transla
}
}
 
-  static unsigned int counter;
-  char buf[100];
   if (this->is_gc_root_ || this->is_slice_init_)
{
+ std::string var_name;
  bool copy_to_heap = false;
  if (this->is_gc_root_)
{
  // Build a decl for a GC root variable.  GC roots are mutable, so
  // they cannot be represented as an immutable_struct in the
  // backend.
- static unsigned int root_counter;
- snprintf(buf, sizeof buf, "gc%u", root_counter);
- ++root_counter;
+ var_name = gogo->gc_root_name();
}
  else
{
  // Build a decl for a slice value initializer.  An immutable slice
  // value initializer may have to be copied to the heap if it
  // contains pointers in a non-constant context.
- snprintf(buf, sizeof buf, "C%u", counter);
- ++counter;
+ var_name = gogo->initializer_name();
 
  Array_type* at = this->expr_->type()->array_type();
  go_assert(at != NULL);
@@ -4379,12 +4359,12 @@ Unary_expression::do_get_backend(Transla
  // read-only, because the program is permitted to change it.
  copy_to_heap = context->function() != NULL;
}
-  std::string asm_name(go_selectively_encode_id(buf));
+ std::string asm_name(go_selectively_encode_id(var_name));
  Bvariable* implicit =
-  gogo->backend()->implicit_variable(buf, asm_name,
+  gogo->backend()->implicit_variable(var_name, asm_name,

Re: [PATCH] Fix PR82396: qsort comparator non-negative on sorted output

2017-10-05 Thread Steve Ellcey
On Wed, 2017-10-04 at 13:24 +, Wilco Dijkstra wrote:
> Richard Sandiford wrote:
> > 
> > 
> > I don't think it's reasonable to commit this as obvious.  You said
> > yourself in the covering message that "it doesn't at all restore
> > the original behaviour since we no longer compare the base address".
> > So even with the bootstrap failure, I think the patch needed review
> > before going in.
> > 
> > Christophe's message doesn't change anything because you knew when you
> > posted the patch that it fixed the failure.
> Well my understanding was that it is OK to fix a bootstrap failure. I believe 
> my
> patch is trivial since it mostly removes redundant code. Also I took Jakub's
> review as an OK as there were no technical objections. However since you
> seem to disagree, I will revert it.
> 
> We have now had 5 days of no builds for a major target, which is a huge
> inconvenience. So I don't think it is reasonable to wait any longer.
> The alternative is to revert the original patch that caused the bootstrap 
> failure
> plus the patch(es) that unexpectedly changed the behaviour of the scheduler
> (I don't think there was any testing as to what effect those had on the 
> schedule).
> 
> So the question is who will do that and when?
> 
> Wilco

The aarch64 bootstrap is still broken.  I am adding the scheduler
maintainers to the CC list on this email to see if one on them can
review/approve Wilco's patch which was applied and then reverted.  If
not, can one of the global maintainers revert the original patch that
broke the bootstrap?

Steve Ellcey
sell...@cavium.com


Re: [PATCH] Fix PR82396: qsort comparator non-negative on sorted output

2017-10-05 Thread Maxim Kuvyrkov
> On Oct 5, 2017, at 8:20 PM, Steve Ellcey  wrote:
> 
> On Wed, 2017-10-04 at 13:24 +, Wilco Dijkstra wrote:
>> Richard Sandiford wrote:
>>> 
>>> 
>>> I don't think it's reasonable to commit this as obvious.  You said
>>> yourself in the covering message that "it doesn't at all restore
>>> the original behaviour since we no longer compare the base address".
>>> So even with the bootstrap failure, I think the patch needed review
>>> before going in.
>>> 
>>> Christophe's message doesn't change anything because you knew when you
>>> posted the patch that it fixed the failure.
>> Well my understanding was that it is OK to fix a bootstrap failure. I 
>> believe my
>> patch is trivial since it mostly removes redundant code. Also I took Jakub's
>> review as an OK as there were no technical objections. However since you
>> seem to disagree, I will revert it.
>> 
>> We have now had 5 days of no builds for a major target, which is a huge
>> inconvenience. So I don't think it is reasonable to wait any longer.
>> The alternative is to revert the original patch that caused the bootstrap 
>> failure
>> plus the patch(es) that unexpectedly changed the behaviour of the scheduler
>> (I don't think there was any testing as to what effect those had on the 
>> schedule).
>> 
>> So the question is who will do that and when?
>> 
>> Wilco
> 
> The aarch64 bootstrap is still broken.  I am adding the scheduler
> maintainers to the CC list on this email to see if one on them can
> review/approve Wilco's patch which was applied and then reverted.  If
> not, can one of the global maintainers revert the original patch that
> broke the bootstrap?

What the heck, I'll jump in as well.

I'm still working on analysis, but it appears to me that Alexander's patch 
(current state of trunk) fails qsort check due to not being symmetric for 
load/store analysis (write == 0 or write == 1) in comparisons with "irrelevant" 
instructions.  Wilco's patch does not seem to address that, and, possibly, 
makes the failure latent (I may be wrong here, it's late and I didn't finish 
analysis yet).

I'm currently bootstrapping the following patch (on aarch64-linux-gnu, 
arm-linux-gnueabihf will follow tomorrow), which (like Wilco's patch) seems to 
unbreak bootstrap, but is less invasive and preserves handling of 
multi_mem_insns.  Would please interested parties help me test it?

Comments?

--
Maxim Kuvyrkov
www.linaro.org



0001-Fix-autopref_rank_for_schedule.patch
Description: Binary data


Re: [PATCH] Fix PR82396: qsort comparator non-negative on sorted output

2017-10-05 Thread Steve Ellcey
On Thu, 2017-10-05 at 20:33 +0300, Maxim Kuvyrkov wrote:

> I'm currently bootstrapping the following patch (on aarch64-linux-
> gnu, arm-linux-gnueabihf will follow tomorrow), which (like Wilco's
> patch) seems to unbreak bootstrap, but is less invasive and preserves
> handling of multi_mem_insns.  Would please interested parties help me
> test it?
> 
> Comments?

This patch is not applying to my ToT.  The code
in autopref_rank_for_schedule doesn't seem to match what I have in my
tree.

Steve Ellcey


Re: [RFA] [PATCH] Add a warning for invalid function casts

2017-10-05 Thread Eric Gallager
On 10/5/17, Bernd Edlinger  wrote:
> On 10/05/17 02:24, Eric Gallager wrote:
>> Sorry if this is a stupid question, but could you explain how this
>> warning is different from -Wbad-function-cast? Something about direct
>> calls to functions vs. passing them as function pointers?
>
> No, it is not :)
>
> -Wbad-function-cast is IMHO a strange legacy warning.
>
> It is C-only, and it triggers only if the result of a function call
> is cast to a type with a different TREE_CODE, so for instance
> int <-> float <-> pointer.
>
> It would trigger for perfectly valid code like this:
>
> i = (int) floor (f);
>
> while we have no warning for
>
> i = floor (f);
>
> What I want to diagnose is assigning a function pointer via an explicit
> type cast to another function pointer, when there is no possible
> implicit conversion between the two function pointer types.
> Thus the cast was used to silence a warning/error in the first place.

OK, thanks for explaining! I kinda worry about warning on code that
was added to silence other warnings in the first place, as it can lead
to frustration when making a fix expecting the number of warnings when
compiling to decrease, but then they don't actually decrease. But as
long as there's a simple way to fix the warned-on code that silences
both the original warning being avoided, and this new one, I see how
it'll be useful.

>
> The idea for this warning came up when someone spotted a place in
> openssl, where a type cast was used to change the return value of a
> callback function from long to int:
>
> https://github.com/openssl/openssl/issues/4413
>
> But due to the type cast there was never ever any warning from this
> invalid type cast.
>
>
> Bernd.
>

Thanks for working on this!

Eric


Re: [PATCH] Fix PR82396: qsort comparator non-negative on sorted output

2017-10-05 Thread Wilco Dijkstra
Maxim Kuvyrkov wrote:

> What the heck, I'll jump in as well.

Why not - how many engineers does it take to sort?!?

> I'm still working on analysis, but it appears to me that Alexander's patch 
> (current state of trunk) fails qsort check due to not being symmetric for
> load/store analysis (write == 0 or write == 1) in comparisons with 
> "irrelevant"
> instructions.  Wilco's patch  does not seem to address that, and, possibly,
> makes the failure latent (I may be wrong here, it's late and I didn't finish 
> analysis yet).
>
> I'm currently bootstrapping the following patch (on aarch64-linux-gnu, 
> arm-linux-gnueabihf will follow tomorrow), which (like Wilco's patch) seems
> to unbreak bootstrap, but is less invasive and preserves handling of
> multi_mem_insns.  Would please interested  parties help me test it?

So what you're saying is that if we compare a load with a store (or a store
with a load), we should always return 0 and use the original instruction order?
This won't fix the issue of overlaps in autopref_rank_data, but I believe this
creates a new sorting order problem:

A. Load off 8
B. Store off 12
C. Load off 4

So A < B (instruction order), B < C (instruction order) but A > C (offset 
order)...

The key here is that sorting functions should only return zero if the two
objects being compared are really indistinguishable, not as a way to say they
are not comparable(!!!) and then defer to a different comparison order. It is
impossible to combine multiple different comparisons to create a total sorting
order that way. 

However it is feasible to divide things into several classes, order the classes
with respect to each other and use a different sort within each class. If you 
want
to treat loads/stores equally, that's fine, but then they end up in the same 
class
and thus you have to use a comparison that orders both loads and stores.

Wilco

Re: [PATCH][GCC][testsuite][mid-end][ARM][AARCH64] Fix failing vec align tests.

2017-10-05 Thread Rainer Orth
Hi Tamar,

> Previously I had corrected the vect_hw_misalign check which prompted these
> three test to start failing because the condition needs to be inverted in the
> testcases.
>
> Regtested on aarch64-none-elf, arm-none-linux-gnueabihf and 
> x86_64-pc-linux-gnu.
>
> Ok for trunk?
>
> Thanks,
> Tamar.
>
> gcc/testsuite/
> 2017-10-02  Tamar Christina  
>
>   * gcc.dg/vect/vect-align-1.c: Fix vect_hw_misalign condition.
>   * gcc.dg/vect/vect-align-2.c: Likewise.
>   * gcc.dg/vect/vect-multitypes-1.c: Likewise.

unfortunately, your patch caused gcc.dg/vect/vect-multitypes-1.c to FAIL
on sparc-sun-solaris2.11 (32 and 64-bit):

FAIL: gcc.dg/vect/vect-multitypes-1.c -flto -ffat-lto-objects  
scan-tree-dump-times vect "Vectorizing an unaligned access" 4
FAIL: gcc.dg/vect/vect-multitypes-1.c scan-tree-dump-times vect "Vectorizing an 
unaligned access" 4

It had XFAILed before.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [RFA] [PATCH 4/4] Ignore reads of "dead" memory locations in DSE

2017-10-05 Thread Christophe Lyon
Hi Jeff,


On 7 September 2017 at 00:18, Jeff Law  wrote:
> Another old patch getting resurrected...
>
>

This patch (r253305) introduces a new FAIL on arm-none-eabi (as
opposed arm-linux-gnueabi*):
FAIL: gcc.dg/tree-ssa/ssa-dse-26.c scan-tree-dump-times dse1
"Deleted dead store" 2

I'm not familiar with all this, but I quickly looked at the testcase,
and noticed it
uses enums.
One ABI difference between arm-non-eabi and arm-linux-gnueabi is that the
bare-metal target defaults to short-enums, while the linux one uses
no-short-enums.

Could that be the problem?

Thanks,

Christophe

> On 01/04/2017 06:50 AM, Richard Biener wrote:
>> On Thu, Dec 22, 2016 at 7:26 AM, Jeff Law  wrote:
>>> This is the final patch in the kit to improve our DSE implementation.
>>>
>>> It's based on a observation by Richi.  Namely that a read from bytes of
>>> memory that are dead can be ignored.  By ignoring such reads we can
>>> sometimes find additional stores that allow us to either eliminate or trim
>>> an earlier store more aggressively.
>>>
>>> This only hit (by hit I mean the ability to ignore resulted in finding a
>>> full or partially dead store that we didn't otherwise find) once during a
>>> bootstrap, but does hit often in the libstdc++ testsuite.  I've added a test
>>> derived from the conversation between myself and Richi last year.
>>>
>>> There's nothing in the BZ database on this issue and I can't reasonably call
>>> it a bugfix.  I wouldn't lose sleep if this deferred to gcc-8.
>>>
>>> Bootstrapped and regression tested on x86-64-linux-gnu.  OK for the trunk or
>>> defer to gcc-8?
>>>
>>>
>>>
>>> * tree-ssa-dse.c (live_bytes_read): New function.
>>> (dse_classify_store): Ignore reads of dead bytes.
>>>
>>> * testsuite/gcc.dg/tree-ssa/ssa-dse-26.c: New test.
>>> * testsuite/gcc.dg/tree-ssa/ssa-dse-26.c: Likewise.
>>>
>>>
>>>
> [ snip ]
>
>>> diff --git a/gcc/tree-ssa-dse.c b/gcc/tree-ssa-dse.c
>>> index a807d6d..f5b53fc 100644
>>> --- a/gcc/tree-ssa-dse.c
>>> +++ b/gcc/tree-ssa-dse.c
>>> @@ -475,6 +475,41 @@ maybe_trim_partially_dead_store (ao_ref *ref, sbitmap
>>> live, gimple *stmt)
>>>  }
>>>  }
>>>
>>> +/* Return TRUE if USE_REF reads bytes from LIVE where live is
>>> +   derived from REF, a write reference.
>>> +
>>> +   While this routine may modify USE_REF, it's passed by value, not
>>> +   location.  So callers do not see those modifications.  */
>>> +
>>> +static bool
>>> +live_bytes_read (ao_ref use_ref, ao_ref *ref, sbitmap live)
>>> +{
>>> +  /* We have already verified that USE_REF and REF hit the same object.
>>> + Now verify that there's actually an overlap between USE_REF and REF.
>>> */
>>> +  if ((use_ref.offset < ref->offset
>>> +   && use_ref.offset + use_ref.size > ref->offset)
>>> +  || (use_ref.offset >= ref->offset
>>> + && use_ref.offset < ref->offset + ref->size))
>>
>> can you use ranges_overlap_p? (tree-ssa-alias.h)
> Yes.  Didn't know about it.  Done.
>
>>
>>> +{
>>> +  normalize_ref (&use_ref, ref);
>>> +
>>> +  /* If USE_REF covers all of REF, then it will hit one or more
>>> +live bytes.   This avoids useless iteration over the bitmap
>>> +below.  */
>>> +  if (use_ref.offset == ref->offset && use_ref.size == ref->size)
>>> +   return true;
>>> +
>>> +  /* Now iterate over what's left in USE_REF and see if any of
>>> +those bits are i LIVE.  */
>>> +  for (int i = (use_ref.offset - ref->offset) / BITS_PER_UNIT;
>>> +  i < (use_ref.offset + use_ref.size) / BITS_PER_UNIT; i++)
>>> +   if (bitmap_bit_p (live, i))
>>
>> a bitmap_bit_in_range_p () would be nice to have.  And it can be more
>> efficient than this loop...
> Yea.  That likely would help here.  I'm testing with a
> bitmap_bit_in_range_p implementation (only for sbitmaps since that's
> what we're using here).
>
> That implementation does the reasonably efficient things and is modeled
> after the sbitmap implementation of bitmap_set_range.
>
>
>>> @@ -554,6 +589,41 @@ dse_classify_store (ao_ref *ref, gimple *stmt, gimple
>>> **use_stmt,
>>>   /* If the statement is a use the store is not dead.  */
>>>   else if (ref_maybe_used_by_stmt_p (use_stmt, ref))
>>> {
>>> + /* Handle common cases where we can easily build a ao_ref
>>> +structure for USE_STMT and in doing so we find that the
>>> +references hit non-live bytes and thus can be ignored.  */
>>> + if (live_bytes)
>>> +   {
>>> + if (is_gimple_assign (use_stmt))
>>> +   {
>>> + /* Other cases were noted as non-aliasing by
>>> +the call to ref_maybe_used_by_stmt_p.  */
>>> + ao_ref use_ref;
>>> + ao_ref_init (&use_ref, gimple_assign_rhs1 (use_stmt));
>>> + if (valid_ao_ref_for_dse (&use_ref)
>>> + && u

Re: [PATCH] Improve alloca alignment

2017-10-05 Thread Wilco Dijkstra
Richard Biener wrote:
> On Thu, Oct 5, 2017 at 1:07 AM, Jeff Law  wrote:
> > On 10/04/2017 08:53 AM, Eric Botcazou wrote:
 This seems like a SPARC target problem to me -- essentially it's
 claiming a higher STACK_BOUNDARY than it really has.
>>>
>>> No, it is not, I can guarantee you that the stack pointer is always aligned 
>>> to
>>> 64-bit boundaries on SPARC, otherwise all hell would break loose...
>> Then something is inconsistent somehwere.  Either the stack is aligned
>> prior to that code or it is not.  If it is aligned, then Wilco's patch
>> ought to keep it aligned.  If is not properly aligned, then well, that's
>> the problem ISTM.
>>
>> Am I missing something here?
>
> What I got from the discussion and the PR is that the stack hardregister
> is properly aligned but what GCC maps to it (virtual or frame or whatever)
> might not be at all points.
>
> allocate_dynamic_stack_space uses virtual_stack_dynamic_rtx and I'm not
> sure STACK_BOUNDARY applies to it?

Yes STACK_BOUNDARY applies to virtual_stack_dynamic_rtx and all other
virtual frame registers. It appears it's main purpose is to enable alignment
optimizations since PREFERRED_STACK_BOUNDARY is used to align
local and outgoing argument area etc. So if you don't want the alignment
optimizations it is feasible to set STACK_BOUNDARY to a lower value
without changing the stack layout.

There is also STACK_DYNAMIC_OFFSET which computes the total offset
from the stack. It's not obvious whether the default version should align (since
outgoing arguments are already aligned there is no easy way to record the
extra padding), but we could assert if the offset isn't aligned.

Wilco

Re: [PATCH] C++17 P0067R5 std::to_chars and std::from_chars (partial)

2017-10-05 Thread Christophe Lyon
Hi Jonathan,

On 3 October 2017 at 16:31, Jonathan Wakely  wrote:
> On 02/10/17 15:13 +0100, Jonathan Wakely wrote:
>>
>> +#ifndef _GLIBCXX_CHARCONV
>> +#define _GLIBCXX_CHARCONV 1
>> +
>> +#pragma GCC system_header
>> +
>> +#if __cplusplus >= 201402L
>> +
>> +#include 
>> +#include 
>> +#include 
>> +#include  // for std::errc
>
>
> I forgot to mention that I've made this header work for C++14 not just
> C++17. I
> did this for similar reasons as we make some C++17 things available for
> -std=gnu++11 and -std=gnu++14:
>
> #if __cplusplus > 201402L || !defined(__STRICT_ANSI__) // c++1z or gnu++11
>
> But in this case  is a completely new header so we don't have to
> limit it to -std=gnu++NN modes only. The new functions won't pollute
> namespace
> std unless the new header gets included, which strictly-conforming C++14
> code
> won't do anyway.
>

Sorry for the delay, I'm still catching up after Linaro Connect.

I've noticed that one of the new tests (20_util/to_chars/1.cc)
fails to compile on ARM and AArch64 bare-metal targets (with Newlib).
That is, arm-none-eabi and aarch64-none-elf targets.

The error message I'm seeing is:
/libstdc++-v3/testsuite/20_util/to_chars/1.cc: In function 'void test01()':
/libstdc++-v3/testsuite/20_util/to_chars/1.cc:88: error:
'std::to_string' has not been declared
In file included from /libstdc++-v3/testsuite/20_util/to_chars/1.cc:38:
/libstdc++-v3/testsuite/20_util/to_chars/1.cc:90: error: 'to_string'
was not declared in this scope
/libstdc++-v3/testsuite/util/testsuite_hooks.h:57: note: in definition
of macro 'VERIFY'

Thanks,

Christophe


Re: Let the target choose a vectorisation alignment

2017-10-05 Thread Christophe Lyon
Hi Richard,


On 18 September 2017 at 15:57, Richard Sandiford
 wrote:
> Richard Biener  writes:
>> On Mon, Sep 18, 2017 at 1:58 PM, Richard Sandiford
>>  wrote:
>>> The vectoriser aligned vectors to TYPE_ALIGN unconditionally, although
>>> there was also a hard-coded assumption that this was equal to the type
>>> size.  This was inconvenient for SVE for two reasons:
>>>
>>> - When compiling for a specific power-of-2 SVE vector length, we might
>>>   want to align to a full vector.  However, the TYPE_ALIGN is governed
>>>   by the ABI alignment, which is 128 bits regardless of size.
>>>
>>> - For vector-length-agnostic code it doesn't usually make sense to align,
>>>   since the runtime vector length might not be a power of two.  Even for
>>>   power of two sizes, there's no guarantee that aligning to the previous
>>>   16 bytes will be an improveent.
>>>
>>> This patch therefore adds a target hook to control the preferred
>>> vectoriser (as opposed to ABI) alignment.
>>>
>>> Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu.
>>> Also tested by comparing the testsuite assembly output on at least one
>>> target per CPU directory.  OK to install?
>>
>> Did you specifically choose to pass the hook a vector type rather than
>> a mode?
>
> It seemed like the safest thing to do for the default implementation,
> e.g. in case we're vectorising "without SIMD" and thus without an
> underlying vector mode.  I agree it probably doesn't make much
> difference for non-default implementations.
>
>> I suppose in peeling for alignment the target should be able to
>> prevent peeling by returning element alignment from the hook?
>
> Yeah.  This is what the SVE port does in the default vector-length
> agnostic mode, and might also make sense in fixed-length mode.
> Maybe it would be useful for other targets too, if unaligned accesses
> have a negligible penalty for them.
>

Since this was committed (r253101), I've noticed a regression
on arm-none-linux-gnueabihf:
FAIL:gcc.dg/vect/vect-multitypes-1.c -flto -ffat-lto-objects
scan-tree-dump-times vect "Vectorizing an unaligned access" 4
FAIL:gcc.dg/vect/vect-multitypes-1.c scan-tree-dump-times vect
"Vectorizing an unaligned access" 4

for instance, with
--target arm-none-linux-gnueabihf
--with-mode arm
--with-cpu cortex-a9
--with-fpu neon-fp16

Thanks,

Christophe

> Thanks,
> Richard
>> Ok.
>>
>> Thanks,
>> Richard.


Re: [PATCH] C++17 P0067R5 std::to_chars and std::from_chars (partial)

2017-10-05 Thread Jonathan Wakely

On 05/10/17 22:00 +0200, Christophe Lyon wrote:

Hi Jonathan,

On 3 October 2017 at 16:31, Jonathan Wakely  wrote:

On 02/10/17 15:13 +0100, Jonathan Wakely wrote:


+#ifndef _GLIBCXX_CHARCONV
+#define _GLIBCXX_CHARCONV 1
+
+#pragma GCC system_header
+
+#if __cplusplus >= 201402L
+
+#include 
+#include 
+#include 
+#include  // for std::errc



I forgot to mention that I've made this header work for C++14 not just
C++17. I
did this for similar reasons as we make some C++17 things available for
-std=gnu++11 and -std=gnu++14:

#if __cplusplus > 201402L || !defined(__STRICT_ANSI__) // c++1z or gnu++11

But in this case  is a completely new header so we don't have to
limit it to -std=gnu++NN modes only. The new functions won't pollute
namespace
std unless the new header gets included, which strictly-conforming C++14
code
won't do anyway.



Sorry for the delay, I'm still catching up after Linaro Connect.

I've noticed that one of the new tests (20_util/to_chars/1.cc)
fails to compile on ARM and AArch64 bare-metal targets (with Newlib).
That is, arm-none-eabi and aarch64-none-elf targets.

The error message I'm seeing is:
/libstdc++-v3/testsuite/20_util/to_chars/1.cc: In function 'void test01()':
/libstdc++-v3/testsuite/20_util/to_chars/1.cc:88: error:
'std::to_string' has not been declared
In file included from /libstdc++-v3/testsuite/20_util/to_chars/1.cc:38:
/libstdc++-v3/testsuite/20_util/to_chars/1.cc:90: error: 'to_string'
was not declared in this scope
/libstdc++-v3/testsuite/util/testsuite_hooks.h:57: note: in definition
of macro 'VERIFY'


Should be fixed with the attached patch.

commit f6ea2112cdbd2c52302032b3991cef0a3af1c1b4
Author: Jonathan Wakely 
Date:   Thu Oct 5 21:14:46 2017 +0100

Fix new testsuite failure on newlib targets

* testsuite/20_util/to_chars/1.cc: Add dg-require-string-conversions.

diff --git a/libstdc++-v3/testsuite/20_util/to_chars/1.cc b/libstdc++-v3/testsuite/20_util/to_chars/1.cc
index bdd961104a2..12dfd3d49f3 100644
--- a/libstdc++-v3/testsuite/20_util/to_chars/1.cc
+++ b/libstdc++-v3/testsuite/20_util/to_chars/1.cc
@@ -17,6 +17,7 @@
 
 // { dg-options "-std=gnu++17" }
 // { dg-do run { target c++17 } }
+// { dg-require-string-conversions "" }
 
 #include 
 #include 


Re: [PATCH] Fix PR82396: qsort comparator non-negative on sorted output

2017-10-05 Thread Alexander Monakov
On Thu, 5 Oct 2017, Maxim Kuvyrkov wrote:
> I'm still working on analysis, but it appears to me that Alexander's patch
> (current state of trunk) fails qsort check due to not being symmetric for
> load/store analysis (write == 0 or write == 1) in comparisons with
> "irrelevant" instructions.  Wilco's patch does not seem to address that, and,
> possibly, makes the failure latent (I may be wrong here, it's late and I
> didn't finish analysis yet).

Yes, your analysis is incomplete, it should be easy to see that for always-false
multi_mem_insn_p, autopref_rank_for_schedule implements lexicographical order.
The problem is that when multi_mem_insn_p may be true, autopref_rank_data is
not guaranteed to be transitive.

I think your patch loses transitivity in autopref_rank_for_schedule, see Wilco's
response.

FWIW, this hunk from my patch posted back on Friday is sufficient to restore
bootstrap as confirmed (again, back on Friday) by Steve.  It avoids the fancy
non-transitive comparison for qsort (but autopref_rank_data is still used in
multipref_dfa_lookahead_guard).

(I'm surprised Kyrill wasn't Cc'ed - adding him now)

Alexander

* haifa-sched.c (autopref_rank_for_schedule): Do not invoke
autopref_rank_data, order by min_offset.

diff --git a/gcc/haifa-sched.c b/gcc/haifa-sched.c
index 549e8961411..cea1242e1f1 100644
--- a/gcc/haifa-sched.c
+++ b/gcc/haifa-sched.c
@@ -5725,7 +5669,7 @@ autopref_rank_for_schedule (const rtx_insn *insn1, const 
rtx_insn *insn2)
   int irrel2 = data2->status == AUTOPREF_MULTIPASS_DATA_IRRELEVANT;
 
   if (!irrel1 && !irrel2)
-   r = autopref_rank_data (data1, data2);
+   r = data1->min_offset - data2->min_offset;
   else
r = irrel2 - irrel1;
 }


[PATCH] Improve -fstore-merging for bool/enum constants (PR tree-optimization/82434)

2017-10-05 Thread Jakub Jelinek
Hi!

The following testcase fails, because can_native_encode_type_p doesn't
handle BOOLEAN_TYPE nor ENUMERAL_TYPE (while native_encode_expr handles
those just fine).
But, it isn't just those, can_native_encode_type_p doesn't really make
sense to me, since whether native_encode_expr fails or not doesn't
really depend on the expression type, but rather on what exact tcc_constant
the expression is, what size it has and some other properties of the
expression.
Instead of writing a routine similar to can_native_encode_string_p that
would handle all the cases when native_encode_expr fails, I've changed
native_encode_expr itself, so that it has a faster dry run mode, where
ptr is NULL, which doesn't store anything, but just returns what it would
return given a non-NULL ptr.
The patch then changes vectorizable_store as well as store merging to use
this to check whether native_encode_expr will be successful.

In addition to that, I've found a thinko in store merging stmt counting,
where it would unnecessarily look for 3rd non-debug stmt even when 2
stmts is what it checks after the loop.  And store merging was for some
unknown reason calling native_encode_expr with 4 arguments, while the
standard/preferred way to call it is with 3 arguments, then it verifies
whether the constant encoding can fit into the buffer etc.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-10-05  Jakub Jelinek  

PR tree-optimization/82434
* fold-const.h (can_native_encode_type_p,
can_native_encode_string_p): Remove.
* fold-const.c (native_encode_int): Formatting fixes.  If ptr is NULL,
don't encode anything, just return what would be otherwise returned.
(native_encode_fixed, native_encode_complex, native_encode_vector):
Likewise.
(native_encode_string): Likewise.  Inline by hand
can_native_encode_string_p.
(can_native_encode_type_p): Remove.
(can_native_encode_string_p): Remove.
* tree-vect-stmts.c (vectorizable_store): Instead of testing just
STRING_CSTs using can_native_encode_string_p, test all
CONSTANT_CLASS_P values using native_encode_expr with NULL ptr.
* gimple-ssa-store-merging.c (encode_tree_to_bitpos): Remove last
argument from native_encode_expr.
(rhs_valid_for_store_merging_p): Use native_encode_expr with NULL ptr.
(pass_store_merging::execute): Don't unnecessarily look for 3 stmts,
but just 2.

* gcc.dg/store_merging_9.c: New test.

--- gcc/fold-const.h.jj 2017-09-05 23:28:10.0 +0200
+++ gcc/fold-const.h2017-10-05 13:16:27.355770215 +0200
@@ -27,8 +27,6 @@ extern int folding_initializer;
 /* Convert between trees and native memory representation.  */
 extern int native_encode_expr (const_tree, unsigned char *, int, int off = -1);
 extern tree native_interpret_expr (tree, const unsigned char *, int);
-extern bool can_native_encode_type_p (tree);
-extern bool can_native_encode_string_p (const_tree);
 
 /* Fold constants as much as possible in an expression.
Returns the simplified expression.
--- gcc/fold-const.c.jj 2017-10-04 16:45:28.0 +0200
+++ gcc/fold-const.c2017-10-05 13:17:42.195863063 +0200
@@ -6982,11 +6982,15 @@ native_encode_int (const_tree expr, unsi
   int byte, offset, word, words;
   unsigned char value;
 
-  if ((off == -1 && total_bytes > len)
-  || off >= total_bytes)
+  if ((off == -1 && total_bytes > len) || off >= total_bytes)
 return 0;
   if (off == -1)
 off = 0;
+
+  if (ptr == NULL)
+/* Dry run.  */
+return MIN (len, total_bytes - off);
+
   words = total_bytes / UNITS_PER_WORD;
 
   for (byte = 0; byte < total_bytes; byte++)
@@ -7009,8 +7013,7 @@ native_encode_int (const_tree expr, unsi
}
   else
offset = BYTES_BIG_ENDIAN ? (total_bytes - 1) - byte : byte;
-  if (offset >= off
- && offset - off < len)
+  if (offset >= off && offset - off < len)
ptr[offset - off] = value;
 }
   return MIN (len, total_bytes - off);
@@ -7036,8 +7039,7 @@ native_encode_fixed (const_tree expr, un
 
   i_type = lang_hooks.types.type_for_size (GET_MODE_BITSIZE (mode), 1);
 
-  if (NULL_TREE == i_type
-  || TYPE_PRECISION (i_type) != total_bytes)
+  if (NULL_TREE == i_type || TYPE_PRECISION (i_type) != total_bytes)
 return 0;
   
   value = TREE_FIXED_CST (expr);
@@ -7065,11 +7067,15 @@ native_encode_real (const_tree expr, uns
  up to 192 bits.  */
   long tmp[6];
 
-  if ((off == -1 && total_bytes > len)
-  || off >= total_bytes)
+  if ((off == -1 && total_bytes > len) || off >= total_bytes)
 return 0;
   if (off == -1)
 off = 0;
+
+  if (ptr == NULL)
+/* Dry run.  */
+return MIN (len, total_bytes - off);
+
   words = (32 / BITS_PER_UNIT) / UNITS_PER_WORD;
 
   real_to_target (tmp, TREE_REAL_CST_PTR (expr), TYPE_MODE (type));
@@ -7123,15 +7129,14 @@ native_encode_complex (const_tree expr,
 
   part = TREE_REALPART (ex

[C PATCH] Fix -Wtautological-compare (PR c/82437)

2017-10-05 Thread Jakub Jelinek
Hi!

In warn_tautological_bitwise_comparison, there is even a comment
mentioning the fact that the types of the two constants might not be the
same (it is called with the original arguments before they are promoted
to common type for the comparison).

On the following testcase, one of the constants was unsigned (because
it has been converted to that when anded with unsigned variable), while
the other was signed and wi::to_widest (bitopcst) & wi::to_widest (cst)
wasn't sign-extended from 32-bit (because bitopcst was unsigned),
while wi::to_widest (cst) was (because cst was int), so we warned even
when we shouldn't.

The following patch instead uses the precision of the larger type and
zero extends from that (because we really don't need to care about bits
beyond that precision).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-10-05  Jakub Jelinek  

PR c/82437
* c-warn.c (warn_tautological_bitwise_comparison): Instead of
using to_widest use wide_int with the larger of the two precisions.

* c-c++-common/Wtautological-compare-6.c: New test.

--- gcc/c-family/c-warn.c.jj2017-10-02 13:46:57.0 +0200
+++ gcc/c-family/c-warn.c   2017-10-05 16:18:20.050207720 +0200
@@ -356,16 +356,24 @@ warn_tautological_bitwise_comparison (lo
 return;
 
   /* Note that the two operands are from before the usual integer
- conversions, so their types might not be the same.  */
-  widest_int res;
+ conversions, so their types might not be the same.
+ Use the larger of the two precisions and ignore bits outside
+ of that.  */
+  int prec = MAX (TYPE_PRECISION (TREE_TYPE (cst)),
+ TYPE_PRECISION (TREE_TYPE (bitopcst)));
+
+  wide_int bitopcstw = wide_int::from (bitopcst, prec, UNSIGNED);
+  wide_int cstw = wide_int::from (cst, prec, UNSIGNED);
+
+  wide_int res;
   if (TREE_CODE (bitop) == BIT_AND_EXPR)
-res = wi::to_widest (bitopcst) & wi::to_widest (cst);
+res = bitopcstw & cstw;
   else
-res = wi::to_widest (bitopcst) | wi::to_widest (cst);
+res = bitopcstw | cstw;
 
   /* For BIT_AND only warn if (CST2 & CST1) != CST1, and
  for BIT_OR only if (CST2 | CST1) != CST1.  */
-  if (res == wi::to_widest (cst))
+  if (res == cstw)
 return;
 
   if (code == EQ_EXPR)
--- gcc/testsuite/c-c++-common/Wtautological-compare-6.c.jj 2017-10-05 
16:21:41.799800708 +0200
+++ gcc/testsuite/c-c++-common/Wtautological-compare-6.c2017-10-05 
16:23:36.727429542 +0200
@@ -0,0 +1,11 @@
+/* PR c/82437 */
+/* { dg-do compile { target int32plus } } */
+/* { dg-options "-Wtautological-compare" } */
+
+int
+foo (unsigned int x)
+{
+  if ((x & -1879048192) != -1879048192)/* { dg-bogus "bitwise 
comparison always evaluates to" } */
+return 0;
+  return 1;
+}

Jakub


Re: [PATCH] Fix PR80295[aarch64] [7/8 Regression] ICE in __builtin_update_setjmp_buf expander

2017-10-05 Thread Qing Zhao
Richard,

thanks for your review and comments.

> On Oct 5, 2017, at 11:50 AM, Richard Earnshaw (lists) 
>  wrote:
> 
> On 25/09/17 17:35, Qing Zhao wrote:
>> 
>> --- a/gcc/config/aarch64/aarch64.h
>> +++ b/gcc/config/aarch64/aarch64.h
>> @@ -782,7 +782,7 @@ typedef struct
>> /* Specify the machine mode that the hardware addresses have.
>>After generation of rtl, the compiler makes no further distinction
>>between pointers and any other objects of this machine mode.  */
>> -#define Pmode   DImode
>> +#define Pmode   (TARGET_ILP32 ? SImode : DImode)
> 
> This is wrong.  AArch64 has both ptr_mode and Pmode.  Pmode must always
> be DImode as (internally) all addresses must be 64-bit.  ptr_mode
> reflects the ABI choice of 32/64-bit language level addresses.  The
> register SP must always be a 64-bit value, even when all the top 32 bits
> are zero.


So,  in ilp32 ABI, compared to LP64,  the only difference is:

   long and pointer size are different

all the others are the same, including:

registers size are 64 bits for both ilp32 and LP64.
stack poping and pushing by 64 bits for both ilp32 and LP64.

is the above right?

what’s the other difference between ilp32 and LP64?

thanks a lot for your help.

Qing

Re: [RFA] [PATCH] Add a warning for invalid function casts

2017-10-05 Thread Bernd Edlinger
On 10/05/17 18:16, Martin Sebor wrote:
> On 10/03/2017 01:33 PM, Bernd Edlinger wrote:
>>
>> I'm not sure if this warning may be a bit too strict, but I think
>> so far it just triggered on rather questionable code.
>>
>> Thoughts?
> 
> My initial thought is that although casts between incompatible
> function types undoubtedly mask bugs, the purpose of such casts
> is to make such conversions possible.  Indiscriminately diagnosing
> them would essentially eliminate this feature of the type system.
> Having to add another cast to a different type(*) to suppress
> the warning when such a conversion is intended doesn't seem like
> a good solution (the next logical step to find bugs in those
> conversions would then be to add another warning to see through
> those additional casts, and so on).
> 
> With that, my question is: under what circumstances does the
> warning not trigger on a cast to a function of an incompatible
> type?
> 
> In my (very quick) tests the warning appears to trigger on all
> strictly incompatible conversions, even if they are otherwise
> benign, such as:
> 
>    int f (const int*);
>    typedef int F (int*);
> 
>    F* pf1 = f;    // -Wincompatible-pointer-types
>    F* pf2 = (F*)f;    // -Wcast-function-type
> 
> Likewise by:
> 
>    int f (signed char);
>    typedef int F (unsigned char);
> 
>    F* pf = (F*)f;    // -Wcast-function-type
> 
> I'd expect these conversions to be useful and would view warning
> for them with -Wextra as excessive.  In fact, I'm not sure I see
> the benefit of warning on these casts under any circumstances.
> 
> Similarly, for casts between pointers to the same integer type
> with a different sign, or those involving ILP32/LP64 portability
> issues I'd expect the warning not to trigger unless requested
> (e.g., by some other option targeting those issues).
> 
> So based on these initial observations and despite the bugs it
> may have uncovered, I share your concern that the warning in
> its present form is too strict to be suitable for -Wextra, or
> possibly even too noisy to be of practical use on its own and
> outside of -Wextra.
> 
> Out of curiosity, have you done any tests on other code bases
> besides GCC to see how many issues (true and false positives)
> it finds?
> 
> Martin
> 
> [*] Strictly speaking, the semantics of casting a function
> pointer to intptr_t aren't necessarily well-defined.  Only void*
> can be portably converted to intptr_t and back, and only object
> pointers are required to be convertible to void* and back.  And
> although GCC defines the semantics of these conversions, forcing
> programs to abandon a well defined language feature in favor of
> one that's not as cleanly specified would undermine the goal
> the warning is meant to achieve.

Thanks for your advice.

I did look into openssl and linux, both have plenty of
-Wcast-function-type warnings.

In the case of openssl it is lots of similar stuff
crypto/aes/aes_cfb.c:25:27: warning: cast between incompatible function 
types from 'void (*)(const unsigned char *, unsigned char *, const
AES_KEY *) {aka void (*)(const unsigned char *, unsigned char *, const 
struct aes_key_st *)}' to 'void (*)(const unsigned char *, unsigned
char *, const void *)' [-Wcast-function-type]
(block128_f) AES_encrypt);

The problem is the function is of course called with a different
signature than what is declared.  They take it somehow for granted,
that "void*" or "const void*" parameter are an alias for
any pointer or any const pointer.  Either as parameter or as return
code.

I would believe this is not well-defined by the c-standard.

But it makes the warning less useful because it would be impossible
to spot the few places where the call will actually abort at
runtime.

Then I tried to compile linux, I noticed that there is a new
warning for the alias to incompatible function.

I saw it very often, and it is always when a system call
is defined:

./include/linux/compat.h:50:18: Warnung: »compat_sys_ptrace« alias 
between functions of incompatible types »long int(compat_long_t, 
compat_long_t,  compat_long_t,  compat_long_t) {alias long int(int, 
int,  int,  int)}« and »long int(long int,  long int,  long int,  long 
int)« [-Wattributes]
   asmlinkage long compat_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__))\

./include/linux/syscalls.h:211:18: Warnung: »sys_rt_sigprocmask« alias 
between functions of incompatible types »long int(int,  sigset_t *, 
sigset_t *, size_t) {alias long int(int,  struct  *, struct 
 *, long unsigned int)}« and »long int(long int,  long int, 
long int,  long int)« [-Wattributes]
   asmlinkage long sys##name(__MAP(x,__SC_DECL,__VA_ARGS__)) \


Needless to say that even more Wcast-function-type warning
happen.

./include/linux/timer.h:178:23: Warnung: cast between incompatible 
function types from »void (*)(struct timer_list *)« to »void (*)(long 
unsigned int)« [-Wcast-function-type]
   __setup_timer(timer, (TIMER_FUNC_TYPE)callback,

So they 

Re: [C PATCH] Fix -Wtautological-compare (PR c/82437)

2017-10-05 Thread Marek Polacek
On Thu, Oct 05, 2017 at 10:34:26PM +0200, Jakub Jelinek wrote:
> Hi!
> 
> In warn_tautological_bitwise_comparison, there is even a comment
> mentioning the fact that the types of the two constants might not be the
> same (it is called with the original arguments before they are promoted
> to common type for the comparison).
> 
> On the following testcase, one of the constants was unsigned (because
> it has been converted to that when anded with unsigned variable), while
> the other was signed and wi::to_widest (bitopcst) & wi::to_widest (cst)
> wasn't sign-extended from 32-bit (because bitopcst was unsigned),
> while wi::to_widest (cst) was (because cst was int), so we warned even
> when we shouldn't.
> 
> The following patch instead uses the precision of the larger type and
> zero extends from that (because we really don't need to care about bits
> beyond that precision).
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Looks ok, thanks.

Marek


Re: [PATCH] PR target/68256 Defining TARGET_USE_CONSTANT_BLOCKS_P causes go bootstrap failure on aarch64

2017-10-05 Thread vladimir . mezentsev


On 10/05/2017 07:09 AM, Richard Earnshaw (lists) wrote:
> On 04/10/17 00:50, vladimir.mezent...@oracle.com wrote:
>> From: Vladimir Mezentsev 
>>
>> Tested on aarch64-linux-gnu.
>> No regression.
>> No bootstrap failure.
>>
>> gcc/ChangeLog:
>> 2017-09-26  Vladimir Mezentsev  
>>
>> * gcc/config/aarch64/aarch64.c: restore fix in 
>> aarch64_use_blocks_for_constant_p
> Vladimir,
>
> Thanks for the patch, but I can't accept this as it stands.  The patch
> lacks a description of what is motivating the change, so I can't really
> review it sensibly.  What's the problem you are trying to address?  Why
> do you believe this is the right fix?  Do you have a test case?

 Hello Richard,
 I wanted to fix *Bug 68256*
 - Defining
TARGET_USE_CONSTANT_BLOCKS_P causes go bootstrap failure on aarch64.

Ramana Radhakrishnan  ( )  made a
workaround for this bug:
commit bc443a71dafa2e707bae4b2fa74f83b05dea37ab
Author: ramana 
Date:   Tue Nov 10 08:35:21 2015 +
Workaround PR68256 on AArch64

I removed Raman's fixes andI cannot reproduce the problem.
I used the gcc116.fsffrance.org machine:
% uname -a Linux gcc116 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28
20:45:34 UTC 2016 aarch64 aarch64 aarch64 GNU/Linux

I run
% ../gcc/configure --enable-languages=c,c++,go --enable-bootstrap
--enable-multilib
% make -j8 bootstrap

  Is it correct steps to reproduce the go bootstrap failure ?
If yes:
   There were no bootstrap comparison failure and no regression in testing.
   My suggestion is to remove Raman's workaround and close PR68256.

If no:
   What is wrong in my steps?

Thank you,
-Vladimir

>
> The ChangeLog entry doesn't help me either: restore what fix?
>
> Is the patch associated with a bugzilla report?  If so, then please
> include that in the ChangeLog in the form "PR /.
>
> Incidentally, ChangeLog entries should be complete sentences (begin with
> a capital letter and end with a full stop).
>
> R.
>
>
>> ---
>>  gcc/config/aarch64/aarch64.c | 8 +++-
>>  1 file changed, 3 insertions(+), 5 deletions(-)
>>
>> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
>> index 1c14008..b377bc7 100644
>> --- a/gcc/config/aarch64/aarch64.c
>> +++ b/gcc/config/aarch64/aarch64.c
>> @@ -6193,11 +6193,9 @@ aarch64_can_use_per_function_literal_pools_p (void)
>>  static bool
>>  aarch64_use_blocks_for_constant_p (machine_mode, const_rtx)
>>  {
>> -  /* Fixme:: In an ideal world this would work similar
>> - to the logic in aarch64_select_rtx_section but this
>> - breaks bootstrap in gcc go.  For now we workaround
>> - this by returning false here.  */
>> -  return false;
>> +   /* We can't use blocks for constants when we're using a per-function
>> +  constant pool.  */
>> +  return !aarch64_can_use_per_function_literal_pools_p ();
>>  }
>>  
>>  /* Select appropriate section for constants depending
>>



Re: [C++ PATCH] Fix comment

2017-10-05 Thread Jason Merrill
On Wed, Oct 4, 2017 at 11:44 AM, Nathan Sidwell  wrote:
> On 10/04/2017 11:29 AM, Jason Merrill wrote:
>> On Wed, Oct 4, 2017 at 10:12 AM, Nathan Sidwell  wrote:
>>> In answering a question about passing non-trivial types through ..., I
>>> discovered a misleading comment.  It is NOT just like a value parm,
>>> because
>>> we don't locally copy it.
>>
>> Hmm, I think the error is in the behavior, not the comment.  :)
>
> I wouldn't disagree.

Fixing thus.
commit e7df49dcd6facccacb13888d44896dae7d3811fa
Author: Jason Merrill 
Date:   Thu Oct 5 15:12:11 2017 -0400

* call.c (convert_arg_to_ellipsis): Use the result of force_rvalue.

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index bfd92882393..9d747be9d79 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -7165,29 +7165,24 @@ convert_arg_to_ellipsis (tree arg, tsubst_flags_t 
complain)
   /* In a template (or ill-formed code), we can have an incomplete type
 even after require_complete_type_sfinae, in which case we don't know
 whether it has trivial copy or not.  */
-  && COMPLETE_TYPE_P (arg_type))
+  && COMPLETE_TYPE_P (arg_type)
+  && !cp_unevaluated_operand)
 {
-  /* Build up a real lvalue-to-rvalue conversion in case the
-copy constructor is trivial but not callable.  */
-  if (!cp_unevaluated_operand && CLASS_TYPE_P (arg_type))
-   force_rvalue (arg, complain);
-
   /* [expr.call] 5.2.2/7:
 Passing a potentially-evaluated argument of class type (Clause 9)
 with a non-trivial copy constructor or a non-trivial destructor
 with no corresponding parameter is conditionally-supported, with
 implementation-defined semantics.
 
-We support it as pass-by-invisible-reference to the caller's
-object.  That's different to named by-value parameters, which
-construct a copy and pass a reference to that.
+We support it as pass-by-invisible-reference, just like a normal
+value parameter.
 
 If the call appears in the context of a sizeof expression,
 it is not potentially-evaluated.  */
-  if (cp_unevaluated_operand == 0
- && (type_has_nontrivial_copy_init (arg_type)
- || TYPE_HAS_NONTRIVIAL_DESTRUCTOR (arg_type)))
+  if (type_has_nontrivial_copy_init (arg_type)
+ || TYPE_HAS_NONTRIVIAL_DESTRUCTOR (arg_type))
{
+ arg = force_rvalue (arg, complain);
  if (complain & tf_warning)
warning (OPT_Wconditionally_supported,
 "passing objects of non-trivially-copyable "
@@ -7195,6 +7190,11 @@ convert_arg_to_ellipsis (tree arg, tsubst_flags_t 
complain)
 arg_type);
  return cp_build_addr_expr (arg, complain);
}
+  /* Build up a real lvalue-to-rvalue conversion in case the
+copy constructor is trivial but not callable.  */
+  else if (CLASS_TYPE_P (arg_type))
+   force_rvalue (arg, complain);
+
 }
 
   return arg;
diff --git a/gcc/testsuite/g++.dg/ext/varargs2.C 
b/gcc/testsuite/g++.dg/ext/varargs2.C
new file mode 100644
index 000..13ddf5bb527
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/varargs2.C
@@ -0,0 +1,17 @@
+// { dg-do run }
+
+int c;
+struct X { X() {}; X(const X&) { ++c; } };
+void Foo (X, ...) {}
+void bin (X &p)
+{
+  Foo (p, p);
+}
+
+int main()
+{
+  X x;
+  bin(x);
+  if (c != 2)
+__builtin_abort();
+}


Re: [RFA] [PATCH] Add a warning for invalid function casts

2017-10-05 Thread Joseph Myers
On Thu, 5 Oct 2017, Bernd Edlinger wrote:

> Maybe it would be good to not warn in type-casts, when they can be
> assumed to be safe, for instance
> void* <-> any pointer (parameter or result),
> uintptr_t <-> any int, any pointer (parameter or result),
> void (*) (void) and void (*) (...) <-> any function pointer.

Well, void * and uintptr_t aren't necessarily interchangable at the ABI 
level.  At least, the m68k ABI returns integers in %d0 and pointers in 
%a0; I don't know if any other ABIs have that peculiarity.

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH] rs6000: Fix an error message in sysv4.h

2017-10-05 Thread Segher Boessenkool
While looking at PR82411, I got the error message

error: '-msdata=-mcall' and 'eabi-linux' are incompatible

which caused some head-scratching.  Fixed thusly.  Will also backport
(to 7 and 6) later.


Segher


2017-10-05  Segher Boessenkool  

* config/rs6000/sysv4.h (SUBTARGET_OVERRIDE_OPTIONS): Correct error
message for incompatible -msdata=* and -mcall-* options.

---
 gcc/config/rs6000/sysv4.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/rs6000/sysv4.h b/gcc/config/rs6000/sysv4.h
index e381ce9..37bd63f 100644
--- a/gcc/config/rs6000/sysv4.h
+++ b/gcc/config/rs6000/sysv4.h
@@ -160,7 +160,7 @@ do {
\
 {  \
   rs6000_sdata = SDATA_NONE;   \
   error ("%<%s=%s%> and %<%s-%s%> are incompatible",   \
-"-msdata", "-mcall", rs6000_sdata_name, rs6000_abi_name);  \
+"-msdata", rs6000_sdata_name, "-mcall", rs6000_abi_name);  \
 }  \
\
   targetm.have_srodata_section = rs6000_sdata == SDATA_EABI;   \
-- 
1.8.3.1



[PATCH], Add PowerPC ISA 3.0 Atomic Memory Operation functions

2017-10-05 Thread Michael Meissner
This patch adds support for most of the PowerPC ISA 3.0 Atomic Memory Operation
instructions, listed in section 4.5 of the manual.  Currently these functions
are done via extended asm.  At some point in the future, I will probably move
the inner part of the patch to use new built-in functions to replace the
extended asm.

Some of the atomic memory operations are not currently provided, due to the
complexity of the operation (some require two adjacent memory locations to be
accessed, some require 3 adjacent GPR registers).

I have checked this patch on a little endian power8 and there was no
regressions in the bootstrap or make check.  I verified that the compile only
test (amo1.c) was done during the test run.  I have verified that the runtime
test (amo2.c) does work on a power9 prototype system.  I am doing a full build
and make check on the power9 system.  Can I check these files into the trunk
assuming the power9 bootstrap/make check passes without regressions?

[gcc]
2017-10-05  Michael Meissner  

* config/rs6000/amo.h: New include file to provide ISA 3.0 atomic
memory operation instruction support.
* config.gcc (powerpc*-*-*): Include amo.h as an extra header.
(rs6000-ibm-aix[789]*): Likewise.
* doc/extend.texi (PowerPC Atomic Memory Operation Functions):
Document new functions.

[gcc/testsuite]
2017-10-05  Michael Meissner  

* gcc.target/powerpc/amo1.c: New test.
* gcc.target/powerpc/amo2.c: Likewise.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/amo.h
===
--- gcc/config/rs6000/amo.h (revision 0)
+++ gcc/config/rs6000/amo.h (revision 0)
@@ -0,0 +1,158 @@
+/* Power ISA 3.0 atomic memory operation include file.
+   Copyright (C) 2017 Free Software Foundation, Inc.
+   Contributed by Michael Meissner .
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published
+   by the Free Software Foundation; either version 3, or (at your
+   option) any later version.
+
+   GCC is distributed in the hope that it will be useful, but WITHOUT
+   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+   License for more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   .  */
+
+#ifndef _AMO_H
+#define _AMO_H
+
+#if !defined(_ARCH_PWR9) || !defined(_ARCH_PPC64)
+#error "The atomic memory operations require Power 64-bit ISA 3.0"
+
+#else
+#include 
+
+/* Enumeration of the LWAT/LDAT sub-opcodes.  */
+enum _AMO_LD {
+  _AMO_LD_ADD  = 0x00, /* Fetch and Add.  */
+  _AMO_LD_XOR  = 0x01, /* Fetch and Xor.  */
+  _AMO_LD_IOR  = 0x02, /* Fetch and Ior.  */
+  _AMO_LD_AND  = 0x03, /* Fetch and And.  */
+  _AMO_LD_UMAX = 0x04, /* Fetch and Unsigned Maximum.  */
+  _AMO_LD_SMAX = 0x05, /* Fetch and Signed Maximum.  */
+  _AMO_LD_UMIN = 0x06, /* Fetch and Unsigned Minimum.  */
+  _AMO_LD_SMIN = 0x07, /* Fetch and Signed Minimum.  */
+  _AMO_LD_SWAP = 0x08, /* Swap.  */
+  _AMO_LD_CS_NE= 0x10, /* Compare and Swap Not Equal.  
*/
+  _AMO_LD_INC_BOUNDED  = 0x18, /* Fetch and Increment Bounded.  */
+  _AMO_LD_INC_EQUAL= 0x19, /* Fetch and Increment Equal.  */
+  _AMO_LD_DEC_BOUNDED  = 0x1A  /* Fetch and Decrement Bounded.  */
+};
+
+/* Implementation of the simple LWAT/LDAT operations that take one register and
+   modify one word or double-word of memory and return the value that was
+   previously in the memory location.
+
+   The LWAT/LDAT opcode requires the address to be a single register, and that
+   points to a suitably aligned memory location.
+
+   In order to indicate to the compiler that the memory location may be
+   modified, the memory location is passed both as a memory reference and the
+   address to put in a register.  */
+
+#define _AMO_LD_SIMPLE(NAME, TYPE, OPCODE, FC) \
+static __inline__ TYPE \
+NAME (TYPE *_PTR, TYPE _VALUE) \
+{  \
+  unsigned __int128 _TMP;   

Re: [RFA] [PATCH] Add a warning for invalid function casts

2017-10-05 Thread Martin Sebor

On 10/05/2017 03:04 PM, Bernd Edlinger wrote:

On 10/05/17 18:16, Martin Sebor wrote:

On 10/03/2017 01:33 PM, Bernd Edlinger wrote:


I'm not sure if this warning may be a bit too strict, but I think
so far it just triggered on rather questionable code.

Thoughts?


My initial thought is that although casts between incompatible
function types undoubtedly mask bugs, the purpose of such casts
is to make such conversions possible.  Indiscriminately diagnosing
them would essentially eliminate this feature of the type system.
Having to add another cast to a different type(*) to suppress
the warning when such a conversion is intended doesn't seem like
a good solution (the next logical step to find bugs in those
conversions would then be to add another warning to see through
those additional casts, and so on).

With that, my question is: under what circumstances does the
warning not trigger on a cast to a function of an incompatible
type?

In my (very quick) tests the warning appears to trigger on all
strictly incompatible conversions, even if they are otherwise
benign, such as:

   int f (const int*);
   typedef int F (int*);

   F* pf1 = f;// -Wincompatible-pointer-types
   F* pf2 = (F*)f;// -Wcast-function-type

Likewise by:

   int f (signed char);
   typedef int F (unsigned char);

   F* pf = (F*)f;// -Wcast-function-type

I'd expect these conversions to be useful and would view warning
for them with -Wextra as excessive.  In fact, I'm not sure I see
the benefit of warning on these casts under any circumstances.

Similarly, for casts between pointers to the same integer type
with a different sign, or those involving ILP32/LP64 portability
issues I'd expect the warning not to trigger unless requested
(e.g., by some other option targeting those issues).

So based on these initial observations and despite the bugs it
may have uncovered, I share your concern that the warning in
its present form is too strict to be suitable for -Wextra, or
possibly even too noisy to be of practical use on its own and
outside of -Wextra.

Out of curiosity, have you done any tests on other code bases
besides GCC to see how many issues (true and false positives)
it finds?

Martin

[*] Strictly speaking, the semantics of casting a function
pointer to intptr_t aren't necessarily well-defined.  Only void*
can be portably converted to intptr_t and back, and only object
pointers are required to be convertible to void* and back.  And
although GCC defines the semantics of these conversions, forcing
programs to abandon a well defined language feature in favor of
one that's not as cleanly specified would undermine the goal
the warning is meant to achieve.


Thanks for your advice.

I did look into openssl and linux, both have plenty of
-Wcast-function-type warnings.

In the case of openssl it is lots of similar stuff
crypto/aes/aes_cfb.c:25:27: warning: cast between incompatible function
types from 'void (*)(const unsigned char *, unsigned char *, const
AES_KEY *) {aka void (*)(const unsigned char *, unsigned char *, const
struct aes_key_st *)}' to 'void (*)(const unsigned char *, unsigned
char *, const void *)' [-Wcast-function-type]
(block128_f) AES_encrypt);

The problem is the function is of course called with a different
signature than what is declared.  They take it somehow for granted,
that "void*" or "const void*" parameter are an alias for
any pointer or any const pointer.  Either as parameter or as return
code.

I would believe this is not well-defined by the c-standard.

But it makes the warning less useful because it would be impossible
to spot the few places where the call will actually abort at
runtime.

Then I tried to compile linux, I noticed that there is a new
warning for the alias to incompatible function.

I saw it very often, and it is always when a system call
is defined:

./include/linux/compat.h:50:18: Warnung: »compat_sys_ptrace« alias
between functions of incompatible types »long int(compat_long_t,
compat_long_t,  compat_long_t,  compat_long_t) {alias long int(int,
int,  int,  int)}« and »long int(long int,  long int,  long int,  long
int)« [-Wattributes]
   asmlinkage long compat_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__))\

./include/linux/syscalls.h:211:18: Warnung: »sys_rt_sigprocmask« alias
between functions of incompatible types »long int(int,  sigset_t *,
sigset_t *, size_t) {alias long int(int,  struct  *, struct
 *, long unsigned int)}« and »long int(long int,  long int,
long int,  long int)« [-Wattributes]
   asmlinkage long sys##name(__MAP(x,__SC_DECL,__VA_ARGS__)) \


Needless to say that even more Wcast-function-type warning
happen.

./include/linux/timer.h:178:23: Warnung: cast between incompatible
function types from »void (*)(struct timer_list *)« to »void (*)(long
unsigned int)« [-Wcast-function-type]
   __setup_timer(timer, (TIMER_FUNC_TYPE)callback,

So they assume obviously that any int / long / pointer value
are compatible with uintptr_t and intptr_t.

[PATCH] PR82396 workaround for AArch64 bootstrap failure

2017-10-05 Thread Wilco Dijkstra
r253236 broke AArch64 bootstrap.  This is a temporary workaround that
disables qsort checking in the scheduler to enable continued development
and testing on AArch64.  This will be removed once the autopref scheduling
code has been fixed.

AArch64 bootstrap completes, OK for commit?

ChangeLog:
2017-10-05  Wilco Dijkstra  

PR rtl-optimization/82396
* haifa-sched.c (ready_sort_real): Disable qsort checking.

--
diff --git a/gcc/haifa-sched.c b/gcc/haifa-sched.c
index 
549e8961411ecd0a04ac3b24ba78b5d53e63258a..e7014cbb8b378c998148189b4d871e93c3e81918
 100644
--- a/gcc/haifa-sched.c
+++ b/gcc/haifa-sched.c
@@ -3084,7 +3084,8 @@ ready_sort_real (struct ready_list *ready)
   if (n_ready_real == 2)
 swap_sort (first, n_ready_real);
   else if (n_ready_real > 2)
-qsort (first, n_ready_real, sizeof (rtx), rank_for_schedule);
+/* HACK: Disable qsort checking for now (PR82396).  */
+(qsort) (first, n_ready_real, sizeof (rtx), rank_for_schedule);
 
   if (sched_verbose >= 4)
 {



Go patch committed: Drop special handling of unexported func/var names

2017-10-05 Thread Ian Lance Taylor
This patch to the Go frontend changes the handling of the assembler
names used for unexported functions and variables, making them the
same as for exported names.  For example, for the package math/big, we
used to generate unexported names as `big.trim` and exported names as
`math_big.NewInt`.  After this change we will use `math_big`
consistently.  Bootstrapped and ran Go testsuite on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 253458)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-048914caa26b34eebabd0423ed48ee3ac34c919c
+adc6eb826f156d0980f0ad9f9efc5c919ec4905e
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/gogo.cc
===
--- gcc/go/gofrontend/gogo.cc   (revision 253458)
+++ gcc/go/gofrontend/gogo.cc   (working copy)
@@ -5343,8 +5343,9 @@ Function::get_or_make_decl(Gogo* gogo, N
 {
   if (this->fndecl_ == NULL)
 {
-  std::string asm_name;
   bool is_visible = false;
+  bool is_init_fn = false;
+  Type* rtype = NULL;
   if (no->package() != NULL)
 ;
   else if (this->enclosing_ != NULL || Gogo::is_thunk(no))
@@ -5355,7 +5356,7 @@ Function::get_or_make_decl(Gogo* gogo, N
   else if (no->name() == gogo->get_init_fn_name())
{
  is_visible = true;
- asm_name = no->name();
+ is_init_fn = true;
}
   else if (Gogo::unpack_hidden_name(no->name()) == "main"
&& gogo->is_main_package())
@@ -5368,17 +5369,29 @@ Function::get_or_make_decl(Gogo* gogo, N
 {
  if (!this->is_unnamed_type_stub_method_)
is_visible = true;
- Type* rtype = NULL;
  if (this->type_->is_method())
rtype = this->type_->receiver()->type();
- asm_name = gogo->function_asm_name(no->name(), NULL, rtype);
 }
 
+  std::string asm_name;
   if (!this->asm_name_.empty())
{
  asm_name = this->asm_name_;
+
+ // If an assembler name is explicitly specified, there must
+ // be some reason to refer to the symbol from a different
+ // object file.
  is_visible = true;
}
+  else if (is_init_fn)
+   {
+ // These names appear in the export data and are used
+ // directly in the assembler code.  If we change this here
+ // we need to change Gogo::init_imports.
+ asm_name = no->name();
+   }
+  else
+   asm_name = gogo->function_asm_name(no->name(), NULL, rtype);
 
   // If a function calls the predeclared recover function, we
   // can't inline it, because recover behaves differently in a
@@ -5409,10 +5422,6 @@ Function::get_or_make_decl(Gogo* gogo, N
   if ((this->pragmas_ & GOPRAGMA_NOSPLIT) != 0)
disable_split_stack = true;
 
-  // Encode name if asm_name not already set at this point
-  if (asm_name.empty())
-   asm_name = gogo->unexported_function_asm_name(no->name());
-
   // This should go into a unique section if that has been
   // requested elsewhere, or if this is a nointerface function.
   // We want to put a nointerface function into a unique section
Index: gcc/go/gofrontend/gogo.h
===
--- gcc/go/gofrontend/gogo.h(revision 253458)
+++ gcc/go/gofrontend/gogo.h(working copy)
@@ -767,10 +767,6 @@ class Gogo
   function_asm_name(const std::string& go_name, const Package*,
const Type* receiver);
 
-  // Return the assembler name to use for an unexported function.
-  std::string
-  unexported_function_asm_name(const std::string& go_name);
-
   // Return the name to use for a function descriptor.
   std::string
   function_descriptor_name(Named_object*);
Index: gcc/go/gofrontend/names.cc
===
--- gcc/go/gofrontend/names.cc  (revision 253458)
+++ gcc/go/gofrontend/names.cc  (working copy)
@@ -54,19 +54,6 @@ Gogo::function_asm_name(const std::strin
   return go_encode_id(ret);
 }
 
-// Return the assembler name to use for an unexported function.
-// FIXME: This should probably be removed and the callers changed to
-// simply call function_name.
-
-std::string
-Gogo::unexported_function_asm_name(const std::string& go_name)
-{
-  std::string ret = this->package_name();
-  ret.append(1, '.');
-  ret.append(Gogo::unpack_hidden_name(go_name));
-  return go_encode_id(ret);
-}
-
 // Return the name to use for a function descriptor.  These symbols
 // are globally visible.
 
@@ -171,18 +158,9 @@ Gogo::specific_type_function_names(const
 std::string
 Gogo::global_var_asm_name(const std::string& go_name, const Package* package)
 {
-  // FIXME: Using package_name for hidden names 

Re: [PATCH v3 1/14] D: The front-end (DMD) language implementation and license.

2017-10-05 Thread Walter Bright



On 10/5/2017 3:59 AM, Iain Buclaw wrote:

On 3 October 2017 at 23:36, Joseph Myers  wrote:

On Tue, 3 Oct 2017, Jeff Law wrote:


/* Copyright (c) 2010-2014 by Digital Mars
  * All Rights Reserved, written by Walter Bright
  * http://www.digitalmars.com
  * Distributed under the Boost Software License, Version 1.0.
  * (See accompanying file LICENSE or copy at
http://www.boost.org/LICENSE_1_0.txt)

If the code was assigned to the FSF in 2011, then the FSF would have
ownership of the code.  And the FSF would be the only entity that could
change the license (which according to your message changed to Boost in
2014).  So something seems wrong here.


The standard FSF assignment would allow the contributor to distribute
their own code under such terms as they see fit.



Walter, would you mind clarifying details of your assignment? Was it a
standard assignment? Did you request for any amendments?


I'm good with FSF owning their copy and it being under the GPL and Digital Mars 
owning our copy and it being Boost licensed.




Jeff, I'm no legal, so I can't comment on it.  Maybe there's someone
from the FSF who be able to confirm?

I'll cc in Andrei as well, so the D language foundation is in on this.

Regards,
Iain.




bootstrap-debug-lean + flags in producer vs compare

2017-10-05 Thread Alexandre Oliva
Unlike bootstrap-debug, bootstrap-debug-lean used to pass compare using
the traditional compare command, because it compiled both stage2 and
stage3 with options that used to generate identical output
(-fcompare-debug= in stage2 vs -fcompare-debug in stage3).

Since we started adding relevant command-line flags to DW_AT_producer,
this is no longer the case, and stages 2 and 3 object files that differ
in nothing but the DW_AT_producer strings.


-fcompare-debug is short for -fcompare-debug=-gtoggle, so stage3
compiles twice, once with the normal options, once with toggled -g, to
then compare the temporary final dumps.  When enabled, both compilations
get from the driver an additional -frandom-seed flag (if none is given
explicitly).

-fcompare-debug= is short for -fno-compare-debug, disabling the second
compilation.


The difference between the DW_AT_producer lines are the different
-fcompare-debug flags, and the presence of the -frandom-seed flag in the
stage3 compilation.

It should be easy and sensible enough to filter the -fcompare-debug
flags out of the DW_AT_producer string.  This option should never affect
the compilation output, it just determines whether or not to perform an
additional compilation that should produce the same executable output.

However, doing that but that won't get us rid of the -frandom-seed
option.  We could drop -frandom-seed from the DW_AT_producer output too,
but I don't think we should do that when the option is given by the
user, rather than implicitly introduced by -fcompare-debug.  We could
introduce an option that causes the subsequent option to be omitted from
the DW_AT_producer string, and arrange for -fcompare-debug to issue that
option before the -frandom-seed option it issues.  The problem with this
approach is that I can't decide whether it's an option prone to abuse,
or one that can have other legitimate uses.


Another approach to fix (or rather hide) this failure mode is to allow
the debug information to differ under bootstrap-debug-lean, by using the
same compare-debug script we use for bootstrap-debug.  The first
patchlet below does just that.  The second drops -fcompare-debug from
DW_AT_producer.  A third (not written yet :-) might somehow deal with
the -frandom-seed added by -fcompare-debug, and either the first or the
others would be posted as a single, fully tested patch, once we decide
how we want to deal with -frandom-seed.


Thoughts?  Thanks in advance,


diff --git a/config/bootstrap-debug-lean.mk b/config/bootstrap-debug-lean.mk
index e215280..5f2db80 100644
--- a/config/bootstrap-debug-lean.mk
+++ b/config/bootstrap-debug-lean.mk
@@ -9,3 +9,4 @@
 
 STAGE2_CFLAGS += -fcompare-debug=
 STAGE3_CFLAGS += -fcompare-debug
+do-compare = $(SHELL) $(srcdir)/contrib/compare-debug $$f1 $$f2


diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index ed7a85a..f077b35 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -23846,6 +23846,7 @@ gen_producer_string (void)
   case OPT_fltrans_output_list_:
   case OPT_fresolution_:
   case OPT_fdebug_prefix_map_:
+  case OPT_fcompare_debug:
/* Ignore these.  */
continue;
   default:


-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [RFC] propagate malloc attribute in ipa-pure-const pass

2017-10-05 Thread Prathamesh Kulkarni
On 29 September 2017 at 12:28, Jan Hubicka  wrote:
>> > I wonder what happens here when, say, ipa-icf redirect the call to 
>> > eqivaelnt
>> > function and removes the callee?  Perhaps we realy want to have set of call
>> > sites rahter than nodes stored from analysis to execution. Call sites have
>> > unique stmts and uids, so it will be possible to map them back and forth.
>> IIUC, call site is represented using cgraph_edge ?
>
> Yes, there is call_stmt pointer when gimple is read and uid in WPA mode which
> are unique.  There is call_summary class which lets you to associate info with
> a call site. While it is not most memory effecient to store one bit of 
> information
> this way, i guess it may be easiest to use it in anticipation of it becoming
> more useful in foreseeable future.  We have some bits in call edge itself that
> may be shuffled to the summary as well.
>
>> So change return_callees_map to be the mapping from node to subset of
>> it's call-sites where
>> node returns the value of one of it's callees:
>> static hash_map< cgraph_node *, vec *> *return_callees_map; ?
>
> This should work too, though storing direct pointers to edges is something we
> probably want to avoid to keep memory representation of edges in bounds.
>
> I would go with call_summary - everything else seems like bit of premature
> optimization.  If we will decide to optimize it later, we may invent a variant
> of summary datatype for that.
Hi Honza,
Thanks for the detailed suggestions, I have updated the patch accordingly.
I have following questions on call_summary:
1] I added field bool is_return_callee in ipa_call_summary to track
whether the caller possibly returns value returned by callee, which
gets rid of return_callees_map. I assume ipa_call_summary_t::remove()
and ipa_call_summary_t::duplicate() will already take care of handling
late insertion/removal of cgraph nodes ? I just initialized
is_return_callee to false in ipa_call_summary::reset and that seems to
work. I am not sure though if I have handled it correctly. Could you
please check that ?

2] ipa_inline() called ipa_free_fn_summary, which made
ipa_call_summaries unavailable during ipa-pure-const pass. I removed
call to ipa_free_fn_summary from ipa_inline, and moved it to
ipa_pure_const::execute(). Is that OK ?

Patch passes bootstrap+test and lto bootstrap+test on x86_64-unknown-linux-gnu.
Verfiied SPEC2k6 compiles and runs without miscompares with LTO
enabled on aarch64-linux-gnu.
Cross-tested on arm*-*-* and aarch64*-*-*. I will additionally test
the patch by building chromium or firefox.
Would it be OK to commit if it passes above validations ?

Thanks,
Prathamesh
>
> Thanks,
> Honza
2017-10-05  Prathamesh Kulkarni  

* cgraph.h (set_malloc_flag): Declare.
* cgraph.c (set_malloc_flag_1): New function.
(set_malloc_flag): Likewise.
* ipa-fnsummary.h (ipa_call_summary): Add new field is_return_callee.
* ipa-fnsummary.c (ipa_call_summary::reset): Set is_return_callee to
false.
(read_ipa_call_summary): Add support for reading is_return_callee.
(write_ipa_call_summary): Stream is_return_callee.
* ipa-inline.c (ipa_inline): Remove call to ipa_free_fn_summary.
* ipa-pure-const.c: Add headers ssa.h, alloc-pool.h, symbol-summary.h,
ipa-prop.h, ipa-fnsummary.h.
(malloc_state_e): Define.
(malloc_state_names): Define.
(funct_state_d): Add field malloc_state.
(varying_state): Set malloc_state to STATE_MALLOC_BOTTOM.
(check_retval_uses): New function.
(malloc_candidate_p): Likewise.
(analyze_function): Add support for malloc attribute.
(pure_const_write_summary): Stream malloc_state.
(pure_const_read_summary): Add support for reading malloc_state.
(dump_malloc_lattice): New function.
(propagate_malloc): New function.
(ipa_pure_const::execute): Call propagate_malloc and
ipa_free_fn_summary.
(pass_local_pure_const::execute): Add support for malloc attribute.
* ssa-iterators.h (RETURN_FROM_IMM_USE_STMT): New macro.

testsuite/
* gcc.dg/ipa/propmalloc-1.c: New test-case.
* gcc.dg/ipa/propmalloc-2.c: Likewise.
* gcc.dg/ipa/propmalloc-3.c: Likewise.

diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index 3d0cefbd46b..0aad90d59ea 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -2530,6 +2530,53 @@ cgraph_node::set_nothrow_flag (bool nothrow)
   return changed;
 }
 
+/* Worker to set malloc flag.  */
+static void
+set_malloc_flag_1 (cgraph_node *node, bool malloc_p, bool *changed)
+{
+  if (malloc_p && !DECL_IS_MALLOC (node->decl))
+{
+  DECL_IS_MALLOC (node->decl) = true;
+  *changed = true;
+}
+
+  ipa_ref *ref;
+  FOR_EACH_ALIAS (node, ref)
+{
+  cgraph_node *alias = dyn_cast (ref->referring);
+  if (!malloc_p || alias->get_availability () > AVAIL_INTERPOSABLE)
+   set_malloc_flag_1 (alias, malloc_p, changed);
+}
+
+ 

RE: Zen tuning part 2: Increase branch_cost to 3

2017-10-05 Thread Kumar, Venkataramanan
Hi Honza,

-Original Message-
From: Jan Hubicka [mailto:hubi...@ucw.cz] 
Sent: Thursday, October 5, 2017 8:41 PM
To: gcc-patches@gcc.gnu.org; Kumar, Venkataramanan 

Subject: Zen tuning part 2: Increase branch_cost to 3

Hi,
this patch increases branch_cost to 3.  Constant 2 is apparently copied from
bdver4 costs while core and generic use 3.  3 seems to give best results for
spec2000 and also significantly improve monte carlo benchmark from scimark.

Bootstrapped/regtested x86_64-linux, comitted.

Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 253448)
+++ config/i386/i386.c  (working copy)
@@ -1421,7 +1421,7 @@ struct processor_costs znver1_cost = {
  to limit number of prefetches at all, as their execution also takes some
  time).  */
   100, /* number of parallel prefetches.  */
-  2,   /* Branch cost.  */
+  3,   /* Branch cost.  */
   COSTS_N_INSNS (6),   /* cost of FADD and FSUB insns.  */
   COSTS_N_INSNS (6),   /* cost of FMUL instruction.  */
   COSTS_N_INSNS (42),  /* cost of FDIV instruction.  */

Sure looks good to me. 

Regards,
Venkat.