date:20161104

Re: [match.pd] Fix for PR35691

2016-11-04 Thread Richard Biener

On Thu, 3 Nov 2016, Marc Glisse wrote:

> On Thu, 3 Nov 2016, Richard Biener wrote:
> 
> > > > > The transform would also work for vectors (element_precision for
> > > > > the test but also a value-matching zero which should ensure the
> > > > > same number of elements).
> > > > Um sorry, I didn't get how to check vectors to be of equal length by a
> > > > matching zero.
> > > > Could you please elaborate on that ?
> > > 
> > > He may have meant something like:
> > > 
> > >   (op (cmp @0 integer_zerop@2) (cmp @1 @2))
> > 
> > I meant with one being @@2 to allow signed vs. Unsigned @0/@1 which was the
> > point of the pattern.
> 
> Oups, that's what I had written first, and then I somehow managed to confuse
> myself enough to remove it so as to remove the call to types_match :-(
> 
> > > So the last operand is checked with operand_equal_p instead of
> > > integer_zerop. But the fact that we could compute bit_ior on the
> > > comparison results should already imply that the number of elements is the
> > > same.
> > 
> > Though for equality compares we also allow scalar results IIRC.
> 
> Oh, right, I keep forgetting that :-( And I have no idea how to generate one
> for a testcase, at least until the GIMPLE FE lands...
> 
> > > On platforms that have IOR on floats (at least x86 with SSE, maybe some
> > > vector mode on s390?), it would be cool to do the same for floats (most
> > > likely at the RTL level).
> > 
> > On GIMPLE view-converts could come to the rescue here as well.  Or we cab
> > just allow bit-and/or on floats as much as we allow them on pointers.
> 
> Would that generate sensible code on targets that do not have logic insns for
> floats? Actually, even on x86_64 that generates inefficient code, so there
> would be some work (for instance grep finds no gen_iordf3, only gen_iorv2df3).
> 
> I am also a bit wary of doing those obfuscating optimizations too early...
> a==0 is something that other optimizations might use. long
> c=(long&)a|(long&)b; (double&)c==0; less so...
> 
> (and I am assuming that signaling NaNs don't make the whole transformation
> impossible, which might be wrong)

Yeah.  I also think it's not so much important - I just wanted to mention
vectors...

Btw, I still think we need a more sensible infrastructure for passes
to gather, analyze and modify complex conditions.  (I'm always pointing
to tree-affine.c as an, albeit not very good, example for handling
a similar problem)

Richard.

Re: [PATCH GCC]Clean pedantic calls and useless lvalue code in fold_cond_expr_with_comparison

2016-11-04 Thread Richard Biener

On Thu, Nov 3, 2016 at 3:11 PM, Bin Cheng  wrote:
> Hi,
> According to analysis given by 
> https://gcc.gnu.org/ml/gcc/2016-10/msg00228.html, calls to 
> pedantic_non_lvalue_loc and code handling lvalue in 
> fold_cond_expr_with_comparison are useless now.  Given this is complicated 
> legacy code, it may be better to change code step by step, rather than doing 
> this cleanup together with moving simplification from 
> fold_cond_expr_with_comparison to match.pd.
> BTW, after last cleanup of pedantic_lvalues, function pedantic_non_lvalue_loc 
> now has nothing to do with lvalue.  It could be further cleaned up, or at 
> least renamed into something else.  This patch doesn't do that because that 
> depends on the answer to the question of the aforementioned message.
>
> Bootstrap and test on x86_64 and AArch64.  Any comments?

Ok.

Note removal of [pedantic_]non_lvalue can at most result in accepting
invalid code
where we might not have any testsuite coverage.  For the 2nd case with
/* Avoid adding NOP_EXPRs in case this is an lvalue.  */ and C++ lvalue ?:
I'm not sure we have testsuite coverage given Jason failed to add a testcase
when adding the code in r34416.

Thanks,
Richard.

> Thanks,
> bin
>
> 2016-10-27  Bin Cheng  
>
> * fold-const.c (fold_cond_expr_with_comparison): Remove call
> to pedantic_non_lvalue_loc.  Remove useless code for lvalue
> where cond_expr can't be a lvalue.

Re: [PATCH] bb-reorder: Improve compgotos pass (PR71785)

2016-11-04 Thread Richard Biener

On Thu, Nov 3, 2016 at 4:46 PM, Segher Boessenkool
 wrote:
> On Thu, Nov 03, 2016 at 09:29:07AM +0100, Richard Biener wrote:
>> On Wed, Nov 2, 2016 at 4:27 PM, Segher Boessenkool
>>  wrote:
>> > On Wed, Nov 02, 2016 at 02:51:41PM +0100, Richard Biener wrote:
>> >> >> I don't believe it needs a cleanup on every iteration. One cleanup at
>> >> >> the end should work fine.
>> >> >
>> >> > But as the comment there says:
>> >> >
>> >> >   /* Merge the duplicated blocks into predecessors, when possible.  
>> >> > */
>> >> >   cleanup_cfg (0);
>> >> >
>> >> > (this is not a new comment), and without merging blocks this whole
>> >> > patch does zilch.
>> >> >
>> >> > There is no need to create new basic blocks at all (can replace the
>> >> > final branch in a block with a copy of the whole block it jumps to,
>> >> > instead); and then it is painfully obvious that switching to layout
>> >> > mode here is pointless, too.
>> >> >
>> >> > Do you want me to do that?
>> >> >
>> >> > Btw, this isn't quadratic anyway; it is a constant number (the maximal
>> >> > length allowed of the block to copy) linear.  Unless there are 
>> >> > unboundedly
>> >> > many forwarder blocks, which there shouldn't be, but I cannot prove 
>> >> > that.
>> >>
>> >> Well, you iterate calling functions (find candidates, merge, cleanup-cfg) 
>> >> that
>> >> walk the whole function.  So unless the number of iterations is bound
>> >> with a constant I call this quadratic (in function size).
>> >
>> > It is bounded (with that caveat above): new candidates will be bigger than
>> > the block merged into it, so there won't be more than
>> > PARAM_MAX_GOTO_DUPLICATION_INSNS passes.
>> >
>> > But I can make it all work simpler, in non-layout mode, if you prefer.
>>
>> Iteration is fine but we should restrict where we look for new
>> candidates.  Likewise
>> block merging opportunities should be easier to exploit than by
>> running cfg-cleanup...
>
> Yes, but that was what the original code does already, so I didn't look
> deeper.  "It was just a simple bugfix".
>
>> I'm just thinking of those gigantic machine-generated state machines where we
>> ATM do quite well (at least at -O1 ...) with respect to compile-time.
>
> Like I said, I tested it on a 2000 node VM, very artificial so almost
> no code except the threading itself, and the compgotos pass takes less
> than 1% time both before and after the patch.  I ony tested at -O2 though.

I see.

> I'll get back to it, but first I have some things that need handling during
> stage 1.

Fair enough.  Don't consider my comments blocking the patch.

Richard.

>
> Segher

Re: [RFC] Check number of uses in simplify_cond_using_ranges().

2016-11-04 Thread Richard Biener

On Thu, Nov 3, 2016 at 4:03 PM, Dominik Vogt  wrote:
> I've been trying to fix some bad tree-ssa related optimisation for
> s390x and come up with the attached experimental patch.  The patch
> is not really good - it breaks some situations in which the
> optimisation was useful.  With this code:
>
>   void bar(long);
>   void foo (char a)
>   {
> long l;
> char b;
>
> b = a & 63;
> l = b;
> if (l > 9)
>   bar(l);
>   }
>
> We get this representation before value range propagation:
>
>   ...
>   a_4 = *p_3(D);
>   b_5 = a_4 & 63;
>   l_6 = (long int) b_5;
>   if (l_6 > 9)
>   ...
>
> Now, there's some code in tree-vrp.c:simplify_cond_using_ranges()
> that folds b_5 into the if condition, because l_6 is just a sign
> extension of b_5, and the value range of l_6 can also be
> represented by the type of b (char).
>
>   if (b_5 > 9)
>
> (On s390x we end up with "a & 63" stored in two separate
> registers, extended to 32 bits in one and to 64 bits in the other,
> adding up to two unnecessary instructions.)
>
> A naive idea to prevent folding in this situation was to suppress
> it if it would introduce a second use of b_5 (i.e. b_5 was only
> used in the cast before) while not eliminating all uses of l_6.
> However, calling has_single_use() for both purposes proves to be
> not good enough, and VRP does not do this kind of optimisation
> yet.  It does not catch cases like
>
>   if (l_6 > 9)
> ...
>   else if (l_6 > 7)
> ...
>
> where all occurences of l_6 could be replaced, and simply looking
> at the use counts is too coarse.
>
> --
>
> Is VRP the right pass to do this optimisation or should a later
> pass rather attempt to eliminate the new use of b_5 instead?  Uli
> has brought up the idea a mini "sign extend elimination" pass that
> checks if the result of a sign extend could be replaced by the
> original quantity in all places, and if so, eliminate the ssa
> name.  (I guess that won't help with the above code because l is
> used also as a function argument.)  How could a sensible approach
> to deal with the situation look like?

We run into this kind of situation regularly and for general foldings
in match.pd we settled with single_use () even though it is not perfect.
Note the usual complaint is not extra extension instructions but
the increase of register pressure.

This is because it is hard to do better when you are doing local
optimization.

As for the question on whether VRP is the right pass to do this the
answer is two-fold -- VRP has the most precise range information.
But the folding itself should be moved to generic code and use
get_range_info ().

Richard.

> Ciao
>
> Dominik ^_^  ^_^
>
> --
>
> Dominik Vogt
> IBM Germany
>

Re: [patch] Clean up WORD_REGISTER_OPERATIONS & LOAD_EXTEND_OP tests

2016-11-04 Thread Richard Biener

On Thu, Nov 3, 2016 at 9:56 PM, Eric Botcazou  wrote:
> Hi,
>
> WORD_REGISTER_OPERATIONS and LOAD_EXTEND_OP are partially used directly as
> preprocessor macros and partially tested in the code.  This patch brings a bit
> of consistency into this by converting the remaining macro cases.
>
> Tested on SPARC/Solaris and x86-64/Linux, OK for the mainline?

Ok.

Richard.

>
> 2016-11-03  Eric Botcazou  
>
> * defaults.h (LOAD_EXTEND_OP): Define if not already defined.
> * combine.c (LOAD_EXTEND_OP): Delete.
> (simplify_comparison): Fix comment on LOAD_EXTEND_OP.
> * cse.c (LOAD_EXTEND_OP): Delete.
> * fold-const.c (LOAD_EXTEND_OP): Likewise.
> * fwprop.c (free_load_extend): Remove #ifdef LOAD_EXTEND_OP/#endif.
> * postreload.c (LOAD_EXTEND_OP): Delete.
> * reload.c (push_reload): Remove #ifdef LOAD_EXTEND_OP/#endif.
> Convert conditional compilation based on WORD_REGISTER_OPERATIONS.
> (find_reloads): Likewise.
> * reload1.c (eliminate_regs_1): Likewise.
> * rtlanal.c (nonzero_bits1): Remove #ifdef LOAD_EXTEND_OP/#endif.
> (num_sign_bit_copies1): Likewise.
>
> --
> Eric Botcazou

Re: [match.pd] Fix for PR35691

2016-11-04 Thread Prathamesh Kulkarni

On 4 November 2016 at 13:41, Richard Biener  wrote:
> On Thu, 3 Nov 2016, Marc Glisse wrote:
>
>> On Thu, 3 Nov 2016, Richard Biener wrote:
>>
>> > > > > The transform would also work for vectors (element_precision for
>> > > > > the test but also a value-matching zero which should ensure the
>> > > > > same number of elements).
>> > > > Um sorry, I didn't get how to check vectors to be of equal length by a
>> > > > matching zero.
>> > > > Could you please elaborate on that ?
>> > >
>> > > He may have meant something like:
>> > >
>> > >   (op (cmp @0 integer_zerop@2) (cmp @1 @2))
>> >
>> > I meant with one being @@2 to allow signed vs. Unsigned @0/@1 which was the
>> > point of the pattern.
>>
>> Oups, that's what I had written first, and then I somehow managed to confuse
>> myself enough to remove it so as to remove the call to types_match :-(
>>
>> > > So the last operand is checked with operand_equal_p instead of
>> > > integer_zerop. But the fact that we could compute bit_ior on the
>> > > comparison results should already imply that the number of elements is 
>> > > the
>> > > same.
>> >
>> > Though for equality compares we also allow scalar results IIRC.
>>
>> Oh, right, I keep forgetting that :-( And I have no idea how to generate one
>> for a testcase, at least until the GIMPLE FE lands...
>>
>> > > On platforms that have IOR on floats (at least x86 with SSE, maybe some
>> > > vector mode on s390?), it would be cool to do the same for floats (most
>> > > likely at the RTL level).
>> >
>> > On GIMPLE view-converts could come to the rescue here as well.  Or we cab
>> > just allow bit-and/or on floats as much as we allow them on pointers.
>>
>> Would that generate sensible code on targets that do not have logic insns for
>> floats? Actually, even on x86_64 that generates inefficient code, so there
>> would be some work (for instance grep finds no gen_iordf3, only 
>> gen_iorv2df3).
>>
>> I am also a bit wary of doing those obfuscating optimizations too early...
>> a==0 is something that other optimizations might use. long
>> c=(long&)a|(long&)b; (double&)c==0; less so...
>>
>> (and I am assuming that signaling NaNs don't make the whole transformation
>> impossible, which might be wrong)
>
> Yeah.  I also think it's not so much important - I just wanted to mention
> vectors...
>
> Btw, I still think we need a more sensible infrastructure for passes
> to gather, analyze and modify complex conditions.  (I'm always pointing
> to tree-affine.c as an, albeit not very good, example for handling
> a similar problem)
Thanks for mentioning the value-matching capture @@, I wasn't aware of
this match.pd feature.
The current patch keeps it restricted to only bitwise operators on integers.
Bootstrap+test running on x86_64-unknown-linux-gnu.
OK to commit if passes ?

Thanks,
Prathamesh
>
> Richard.
2016-11-04  Prathamesh Kulkarni  

PR middle-end/35691
* match.pd: Add following two patterns:
(x == 0 & y == 0) -> (x | typeof(x)(y)) == 0.
(x != 0 | y != 0) -> (x | typeof(x)(y)) != 0.

testsuite/
* gcc.dg/pr35691-1.c: New test-case.
* gcc.dg/pr35691-2.c: Likewise.
* gcc.dg/pr35691-3.c: Likewise.
* gcc.dg/pr35691-4.c: Likewise.

diff --git a/gcc/match.pd b/gcc/match.pd
index 48f7351..4f74942 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -519,6 +519,19 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   (if (TYPE_UNSIGNED (type))
 (bit_and @0 (bit_not (lshift { build_all_ones_cst (type); } @1)
 
+/* PR35691: Transform
+   (x == 0 & y == 0) -> (x | typeof(x)(y)) == 0.
+   (x != 0 | y != 0) -> (x | typeof(x)(y)) != 0.  */
+
+(for bitop (bit_and bit_ior)
+ cmp (eq ne)
+ (simplify
+  (bitop (cmp @0 integer_zerop) (cmp @1 integer_zerop))
+   (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
+   && INTEGRAL_TYPE_P (TREE_TYPE (@1))
+   && TYPE_PRECISION (TREE_TYPE (@0)) == TYPE_PRECISION (TREE_TYPE (@1)))
+(cmp (bit_ior @0 (convert @1)) { build_zero_cst (TREE_TYPE (@0)); }
+
 /* Fold (A & ~B) - (A & B) into (A ^ B) - B.  */
 (simplify
  (minus (bit_and:cs @0 (bit_not @1)) (bit_and:cs @0 @1))
diff --git a/gcc/testsuite/gcc.dg/pr35691-1.c b/gcc/testsuite/gcc.dg/pr35691-1.c
new file mode 100644
index 000..5211f815
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr35691-1.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-forwprop-details" } */
+
+int foo(int z0, unsigned z1)
+{
+  int t0 = (z0 == 0);
+  int t1 = (z1 == 0);
+  int t2 = (t0 && t1);
+  return t2;
+}
+
+/* { dg-final { scan-tree-dump "gimple_simplified to _\[0-9\]* = \\(int\\) 
z1_\[0-9\]*\\(D\\);" "forwprop1" } } */
diff --git a/gcc/testsuite/gcc.dg/pr35691-2.c b/gcc/testsuite/gcc.dg/pr35691-2.c
new file mode 100644
index 000..90cbf6d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr35691-2.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-forwprop-details" } */
+
+int foo(int z0, unsigned z1)
+{
+  int t0 = (z0 != 0);
+  int t1 = (z1 != 0);
+  i

Re: [PATCH, RFC] Introduce -fsanitize=use-after-scope (v2)

2016-11-04 Thread Martin Liška

On 11/02/2016 03:35 PM, Jakub Jelinek wrote:
> On Wed, Nov 02, 2016 at 03:27:42PM +0100, Martin Liška wrote:
>>> So is there anything I should do wrt -Wswitch-unreachable?
>>>
>>> Marek
>>>
>>
>> Probably not. I'm having a patch puts GIMPLE_SWITCH statement to a proper 
>> place
>> in GIMPLE_BIND. Let's see whether such patch can bootstrap and survive 
>> regression
>> tests.
> 
> Please do that only for -fsanitize-use-after-scope, it will likely affect at
> least for -O0 the debugging experience.  For -O0 -fsanitize=address 
> -fsanitize-use-after-scope
> perhaps we could arrange for some extra stmt to have the locus of the
> switch (where we still don't want the vars to appear in scope) and then
> have no locus on the ASAN_MARK and actual GIMPLE_SWITCH or something
> similar.
> 
>   Jakub
> 

I'm sending patch where I put gimple switch statement to a place where all 
BIND_EXPR vars
are unpoisoned. I'm sending diff to a previous version and new version of the 
patch.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests. 
Apart from that,
asan bootstrap successfully finished on x86_64-linux-gnu.

Martin
>From 742ba5e3f5eddb069a13bc51347d21a4ec6a45c6 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Tue, 3 May 2016 15:35:22 +0200
Subject: [PATCH 1/2] Introduce -fsanitize-address-use-after-scope

gcc/c-family/ChangeLog:

2016-10-27  Martin Liska  

	* c-warn.c (warn_for_unused_label): Save all labels used
	in goto or in &label.

gcc/ChangeLog:

2016-10-27  Martin Liska  

	* asan.c (enum asan_check_flags): Move the enum to header file.
	(asan_init_shadow_ptr_types): Make type creation more generic.
	(shadow_mem_size): New function.
	(asan_emit_stack_protection): Use newly added ASAN_SHADOW_GRANULARITY.
	Rewritten stack unpoisoning code.
	(build_shadow_mem_access): Add new argument return_address.
	(instrument_derefs): Instrument local variables if use after scope
	sanitization is enabled.
	(asan_store_shadow_bytes): New function.
	(asan_expand_mark_ifn): Likewise.
	(asan_sanitize_stack_p): Moved from asan_sanitize_stack_p.
	* asan.h (enum asan_mark_flags): Moved here from asan.c
	(asan_protect_stack_decl): Protect all declaration that need
	to live in memory.
	(asan_sanitize_use_after_scope): New function.
	(asan_no_sanitize_address_p): Likewise.
	* cfgexpand.c (partition_stack_vars): Consider
	asan_sanitize_use_after_scope in condition.
	(expand_stack_vars): Likewise.
	* common.opt (-fsanitize-address-use-after-scope): New option.
	* doc/invoke.texi (use-after-scope-direct-emission-threshold):
	Explain the parameter.
	* flag-types.h (enum sanitize_code): Define SANITIZE_USE_AFTER_SCOPE.
	* gimplify.c (build_asan_poison_call_expr): New function.
	(asan_poison_variable): Likewise.
	(gimplify_bind_expr): Generate poisoning/unpoisoning for local
	variables that have address taken.
	(gimplify_decl_expr): Likewise.
	(gimplify_target_expr): Likewise for C++ temporaries.
	(sort_by_decl_uid): New function.
	(gimplify_expr): Unpoison all variables for a label we can jump
	from outside of a scope.
	(gimplify_function_tree): Clear asan_poisoned_variables.
	* internal-fn.c (expand_ASAN_MARK): New function.
	* internal-fn.def (ASAN_MARK): Declare.
	* opts.c (finish_options): Handle -fstack-reuse if
	-fsanitize-address-use-after-scope is enabled.
	(common_handle_option): Enable address sanitization if
	-fsanitize-address-use-after-scope is enabled.
	(warn_switch_unreachable_r): Return iterator
	instead of gimple *.
	(maybe_warn_switch_unreachable): Release the iterator.
	(gimplify_switch_expr): In case of enabled sanitization, put switch
	gimple statement to a place where all variables in BIND_EXPR
	are sanitized.
	* params.def (PARAM_USE_AFTER_SCOPE_DIRECT_EMISSION_THRESHOLD):
	New parameter.
	* params.h: Likewise.
	* sancov.c (pass_sanopt::execute): Handle IFN_ASAN_MARK.
	sanitizer.def: Define __asan_poison_stack_memory and
	__asan_unpoison_stack_memory functions.
	* tree-eh.c (verify_norecord_switch_expr): Verify switch expr
	only when not doing sanitization.
---
 gcc/asan.c| 287 +-
 gcc/asan.h|  66 ++--
 gcc/c-family/c-warn.c |   9 +-
 gcc/cfgexpand.c   |  18 +---
 gcc/common.opt|   3 +
 gcc/doc/invoke.texi   |  15 ++-
 gcc/gimplify.c| 239 +
 gcc/internal-fn.c |   9 ++
 gcc/internal-fn.def   |   1 +
 gcc/opts.c|  27 -
 gcc/params.def|   6 ++
 gcc/params.h  |   2 +
 gcc/sanitizer.def |   4 +
 gcc/sanopt.c  |   3 +
 gcc/tree-eh.c |   4 +
 15 files changed, 592 insertions(+), 101 deletions(-)

diff --git a/gcc/asan.c b/gcc/asan.c
index c6d9240..95495d2 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -245,6 +245,13 @@ static unsigned HOST_WIDE_INT asan_shadow_offset_value;
 static bool asan_shadow_offset_computed;
 static vec sanitized_sections;
 
+/* Set of variable declarations that are going to be guarded

Re: [PATCH] Make direct emission of time profiler counter

2016-11-04 Thread Christophe Lyon

On 3 November 2016 at 16:11, Jan Hubicka  wrote:
>>
>> 2016-10-31  Martin Liska  
>>
>>   * libgcov-profiler.c (__gcov_time_profiler): Remove.
>>   (__gcov_time_profiler_atomic): Likewise.
>>
>> gcc/ChangeLog:
>>
>> 2016-10-31  Martin Liska  
>>
>>   * profile.c (instrument_values): Fix coding style.
>>   (branch_prob): Use renamed function.
>>   * tree-profile.c (init_ic_make_global_vars): Likewise.
>>   (gimple_init_edge_profiler): Rename to
>>   gimple_init_gcov_profiler.
>>   tree_time_profiler_counter variable declaration.
>>   (gimple_gen_time_profiler): Rewrite to do a direct gimple code
>>   emission.
>>   * value-prof.h: Remove an argument.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2016-11-03  Martin Liska  
>>
>>   * gcc.dg/no_profile_instrument_function-attr-1.c: Update scanned
>>   output.
>>   * gcc.dg/tree-prof/time-profiler-3.c: New test.
>
> OK,
> Thanks!
> Honza

Hi,

It seems this patch causes an ICE when compiling
gcc.dg/gomp/pr27573.c
for instance on arm-linux-gnueabi --wtih-cpu=cortex-a9

The backtrace is:
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/testsuite/gcc.dg/gomp/pr27573.c:
In function 'main._omp_fn.0':
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/testsuite/gcc.dg/gomp/pr27573.c:12:9:
internal compiler error: in convert_memory_address_addr_sp
ace_1, at explow.c:284
0x79b19e convert_memory_address_addr_space_1(machine_mode, rtx_def*,
unsigned char, bool, bool)
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/explow.c:284
0x670903 get_builtin_sync_mem
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/builtins.c:4933
0x671fef expand_builtin_atomic_fetch_op
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/builtins.c:5452
0x674aa4 expand_builtin(tree_node*, rtx_def*, rtx_def*, machine_mode, int)
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/builtins.c:6859
0x7b729a expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
expand_modifier, rtx_def**, bool)
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:10773
0x7c1460 store_expr_with_bounds(tree_node*, rtx_def*, int, bool, bool,
tree_node*)
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:5551
0x7c7451 expand_assignment(tree_node*, tree_node*, bool)
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:5317
0x69a0de expand_call_stmt
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cfgexpand.c:2666
0x69b264 expand_gimple_stmt_1
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cfgexpand.c:3581
0x69b264 expand_gimple_stmt
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cfgexpand.c:3747
0x69cc8a expand_gimple_basic_block
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cfgexpand.c:5754
0x69fefe execute
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cfgexpand.c:6368
Please submit a full bug report,

Christophe

Re: [PATCHv2 5/7, GCC, ARM, V8M] Handling ARMv8-M Security Extension's cmse_nonsecure_call attribute

2016-11-04 Thread Kyrill Tkachov


Hi Andre,

On 25/10/16 17:29, Andre Vieira (lists) wrote:

On 24/08/16 12:01, Andre Vieira (lists) wrote:

On 25/07/16 14:25, Andre Vieira (lists) wrote:

This patch adds support for the ARMv8-M Security Extensions
'cmse_nonsecure_call' attribute. This attribute may only be used for
function types and when used in combination with the '-mcmse'
compilation flag. See Section 5.5 of ARM®v8-M Security Extensions
(http://infocenter.arm.com/help/topic/com.arm.doc.ecm0359818/index.html).

We currently do not support cmse_nonsecure_call functions that pass
arguments or return variables on the stack and we diagnose this.

*** gcc/ChangeLog ***
2016-07-25  Andre Vieira
 Thomas Preud'homme  

 * config/arm/arm.c (gimplify.h): New include.
 (arm_handle_cmse_nonsecure_call): New.
 (arm_attribute_table): Added cmse_nonsecure_call.

*** gcc/testsuite/ChangeLog ***
2016-07-25  Andre Vieira
 Thomas Preud'homme  

 * gcc.target/arm/cmse/cmse-3.c: Add tests.
 * gcc.target/arm/cmse/cmse-4.c: Add tests.


Added more documentation as requested.

---

This patch adds support for the ARMv8-M Security Extensions
'cmse_nonsecure_call' attribute. This attribute may only be used for
function types and when used in combination with the '-mcmse'
compilation flag. See Section 5.5 of ARM®v8-M Security Extensions
(http://infocenter.arm.com/help/topic/com.arm.doc.ecm0359818/index.html).

We currently do not support cmse_nonsecure_call functions that pass
arguments or return variables on the stack and we diagnose this.

*** gcc/ChangeLog ***
2016-07-xx  Andre Vieira
 Thomas Preud'homme  

 * config/arm/arm.c (gimplify.h): New include.
 (arm_handle_cmse_nonsecure_call): New.
 (arm_attribute_table): Added cmse_nonsecure_call.
 * doc/extend.texi (ARM ARMv8-M Security Extensions): New attribute.

*** gcc/testsuite/ChangeLog ***
2016-07-xx  Andre Vieira
 Thomas Preud'homme  

 * gcc.target/arm/cmse/cmse-3.c: Add tests.
 * gcc.target/arm/cmse/cmse-4.c: Add tests.


Hi,

Rebased previous patch on top of trunk as requested. No changes to
ChangeLog.


This looks ok to me.
Thanks,
Kyrill


Cheers,
Andre

Re: [PATCH, RFC] Introduce -fsanitize=use-after-scope (v2)

2016-11-04 Thread Jakub Jelinek

Hi!

On Fri, Nov 04, 2016 at 10:17:31AM +0100, Martin Liška wrote:
> diff --git a/gcc/gimplify.c b/gcc/gimplify.c
> index 813777d..86ce793 100644
> --- a/gcc/gimplify.c
> +++ b/gcc/gimplify.c
> @@ -1678,7 +1678,9 @@ warn_switch_unreachable_r (gimple_stmt_iterator *gsi_p, 
> bool *handled_ops_p,
>worse location info.  */
>if (gimple_try_eval (stmt) == NULL)
>   {
> -   wi->info = stmt;
> +   gimple_stmt_iterator *it = XNEW (gimple_stmt_iterator);
> +   memcpy (it, gsi_p, sizeof (gimple_stmt_iterator));

That would be cleaner as *it = *gsi_p;
That set, I fail to see
1) the need to use a gsi pointer in wi->info compared to stmt itself,
   you can gsi_for_stmt cheaply at any time
2) why is anything done about this in warn_switch_unreachable_r
   - the problem isn't related to this warning IMHO.  Even
switch (x)
  {
  case 1:
int x;
x = 6;
ptr = &x;
break;
  case 2:
ptr = &x;
*ptr = 7;
break;
  }
   has the same issue and there is no switch unreachable code there, but you
   still want for -fsanitize-use-after-scope pretend it is actually:
x_tmp = x;
{
  int x;
  switch (x_tmp)
{
case 1:
  x = 6;
  ptr = &x;
  break;
case 2:
  ptr = &x;
  *ptr = 7;
  break;
}
}
and put ASAN_MARK unpoisoning before GIMPLE_SWITCH.

Jakub

Re: [patch,testsuite] Support dg-require-effective-target label_offsets.

2016-11-04 Thread Georg-Johann Lay


On 04.11.2016 03:48, Mike Stump wrote:

On Nov 3, 2016, at 8:25 AM, Georg-Johann Lay  wrote:


On 28.10.2016 17:58, Mike Stump wrote:

On Oct 27, 2016, at 3:16 AM, Georg-Johann Lay  wrote:


Now imagine some arithmetic like &&LAB2 - &&LAB1.  This might result in
one or two stub addresses, and difference between such addresses is a
complete different thing than the difference between the original labels:
The result might differ in absolute value and in sign, i.e. you can't even
determine whether LAB1 or LAB2 comes first in the generated code as the
order of stubs might differ from the order of respective labels.


So, I think this all doesn't matter any.  Every address gs(LAB) fits in
16-bits by definition, and every gs(LAB1) - gs(LAB2) fits into 16 bits and


Yes, you are right.  Label differences could be implemented.  AFAIK there is 
currently no activity by the Binutils folks to add respective support, so it is 
somewhat pointless to add that support to avr-gcc...



[ pause, built an avr compiler ] So, I checked code-gen for 
gcc.c-torture/compile/labels-2.c and it looked fine:

ldi r24,lo8(gs(.L2))
ldi r25,hi8(gs(.L2))
std Y+2,r25
std Y+1,r24
ldi r18,lo8(gs(.L3))
ldi r19,hi8(gs(.L3))
ldi r24,lo8(gs(.L4))
ldi r25,hi8(gs(.L4))
mov r20,r18
mov r21,r19
sub r20,r24

So, apparently quite a lot of the code-gen is already wired up.  Since this 
generated code, I wasn't sure what error you're trying to hide?  I tried all 
the test cases your mentioned, none failed to produce code or failed to 
assemble that code with -mmcu=atmega2560 nor any other cpu I could find, so, 
trivially, I must not quite understand what doesn't work yet.

I did notice that gcc.c-torture/execute/pr70460.c produced:

.word   .L3-(.L2)

which seems wrong, given all the other code-gen.  I think this has to be 
gs(.L3)-(gs(.L2)), no?  I tried to get binutils to do that for me directly with 
a .word and it seemed resistant; which is unfortunate.


Well, the GCC specification on void* arithmetic requires that void is handled 
as if it has a size of one, and .L3-.L2 meets that requirement.  As code 
addresses on avr are word addresses, not byte addresses, the above code will 
crash.  A working expression would be (.L3-.L2)/2 but 1) this conflicts with 
the gcc specification 2) it does not work with stubs 3) the linker does not 
handle this correctly if relaxation is on.


All the label-arithmetic test cases in GCC testsuite don't actually need stub 
generation because the binaries are not big enough.  Using gs, the expression 
would be gs(.L3)-(gs(.L2)) as gs implements word addresses, not byte addresses.


This also means the problem is not only present with 3-byte PC devises but also 
for smaller ones.




If you consider the above to be a code-gen bug, then you can do something like:

Index: config/avr/avr.c
===
--- config/avr/avr.c(revision 241837)
+++ config/avr/avr.c(working copy)
@@ -9422,6 +9422,21 @@
   x = plus_constant (Pmode, x, AVR_TINY_PM_OFFSET);
 }

+  /* We generate stubs using gs(), but binutils doesn't support that
+ here.  */
+  if (AVR_3_BYTE_PC
+  && GET_CODE (x) == MINUS
+  && GET_CODE (XEXP (x, 0)) == LABEL_REF
+  && GET_CODE (XEXP (x, 1)) == LABEL_REF)
+return false;
+  if (AVR_3_BYTE_PC
+  && GET_CODE (x) == MINUS
+  && GET_CODE (XEXP (x, 0)) == SUBREG
+  && GET_CODE (XEXP (XEXP (x, 0), 0)) == LABEL_REF
+  && GET_CODE (XEXP (x, 1)) == SUBREG
+  && GET_CODE (XEXP (XEXP (x, 1), 0)) == LABEL_REF)
+return false;
+
   return default_assemble_integer (x, size, aligned_p);
 }

to get the compiler to error out to prevent bad code-gen that binutils can't 
handle.  If you want, you can also declare by fiat that subtraction might be 
performed on the stub or the actual function, and so any code that does this 
isn't supported (runtime unspecified behavior), and then you just mark the 
execute testcase as bad (not supportable on avr, since avr blew the binutils 
support for gs).  The compile only test cases work fine now, so nothing needs 
to be done for them.


ICE is not a nice approach, IMO.  I'd prefer some more graceful error using 
code locations of the labels (which are not available).




If you go for the above and make it a hard error, then a patch in the style of 
the original patch would be fine, but please just mark the ones that fail and 
please key of avr*, as this is an avr-only misfeature.  If avr did it right, 
they would complete gs() support so that gs(LAB)-gs(LAB) works.  gcc has a 
fetish for + and - working with labels and constants.

The completely different solution would be to never use gs(), and generate code 
that is 3 byte aware, but, I suspect you're unlikely to want to do that.


That's the worst solution of all because it changes the ABI, leads to code 
bloat, and many more issues.  And

Re: [patch,testsuite] Support dg-require-effective-target label_offsets.

2016-11-04 Thread Georg-Johann Lay


On 04.11.2016 06:18, Senthil Kumar Selvaraj wrote:


Georg-Johann Lay writes:

State of matters is that Binutils support is missing, and if I understand you
correctly, dg-require is not appropriate to factor out availability of such
features?


I'll take a stab at adding the missing binutils support for avr label diffs.


Thanks for taking care of this.  I had a look into it but got stuck with a test 
case derived from gcc.c-torture/execute/pr70460.c


int c;

__attribute__((noinline, noclone)) void
call (void)
{
  __asm ("nop");
}

__attribute__((noinline, noclone)) void
foo (int x)
{
  static int b[] = { &&lab1 - &&lab0, &&lab2 - &&lab0 };
  void *a = &&lab0 + b[x];
  goto *a;
lab1:
  c += 2;
  call();
lab2:
  c++;
lab0:
  ;
}

int
main ()
{
  foo (0);
  if (c != 3)
__builtin_abort ();
  foo (1);
  if (c != 4)
__builtin_abort ();
  return 0;
}


The problem is when relaxation is turned on and the CALL is shortened to  RCALL 
but respective literals are not fixed up.


Johann


Regards
Senthil

Re: [PATCH] Make direct emission of time profiler counter

2016-11-04 Thread Martin Liška

On 11/04/2016 10:18 AM, Christophe Lyon wrote:
> On 3 November 2016 at 16:11, Jan Hubicka  wrote:
>>>
>>> 2016-10-31  Martin Liska  
>>>
>>>   * libgcov-profiler.c (__gcov_time_profiler): Remove.
>>>   (__gcov_time_profiler_atomic): Likewise.
>>>
>>> gcc/ChangeLog:
>>>
>>> 2016-10-31  Martin Liska  
>>>
>>>   * profile.c (instrument_values): Fix coding style.
>>>   (branch_prob): Use renamed function.
>>>   * tree-profile.c (init_ic_make_global_vars): Likewise.
>>>   (gimple_init_edge_profiler): Rename to
>>>   gimple_init_gcov_profiler.
>>>   tree_time_profiler_counter variable declaration.
>>>   (gimple_gen_time_profiler): Rewrite to do a direct gimple code
>>>   emission.
>>>   * value-prof.h: Remove an argument.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> 2016-11-03  Martin Liska  
>>>
>>>   * gcc.dg/no_profile_instrument_function-attr-1.c: Update scanned
>>>   output.
>>>   * gcc.dg/tree-prof/time-profiler-3.c: New test.
>>
>> OK,
>> Thanks!
>> Honza
> 
> Hi,
> 
> It seems this patch causes an ICE when compiling
> gcc.dg/gomp/pr27573.c
> for instance on arm-linux-gnueabi --wtih-cpu=cortex-a9

Hello.

Sorry for the breakage, I'm attaching untested patch which fixes that.
I'm going to trigger regression tests.

Martin

> 
> The backtrace is:
> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/testsuite/gcc.dg/gomp/pr27573.c:
> In function 'main._omp_fn.0':
> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/testsuite/gcc.dg/gomp/pr27573.c:12:9:
> internal compiler error: in convert_memory_address_addr_sp
> ace_1, at explow.c:284
> 0x79b19e convert_memory_address_addr_space_1(machine_mode, rtx_def*,
> unsigned char, bool, bool)
> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/explow.c:284
> 0x670903 get_builtin_sync_mem
> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/builtins.c:4933
> 0x671fef expand_builtin_atomic_fetch_op
> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/builtins.c:5452
> 0x674aa4 expand_builtin(tree_node*, rtx_def*, rtx_def*, machine_mode, int)
> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/builtins.c:6859
> 0x7b729a expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
> expand_modifier, rtx_def**, bool)
> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:10773
> 0x7c1460 store_expr_with_bounds(tree_node*, rtx_def*, int, bool, bool,
> tree_node*)
> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:5551
> 0x7c7451 expand_assignment(tree_node*, tree_node*, bool)
> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:5317
> 0x69a0de expand_call_stmt
> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cfgexpand.c:2666
> 0x69b264 expand_gimple_stmt_1
> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cfgexpand.c:3581
> 0x69b264 expand_gimple_stmt
> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cfgexpand.c:3747
> 0x69cc8a expand_gimple_basic_block
> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cfgexpand.c:5754
> 0x69fefe execute
> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cfgexpand.c:6368
> Please submit a full bug report,
> 
> Christophe
> 

>From 531392d44eb195bd39cb49a169047f5bd898242f Mon Sep 17 00:00:00 2001
From: marxin 
Date: Fri, 4 Nov 2016 11:12:06 +0100
Subject: [PATCH] time_profiler: Set proper type to time_profiler_counter_ptr.

gcc/ChangeLog:

2016-11-04  Martin Liska  

	* tree-profile.c (gimple_gen_time_profiler): Set proper type
	to time_profiler_counter_ptr.
---
 gcc/tree-profile.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/gcc/tree-profile.c b/gcc/tree-profile.c
index 09a702f..d18b954 100644
--- a/gcc/tree-profile.c
+++ b/gcc/tree-profile.c
@@ -489,8 +489,9 @@ gimple_gen_time_profiler (unsigned tag, unsigned base)
   /* Emit: counters[0] = ++__gcov_time_profiler_counter.  */
   if (flag_profile_update == PROFILE_UPDATE_ATOMIC)
 {
-  tree ptr = make_temp_ssa_name (type, NULL, "time_profiler_counter_ptr");
-  tree addr = build1 (ADDR_EXPR, build_pointer_type (type),
+  tree ptr = make_temp_ssa_name (build_pointer_type (type), NULL,
+ "time_profiler_counter_ptr");
+  tree addr = build1 (ADDR_EXPR, TREE_TYPE (ptr),
 			  tree_time_profiler_counter);
   gassign *assign = gimple_build_assign (ptr, NOP_EXPR, addr);
   gsi_insert_before (&gsi, assign, GSI_NEW_STMT);
-- 
2.10.1

Re: [PATCH] Convert SPARC to LRA

2016-11-04 Thread Sebastian Huber

On 08/12/15 17:55, David Miller wrote:

From: Sebastian Huber 
Date: Tue, 8 Dec 2015 11:17:53 +0100

since the LRA patch is still reverted on the trunk, I guess the
switch to LRA will not happen in GCC 6?

Indeed, it is unlikely I will have time to work on this for at
least a month.

Are there any plans to switch the SPARC GCC to LRA for GCC 7 or later?

--
Sebastian Huber, embedded brains GmbH

Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone   : +49 89 189 47 41-16
Fax : +49 89 189 47 41-09
E-Mail  : sebastian.hu...@embedded-brains.de
PGP : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.

Re: [PATCH, RFC] Introduce -fsanitize=use-after-scope (v2)

2016-11-04 Thread Martin Liška

On 11/04/2016 10:32 AM, Jakub Jelinek wrote:
> Hi!
> 
> On Fri, Nov 04, 2016 at 10:17:31AM +0100, Martin Liška wrote:
>> diff --git a/gcc/gimplify.c b/gcc/gimplify.c
>> index 813777d..86ce793 100644
>> --- a/gcc/gimplify.c
>> +++ b/gcc/gimplify.c
>> @@ -1678,7 +1678,9 @@ warn_switch_unreachable_r (gimple_stmt_iterator 
>> *gsi_p, bool *handled_ops_p,
>>   worse location info.  */
>>if (gimple_try_eval (stmt) == NULL)
>>  {
>> -  wi->info = stmt;
>> +  gimple_stmt_iterator *it = XNEW (gimple_stmt_iterator);
>> +  memcpy (it, gsi_p, sizeof (gimple_stmt_iterator));
> 
> That would be cleaner as *it = *gsi_p;

I know that it's kind of ugly, but as the gimple stmt is not yet added to a BB, 
using
gsi_for_stmt ICEs:

/tmp/use-after-scope-switch.c:12:5: internal compiler error: Segmentation fault
 switch (argc)
 ^~
0xe16a14 crash_signal
../../gcc/toplev.c:338
0xadf890 bb_seq_addr
../../gcc/gimple.h:1658
0xae01dd gsi_start_bb
../../gcc/gimple-iterator.h:129
0xae111f gsi_for_stmt(gimple*)
../../gcc/gimple-iterator.c:617

> That set, I fail to see
> 1) the need to use a gsi pointer in wi->info compared to stmt itself,
>you can gsi_for_stmt cheaply at any time
> 2) why is anything done about this in warn_switch_unreachable_r
>- the problem isn't related to this warning IMHO.  Even
> switch (x)
>   {
>   case 1:
> int x;
> x = 6;
> ptr = &x;
> break;
>   case 2:
> ptr = &x;
> *ptr = 7;
> break;
>   }
>has the same issue and there is no switch unreachable code there, but you
>still want for -fsanitize-use-after-scope pretend it is actually:

You're right, that's not handled. I'm wondering whether it's always profitable
because when you do not reach the case 1, you're not doing a poisoning 
operation?

Martin

> x_tmp = x;
> {
>   int x;
>   switch (x_tmp)
> {
> case 1:
>   x = 6;
>   ptr = &x;
>   break;
> case 2:
>   ptr = &x;
>   *ptr = 7;
>   break;
> }
> }
> and put ASAN_MARK unpoisoning before GIMPLE_SWITCH.
> 
>   Jakub
>

Re: [PATCH] Generate reproducible output independently of the build-path

2016-11-04 Thread Ximin Luo

Mike Stump:
> On Nov 3, 2016, at 1:01 PM, Ximin Luo  wrote:
>> Log snippets attached.
> 
> Ick.  You missed the utility of contrib/compare_tests.  :-)
> 
> If you say:
> 
>   contrib/compare_tests oldbuilddir newbuilddir
> 
> You can then sit back and see everything as you'd expect and want.  The 
> output is priority sorted and usually around 0 lines line or so.
> 

Oh, thanks! Here is the output:

~/reproducible/gcc-7-20161030$ contrib/compare_tests ../gcc-build-0/ 
../gcc-build-1/
# Comparing directories
## Dir1=../gcc-build-0/: 8 sum files
## Dir2=../gcc-build-1/: 8 sum files

# Comparing 8 common sum files
## /bin/sh contrib/compare_tests  /tmp/gxx-sum1.9339 /tmp/gxx-sum2.9339
New tests that PASS:

gcc.dg/cpp/source_date_epoch-4.c (test for excess errors)
gcc.dg/cpp/source_date_epoch-4.c execution test
gcc.dg/cpp/source_date_epoch-5.c (test for excess errors)
gcc.dg/cpp/source_date_epoch-5.c execution test
gcc.dg/cpp/source_prefix_map-1.c (test for excess errors)
gcc.dg/cpp/source_prefix_map-1.c execution test
gcc.dg/cpp/source_prefix_map-2.c (test for excess errors)
gcc.dg/cpp/source_prefix_map-2.c execution test
gcc.dg/debug/dwarf2/source_prefix_map-1.c (test for excess errors)
gcc.dg/debug/dwarf2/source_prefix_map-1.c scan-assembler-dem DW_AT_comp_dir: 
"DWARF2TEST/gcc
gcc.dg/debug/dwarf2/source_prefix_map-2.c (test for excess errors)
gcc.dg/debug/dwarf2/source_prefix_map-2.c scan-assembler DW_AT_comp_dir: "/

# No differences found in 8 common sum files


-- 
GPG: ed25519/56034877E1F87C35
GPG: rsa4096/1318EFAC5FBBDBCE
https://github.com/infinity0/pubkeys.git

Re: [RFC] Check number of uses in simplify_cond_using_ranges().

2016-11-04 Thread Dominik Vogt

On Fri, Nov 04, 2016 at 09:47:26AM +0100, Richard Biener wrote:
> On Thu, Nov 3, 2016 at 4:03 PM, Dominik Vogt  wrote:
> > Is VRP the right pass to do this optimisation or should a later
> > pass rather attempt to eliminate the new use of b_5 instead?  Uli
> > has brought up the idea a mini "sign extend elimination" pass that
> > checks if the result of a sign extend could be replaced by the
> > original quantity in all places, and if so, eliminate the ssa
> > name.  (I guess that won't help with the above code because l is
> > used also as a function argument.)  How could a sensible approach
> > to deal with the situation look like?
> 
> We run into this kind of situation regularly and for general foldings
> in match.pd we settled with single_use () even though it is not perfect.
> Note the usual complaint is not extra extension instructions but
> the increase of register pressure.
> 
> This is because it is hard to do better when you are doing local
> optimization.
> 
> As for the question on whether VRP is the right pass to do this the
> answer is two-fold -- VRP has the most precise range information.
> But the folding itself should be moved to generic code and use
> get_range_info ().

All right, is this a sensible approach then?

  1. Using has_single_use as in the experimental patch is good
 enough (provided further testing does not show serious
 regressions).
  2. Rip the whole top level if-block from simplify_cond_using_ranges().
  3. Translate that code to match.pd syntax.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany

RE: [PATCH] [ARC] New option handling, refurbish multilib support.

2016-11-04 Thread Claudiu Zissulescu

Andrew,

> you create the TargetVariable arc_mpy_option, however, I think it
> would be neater to fold this variable into the mmpy-option as it was
> before, but, changing the type to Enum.  This would allow the big
> option checking switch to be removed from arc-common.c, which I think
> is a win overall.
> 
> I wasn't 100% certain that the above would actually be possible, so I
> put together a prototype, I've included the patch below, that applies
> on top of your patch.  Hopefully you'll agree that it's a nice clean
> up.

Thanks for the suggestion.

> I wonder where the vale 0x1000 comes from, what's the significance
> of it.  I could ask the same question about the magic -1 constant, but
> it's rather more obvious that -1 is just a no-value-selected magic
> number.  I guess my questions for 0x1000 are why this specific
> value?  What does it mean?

I need a value to mark if the variable in question is changed by a command line 
option or not. This is required when we set the cpu's specific configuration. 
I'll include this comment in the description of the define for clarity.
 
> and also the driver-arc.c file.  You seem to be missing support for
> nps400 in here.  Specifically, if I pass -mcpu=nps400 to GCC I'd
> expect the generated assembler file to include ".cpu NPS400", and the
> assembler to be driven with "-mcpu=nps400".

This patch was crafted way before the new ARC binutils to be upstreamed. Hence, 
it needed to work with the "old" binutils which had no support for nps400 :)
 
> Admittedly we're missing a GCC test for this (there's a patch below
> for just such a new test).
>
> I think the solution could be fairly easy, if we tracked the specific
> processor type in arc_cpu_t structure we could specialise in
> arc_override_options and in driver-arc.c, though I don't know if you'd
> agree that this is the right approach... I'm not entirely sure
> myself... but it might be the easiest approach to move us forward.
> Anyway, I include a patch for that below too, feel free to use or not
> as you see fit.

I am preparing a patch to binutils which should be able to recognize all GCC's 
cpu variations as valid .cpu pseudo-op argument. Hence, we can just pass to the 
assembler the appropriate .cpu variation seamlessly. If it is ok with you, I 
would like to come with a new patch on this topic after I extend .cpu support 
in binutils.  Alternatively, I can include your suggestion now, but then I will 
refurbish it later on. Please let me know what do u fancy most.

Thanks,
Claudiu

[Patch, fortran] PR64933 - ASSOCIATE on a character variable does not allow substring expressions

2016-11-04 Thread Paul Richard Thomas

Dear All,

The associate construct does not readily permit the identification of
the associate name as an array during parsing. However, this can be
done whilst matching rvalues within the associate block. This patch
extends this identification in gfc_match_varspec, whilst excluding
character types from the present test. This latter change prevents
scalar substrings from being pegged as array references, which was the
origin of this bug.

Bootstraps and regtests on FC21/x86_64.

OK for trunk?

Paul

2016-04-19  Paul Thomas  

PR fortran/64933
* primary.c (gfc_match_varspec): If selector expression is
unambiguously an array, make sure that the associate name
is an array and has an array spec. Modify the original
condition for doing this to exclude character types.

2016-04-19  Paul Thomas  

PR fortran/64933
* gfortran.dg/associate_23.f90: New test.


-- 
The difference between genius and stupidity is; genius has its limits.

Albert Einstein
Index: gcc/fortran/primary.c
===
*** gcc/fortran/primary.c   (revision 241831)
--- gcc/fortran/primary.c   (working copy)
*** gfc_match_varspec (gfc_expr *primary, in
*** 1931,1945 
  }
  
/* For associate names, we may not yet know whether they are arrays or not.
!  Thus if we have one and parentheses follow, we have to assume that it
!  actually is one for now.  The final decision will be made at
!  resolution time, of course.  */
!   if (sym->assoc && gfc_peek_ascii_char () == '('
!   && !(sym->assoc->dangling && sym->assoc->st
   && sym->assoc->st->n.sym
!  && sym->assoc->st->n.sym->attr.dimension == 0)
!   && sym->ts.type != BT_CLASS)
  sym->attr.dimension = 1;
  
if ((equiv_flag && gfc_peek_ascii_char () == '(')
|| gfc_peek_ascii_char () == '[' || sym->attr.codimension
--- 1931,1966 
  }
  
/* For associate names, we may not yet know whether they are arrays or not.
!  If the selector expression is unambiguously an array; eg. a full array
!  or an array section, then the associate name must be an array and we can
!  fix it now. Otherwise, if parentheses follow and it is not a character
!  type, we have to assume that it actually is one for now.  The final
!  decision will be made at resolution, of course.  */
!   if (sym->assoc
!   && gfc_peek_ascii_char () == '('
!   && sym->ts.type != BT_CLASS
!   && !sym->attr.dimension)
! {
!   if ((!sym->assoc->dangling
!  && sym->assoc->target
!  && sym->assoc->target->ref
!  && sym->assoc->target->ref->type == REF_ARRAY
!  && (sym->assoc->target->ref->u.ar.type == AR_FULL
!  || sym->assoc->target->ref->u.ar.type == AR_SECTION))
! ||
!  (!(sym->assoc->dangling || sym->ts.type == BT_CHARACTER)
!   && sym->assoc->st
   && sym->assoc->st->n.sym
!   && sym->assoc->st->n.sym->attr.dimension == 0))
!   {
  sym->attr.dimension = 1;
+ if (sym->as == NULL && sym->assoc
+ && sym->assoc->st
+ && sym->assoc->st->n.sym
+ && sym->assoc->st->n.sym->as)
+   sym->as = gfc_copy_array_spec (sym->assoc->st->n.sym->as);
+   }
+ }
  
if ((equiv_flag && gfc_peek_ascii_char () == '(')
|| gfc_peek_ascii_char () == '[' || sym->attr.codimension
Index: gcc/testsuite/gfortran.dg/associate_23.f90
===
*** gcc/testsuite/gfortran.dg/associate_23.f90  (revision 0)
--- gcc/testsuite/gfortran.dg/associate_23.f90  (working copy)
***
*** 0 
--- 1,36 
+ ! { dg-do run }
+ !
+ ! Tests the fix for PR64933
+ !
+ ! Contributed by Olivier Marsden  
+ !
+ program test_this
+   implicit none
+   character(len = 15) :: char_var, char_var_dim (3)
+   character(len = 80) :: buffer
+ 
+ ! Original failing case reported in PR
+   ASSOCIATE(should_work=>char_var)
+ should_work = "test succesful"
+ write (buffer, *) should_work(5:14)
+   END ASSOCIATE
+ 
+   if (trim (buffer) .ne. "  succesful") call abort
+ 
+ ! Found to be failing during debugging
+   ASSOCIATE(should_work=>char_var_dim)
+ should_work = ["test SUCCESFUL", "test_SUCCESFUL", "test.SUCCESFUL"]
+ write (buffer, *) should_work(:)(5:14)
+   END ASSOCIATE
+ 
+   if (trim (buffer) .ne. "  SUCCESFUL_SUCCESFUL.SUCCESFUL") call abort
+ 
+ ! Found to be failing during debugging
+   ASSOCIATE(should_work=>char_var_dim(1:2))
+ should_work = ["test SUCCESFUL", "test_SUCCESFUL", "test.SUCCESFUL"]
+ write (buffer, *) should_work(:)(5:14)
+   END ASSOCIATE
+ 
+   if (trim (buffer) .ne. "  SUCCESFUL_SUCCESFUL") call abort
+ 
+ end program

Re: [PATCH] Make direct emission of time profiler counter

2016-11-04 Thread Martin Liška

On 11/04/2016 11:14 AM, Martin Liška wrote:
> Hello.
> 
> Sorry for the breakage, I'm attaching untested patch which fixes that.
> I'm going to trigger regression tests.
> 
> Martin

Regression tests have finished on ppc64le-redhat-linux.

Ready to be installed?
Martin

Re: [PATCH] Convert SPARC to LRA

2016-11-04 Thread Eric Botcazou

> Are there any plans to switch the SPARC GCC to LRA for GCC 7 or later?

There are plans to switch SPARC to LRA at some point, but no ETA.

-- 
Eric Botcazou

RE: [PATCH] [ARC] Various small miscellaneous fixes.

2016-11-04 Thread Claudiu Zissulescu

> All the rest looks good.
> 

Committed with the suggested changes.

Thank you for your review,
Claudiu

[PATCH] Fix DSE not to consider calls as reads from function's body (PR target/77834)

2016-11-04 Thread Jakub Jelinek

Hi!

One of the common reasons why nonoverlapping_memrefs_p is called with
MEM_EXPRs that are FUNCTION_DECLs and don't have DECL_RTL set is that
DSE considers all calls to be memory reads from what is being called.

The following patch avoids that.  Of course the question is if we want
to support something like:
char var[64];
...
var[0] = '0xe8';
var[1] = ...;
(void (*fn) (void) = (void (*) (void)) &var[0];
fn ();
without any kind of barrier, or not.
If we do, then the patch below might break it by DSEing the stores.
So, do we want the patch as is, or just limit it to direct function calls
(call (mem (symbol_ref))), or only those where the symbol is actually
a FUNCTION_DECL?  Or do we want to consider arbitrary pointers possibly
writing into function function text?

In any case, the second hunk fixes an important DSE bug that just got
revealed by the former change.

2016-11-04  Jakub Jelinek  

PR target/77834
* dse.c (check_mem_read_use): For CALL with MEM as first operand,
ignore that MEM, but recurse on the MEM's operand if not SYMBOL_REF.
(dse_step5): Call scan_reads even if just insn_info->frame_read.
Improve and fix dump file messages.

--- gcc/dse.c.jj2016-11-03 15:52:12.521965058 +0100
+++ gcc/dse.c   2016-11-04 09:42:27.640098125 +0100
@@ -2159,6 +2159,19 @@ check_mem_read_use (rtx *loc, void *data
   rtx *loc = *iter;
   if (MEM_P (*loc))
check_mem_read_rtx (loc, (bb_info_t) data);
+  else if (GET_CODE (*loc) == CALL && MEM_P (XEXP (*loc, 0)))
+   {
+ /* Ignore the MEM in first operand of CALL, that isn't
+any kind of read of the function text memory.  */
+ iter.skip_subrtxes ();
+ /* But for indirect function calls consider the MEM
+containing the function pointer if any.  Avoid recursing
+in the most common case.  */
+ if (GET_CODE (XEXP (XEXP (*loc, 0), 0)) != SYMBOL_REF)
+   check_mem_read_use (&XEXP (XEXP (*loc, 0), 0), data);
+ if (!CONST_INT_P (XEXP (*loc, 1)))
+   check_mem_read_use (&XEXP (*loc, 1), data);
+   }
 }
 }
 
@@ -3298,12 +3311,19 @@ dse_step5 (void)
  bitmap_clear (v);
}
  else if (insn_info->read_rec
-   || insn_info->non_frame_wild_read)
+  || insn_info->non_frame_wild_read
+  || insn_info->frame_read)
{
- if (dump_file && !insn_info->non_frame_wild_read)
-   fprintf (dump_file, "regular read\n");
-  else if (dump_file && (dump_flags & TDF_DETAILS))
-   fprintf (dump_file, "non-frame wild read\n");
+ if (dump_file && (dump_flags & TDF_DETAILS))
+   {
+ if (!insn_info->non_frame_wild_read
+ && !insn_info->frame_read)
+   fprintf (dump_file, "regular read\n");
+ if (insn_info->non_frame_wild_read)
+   fprintf (dump_file, "non-frame wild read\n");
+ if (insn_info->frame_read)
+   fprintf (dump_file, "frame read\n");
+   }
  scan_reads (insn_info, v, NULL);
}
}

Jakub

[PATCH] Fix sched-deps not to consider calls as reads from function's body (PR target/77834)

2016-11-04 Thread Jakub Jelinek

Hi!

Similarly to the previous patch, this one is the second spot
where nonoverlapping_memrefs_p is called with FUNCTION_DECLs
without DECL_RTL.  With both patches in the only case
this happens in bootstrap/regtest is the pr77834.c testcase.

Same questions as for the DSE patch apply.

2016-11-04  Jakub Jelinek  

PR target/77834
* sched-deps.c (sched_analyze_2): Don't recurse in CALL
first argument if it is a MEM, just on its address.

--- gcc/sched-deps.c.jj 2016-10-31 13:28:12.0 +0100
+++ gcc/sched-deps.c2016-11-04 09:59:28.192249487 +0100
@@ -2808,6 +2808,19 @@ sched_analyze_2 (struct deps_desc *deps,
 
   return;
 
+case CALL:
+  /* Don't consider MEM in CALL's first argument
+ as a memory read, that is the first byte of
+ the function to be called.  Only look at its
+ address.  */
+  if (MEM_P (XEXP (x, 0)))
+   {
+ sched_analyze_2 (deps, XEXP (XEXP (x, 0), 0), insn);
+ sched_analyze_2 (deps, XEXP (x, 1), insn);
+ return;
+   }
+  break;
+
 default:
   break;
 }

Jakub

Re: [PATCH] Fix DSE not to consider calls as reads from function's body (PR target/77834)

2016-11-04 Thread Bernd Schmidt


On 11/04/2016 01:26 PM, Jakub Jelinek wrote:


The following patch avoids that.  Of course the question is if we want
to support something like:
char var[64];
...
var[0] = '0xe8';
var[1] = ...;
(void (*fn) (void) = (void (*) (void)) &var[0];
fn ();
without any kind of barrier, or not.


I was going to say no, but trampolines come to mind as a situation where 
we might come close to the above.



Bernd

Simplify X /[ex] 8 == 0

2016-11-04 Thread Marc Glisse


Hello,

this kind of simplification is already handled by fold_comparison, but 
the code is common with TRUNC_DIV_EXPR, which yields suboptimal code. In 
particular, for an unsigned number, X/8==0 means x<=7, while X/[ex]8==0 
can be turned into X==0.


When we have an explicit division by 0, there is a better optimisation 
possible (insert __builtin_unreachable() or __builtin_trap() after that 
statement, as in find_explicit_erroneous_behavior), so I don't touch it.


For the overflow inequality case, it would have been a bit clearer to 
generate

  (cmp { build_zero_cst (TREE_TYPE (@0)); } @2)
and let that be constant folded instead of having that ugly and 
error-prone test in constant_boolean_node, but I saved one tree ;-)


Bootstrap+regtest on powerpc64le-unknown-linux-gnu, all the regressions 
are in the libitm part of the testsuite, they should be fixed by 
https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02220.html , I'll rerun the 
testsuite when that patch is in.


2016-11-07  Marc Glisse  

gcc/
* fold-const.c (fold_comparison): Ignore EXACT_DIV_EXPR.
* match.pd (A /[ex] B CMP C): New simplifications.

gcc/testsuite/
* gcc.dg/tree-ssa/cmpexactdiv.c: New file.

--
Marc GlisseIndex: gcc/fold-const.c
===
--- gcc/fold-const.c	(revision 241840)
+++ gcc/fold-const.c	(working copy)
@@ -8740,22 +8740,21 @@ fold_comparison (location_t loc, enum tr
 		  SET_EXPR_LOCATION (tem, loc);
 		  return tem;
 		}
 	  return fold_build2_loc (loc, code, type, cval1, cval2);
 	}
 	}
 }
 
   /* We can fold X/C1 op C2 where C1 and C2 are integer constants
  into a single range test.  */
-  if ((TREE_CODE (arg0) == TRUNC_DIV_EXPR
-   || TREE_CODE (arg0) == EXACT_DIV_EXPR)
+  if (TREE_CODE (arg0) == TRUNC_DIV_EXPR
   && TREE_CODE (arg1) == INTEGER_CST
   && TREE_CODE (TREE_OPERAND (arg0, 1)) == INTEGER_CST
   && !integer_zerop (TREE_OPERAND (arg0, 1))
   && !TREE_OVERFLOW (TREE_OPERAND (arg0, 1))
   && !TREE_OVERFLOW (arg1))
 {
   tem = fold_div_compare (loc, code, type, arg0, arg1);
   if (tem != NULL_TREE)
 	return tem;
 }
Index: gcc/match.pd
===
--- gcc/match.pd	(revision 241840)
+++ gcc/match.pd	(working copy)
@@ -2314,20 +2314,50 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 	   (ne @0 { build_real (TREE_TYPE (@0), c2); }
 	/* sqrt(x) < c is the same as x < c*c, if we ignore NaNs.  */
 	(if (! HONOR_NANS (@0))
 	 (cmp @0 { build_real (TREE_TYPE (@0), c2); })
 	 /* sqrt(x) < c is the same as x >= 0 && x < c*c.  */
 	 (if (GENERIC)
 	  (truth_andif
 	   (ge @0 { build_real (TREE_TYPE (@0), dconst0); })
 	   (cmp @0 { build_real (TREE_TYPE (@0), c2); }
 
+/* Fold A /[ex] B CMP C to A CMP B * C.  */
+(for cmp (eq ne)
+ (simplify
+  (cmp (exact_div @0 @1) INTEGER_CST@2)
+  (if (!integer_zerop (@1))
+   (if (wi::eq_p (@2, 0))
+(cmp @0 @2)
+(if (TREE_CODE (@1) == INTEGER_CST)
+ (with
+  {
+	bool ovf;
+	wide_int prod = wi::mul (@2, @1, TYPE_SIGN (TREE_TYPE (@1)), &ovf);
+  }
+  (if (ovf)
+   { constant_boolean_node (cmp == NE_EXPR, type); }
+   (cmp @0 { wide_int_to_tree (TREE_TYPE (@0), prod); }
+(for cmp (lt le gt ge)
+ (simplify
+  (cmp (exact_div @0 INTEGER_CST@1) INTEGER_CST@2)
+  (if (wi::gt_p (@1, 0, TYPE_SIGN (TREE_TYPE (@1
+   (with
+{
+  bool ovf;
+  wide_int prod = wi::mul (@2, @1, TYPE_SIGN (TREE_TYPE (@1)), &ovf);
+}
+(if (ovf)
+ { constant_boolean_node (wi::lt_p (@2, 0, TYPE_SIGN (TREE_TYPE (@2)))
+			  != (cmp == LT_EXPR || cmp == LE_EXPR), type); }
+ (cmp @0 { wide_int_to_tree (TREE_TYPE (@0), prod); }))
+
 /* Unordered tests if either argument is a NaN.  */
 (simplify
  (bit_ior (unordered @0 @0) (unordered @1 @1))
  (if (types_match (@0, @1))
   (unordered @0 @1)))
 (simplify
  (bit_and (ordered @0 @0) (ordered @1 @1))
  (if (types_match (@0, @1))
   (ordered @0 @1)))
 (simplify
Index: gcc/testsuite/gcc.dg/tree-ssa/cmpexactdiv.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/cmpexactdiv.c	(revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/cmpexactdiv.c	(working copy)
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-optimized" } */
+
+int f(int *p, int *q){
+  __SIZE_TYPE__ n = q - p;
+  return n == 0;
+}
+
+int g(int *p, int *q){
+  __PTRDIFF_TYPE__ n = q - p;
+  return n <= 2;
+}
+
+int h(long *p, long *q){
+  __SIZE_TYPE__ n = q - p;
+  return n == (__SIZE_TYPE__)(-1)/2;
+}
+
+/* { dg-final { scan-tree-dump-not "== 0" "optimized" } } */
+/* { dg-final { scan-tree-dump "<= 8" "optimized" { target { int32 } } } } */
+/* { dg-final { scan-tree-dump "return 0" "optimized" } } */

Re: [PATCH] Fix wrong declarations of builtin functions

2016-11-04 Thread Bernd Schmidt


On 11/03/2016 10:52 PM, Bernd Edlinger wrote:

I thought in preparation of the new C++ warning about wrong
declarations of builtin functions it would be good to fix some of the
more obvious bugs in the testsuite first.


Bootstrapped and reg-tested on x86_64-pc-linux-gnu.


Ok.


Bernd

Re: [PATCH GCC]Clean pedantic calls and useless lvalue code in fold_cond_expr_with_comparison

2016-11-04 Thread Jason Merrill

On Fri, Nov 4, 2016 at 4:40 AM, Richard Biener
 wrote:
> On Thu, Nov 3, 2016 at 3:11 PM, Bin Cheng  wrote:
>> Hi,
>> According to analysis given by 
>> https://gcc.gnu.org/ml/gcc/2016-10/msg00228.html, calls to 
>> pedantic_non_lvalue_loc and code handling lvalue in 
>> fold_cond_expr_with_comparison are useless now.  Given this is complicated 
>> legacy code, it may be better to change code step by step, rather than doing 
>> this cleanup together with moving simplification from 
>> fold_cond_expr_with_comparison to match.pd.
>> BTW, after last cleanup of pedantic_lvalues, function 
>> pedantic_non_lvalue_loc now has nothing to do with lvalue.  It could be 
>> further cleaned up, or at least renamed into something else.  This patch 
>> doesn't do that because that depends on the answer to the question of the 
>> aforementioned message.
>>
>> Bootstrap and test on x86_64 and AArch64.  Any comments?
>
> Ok.
>
> Note removal of [pedantic_]non_lvalue can at most result in accepting
> invalid code where we might not have any testsuite coverage.  For the 2nd 
> case with
> /* Avoid adding NOP_EXPRs in case this is an lvalue.  */ and C++ lvalue ?:
> I'm not sure we have testsuite coverage given Jason failed to add a testcase
> when adding the code in r34416.

Now that we're delaying folding in C++, it shouldn't matter whether
fold is lvalue-safe.

Jason

C++ PATCH for c++/78198 (inherited template ctor with default arg)

2016-11-04 Thread Jason Merrill

Default arguments of an inherited ctor remain in the context of the
inherited function, not the artificial inheriting constructor.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit d6ceff13167956b660317ef37a40ba7110d325c4
Author: Jason Merrill 
Date:   Thu Nov 3 16:14:51 2016 -0400

PR c++/78198 - inherited template ctor with default arg

* call.c (convert_default_arg): Look through inheriting ctors.

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index 27aa7fd..d2e99bc 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -7193,6 +7193,9 @@ convert_default_arg (tree type, tree arg, tree fn, int 
parmnum,
 
   /* See through clones.  */
   fn = DECL_ORIGIN (fn);
+  /* And inheriting ctors.  */
+  if (flag_new_inheriting_ctors)
+fn = strip_inheriting_ctors (fn);
 
   /* Detect recursion.  */
   FOR_EACH_VEC_SAFE_ELT (default_arg_context, i, t)
diff --git a/gcc/testsuite/g++.dg/cpp0x/inh-ctor22.C 
b/gcc/testsuite/g++.dg/cpp0x/inh-ctor22.C
new file mode 100644
index 000..1b0e242
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/inh-ctor22.C
@@ -0,0 +1,16 @@
+// { dg-do compile { target c++11 } }
+
+class A { };
+template using UniquePtr = int;
+template struct BufferList {
+  BufferList(unsigned, unsigned, unsigned, AllocPolicy = AllocPolicy());
+};
+class D : BufferList {
+  using BufferList::BufferList;
+};
+template UniquePtr MakeUnique(Args... aArgs)
+{
+  D d(aArgs...);
+  return 0;
+}
+UniquePtr setCloneBuffer_impl_buf = MakeUnique(0, 0, 0);

Re: [PATCH] Fix DSE not to consider calls as reads from function's body (PR target/77834)

2016-11-04 Thread Richard Biener

On Fri, 4 Nov 2016, Jakub Jelinek wrote:

> Hi!
> 
> One of the common reasons why nonoverlapping_memrefs_p is called with
> MEM_EXPRs that are FUNCTION_DECLs and don't have DECL_RTL set is that
> DSE considers all calls to be memory reads from what is being called.
> 
> The following patch avoids that.  Of course the question is if we want
> to support something like:
> char var[64];
> ...
> var[0] = '0xe8';
> var[1] = ...;
> (void (*fn) (void) = (void (*) (void)) &var[0];
> fn ();
> without any kind of barrier, or not.
> If we do, then the patch below might break it by DSEing the stores.
> So, do we want the patch as is, or just limit it to direct function calls
> (call (mem (symbol_ref))), or only those where the symbol is actually
> a FUNCTION_DECL?  Or do we want to consider arbitrary pointers possibly
> writing into function function text?

I'm not sure the current code handled this correctly?  At least I
see nothing on the GIMPLE level that would disallow the DSE:

int foo ()
{
  char a[256];
  a[0] = 'a';
  a[1] = 'b';
  int __attribute__((const)) (*fn) (void) = (int (*) (void)) &a[0];
  return fn ();
}

even with pure or regular fn this gets DSEd, the call is appearantly
not an escape point for &a.

That is, for DSE it would only matter for locals and/or const
function calls (if stores after the call make the earlier appear dead).

We did have bugs like PR70128 fixed though (but that was "global memory").
It might be still broken if the patched function is const and we
unpatch right after calling...

So to answer, yes, I think we want to treat this conservatively
(but we may have existing bugs here).

> In any case, the second hunk fixes an important DSE bug that just got
> revealed by the former change.

Do you have a testcase for the (wrong-code?) bug?

> 2016-11-04  Jakub Jelinek  
> 
>   PR target/77834
>   * dse.c (check_mem_read_use): For CALL with MEM as first operand,
>   ignore that MEM, but recurse on the MEM's operand if not SYMBOL_REF.
>   (dse_step5): Call scan_reads even if just insn_info->frame_read.
>   Improve and fix dump file messages.
> 
> --- gcc/dse.c.jj  2016-11-03 15:52:12.521965058 +0100
> +++ gcc/dse.c 2016-11-04 09:42:27.640098125 +0100
> @@ -2159,6 +2159,19 @@ check_mem_read_use (rtx *loc, void *data
>rtx *loc = *iter;
>if (MEM_P (*loc))
>   check_mem_read_rtx (loc, (bb_info_t) data);
> +  else if (GET_CODE (*loc) == CALL && MEM_P (XEXP (*loc, 0)))
> + {
> +   /* Ignore the MEM in first operand of CALL, that isn't
> +  any kind of read of the function text memory.  */
> +   iter.skip_subrtxes ();
> +   /* But for indirect function calls consider the MEM
> +  containing the function pointer if any.  Avoid recursing
> +  in the most common case.  */
> +   if (GET_CODE (XEXP (XEXP (*loc, 0), 0)) != SYMBOL_REF)
> + check_mem_read_use (&XEXP (XEXP (*loc, 0), 0), data);
> +   if (!CONST_INT_P (XEXP (*loc, 1)))
> + check_mem_read_use (&XEXP (*loc, 1), data);
> + }
>  }
>  }
>  
> @@ -3298,12 +3311,19 @@ dse_step5 (void)
> bitmap_clear (v);
>   }
> else if (insn_info->read_rec
> -   || insn_info->non_frame_wild_read)
> +|| insn_info->non_frame_wild_read
> +|| insn_info->frame_read)
>   {
> -   if (dump_file && !insn_info->non_frame_wild_read)
> - fprintf (dump_file, "regular read\n");
> -  else if (dump_file && (dump_flags & TDF_DETAILS))
> - fprintf (dump_file, "non-frame wild read\n");
> +   if (dump_file && (dump_flags & TDF_DETAILS))
> + {
> +   if (!insn_info->non_frame_wild_read
> +   && !insn_info->frame_read)
> + fprintf (dump_file, "regular read\n");
> +   if (insn_info->non_frame_wild_read)
> + fprintf (dump_file, "non-frame wild read\n");
> +   if (insn_info->frame_read)
> + fprintf (dump_file, "frame read\n");
> + }
> scan_reads (insn_info, v, NULL);
>   }
>   }
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Re: [PATCH] Fix DSE not to consider calls as reads from function's body (PR target/77834)

2016-11-04 Thread Jakub Jelinek

On Fri, Nov 04, 2016 at 01:33:54PM +0100, Bernd Schmidt wrote:
> On 11/04/2016 01:26 PM, Jakub Jelinek wrote:
> >
> >The following patch avoids that.  Of course the question is if we want
> >to support something like:
> >char var[64];
> >...
> >var[0] = '0xe8';
> >var[1] = ...;
> >(void (*fn) (void) = (void (*) (void)) &var[0];
> >fn ();
> >without any kind of barrier, or not.
> 
> I was going to say no, but trampolines come to mind as a situation where we
> might come close to the above.

So then (untested so far) the following instead?
Indirect function calls will then be as non-efficient as in the past for DSE
and scheduling, but for direct calls, whether the target has a direct call
or only indirect call instruction, should be handled more efficiently.

2016-11-04  Jakub Jelinek  

PR target/77834
* dse.c (check_mem_read_use): For CALL with MEM as first operand,
ignore that MEM, but recurse on the MEM's operand if not SYMBOL_REF.
(dse_step5): Call scan_reads even if just insn_info->frame_read.
Improve and fix dump file messages.

--- gcc/dse.c.jj2016-11-03 15:52:12.521965058 +0100
+++ gcc/dse.c   2016-11-04 09:42:27.640098125 +0100
@@ -2159,6 +2159,19 @@ check_mem_read_use (rtx *loc, void *data
   rtx *loc = *iter;
   if (MEM_P (*loc))
check_mem_read_rtx (loc, (bb_info_t) data);
+  else if (GET_CODE (*loc) == CALL
+  && MEM_P (XEXP (*loc, 0))
+  && MEM_EXPR (XEXP (*loc, 0))
+  && TREE_CODE (MEM_EXPR (XEXP (*loc, 0))) == FUNCTION_DECL)
+   {
+ /* Ignore the MEM in first operand of CALL if it is
+a direct call.  */
+ iter.skip_subrtxes ();
+ if (GET_CODE (XEXP (XEXP (*loc, 0), 0)) != SYMBOL_REF)
+   check_mem_read_use (&XEXP (XEXP (*loc, 0), 0), data);
+ if (!CONST_INT_P (XEXP (*loc, 1)))
+   check_mem_read_use (&XEXP (*loc, 1), data);
+   }
 }
 }
 
@@ -3298,12 +3311,19 @@ dse_step5 (void)
  bitmap_clear (v);
}
  else if (insn_info->read_rec
-   || insn_info->non_frame_wild_read)
+  || insn_info->non_frame_wild_read
+  || insn_info->frame_read)
{
- if (dump_file && !insn_info->non_frame_wild_read)
-   fprintf (dump_file, "regular read\n");
-  else if (dump_file && (dump_flags & TDF_DETAILS))
-   fprintf (dump_file, "non-frame wild read\n");
+ if (dump_file && (dump_flags & TDF_DETAILS))
+   {
+ if (!insn_info->non_frame_wild_read
+ && !insn_info->frame_read)
+   fprintf (dump_file, "regular read\n");
+ if (insn_info->non_frame_wild_read)
+   fprintf (dump_file, "non-frame wild read\n");
+ if (insn_info->frame_read)
+   fprintf (dump_file, "frame read\n");
+   }
  scan_reads (insn_info, v, NULL);
}
}


Jakub

Re: [PATCH] Fix DSE not to consider calls as reads from function's body (PR target/77834)

2016-11-04 Thread Jakub Jelinek

On Fri, Nov 04, 2016 at 01:47:36PM +0100, Richard Biener wrote:
> I'm not sure the current code handled this correctly?  At least I
> see nothing on the GIMPLE level that would disallow the DSE:
> 
> int foo ()
> {
>   char a[256];
>   a[0] = 'a';
>   a[1] = 'b';
>   int __attribute__((const)) (*fn) (void) = (int (*) (void)) &a[0];
>   return fn ();
> }
> 
> even with pure or regular fn this gets DSEd, the call is appearantly
> not an escape point for &a.
> 
> That is, for DSE it would only matter for locals and/or const
> function calls (if stores after the call make the earlier appear dead).
> 
> We did have bugs like PR70128 fixed though (but that was "global memory").
> It might be still broken if the patched function is const and we
> unpatch right after calling...
> 
> So to answer, yes, I think we want to treat this conservatively
> (but we may have existing bugs here).

So like the variant patch I've just posted?

> > In any case, the second hunk fixes an important DSE bug that just got
> > revealed by the former change.
> 
> Do you have a testcase for the (wrong-code?) bug?

FAIL: gcc.dg/torture/pr67690.c   -Os  execution test

that appeared on i686-linux with the first hunk without the second hunk.

Jakub

Re: [RFC] Check number of uses in simplify_cond_using_ranges().

2016-11-04 Thread Richard Biener

On Fri, Nov 4, 2016 at 12:08 PM, Dominik Vogt  wrote:
> On Fri, Nov 04, 2016 at 09:47:26AM +0100, Richard Biener wrote:
>> On Thu, Nov 3, 2016 at 4:03 PM, Dominik Vogt  wrote:
>> > Is VRP the right pass to do this optimisation or should a later
>> > pass rather attempt to eliminate the new use of b_5 instead?  Uli
>> > has brought up the idea a mini "sign extend elimination" pass that
>> > checks if the result of a sign extend could be replaced by the
>> > original quantity in all places, and if so, eliminate the ssa
>> > name.  (I guess that won't help with the above code because l is
>> > used also as a function argument.)  How could a sensible approach
>> > to deal with the situation look like?
>>
>> We run into this kind of situation regularly and for general foldings
>> in match.pd we settled with single_use () even though it is not perfect.
>> Note the usual complaint is not extra extension instructions but
>> the increase of register pressure.
>>
>> This is because it is hard to do better when you are doing local
>> optimization.
>>
>> As for the question on whether VRP is the right pass to do this the
>> answer is two-fold -- VRP has the most precise range information.
>> But the folding itself should be moved to generic code and use
>> get_range_info ().
>
> All right, is this a sensible approach then?

Yes.

>   1. Using has_single_use as in the experimental patch is good
>  enough (provided further testing does not show serious
>  regressions).

I'd approve that, yes.

>   2. Rip the whole top level if-block from simplify_cond_using_ranges().
>   3. Translate that code to match.pd syntax.

Might be some work but yes, that's also desired (you'd lose the ability
to emit the warnings though).

Richard.

>
> Ciao
>
> Dominik ^_^  ^_^
>
> --
>
> Dominik Vogt
> IBM Germany
>

Re: [PATCH] Fix DSE not to consider calls as reads from function's body (PR target/77834)

2016-11-04 Thread Richard Biener

On Fri, 4 Nov 2016, Jakub Jelinek wrote:

> On Fri, Nov 04, 2016 at 01:47:36PM +0100, Richard Biener wrote:
> > I'm not sure the current code handled this correctly?  At least I
> > see nothing on the GIMPLE level that would disallow the DSE:
> > 
> > int foo ()
> > {
> >   char a[256];
> >   a[0] = 'a';
> >   a[1] = 'b';
> >   int __attribute__((const)) (*fn) (void) = (int (*) (void)) &a[0];
> >   return fn ();
> > }
> > 
> > even with pure or regular fn this gets DSEd, the call is appearantly
> > not an escape point for &a.
> > 
> > That is, for DSE it would only matter for locals and/or const
> > function calls (if stores after the call make the earlier appear dead).
> > 
> > We did have bugs like PR70128 fixed though (but that was "global memory").
> > It might be still broken if the patched function is const and we
> > unpatch right after calling...
> > 
> > So to answer, yes, I think we want to treat this conservatively
> > (but we may have existing bugs here).
> 
> So like the variant patch I've just posted?

That doesn't handle

int __attribute__((const,noinline))
foo ()
{
  return 1;
}

int bar()
{
  *((int *)foo) + 4 = 2;
  int ret = foo ();
  *((int *)foo) + 4 = 1;
  return ret;
}

right?  patching foo to return 2, calling foo and then unpatching it?

> > > In any case, the second hunk fixes an important DSE bug that just got
> > > revealed by the former change.
> > 
> > Do you have a testcase for the (wrong-code?) bug?
> 
> FAIL: gcc.dg/torture/pr67690.c   -Os  execution test
> 
> that appeared on i686-linux with the first hunk without the second hunk.

Ah ok.

Richard.

>   Jakub
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Re: [PATCH] Fix DSE not to consider calls as reads from function's body (PR target/77834)

2016-11-04 Thread Jakub Jelinek

On Fri, Nov 04, 2016 at 01:59:05PM +0100, Richard Biener wrote:
> > So like the variant patch I've just posted?
> 
> That doesn't handle
> 
> int __attribute__((const,noinline))
> foo ()
> {
>   return 1;
> }
> 
> int bar()
> {
>   *((int *)foo) + 4 = 2;
>   int ret = foo ();
>   *((int *)foo) + 4 = 1;
>   return ret;
> }
> 
> right?  patching foo to return 2, calling foo and then unpatching it?

If you mean *((int *)foo + 4) = rather than *((int *)foo) + 4,
then indeed it doesn't.  But GIMPLE optimizers remove that far before.
And I'm not sure if we want to pessimize code just for such hypothetical
cases; users can always add explicit optimization barriers;
or maybe just have a flag whether function text is writable
and disable that flag by default on non-bare metal?
Unless the function is in a writable section, if it isn't the kernel
or embedded target, most likely the function body won't be writable anyway.

Note I'm not sure that even without the patch we'd handle it at the
DSE/sched time - the thing is that the call's argument is a mem:QI,
so effectively represents just the first byte of the function.  So
if you stored to the first byte of the function, it might handle it
"right", but if you store 4 bytes at offset 4, nonoverlapping_memrefs_p
might still figure out that it is comparing a offset 4 4 byte memory
with offset 0 1 byte memory and tell they don't overlap.

Jakub

Re: Simplify X /[ex] 8 == 0

2016-11-04 Thread Richard Biener

On Fri, Nov 4, 2016 at 1:34 PM, Marc Glisse  wrote:
> Hello,
>
> this kind of simplification is already handled by fold_comparison, but the
> code is common with TRUNC_DIV_EXPR, which yields suboptimal code. In
> particular, for an unsigned number, X/8==0 means x<=7, while X/[ex]8==0 can
> be turned into X==0.
>
> When we have an explicit division by 0, there is a better optimisation
> possible (insert __builtin_unreachable() or __builtin_trap() after that
> statement, as in find_explicit_erroneous_behavior), so I don't touch it.
>
> For the overflow inequality case, it would have been a bit clearer to
> generate
>   (cmp { build_zero_cst (TREE_TYPE (@0)); } @2)
> and let that be constant folded instead of having that ugly and error-prone
> test in constant_boolean_node, but I saved one tree ;-)
>
> Bootstrap+regtest on powerpc64le-unknown-linux-gnu, all the regressions are
> in the libitm part of the testsuite, they should be fixed by
> https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02220.html , I'll rerun the
> testsuite when that patch is in.

Ok.

You fail to handle A /[ex] -2 < 2?  Is that on purpose?  Or just lazy so you
dont' have to handle inverting the comparison?

Thanks,
Richard.

> 2016-11-07  Marc Glisse  
>
> gcc/
> * fold-const.c (fold_comparison): Ignore EXACT_DIV_EXPR.
> * match.pd (A /[ex] B CMP C): New simplifications.
>
> gcc/testsuite/
> * gcc.dg/tree-ssa/cmpexactdiv.c: New file.
>
> --
> Marc Glisse

Re: Simplify X /[ex] 8 == 0

2016-11-04 Thread Richard Biener

On Fri, Nov 4, 2016 at 2:15 PM, Richard Biener
 wrote:
> On Fri, Nov 4, 2016 at 1:34 PM, Marc Glisse  wrote:
>> Hello,
>>
>> this kind of simplification is already handled by fold_comparison, but the
>> code is common with TRUNC_DIV_EXPR, which yields suboptimal code. In
>> particular, for an unsigned number, X/8==0 means x<=7, while X/[ex]8==0 can
>> be turned into X==0.
>>
>> When we have an explicit division by 0, there is a better optimisation
>> possible (insert __builtin_unreachable() or __builtin_trap() after that
>> statement, as in find_explicit_erroneous_behavior), so I don't touch it.
>>
>> For the overflow inequality case, it would have been a bit clearer to
>> generate
>>   (cmp { build_zero_cst (TREE_TYPE (@0)); } @2)
>> and let that be constant folded instead of having that ugly and error-prone
>> test in constant_boolean_node, but I saved one tree ;-)
>>
>> Bootstrap+regtest on powerpc64le-unknown-linux-gnu, all the regressions are
>> in the libitm part of the testsuite, they should be fixed by
>> https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02220.html , I'll rerun the
>> testsuite when that patch is in.
>
> Ok.
>
> You fail to handle A /[ex] -2 < 2?  Is that on purpose?  Or just lazy so you
> dont' have to handle inverting the comparison?

Oh, I suppose nothing generates exact divisions by negative numbers.

Richard.

> Thanks,
> Richard.
>
>> 2016-11-07  Marc Glisse  
>>
>> gcc/
>> * fold-const.c (fold_comparison): Ignore EXACT_DIV_EXPR.
>> * match.pd (A /[ex] B CMP C): New simplifications.
>>
>> gcc/testsuite/
>> * gcc.dg/tree-ssa/cmpexactdiv.c: New file.
>>
>> --
>> Marc Glisse

Re: [PATCH] Fix DSE not to consider calls as reads from function's body (PR target/77834)

2016-11-04 Thread Richard Biener

On Fri, 4 Nov 2016, Jakub Jelinek wrote:

> On Fri, Nov 04, 2016 at 01:59:05PM +0100, Richard Biener wrote:
> > > So like the variant patch I've just posted?
> > 
> > That doesn't handle
> > 
> > int __attribute__((const,noinline))
> > foo ()
> > {
> >   return 1;
> > }
> > 
> > int bar()
> > {
> >   *((int *)foo) + 4 = 2;
> >   int ret = foo ();
> >   *((int *)foo) + 4 = 1;
> >   return ret;
> > }
> > 
> > right?  patching foo to return 2, calling foo and then unpatching it?
> 
> If you mean *((int *)foo + 4) = rather than *((int *)foo) + 4,
> then indeed it doesn't.  But GIMPLE optimizers remove that far before.
> And I'm not sure if we want to pessimize code just for such hypothetical
> cases; users can always add explicit optimization barriers;
> or maybe just have a flag whether function text is writable
> and disable that flag by default on non-bare metal?
> Unless the function is in a writable section, if it isn't the kernel
> or embedded target, most likely the function body won't be writable anyway.

Not sure if it will pessimize much, but yes, if we want to support it
there's the complication that the const call has no VOPs on GIMPLE
and I'm not sure we want to force VUSEs on all const calls...

> Note I'm not sure that even without the patch we'd handle it at the
> DSE/sched time - the thing is that the call's argument is a mem:QI,
> so effectively represents just the first byte of the function.  So
> if you stored to the first byte of the function, it might handle it
> "right", but if you store 4 bytes at offset 4, nonoverlapping_memrefs_p
> might still figure out that it is comparing a offset 4 4 byte memory
> with offset 0 1 byte memory and tell they don't overlap.

Yeah, but I hope MEM_SIZE_KNOWN_P is false for the MEM, that should
do the trick (at least that's what I remember - I'm not that fluent
with RTL).

Richard.

[PATCH][GCC] Fix native Windows x86 bootstrap failure with self test

2016-11-04 Thread Tamar Christina

Hi all,

The GCC self-test added in r237144 breaks the
native Windows x86 builds (e.g. mingw-w64).
This fixes (PR78196) by explicitly adding /dev/null as
the output file to the GCC self test.

The test essentially does `-xc -S -c /dev/null -fself-test`

`/dev/null` is then converted into the Windows null device `nul`
by the MSYS shell, which is correct. But then the driver adds a
filename to the name, trying to write the output to `nul.s`.
`nul` is a reserved filename on Windows. As such it's invalid
to create this file and the call always fails using CreateFile.

Checked with x86_64-w64-mingw32 and build gets passed self tests
but dies at unrelated libstdc++ and libvtv errors.

Ok for trunk?

Thanks,
Tamar

gcc/

2016-11-03  Tamar Christina  

PR driver/78196
* Makefile.in (SELFTEST_FLAGS): Added -o /dev/null.
diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 622d038f6fb50c35ff48bc84a69967071aa17e90..7ecd1e4e726e4cb31b1c9fe22dcb777e71468fb2 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1879,7 +1879,10 @@ rest.cross: specs
 # Specify a dummy input file to placate the driver.
 # Specify -nostdinc to work around missing WIND_BASE environment variable
 # required for *-wrs-vxworks-* targets.
-SELFTEST_FLAGS = -nostdinc -x c /dev/null -S -fself-test
+# Specify -o /dev/null so the output of -S is discarded. More importantly
+# It does not try to create a file with the name "null.s" on POSIX and
+# "nul.s" on Windows. Because on Windows "nul" is a reserved file name.
+SELFTEST_FLAGS = -nostdinc -x c /dev/null -S -fself-test -o /dev/null
 
 # Run the selftests during the build once we have a driver and a cc1,
 # so that self-test failures are caught as early as possible.

[PATCH] Improve vect cost model for PR37150

2016-11-04 Thread Richard Biener


The following implements some easy improvements for the SLP cost model
for PR37150 which shows excess cost for dead loads and permutes
accounted when vectorizing a basic-block.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Still doesn't vectorize the testcase in that PR without 
-fno-vect-cost-model though.  Real improvements are only possible
with re-doing the vectorizer data structures.

Richard.

2016-11-04  Richard Biener  

PR tree-optimization/37150
* tree-vectorizer.h (vect_transform_slp_perm_load): Add n_perms
parameter.
* tree-vect-slp.c (vect_supported_load_permutation_p): Adjust.
(vect_analyze_slp_cost_1): Account for the real number of
permutations emitted and for dead loads.
(vect_transform_slp_perm_load): Add n_perms parameter counting
the number of emitted permutations.
* tree-vect-stmts.c (vectorizable_load): Adjust.

Index: gcc/tree-vectorizer.h
===
--- gcc/tree-vectorizer.h   (revision 241791)
+++ gcc/tree-vectorizer.h   (working copy)
@@ -1166,7 +1168,7 @@ extern int vect_get_known_peeling_cost (
 extern void vect_free_slp_instance (slp_instance);
 extern bool vect_transform_slp_perm_load (slp_tree, vec ,
   gimple_stmt_iterator *, int,
-  slp_instance, bool);
+  slp_instance, bool, unsigned *);
 extern bool vect_slp_analyze_operations (vec slp_instances,
 void *);
 extern bool vect_schedule_slp (vec_info *);
Index: gcc/tree-vect-slp.c
===
--- gcc/tree-vect-slp.c (revision 241791)
+++ gcc/tree-vect-slp.c (working copy)
@@ -1461,8 +1461,9 @@ vect_supported_load_permutation_p (slp_i
{
  /* Verify the permutation can be generated.  */
  vec tem;
+ unsigned n_perms;
  if (!vect_transform_slp_perm_load (node, tem, NULL,
-1, slp_instn, true))
+1, slp_instn, true, &n_perms))
{
  dump_printf_loc (MSG_MISSED_OPTIMIZATION,
   vect_location,
@@ -1475,11 +1476,13 @@ vect_supported_load_permutation_p (slp_i
 }
 
   /* For loop vectorization verify we can generate the permutation.  */
+  unsigned n_perms;
   FOR_EACH_VEC_ELT (SLP_INSTANCE_LOADS (slp_instn), i, node)
 if (node->load_permutation.exists ()
&& !vect_transform_slp_perm_load
  (node, vNULL, NULL,
-  SLP_INSTANCE_UNROLLING_FACTOR (slp_instn), slp_instn, true))
+  SLP_INSTANCE_UNROLLING_FACTOR (slp_instn), slp_instn, true,
+  &n_perms))
   return false;
 
   return true;
@@ -1548,14 +1551,38 @@ vect_analyze_slp_cost_1 (slp_instance in
  stmt = GROUP_FIRST_ELEMENT (stmt_info);
  stmt_info = vinfo_for_stmt (stmt);
  /* Record the cost for the permutation.  */
- record_stmt_cost (body_cost_vec, ncopies_for_cost, vec_perm,
+ unsigned n_perms;
+ vect_transform_slp_perm_load (node, vNULL, NULL,
+   ncopies_for_cost, instance, true,
+   &n_perms);
+ record_stmt_cost (body_cost_vec, n_perms, vec_perm,
stmt_info, 0, vect_body);
- /* And adjust the number of loads performed.  */
  unsigned nunits
= TYPE_VECTOR_SUBPARTS (STMT_VINFO_VECTYPE (stmt_info));
- ncopies_for_cost
-   = (GROUP_SIZE (stmt_info) - GROUP_GAP (stmt_info)
-  + nunits - 1) / nunits;
+ /* And adjust the number of loads performed.  This handles
+redundancies as well as loads that are later dead.  */
+ auto_sbitmap perm (GROUP_SIZE (stmt_info));
+ bitmap_clear (perm);
+ for (i = 0; i < SLP_TREE_LOAD_PERMUTATION (node).length (); ++i)
+   bitmap_set_bit (perm, SLP_TREE_LOAD_PERMUTATION (node)[i]);
+ ncopies_for_cost = 0;
+ bool load_seen = false;
+ for (i = 0; i < GROUP_SIZE (stmt_info); ++i)
+   {
+ if (i % nunits == 0)
+   {
+ if (load_seen)
+   ncopies_for_cost++;
+ load_seen = false;
+   }
+ if (bitmap_bit_p (perm, i))
+   load_seen = true;
+   }
+ if (load_seen)
+   ncopies_for_cost++;
+ gcc_assert (ncopies_for_cost
+ <= (GROUP_SIZE (stmt_info) - GROUP_GAP (stmt_info)
+ + nunits - 1) / nunits)

Re: [PATCH] Convert character arrays to string csts

2016-11-04 Thread Martin Liška

On 11/03/2016 01:46 PM, Richard Biener wrote:
> On Thu, Nov 3, 2016 at 1:12 PM, Martin Liška  wrote:
>> Hello.
>>
>> This is small follow-up of the patches I sent to string folding.
>> The patch transforms character array defined in an initializer to string
>> constant:
>>
>> +const char global[] = {'a', 'b', 'c', 'd', '\0'};
>>
>> Apart from that, it also enables string folding of local symbols like:
>> +  const char local[] = "abcd";
>>
>> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
> 
> Hmm, does this handle
> 
> const char global[] = {'a', [4] = 'd', 'b', [3] = '\0'};
> 
> correctly?

It hasn't, fixed in the v2. As ctor indices are sorted in increasing order,
I only check that there are no holes.

> 
> I think you need to check that 'key' is increasing and there are no gaps
> (ISTR constructor elements are sorted but gaps can still appear).
> 
> + || !useless_type_conversion_p (TREE_TYPE (value), char_type_node))
> 
> please instead check the element type of the array constructor.  I'd also
> use a stricter check like TYPE_MAIN_VARIANT (type) == char_type_node
> to avoid changing non-strings like unsigned / signed char.

Ok, v2 does: TYPE_MAIN_VARIANT (type) == char_type_node

> 
> Finally I'm a bit worried about doing this for all 'char'
> constructors.  Maybe we
> should restrict this to those with ISPRINT () entries?

Also done for all characters except the trailing \0 character.
I'll send patch in the next email.

Martin

> 
> Richard.
> 
> 
>> Martin

Re: Simplify X /[ex] 8 == 0

2016-11-04 Thread Marc Glisse


On Fri, 4 Nov 2016, Richard Biener wrote:


On Fri, Nov 4, 2016 at 2:15 PM, Richard Biener
 wrote:

On Fri, Nov 4, 2016 at 1:34 PM, Marc Glisse  wrote:

Hello,

this kind of simplification is already handled by fold_comparison, but the
code is common with TRUNC_DIV_EXPR, which yields suboptimal code. In
particular, for an unsigned number, X/8==0 means x<=7, while X/[ex]8==0 can
be turned into X==0.

When we have an explicit division by 0, there is a better optimisation
possible (insert __builtin_unreachable() or __builtin_trap() after that
statement, as in find_explicit_erroneous_behavior), so I don't touch it.

For the overflow inequality case, it would have been a bit clearer to
generate
  (cmp { build_zero_cst (TREE_TYPE (@0)); } @2)
and let that be constant folded instead of having that ugly and error-prone
test in constant_boolean_node, but I saved one tree ;-)

Bootstrap+regtest on powerpc64le-unknown-linux-gnu, all the regressions are
in the libitm part of the testsuite, they should be fixed by
https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02220.html , I'll rerun the
testsuite when that patch is in.


Ok.

You fail to handle A /[ex] -2 < 2?  Is that on purpose?  Or just lazy so you
dont' have to handle inverting the comparison?


Oh, I suppose nothing generates exact divisions by negative numbers.


Yes, that :-) I think I'd rather wait until I can test it before 
simplifying those too much.


--
Marc Glisse

Re: [PATCH] combine lhs zero_extract fix (PR78186)

2016-11-04 Thread Christophe Lyon

On 3 November 2016 at 17:01, Segher Boessenkool
 wrote:
> I managed to forget to mask the thing inserted.  Tested on powerpc64-linux
> {-m32,-m64}, and Bin tested on arm.  Applying to trunk.
>
>
> Segher
>
>
> 2016-11-03  Segher Boessenkool  
>
> PR rtl-optimization/78186
> * combine.c (change_zero_ext): Mask the RHS of a zero_extract as
> well, when converting to IOR.
>

Hi,

Since this commit I have noticed execution failures on "old" arm targets:

  gcc.dg/torture/pr48124-4.c   -O1  execution test
  gcc.dg/torture/pr48124-4.c   -O2  execution test
  gcc.dg/torture/pr48124-4.c   -O2 -flto -fno-use-linker-plugin
-flto-partition=none  execution test
  gcc.dg/torture/pr48124-4.c   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  execution test
  gcc.dg/torture/pr48124-4.c   -O3 -g  execution test
  gcc.dg/torture/pr48124-4.c   -Os  execution test

For instance on target arm-none-linux-gnueabi --with-cpu=cortex-a9
--with-mode=arm
and running the tests with -march=armv5t

or on arm-none-eabi with default mode/cpu and no special runtest flags
if that's easier.

Christophe




> ---
>  gcc/combine.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/gcc/combine.c b/gcc/combine.c
> index 7c21fe4..7ed0a62 100644
> --- a/gcc/combine.c
> +++ b/gcc/combine.c
> @@ -11224,6 +11224,9 @@ change_zero_ext (rtx pat)
>rtx x = gen_rtx_AND (mode, reg, immed_wide_int_const (mask, mode));
>rtx y = simplify_gen_binary (ASHIFT, mode, SET_SRC (pat),
>GEN_INT (offset));
> +  wide_int mask2 = wi::shifted_mask (offset, width, false, reg_width);
> +  y = simplify_gen_binary (AND, mode, y,
> +  immed_wide_int_const (mask2, mode));
>rtx z = simplify_gen_binary (IOR, mode, x, y);
>SUBST (SET_DEST (pat), reg);
>SUBST (SET_SRC (pat), z);
> --
> 1.9.3
>

Re: [PATCH] Convert character arrays to string csts

2016-11-04 Thread Martin Liška

On 11/03/2016 02:00 PM, Jan Hubicka wrote:
>> On 11/03/2016 01:12 PM, Martin Liška wrote:
>>> +  tree init = DECL_INITIAL (decl);
>>> +  if (init
>>> +  && TREE_READONLY (decl)
>>> +  && can_convert_ctor_to_string_cst (init))
>>> +DECL_INITIAL (decl) = build_string_cst_from_ctor (init);
>>
>> I'd merge these two new functions since they're only ever called
>> together. We'd then have something like
>>
>> if (init && TREE_READONLY (decl))
>>   init = convert_ctor_to_string_cst (init);
>> if (init)
>>   DECL_INITIAL (decl) = init;

Done.

>>
>> I'll defer to Jan on whether finalize_decl seems like a good place
>> to do this.
> 
> I think finalize_decl may be bit too early because frontends may expects the
> ctors to be in a way they produced them.  We only want to convert those arrays
> seen by middle-end so I would move the logic to varpool_node::analyze
> 
> Otherwise the patch seems fine to me.
> 
> Honza
>>
>>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/builtins-folding-gimple.c 
>>> b/gcc/testsuite/gcc.dg/tree-ssa/builtins-folding-gimple.c
>>> index 283bd1c..b2d1fd5 100644
>>> --- a/gcc/testsuite/gcc.dg/tree-ssa/builtins-folding-gimple.c
>>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/builtins-folding-gimple.c
>>> @@ -4,12 +4,15 @@
>>> char *buffer1;
>>> char *buffer2;
>>>
>>> +const char global[] = {'a', 'b', 'c', 'd', '\0'};
>>> +
>>> #define SIZE 1000
>>>
>>> int
>>> main (void)
>>> {
>>>   const char* const foo1 = "hello world";
>>> +  const char local[] = "abcd";
>>>
>>>   buffer1 = __builtin_malloc (SIZE);
>>>   __builtin_strcpy (buffer1, foo1);
>>> @@ -45,6 +48,10 @@ main (void)
>>> __builtin_abort ();
>>>   if (__builtin_memchr (foo1, null, 12) != foo1 + 11)
>>> __builtin_abort ();
>>> +  if (__builtin_memchr (global, null, 5) == 0)
>>> +__builtin_abort ();
>>> +  if (__builtin_memchr (local, null, 5) == 0)
>>> +__builtin_abort ();
>>
>> How is that a meaningful test? This seems to work even with an
>> unpatched gcc. I'd have expected something that shows a benefit for
>> doing this conversion, and maybe also a test that shows it isn't
>> done in cases where it's not allowed.

It's meaningful as it scans that there's no __builtin_memchr in optimized dump.
I'm adding new tests that does the opposite test.

>>
>>> tree
>>> -build_string_literal (int len, const char *str)
>>> +build_string_literal (int len, const char *str, bool build_addr_expr)
>>
>> New arguments should be documented in the function comment.

Yep, improved.

>>
>>> +/* Return TRUE when CTOR can be converted to a string constant.  */
>>
>> "if", not "when".

Done.

>>
>>> +  unsigned HOST_WIDE_INT elements = CONSTRUCTOR_NELTS (ctor);
>>> +  FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (ctor), idx, key, value)
>>> +{
>>> +  if (key == NULL_TREE
>>> + || TREE_CODE (key) != INTEGER_CST
>>> + || !tree_fits_uhwi_p (value)
>>> + || !useless_type_conversion_p (TREE_TYPE (value), char_type_node))
>>> +   return false;
>>
>> Shouldn't all elements have the same type, or do you really have to
>> call useless_type_conversion in a loop?
>>
>>> +  /* Allow zero character just at the end of a string.  */
>>> +  if (integer_zerop (value))
>>> +   return idx == elements - 1;
>>
>> Don't you also have to explicitly check it's there?
>>
>>
>> Bernd


Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Martin

>From 4a2bd43ad10cfb0467dde15ca0be0deba8194ea7 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Mon, 17 Oct 2016 15:24:46 +0200
Subject: [PATCH] Convert character arrays to string csts

gcc/testsuite/ChangeLog:

2016-11-04  Martin Liska  

	* gcc.dg/tree-ssa/builtins-folding-gimple.c (main): Add new
	tests.
	* gcc.dg/tree-ssa/builtins-folding-gimple-ub.c (main): Likewise.

gcc/ChangeLog:

2016-11-04  Martin Liska  

	* gimplify.c (gimplify_decl_expr): Emit INIT_EXPR just if it
	cannot be converted to a string constant.
	(gimplify_init_constructor): Create string constant for local
	variables (if possible).
	* tree.c (convert_ctor_to_string_cst): New function.
	(build_string_literal): Add new argument.
	(can_convert_ctor_to_string_cst): New function.
	* tree.h: Declare new functions.
	* varpool.c (ctor_for_folding): Return ctor for local variables.
	(varpool_node::analyze): Convert character array ctors to a
	string constant (if possible).
---
 gcc/gimplify.c | 16 -
 .../gcc.dg/tree-ssa/builtins-folding-gimple-ub.c   | 20 +-
 .../gcc.dg/tree-ssa/builtins-folding-gimple.c  |  7 ++
 gcc/tree.c | 83 --
 gcc/tree.h |  5 +-
 gcc/varpool.c  | 14 +++-
 6 files changed, 134 insertions(+), 11 deletions(-)

diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 1531582..32f0909 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -1495,7 +1495,8 @@ gimplify_decl_expr (tree *stmt_p, gimple_seq *seq_p)
 	{
 	  if (!

Re: [patch, libgfortran] Bug 78123 - Short reads with T edit descriptor not padding correctly

2016-11-04 Thread Christophe Lyon

On 30 October 2016 at 21:40, Thomas Koenig  wrote:
> Hi Jerry,
>
>> See patch below which is very simple. We were handling the case where we
>> tabbed left of current position in the record by clearing the seen_eor
>> flag. We were not handling the case where we tab toward the right and so
>> are still at eor.
>>
>> New test case attached. Regression tested on x86-64-linux.
>>
>> OK to commit?
>>
>> This is a regression relative to g77. Its simple enough I think it
>> should go to 5 and 6 as well.
>
>
> I think so too.  OK for all affected branches.
>
> Thanks for the patch!
>

Hi,

The new test fails at execution on arm and aarch64:

  gfortran.dg/fmt_t_9.f   -O0  execution test
  gfortran.dg/fmt_t_9.f   -O1  execution test
  gfortran.dg/fmt_t_9.f   -O2  execution test
  gfortran.dg/fmt_t_9.f   -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions  execution test
  gfortran.dg/fmt_t_9.f   -O3 -g  execution test
  gfortran.dg/fmt_t_9.f   -Os  execution test

seen on gcc-5 and gcc-6 branches. Maybe I missed it on trunk due to
larger granularity in validations.

Christophe



> Thomas
>

[PING][PATCH][1/2] GIMPLE Frontend, C FE parts (and GIMPLE parser)

2016-11-04 Thread Richard Biener

On Fri, 28 Oct 2016, Richard Biener wrote:

> 
> These are the C (and ObjC) Frontend changes required by the GIMPLE
> Frontend which is now itself contained in c/gimple-parser.[ch].
> 
> Most changes are due to a new c-parser.h header where we export
> stuff from the C parser that the GIMPLE frontend requires.  Other
> changes include new __GIMPLE and __PHI keywords, handling
> __GIMPLE as new declspec and dispatching to the GIMPLE parser
> for __GIMPLE marked function definitions.
> 
> We'd like to include the GIMPLE parser for GCC 7, as the parser
> is pretty self-contained (and now works to a good extent) it can
> be improved during stage3 or when testcases show that it needs
> improvement.
> 
> Bootstrapped and tested on x86_64-unknown-linux-gnu (together with [2/2])
> for C, ObjC.
> 
> Ok for trunk?

Ping.

Thanks,
Richard.

> Thanks,
> Richard.
> 
> 2016-10-28  Prasad Ghangal  
>   Richard Biener  
> 
>   c/
>   * Make-lang.in (C_AND_OBJC_OBJS): Add gimple-parser.o.
>   * config-lang.in (gtfiles): Add c/c-parser.h.
>   * c-tree.h (enum c_declspec_word): Add cdw_gimple.
>   (struct c_declspecs): Add gimple_pass member and gimple_p flag.
>   * c-parser.c (enum c_id_kind, struct c_token,
>   c_parser_next_token_is, c_parser_next_token_is_not,
>   c_parser_next_token_is_keyword,
>   enum c_lookahead_kind, enum c_dtr_syn, enum c_parser_prec):
>   Split out to ...
>   * c-parser.h: ... new header.
>   * c-parser.c: Include c-parser.h and gimple-parser.h.
>   (c_parser_peek_token, c_parser_peek_2nd_token,
>   c_token_starts_typename, c_parser_next_token_starts_declspecs,
>   c_parser_next_tokens_start_declaration, c_parser_consume_token,
>   c_parser_error, c_parser_require, c_parser_skip_until_found,
>   c_parser_declspecs, c_parser_declarator, c_parser_peek_nth_token,
>   c_parser_cast_expression): Export.
>   (c_parser_tokens_buf): New function.
>   (c_parser_error): Likewise.
>   (c_parser_set_error): Likewise.
>   (c_parser_declspecs): Handle RID_GIMPLE.
>   (c_parser_declaration_or_fndef): Parse __GIMPLE marked body
>   via c_parser_parse_gimple_body.
>   * c-parser.h (c_parser_peek_token, c_parser_peek_2nd_token,
>   c_token_starts_typename, c_parser_next_token_starts_declspecs,
>   c_parser_next_tokens_start_declaration, c_parser_consume_token,
>   c_parser_error, c_parser_require, c_parser_skip_until_found,
>   c_parser_declspecs, c_parser_declarator, c_parser_peek_nth_token,
>   c_parser_cast_expression): Declare.
>   (struct c_parser): Declare forward.
>   (c_parser_tokens_buf): Declare.
> (c_parser_error): Likewise.
> (c_parser_set_error): Likewise.
>   * gimple-parser.c: New file.
>   * gimple-parser.h: Likewise.
> 
>   obj-c/
>   * config-lang.in (gtfiles): Add c/c-parser.h.
> 
>   c-family/
>   * c-common.h (c_common_resword): Add RID_GIMPLE, RID_PHI types.
>   * c-common.h (enum rid): Add RID_GIMPLE, RID_PHI.
>   * c.opt (fgimple): New option.
> 
>   * doc/invoke.texi (fgimple): Document.
>  
> diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
> index 307862b..2997c83 100644
> --- a/gcc/c-family/c-common.c
> +++ b/gcc/c-family/c-common.c
> @@ -435,6 +435,8 @@ const struct c_common_resword c_common_reswords[] =
>{ "__underlying_type", RID_UNDERLYING_TYPE, D_CXXONLY },
>{ "__volatile",RID_VOLATILE,   0 },
>{ "__volatile__",  RID_VOLATILE,   0 },
> +  { "__GIMPLE",  RID_GIMPLE, D_CONLY },
> +  { "__PHI", RID_PHI,D_CONLY },
>{ "alignas",   RID_ALIGNAS,D_CXXONLY | D_CXX11 | D_CXXWARN 
> },
>{ "alignof",   RID_ALIGNOF,D_CXXONLY | D_CXX11 | D_CXXWARN 
> },
>{ "asm",   RID_ASM,D_ASM },
> diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
> index 547bab2..1fbe060 100644
> --- a/gcc/c-family/c-common.h
> +++ b/gcc/c-family/c-common.h
> @@ -118,6 +118,12 @@ enum rid
>  
>RID_FRACT, RID_ACCUM, RID_AUTO_TYPE, RID_BUILTIN_CALL_WITH_STATIC_CHAIN,
>  
> +  /* "__GIMPLE", for the GIMPLE-parsing extension to the C frontend. */
> +  RID_GIMPLE,
> +
> +  /* "__PHI", for parsing PHI function in GIMPLE FE.  */
> +  RID_PHI,
> +
>/* C11 */
>RID_ALIGNAS, RID_GENERIC,
>  
> diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
> index 458d453..24d3b8e 100644
> --- a/gcc/c-family/c.opt
> +++ b/gcc/c-family/c.opt
> @@ -200,6 +200,10 @@ F
>  Driver C ObjC C++ ObjC++ Joined Separate MissingArgError(missing path after 
> %qs)
>  -F  Add  to the end of the main framework include path.
>  
> +fgimple
> +C Var(flag_gimple) Init(0)
> +Enable parsing GIMPLE.
> +
>  H
>  C ObjC C++ ObjC++
>  Print the name of header files as they are used.
> diff --git a/gcc/c/Make-lang.in b/gcc/c/Make-lang.in
> index 72c9ae7..cd7108b 100644
> --- a/gcc/c/Make-lang.in
> +++ b/gcc/c/Make-lang.in
> @@ -51,7 +51,

Re: [PATCH][1/2] GIMPLE Frontend, C FE parts (and GIMPLE parser)

2016-11-04 Thread Jakub Jelinek

Hi!

Just 2 nits:

On Fri, Oct 28, 2016 at 01:46:57PM +0200, Richard Biener wrote:
> +/* Return a pointer to the Nth token in PARERs tokens_buf.  */

PARSERs ?

> @@ -454,7 +423,7 @@ c_lex_one_token (c_parser *parser, c_token *token)
>  /* Return a pointer to the next token from PARSER, reading it in if
> necessary.  */
>  
> -static inline c_token *
> +c_token *
>  c_parser_peek_token (c_parser *parser)
>  {
>if (parser->tokens_avail == 0)

I wonder if turning all of these into non-inlines is a good idea.
Can't you move them to the common header instead?

The rest I defer to Joseph or Marek.

Jakub

Re: [libgcc] Protect __TMC_END - TMC_LIST__ == 0

2016-11-04 Thread Marc Glisse


Ping https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02220.html

On Thu, 27 Oct 2016, Marc Glisse wrote:


Hello,

some optimization patch I was working on simplified __TMC_END__ - 
__TMC_LIST__ == 0 to false, which is not wanted (I assume that's why it 
wasn't written __TMC_END__ == __TMC_LIST__ in the first place).


Bootstrap+regtest on powerpc64le-unknown-linux-gnu.

2016-10-27  Marc Glisse  

PR libgcc/77813
* crtstuff.c (deregister_tm_clones, register_tm_clones): Hide
__TMC_END__ behind a passthrough asm.


--
Marc Glisse

Re: [patch, libgfortran] Bug 78123 - Short reads with T edit descriptor not padding correctly

2016-11-04 Thread Christophe Lyon

On 4 November 2016 at 14:48, Christophe Lyon  wrote:
> On 30 October 2016 at 21:40, Thomas Koenig  wrote:
>> Hi Jerry,
>>
>>> See patch below which is very simple. We were handling the case where we
>>> tabbed left of current position in the record by clearing the seen_eor
>>> flag. We were not handling the case where we tab toward the right and so
>>> are still at eor.
>>>
>>> New test case attached. Regression tested on x86-64-linux.
>>>
>>> OK to commit?
>>>
>>> This is a regression relative to g77. Its simple enough I think it
>>> should go to 5 and 6 as well.
>>
>>
>> I think so too.  OK for all affected branches.
>>
>> Thanks for the patch!
>>
>
> Hi,
>
> The new test fails at execution on arm and aarch64:
>
>   gfortran.dg/fmt_t_9.f   -O0  execution test
>   gfortran.dg/fmt_t_9.f   -O1  execution test
>   gfortran.dg/fmt_t_9.f   -O2  execution test
>   gfortran.dg/fmt_t_9.f   -O3 -fomit-frame-pointer -funroll-loops
> -fpeel-loops -ftracer -finline-functions  execution test
>   gfortran.dg/fmt_t_9.f   -O3 -g  execution test
>   gfortran.dg/fmt_t_9.f   -Os  execution test
>
> seen on gcc-5 and gcc-6 branches. Maybe I missed it on trunk due to
> larger granularity in validations.
>

I see it failing on trunk too.

> Christophe
>
>
>
>> Thomas
>>

Re: [PATCH, RFC] Fix PR71915, PR71490: Handle casts on strides consistently

2016-11-04 Thread Christophe Lyon

On 24 October 2016 at 12:55, Richard Biener  wrote:
> On Fri, Oct 21, 2016 at 11:46 PM, Bill Schmidt
>  wrote:
>> Hi,
>>
>> I've been meaning for some time now to do a better job handling strength
>> reduction candidates where the stride of the candidate and its basis
>> involves a cast (usually widening) from another type.  The two PRs
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71490 and
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71915 show that we miss
>> strength reduction opportunities in such cases.  Essentially the way I
>> was handling this before required a cast to go back to the original
>> type, which was illegal when this was a narrowing operation or a cast
>> from a wrapping type to a non-wrapping type.
>>
>> This patch improves matters by tracking the effective type of the stride
>> so that we perform arithmetic in that type directly.  This allows us to
>> strength-reduce some cases missed before.  We still have to be careful
>> not to ever perform a narrowing or wrap-to-nonwrap conversion, but these
>> cases don't come up nearly so often in practice with this patch.
>>
>> The patch is pretty straightforward but rather large, so I'd like to
>> leave it up for review for a week or so before committing it -- extra
>> eyes welcome!
>>
>> There is an existing test case that checks for the narrowing problem,
>> and now fails with this patch as it should.  That is, it's now legal to
>> insert the initializer that we formerly knew to be bad for this test.
>> Since the test no longer serves any purpose, I'm deleting it.
>>
>> gcc.dg/tree-ssa/slsr-8.c generates quite different intermediate code now
>> than when I first added it to the test suite.  As a result of various
>> optimization changes, it's no longer maintained as a single block;
>> instead, the optimizable computations are sunk into two legs of a
>> conditional.  This exposed another similar case, leading to the bug
>> reports.  This test now passes, but I had to adjust it for the new code
>> generation we get.  I also added some commentary to indicate what we
>> expect to happen in that test, since it isn't too obvious.
>>
>> I've bootstrapped this and tested it on powerpc64le-unknown-linux-gnu
>> with no regressions.  To avoid the Wrath of Octoploid ;), I've also
>> tested it against ffmpeg using an x86_64-pc-linux-gnu cross with -O3
>> -march=amdfam10, also with no failures.  I've also verified that we hit
>> this code about 35 times in the test suite, so it looks like we have
>> some decent additional test coverage.
>>
>> Thanks in advance to anyone willing to take a look.
>
> LGTM.
>

Hi,

Since this was committed, I'm seeing a failure in
  gcc.dg/tree-ssa/slsr-8.c scan-tree-dump-times optimized " \\* " 9
on aarch64 targets.

Christophe


> Richard.
>
>> Bill
>>
>>
>> [gcc]
>>
>> 2016-10-21  Bill Schmidt  
>>
>> PR tree-optimization/71915
>> * gimple-ssa-strength-reduction.c (struct slsr_cand_d): Add
>> stride_type field.
>> (find_basis_for_base_expr): Require stride types to match when
>> seeking a basis.
>> (alloc_cand_and_find_basis): Record the stride type.
>> (slsr_process_phi): Pass stride type to alloc_cand_and_find_basis.
>> (backtrace_base_for_ref): Pass types to legal_cast_p_1 rather than
>> the expressions having those types.
>> (slsr_process_ref): Pass stride type to alloc_cand_and_find_basis.
>> (create_mul_ssa_cand): Likewise.
>> (create_mul_imm_cand): Likewise.
>> (create_add_ssa_cand): Likewise.
>> (create_add_imm_cand): Likewise.
>> (legal_cast_p_1): Change interface to accept types rather than the
>> expressions having those types.
>> (legal_cast_p): Pass types to legal_cast_p_1.
>> (slsr_process_cast): Pass stride type to
>> alloc_cand_and_find_basis.
>> (slsr_process_copy): Likewise.
>> (dump_candidate): Display stride type when a cast exists.
>> (create_add_on_incoming_edge): Introduce a cast when necessary for
>> the stride type.
>> (analyze_increments): Change the code checking for invalid casts
>> to rely on the stride type, and update the documentation and
>> example.  Change the code checking for pointer multiplies to rely
>> on the stride type.
>> (insert_initializers): Introduce a cast when necessary for the
>> stride type.  Use the stride type for the type of the initializer.
>>
>> [gcc/testsuite]
>>
>> 2016-10-21  Bill Schmidt  
>>
>> PR tree-optimization/71915
>> * gcc.dg/tree-ssa/pr54245.c: Delete.
>> * gcc.dg/tree-ssa/slsr-8.c: Adjust for new optimization and
>> document why.
>>
>>
>> Index: gcc/gimple-ssa-strength-reduction.c
>> ===
>> --- gcc/gimple-ssa-strength-reduction.c (revision 241379)
>> +++ gcc/gimple-ssa-strength-reduction.c (working copy)
>> @@ -246,6 +246,13 @@

Re: [PATCH] Five patches for std::experimental::filesystem

2016-11-04 Thread Christophe Lyon

On 2 November 2016 at 10:09, Christophe Lyon  wrote:
> On 27 October 2016 at 15:34, Jonathan Wakely  wrote:
>> On 26/10/16 09:24 +0200, Christophe Lyon wrote:
>>>
>>> Hi Jonathan,
>>>
>>> On 25 October 2016 at 17:32, Jonathan Wakely  wrote:

 Two more fixes for the filesystem TS, and improved tests.

   Handle negative times in filesystem::last_write_time
   * src/filesystem/ops.cc
(last_write_time(const path&, file_time_type, error_code&)):
 Handle
negative times correctly.
* testsuite/experimental/filesystem/operations/last_write_time.cc:
Test writing file times.

Fix error handling in copy_file and equivalent
   * src/filesystem/ops.cc (do_copy_file): Report an error if
 source
 or
destination is not a regular file (LWG 2712).
(equivalent): Fix error handling and result when only one file
 exists.
* testsuite/experimental/filesystem/operations/copy.cc: Remove
 files
created by tests. Test copying directories.
* testsuite/experimental/filesystem/operations/copy_file.cc:
 Remove
files created by tests.
* testsuite/experimental/filesystem/operations/equivalent.cc: New.
* testsuite/experimental/filesystem/operations/is_empty.cc: New.
* testsuite/experimental/filesystem/operations/read_symlink.cc:
 Remove
file created by test.
* testsuite/experimental/filesystem/operations/remove_all.cc: New.
* testsuite/util/testsuite_fs.h (~scoped_file): Only try to remove
file if path is non-empty, to support removal by other means.

 Tested x86_64-linux, committed to trunk.

>>> I can see failures in
>>> experimental/filesystem/operations/last_write_time.cc after your
>>> committed this patch:
>>>
>>> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/experimental/filesystem/operations/last_write_time.cc:127:
>>> void test02(): Assertion 'last_write_time(f.path) == time' failed.
>>> on arm*linux* and aarch64*linux* targets.
>>
>>
>> That test will fail for targets where _GLIBCXX_USE_UTIMENSAT is not
>> defined, as they use utime() instead which only supports second
>> granularity.
>>
>> This should solve it, by only checking that the file times are within
>> one second of the expected value.
>>
>
> Hi Jonathan,
> Indeed your patch fixes the problem I reported.
> Sorry for the delay, I was on holidays.
>

 but experimental/filesystem/iterators/recursive_directory_iterator.cc
now fails at execution on "old" arm targets (when forcing -march=armv5t).

Christophe

> Thanks,
>
> Christophe
>
>
>>
>> Tested x86_64-linux, committed to trunk.

Re: [PATCH, RFC] Fix PR71915, PR71490: Handle casts on strides consistently

2016-11-04 Thread Bill Schmidt

On Nov 4, 2016, at 9:10 AM, Christophe Lyon  wrote:
> 
> 
> Hi,
> 
> Since this was committed, I'm seeing a failure in
>  gcc.dg/tree-ssa/slsr-8.c scan-tree-dump-times optimized " \\* " 9
> on aarch64 targets.

Hi Christophe,

Best open a bug report or I will likely lose this in the crush.  If you don't 
mind,
please include the generated assembly file so I can see why the number of *'s
is off, and provide the output from -fdump-tree-slsr-details.  I imagine this is
a minor issue.  Please CC me as wschm...@gcc.gnu.org.

Thanks!
Bill

Re: [libgcc] Protect __TMC_END - TMC_LIST__ == 0

2016-11-04 Thread Andrew Pinski

On Fri, Nov 4, 2016 at 7:08 AM, Marc Glisse  wrote:
> Ping https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02220.html

I think this is obvious.

Thanks,
Andrew

>
>
> On Thu, 27 Oct 2016, Marc Glisse wrote:
>
>> Hello,
>>
>> some optimization patch I was working on simplified __TMC_END__ -
>> __TMC_LIST__ == 0 to false, which is not wanted (I assume that's why it
>> wasn't written __TMC_END__ == __TMC_LIST__ in the first place).
>>
>> Bootstrap+regtest on powerpc64le-unknown-linux-gnu.
>>
>> 2016-10-27  Marc Glisse  
>>
>> PR libgcc/77813
>> * crtstuff.c (deregister_tm_clones, register_tm_clones): Hide
>> __TMC_END__ behind a passthrough asm.
>
>
> --
> Marc Glisse

Re: [PATCH, RFC] Fix PR71915, PR71490: Handle casts on strides consistently

2016-11-04 Thread Christophe Lyon

On 4 November 2016 at 15:36, Bill Schmidt  wrote:
> On Nov 4, 2016, at 9:10 AM, Christophe Lyon  
> wrote:
>>
>>
>> Hi,
>>
>> Since this was committed, I'm seeing a failure in
>>  gcc.dg/tree-ssa/slsr-8.c scan-tree-dump-times optimized " \\* " 9
>> on aarch64 targets.
>
> Hi Christophe,
>
> Best open a bug report or I will likely lose this in the crush.  If you don't 
> mind,
> please include the generated assembly file so I can see why the number of *'s
> is off, and provide the output from -fdump-tree-slsr-details.  I imagine this 
> is
> a minor issue.  Please CC me as wschm...@gcc.gnu.org.
>

OK, I've created pr78210 for this.

Christophe

> Thanks!
> Bill

[Ping] Re: [PATCH] Start adding target-specific selftests

2016-11-04 Thread David Malcolm

(ping)

https://gcc.gnu.org/ml/gcc-patches/2016-10/msg01785.html 

On Fri, 2016-10-21 at 13:45 -0400, David Malcolm wrote:
> On Fri, 2016-10-21 at 12:04 +0200, Bernd Schmidt wrote:
> > On 10/21/2016 02:36 AM, David Malcolm wrote:
> > > +  /* Test dumping of hard regs.  This is inherently target
> > > -specific due
> > > + to the name.  */
> > > +#ifdef I386_OPTS_H
> > > +  ASSERT_RTL_DUMP_EQ ("(reg:SI ax)", gen_raw_REG (SImode, 0));
> > > +#endif
> > 
> > Generally putting in target dependencies like this isn't something
> > we
> > like to do. The patch is OK without this part, and we can revisit
> > this,
> > but maybe there wants to be a target hook for running target
> > -specific
> > selftests.
> 
> Thanks.  I removed the above target-specific part, and committed it
> as r241405 (having reverified bootstrap®rtest on x86_64-pc-linux
> -gnu).
> 
> The following patch implements a target hook for running target
> -specific
> selftests.
> 
> It implements the above test for dumping of hard regs, putting it
> within i386.c.
> 
> It's rather trivial, but I have followups that add further
> target-specific tests, so hopefully this foundation is OK.
> 
> Successfully bootstrapped®rtested on x86_64-pc-linux-gnu.
> 
> OK for trunk?
>  
> > > +  ASSERT_RTL_DUMP_EQ ("(cjump_insn (set (pc)\n"
> > > +   "(label_ref 0))\n"
> > > +   " (nil))\n",
> > > +   jump_insn);
> > >  }
> > 
> > I do wonder about the (nil)s and whether we can eliminate them.
> 
> I hope to.
> 
> gcc/ChangeLog:
>   * config/i386/i386.c: Include "selftest.h" and "selftest
> -rtl.h".
>   (selftest::ix86_test_dumping_hard_regs): New function.
>   (selftest::ix86_run_selftests): New function.
>   (TARGET_RUN_TARGET_SELFTESTS): When CHECKING_P, wire this up to
>   selftest::ix86_run_selftests.
>   * doc/tm.texi.in (TARGET_RUN_TARGET_SELFTESTS): New.
>   * doc/tm.texi: Regenerate
>   * rtl-tests.c: Include "selftest-rtl.h".
>   (selftest::assert_rtl_dump_eq): Make non-static.
>   (ASSERT_RTL_DUMP_EQ): Move to selftest-rtl.h.
>   (selftest::test_dumping_regs): Update comment.
>   * selftest-rtl.h: New file.
>   * selftest-run-tests.c: Include "target.h".
>   (selftest::run_tests): If non-NULL, call
>   targetm.run_target_selftests.
>   * target.def (run_target_selftests): New hook.
> ---
>  gcc/config/i386/i386.c   | 34 ++
>  gcc/doc/tm.texi  |  4 
>  gcc/doc/tm.texi.in   |  2 ++
>  gcc/rtl-tests.c  | 10 +++---
>  gcc/selftest-rtl.h   | 45
> +
>  gcc/selftest-run-tests.c |  5 +
>  gcc/target.def   |  6 ++
>  7 files changed, 99 insertions(+), 7 deletions(-)
>  create mode 100644 gcc/selftest-rtl.h
> 
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 3e6f8fd..8f6ceb4 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -77,6 +77,8 @@ along with GCC; see the file COPYING3.  If not see
>  #include "case-cfn-macros.h"
>  #include "regrename.h"
>  #include "dojump.h"
> +#include "selftest.h"
> +#include "selftest-rtl.h"
>  
>  /* This file should be included last.  */
>  #include "target-def.h"
> @@ -50365,6 +50367,33 @@ ix86_addr_space_zero_address_valid
> (addr_space_t as)
>  #undef TARGET_ADDR_SPACE_ZERO_ADDRESS_VALID
>  #define TARGET_ADDR_SPACE_ZERO_ADDRESS_VALID
> ix86_addr_space_zero_address_valid
>  
> +/* Target-specific selftests.  */
> +
> +#if CHECKING_P
> +
> +namespace selftest {
> +
> +/* Verify that hard regs are dumped as expected (in compact mode). 
>  */
> +
> +static void
> +ix86_test_dumping_hard_regs ()
> +{
> +  ASSERT_RTL_DUMP_EQ ("(reg:SI ax)", gen_raw_REG (SImode, 0));
> +  ASSERT_RTL_DUMP_EQ ("(reg:SI dx)", gen_raw_REG (SImode, 1));
> +}
> +
> +/* Run all target-specific selftests.  */
> +
> +static void
> +ix86_run_selftests (void)
> +{
> +  ix86_test_dumping_hard_regs ();
> +}
> +
> +} // namespace selftest
> +
> +#endif /* CHECKING_P */
> +
>  /* Initialize the GCC target structure.  */
>  #undef TARGET_RETURN_IN_MEMORY
>  #define TARGET_RETURN_IN_MEMORY ix86_return_in_memory
> @@ -50840,6 +50869,11 @@ ix86_addr_space_zero_address_valid
> (addr_space_t as)
>  #undef TARGET_CUSTOM_FUNCTION_DESCRIPTORS
>  #define TARGET_CUSTOM_FUNCTION_DESCRIPTORS 1
>  
> +#if CHECKING_P
> +#undef TARGET_RUN_TARGET_SELFTESTS
> +#define TARGET_RUN_TARGET_SELFTESTS selftest::ix86_run_selftests
> +#endif /* #if CHECKING_P */
> +
>  struct gcc_target targetm = TARGET_INITIALIZER;
>  
>  #include "gt-i386.h"
> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
> index 29dc73b..7efcf57 100644
> --- a/gcc/doc/tm.texi
> +++ b/gcc/doc/tm.texi
> @@ -11821,3 +11821,7 @@ All and all it does not take long to convert
> ports that the
>  maintainer is familiar with.
>  
>  @end defmac
> +
> +@deftypefn {Target Hook} void TARGET_RUN_TARGET_SELFTESTS (void)
> +If selftests are enabled, run

Re: [PATCH 2/4] BRIG (HSAIL) frontend: The FE itself.

2016-11-04 Thread Martin Jambor

Hi Pekka,

On Sat, Oct 29, 2016 at 02:58:29PM +0300, Pekka Jääskeläinen wrote:
> Hi Martin,
> 
> Thanks for the comments and suggestions. Replies inline:
> 
> On Thu, Oct 20, 2016 at 6:10 PM, Martin Jambor  wrote:
> > - Still quite few things need to be documented better, e.g.:
> >   + brig_to_generic::get_mangled_name_tmpl and to a lesser extent
> > brig_to_generic::get_mangled_name.  It should be clear what is the
> > intended difference in usage of the two (specially since the
> > former is a template and so the parameter does not give that much
> > of a hint to the reader)
> 
> Added more comments.

thanks.

> 
> >   + the visitor classes need some description so that the first time
> > readers see them, they understand what they are for and what they
> > visit (i.e. what "visiting" even means).
> 
> This is an adaptation of the classic gang of four Visitor design pattern.
> I added a reference to it in a comment.

Thanks.  As you can see, I am not a C++ person and do not know names
of common patterns.  For me, "Gang of Four" is a reference to history
of China and not something one would want to emulate.

> 
> > - I know it was me who told you to use gcc_assert and gcc_unreachable
> >   instead of internal_error.  The thing is, often the error is clearly
> >   not an internal error but an error in input.  I think that we should
> >   plan to handle these cases differently and report the issues better
> >   to the user, give a meaningful error message together width the
> >   section and the offset there when it was encountered.  I am not
> >   asking you to audit all asserts now and convert those in this
> >   category but it would be nice to have a mechanism to easily do so
> >   (and convert a few obvious places), so that we can convert these as
> >   we bump into them.
> 
> I'm not sure about this. BRIG FE is a rather special case as we
> assume HSAILasm has been used to parse and error check the original
> HSAIL text to the binary BRIG format which it consumes.
> 
> Of course HSAILasm can have bugs, but how much we should produce human
> readable error messages to help debugging HSAILasm is another thing.
> In case the BRIG FE
> fails to consume the input, it means either the BRIG is corrupted for
> a reason or another,
> but typically is not a human error (those should be caught be HSAILasm).

I personally primarily want this for debugging purposes, and we should
try to eventually report errors in a more comprehensible way than
HSAILasm.  But more generally, and more importantly, if the input,
whether human readable or not, is incorrect, the compiler should not
produce an "internal error."  If that happens, users will file bugs
against gcc when in fact it is the generating tool that is broken.
One you maintain the FE, you personally will not want this :-)

> 
> "File format not recognized" error is one that might be useful though.
> I added a check for the BRIG magic number and the supported version (1.0).
> 
> Perhaps we should add error printouts later on case by case basis when we
> see which error cases can be useful and worth reporting in a human readable
> graceful manner? It can be as easy as converting the internal_error
> to fatal_error or similar in that case.

Yeah, that is what I meant, together with printing some information
about offsets in different sections where the error is.

> 
> > - A very minor suggestion: In GCC it is customary to write TODO as one
> >   word.  We generally do not use "TO OPTIMIZE", that is just a TODO
> >   (as opposed to a FIXME, which hints that something is at least a bit
> >   wrong here).  I think you can keep your way if you want but for
> >   example I do have emacs highlighting set up for the traditional
> >   formats.
> 
> Converted all to TODO.
> 
> > - You do not seem to handle BRIG_OPCODE_ICALL, or have I missed it?
> >   That is fine, I don't think anybody else does that now anyway, I
> >   just got curious reading the ipa analysis.
> 
> Right.
> 
> > - brig-lang.c:
> >   + x_flag_whole_program = 0; - talk about this with Honza
> 
> I guess this is note to yourself? As we agreed I didn't try whole program
> optimizations yet.  I might later when optimizing for a target. It will
> be really useful, I agree. Now that it used the proper builtin way, it
> might work more easily.

Right, it was a note for myself.

> 
> >   + brig_langhook_type_for_size: has several issues.  Either always
> > return a type like go or return error_mark_node instead of NULL.
> > Also, do not count on long being 64-bit.  I would just copy what
> > go does.  Or lto.
> 
> Copied the go's version, it should work with BRIG too.

Great.

> 
> >   + convert: did you avoid using convert_to_vector deliberately?  The
> > size check seems genuinely useful.
> 
> BRIG/HSAIL is a bit special case due to its "untyped" variables (registers),
> I use bit level casts in a lot of places to avoid accidental type conversions
> or sign extensions.

I w

Re: [PATCH] Start adding target-specific selftests

2016-11-04 Thread Bernd Schmidt


On 10/21/2016 07:45 PM, David Malcolm wrote:


gcc/ChangeLog:
* config/i386/i386.c: Include "selftest.h" and "selftest-rtl.h".
(selftest::ix86_test_dumping_hard_regs): New function.
(selftest::ix86_run_selftests): New function.
(TARGET_RUN_TARGET_SELFTESTS): When CHECKING_P, wire this up to
selftest::ix86_run_selftests.
* doc/tm.texi.in (TARGET_RUN_TARGET_SELFTESTS): New.
* doc/tm.texi: Regenerate
* rtl-tests.c: Include "selftest-rtl.h".
(selftest::assert_rtl_dump_eq): Make non-static.
(ASSERT_RTL_DUMP_EQ): Move to selftest-rtl.h.
(selftest::test_dumping_regs): Update comment.
* selftest-rtl.h: New file.
* selftest-run-tests.c: Include "target.h".
(selftest::run_tests): If non-NULL, call
targetm.run_target_selftests.
* target.def (run_target_selftests): New hook.


Ok.


Bernd

Re: [PATCH, rs6000] Fold vector addition built-ins in GIMPLE

2016-11-04 Thread Bill Schmidt

Just a note that the "-std=gnu11" option in the test cases in this patch is a 
leftover
from a previous version of the patch, which I'll plan to remove.  Sorry for the
oversight!

Bill

> On Nov 1, 2016, at 9:05 PM, Bill Schmidt  wrote:
> 
> Hi,
> 
> As Jakub suggested in response to my *ahem* ornate patch for overloaded
> function built-ins, a much better approach is to use the existing
> machinery for overloading and then immediately fold the specific
> functions during gimplification.  There is a target hook available for
> this purpose that we have not previously used.  This patch demonstrates
> this functionality by implementing the target hook and folding vector
> addition built-ins within it.  Future patches will fold other such
> operations, improving the optimization available for many vector
> intrinsics.
> 
> Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no
> regressions.  Is this ok for trunk?
> 
> Thanks,
> Bill
> 
> 
> [gcc]
> 
> 2016-11-01  Bill Schmidt  
> 
>   * config/rs6000/rs6000.c (gimple-ssa.h): New #include.
>   (TARGET_GIMPLE_FOLD_BUILTIN): Define as
>   rs6000_gimple_fold_builtin.
>   (rs6000_gimple_fold_builtin): New function.  Add handling for
>   early expansion of vector addition builtins.
> 
> 
> [gcc/testsuite]
> 
> 2016-11-01  Bill Schmidt  
> 
>   * gcc.target/powerpc/fold-vec-add-1.c: New.
>   * gcc.target/powerpc/fold-vec-add-2.c: New.
>   * gcc.target/powerpc/fold-vec-add-3.c: New.
>   * gcc.target/powerpc/fold-vec-add-4.c: New.
>   * gcc.target/powerpc/fold-vec-add-5.c: New.
>   * gcc.target/powerpc/fold-vec-add-6.c: New.
>   * gcc.target/powerpc/fold-vec-add-7.c: New.
> 
> 
> Index: gcc/config/rs6000/rs6000.c
> ===
> --- gcc/config/rs6000/rs6000.c(revision 241624)
> +++ gcc/config/rs6000/rs6000.c(working copy)
> @@ -56,6 +56,7 @@
> #include "sched-int.h"
> #include "gimplify.h"
> #include "gimple-iterator.h"
> +#include "gimple-ssa.h"
> #include "gimple-walk.h"
> #include "intl.h"
> #include "params.h"
> @@ -1632,6 +1633,8 @@ static const struct attribute_spec rs6000_attribut
> 
> #undef TARGET_FOLD_BUILTIN
> #define TARGET_FOLD_BUILTIN rs6000_fold_builtin
> +#undef TARGET_GIMPLE_FOLD_BUILTIN
> +#define TARGET_GIMPLE_FOLD_BUILTIN rs6000_gimple_fold_builtin
> 
> #undef TARGET_EXPAND_BUILTIN
> #define TARGET_EXPAND_BUILTIN rs6000_expand_builtin
> @@ -16337,6 +16340,46 @@ rs6000_fold_builtin (tree fndecl, int n_args ATTRI
> #endif
> }
> 
> +/* Fold a machine-dependent built-in in GIMPLE.  (For folding into
> +   a constant, use rs6000_fold_builtin.)  */
> +
> +bool
> +rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
> +{
> +  gimple *stmt = gsi_stmt (*gsi);
> +  tree fndecl = gimple_call_fndecl (stmt);
> +  gcc_checking_assert (fndecl && DECL_BUILT_IN_CLASS (fndecl) == 
> BUILT_IN_MD);
> +  enum rs6000_builtins fn_code
> += (enum rs6000_builtins) DECL_FUNCTION_CODE (fndecl);
> +  tree arg0, arg1, lhs;
> +
> +  switch (fn_code)
> +{
> +/* Flavors of vec_add.  We deliberately don't expand
> +   P8V_BUILTIN_VADDUQM as it gets lowered from V1TImode to
> +   TImode, resulting in much poorer code generation.  */
> +case ALTIVEC_BUILTIN_VADDUBM:
> +case ALTIVEC_BUILTIN_VADDUHM:
> +case ALTIVEC_BUILTIN_VADDUWM:
> +case P8V_BUILTIN_VADDUDM:
> +case ALTIVEC_BUILTIN_VADDFP:
> +case VSX_BUILTIN_XVADDDP:
> +  {
> + arg0 = gimple_call_arg (stmt, 0);
> + arg1 = gimple_call_arg (stmt, 1);
> + lhs = gimple_call_lhs (stmt);
> + gimple *g = gimple_build_assign (lhs, PLUS_EXPR, arg0, arg1);
> + gimple_set_location (g, gimple_location (stmt));
> + gsi_replace (gsi, g, true);
> + return true;
> +  }
> +default:
> +  break;
> +}
> +
> +  return false;
> +}
> +
> /* Expand an expression EXP that calls a built-in function,
>with result going to TARGET if that's convenient
>(and in mode MODE if that's convenient).
> Index: gcc/testsuite/gcc.target/powerpc/fold-vec-add-1.c
> ===
> --- gcc/testsuite/gcc.target/powerpc/fold-vec-add-1.c (revision 0)
> +++ gcc/testsuite/gcc.target/powerpc/fold-vec-add-1.c (working copy)
> @@ -0,0 +1,46 @@
> +/* Verify that overloaded built-ins for vec_add with char
> +   inputs produce the right results.  */
> +
> +/* { dg-do compile } */
> +/* { dg-require-effective-target powerpc_altivec_ok } */
> +/* { dg-additional-options "-std=gnu11" } */
> +
> +#include 
> +
> +vector signed char
> +test1 (vector bool char x, vector signed char y)
> +{
> +  return vec_add (x, y);
> +}
> +
> +vector signed char
> +test2 (vector signed char x, vector bool char y)
> +{
> +  return vec_add (x, y);
> +}
> +
> +vector signed char
> +test3 (vector signed char x, vector signed char y)
> +{
> +  return vec_add (x, y);
> +}
> +
> +vector unsigned char
> +test4 (vector bool char x,

RE: [PATCH,testsuite] MIPS: Downgrade R6 to R5 if tests need branch-likely instructions.

2016-11-04 Thread Toma Tabacu

> From: Matthew Fortune
> Sent: 03 November 2016 13:07
> To: Toma Tabacu; gcc-patches@gcc.gnu.org
> Cc: catherine_mo...@mentor.com
> Subject: RE: [PATCH,testsuite] MIPS: Downgrade R6 to R5 if tests need
> branch-likely instructions.
> 
> Toma Tabacu  writes:
> > The gcc.target/mips/wrap-delay.c test was failing on mips-img-*
> > toolchains because it was using -mbranch-likely with an R6 target, and
> > branch- likely instructions were removed in R6.
> >
> > This patch makes the testsuite downgrade to R5 if the -mbranch-likely
> > option is present and we're targeting R6.
> >
> > Tested with mips-img-elf and mips-img-linux-gnu.
> 
> Hi Toma,
> 
> Welcome to GCC development, thanks for your first patch. As you can see
> from Catherine's reply the change looks good. I'll just cover some
> housekeeping issues...
> 
> > gcc/testsuite/
> > * gcc.target/mips/mips.exp: Add check for -mbranch-likely in
> > condition for R5 downgrade.
> 
> Changelogs are an art form which will take some getting used to. This is
> almost there but needs to reference the function affected.
> 
>   * gcc.target/mips/mips.exp (mips-dg-options): Downgrade to R5
>   for -mbranch-likely and infer -mno-branch-likely for R6.
> 
> I have extended the comment as well as there is an additional change
> needed for this patch ideally.
> 
> > diff --git a/gcc/testsuite/gcc.target/mips/mips.exp
> > b/gcc/testsuite/gcc.target/mips/mips.exp
> > index 7c24140..382d69c 100644
> > --- a/gcc/testsuite/gcc.target/mips/mips.exp
> > +++ b/gcc/testsuite/gcc.target/mips/mips.exp
> > @@ -1176,7 +1176,8 @@ proc mips-dg-options { args } {
> >|| [mips_have_test_option_p options "-mpaired-
> > single"]
> >|| [mips_have_test_option_p options "-
> > mnan=legacy"]
> >|| [mips_have_test_option_p options "-
> > mabs=legacy"]
> > -  || [mips_have_test_option_p options "!HAS_LSA"])
> > } {
> > +  || [mips_have_test_option_p options "!HAS_LSA"]
> > +  || [mips_have_test_option_p options "-mbranch-
> > likely"]) } {
> > if { $gp_size == 32 } {
> > mips_make_test_option options "-mips32r5"
> > } else {
> 
> Please can you make sure to retain the original patch formatting when
> posting. I suspect you have copied this out of a putty session or similar and
> have therefore lost the tabs.
> 
> The extra change is that in the post-arch option processing we will need to
> infer -mno-branch-likely for the $isa_rev > 5 case much like we infer -
> mnan=2008 and -mabs=2008. This is so that when running the testsuite using
> -mips32r5 or earlier, with -mbranch-likely as part of the user-supplied test
> flags, then a test which is upgraded to
> mips32r6 does not attempt to use -mbranch-likely.
> 
> Hope that wasn't too cryptic!
> 
> Thanks,
> Matthew

The updated patch below includes the improved ChangeLog comment, correct
formatting, and the post-arch enforcing of -mno-branch-likely for R6.

Regards,
Toma

gcc/testsuite/ChangeLog:

* gcc.target/mips/mips.exp (mips-dg-options): Downgrade to R5
for -mbranch-likely and infer -mno-branch-likely for R6.

diff --git a/gcc/testsuite/gcc.target/mips/mips.exp 
b/gcc/testsuite/gcc.target/mips/mips.exp
index 7c24140..6b7c46f 100644
--- a/gcc/testsuite/gcc.target/mips/mips.exp
+++ b/gcc/testsuite/gcc.target/mips/mips.exp
@@ -1176,7 +1176,8 @@ proc mips-dg-options { args } {
   || [mips_have_test_option_p options "-mpaired-single"]
   || [mips_have_test_option_p options "-mnan=legacy"]
   || [mips_have_test_option_p options "-mabs=legacy"]
-  || [mips_have_test_option_p options "!HAS_LSA"]) } {
+  || [mips_have_test_option_p options "!HAS_LSA"])
+  || [mips_have_test_option_p options "-mbranch-likely"]) 
} {
if { $gp_size == 32 } {
mips_make_test_option options "-mips32r5"
} else {
@@ -1345,6 +1346,7 @@ proc mips-dg-options { args } {
mips_make_test_option options "-mno-paired-single"
mips_make_test_option options "-mnan=2008"
mips_make_test_option options "-mabs=2008"
+   mips_make_test_option options "-mno-branch-likely"
}
if { [regexp {^-march=(octeon|loongson)} $arch] } {
mips_make_test_option options "-mno-micromips"

[PATCH, committed] Fix PR78210

2016-11-04 Thread Bill Schmidt

Hi,

On some targets (AArch64 in particular) test gcc.dg/tree-ssa/slsr-8.c
fails because the scan of the SLSR dump is too strict.  On these targets,
a multiply may be a widening multiply, and that wasn't accounted for.
Now it is.  Committed as obvious.

Thanks,
Bill


2016-11-04  Bill Schmidt  

PR tree-optimization/78210
* gcc.dg/tree-ssa/slsr-8.c: Fix slsr scan to include the
possibility of widening multiplies.


Index: gcc/testsuite/gcc.dg/tree-ssa/slsr-8.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/slsr-8.c  (revision 241844)
+++ gcc/testsuite/gcc.dg/tree-ssa/slsr-8.c  (working copy)
@@ -24,5 +24,6 @@ f (int s, int *c, int *d)
initializer with a cast, so we'll keep it as is.  */
 
 /* There are 4 ' * ' instances in the decls (since "int * iftmp.0;" is
-   added), 2 parms, 3 in the code.  */
-/* { dg-final { scan-tree-dump-times " \\* " 9 "optimized" } } */
+   added), 2 parms, 3 in the code.  The second one in the code may
+   be a widening multiply (for example, on AArch64).  */
+/* { dg-final { scan-tree-dump-times " w?\\* " 9 "optimized" } } */

[PATCH v2] Add mcpu flag for Qualcomm falkor core

2016-11-04 Thread Siddhesh Poyarekar

This adds an mcpu option for the upcoming Qualcomm Falkor core.  This
is identical to the qdf24xx part that was added earlier and hence
retains the same tuning structure and continues to have the a57
pipeline for now.  The part number has also been changed and this
patch fixes this for both qdf24xx and falkor options.

Tested with aarch64 and armhf.

Siddhesh

* gcc/config/aarch64/aarch64-cores.def (qdf24xx): Update part
number.
(falkor): New core.
* gcc/config/aarch64/aarch64-tune.md: Regenerated.
* gcc/config/arm/arm-cores.def (falkor): New core.
* gcc/config/arm/arm-tables.opt: Regenerated.
* gcc/config/arm/arm-tune.md: Regenerated.
* gcc/config/arm/bpabi.h (BE8_LINK_SPEC): Add falkor support.
* gcc/config/arm/t-aprofile (MULTILIB_MATCHES): Add falkor
support.
* gcc/doc/invoke.texi (AArch64 Options/-mtune): Add falkor.
(ARM Options/-mtune): Add falkor.

---
 gcc/config/aarch64/aarch64-cores.def | 3 ++-
 gcc/config/aarch64/aarch64-tune.md   | 2 +-
 gcc/config/arm/arm-cores.def | 1 +
 gcc/config/arm/arm-tables.opt| 3 +++
 gcc/config/arm/arm-tune.md   | 2 +-
 gcc/config/arm/bpabi.h   | 2 ++
 gcc/config/arm/t-aprofile| 1 +
 gcc/doc/invoke.texi  | 9 +
 8 files changed, 16 insertions(+), 7 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-cores.def 
b/gcc/config/aarch64/aarch64-cores.def
index f9b7552..4b00f3f 100644
--- a/gcc/config/aarch64/aarch64-cores.def
+++ b/gcc/config/aarch64/aarch64-cores.def
@@ -54,7 +54,8 @@ AARCH64_CORE("cortex-a73",  cortexa73, cortexa57, 8A,  
AARCH64_FL_FOR_ARCH8 | AA
 AARCH64_CORE("exynos-m1",   exynosm1,  exynosm1,  8A,  AARCH64_FL_FOR_ARCH8 | 
AARCH64_FL_CRC | AARCH64_FL_CRYPTO, exynosm1,  0x53, 0x001)
 
 /* Qualcomm ('Q') cores. */
-AARCH64_CORE("qdf24xx", qdf24xx,   cortexa57, 8A,  AARCH64_FL_FOR_ARCH8 | 
AARCH64_FL_CRC | AARCH64_FL_CRYPTO, qdf24xx,   0x51, 0x800)
+AARCH64_CORE("falkor",  falkor,cortexa57, 8A,  AARCH64_FL_FOR_ARCH8 | 
AARCH64_FL_CRC | AARCH64_FL_CRYPTO, qdf24xx,   0x51, 0xC00)
+AARCH64_CORE("qdf24xx", qdf24xx,   cortexa57, 8A,  AARCH64_FL_FOR_ARCH8 | 
AARCH64_FL_CRC | AARCH64_FL_CRYPTO, qdf24xx,   0x51, 0xC00)
 
 /* Cavium ('C') cores. */
 AARCH64_CORE("thunderx",thunderx,  thunderx,  8A,  AARCH64_FL_FOR_ARCH8 | 
AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx,  0x43, 0x0a1)
diff --git a/gcc/config/aarch64/aarch64-tune.md 
b/gcc/config/aarch64/aarch64-tune.md
index 022b131..29afcdf 100644
--- a/gcc/config/aarch64/aarch64-tune.md
+++ b/gcc/config/aarch64/aarch64-tune.md
@@ -1,5 +1,5 @@
 ;; -*- buffer-read-only: t -*-
 ;; Generated automatically by gentune.sh from aarch64-cores.def
 (define_attr "tune"
-   
"cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,exynosm1,qdf24xx,thunderx,xgene1,vulcan,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53"
+   
"cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,exynosm1,falkor,qdf24xx,thunderx,xgene1,vulcan,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53"
(const (symbol_ref "((enum attr_tune) aarch64_tune)")))
diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def
index 2072e1e..1bfec9c 100644
--- a/gcc/config/arm/arm-cores.def
+++ b/gcc/config/arm/arm-cores.def
@@ -173,6 +173,7 @@ ARM_CORE("cortex-a57",  cortexa57, cortexa57,   8A, 
ARM_FSET_MAKE_CPU1 (FL_LDSCHED
 ARM_CORE("cortex-a72", cortexa72, cortexa57,   8A, ARM_FSET_MAKE_CPU1 
(FL_LDSCHED | FL_CRC32 | FL_FOR_ARCH8A), cortex_a57)
 ARM_CORE("cortex-a73", cortexa73, cortexa57,   8A, ARM_FSET_MAKE_CPU1 
(FL_LDSCHED | FL_CRC32 | FL_FOR_ARCH8A), cortex_a73)
 ARM_CORE("exynos-m1",  exynosm1,  exynosm1,8A, ARM_FSET_MAKE_CPU1 
(FL_LDSCHED | FL_CRC32 | FL_FOR_ARCH8A), exynosm1)
+ARM_CORE("falkor", falkor,cortexa57,   8A, ARM_FSET_MAKE_CPU1 
(FL_LDSCHED | FL_CRC32 | FL_FOR_ARCH8A), qdf24xx)
 ARM_CORE("qdf24xx",qdf24xx,   cortexa57,   8A, ARM_FSET_MAKE_CPU1 
(FL_LDSCHED | FL_CRC32 | FL_FOR_ARCH8A), qdf24xx)
 ARM_CORE("xgene1",  xgene1,xgene1,  8A,ARM_FSET_MAKE_CPU1 
(FL_LDSCHED | FL_FOR_ARCH8A),xgene1)
 
diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt
index ee9e3bb..7b15c8c 100644
--- a/gcc/config/arm/arm-tables.opt
+++ b/gcc/config/arm/arm-tables.opt
@@ -328,6 +328,9 @@ EnumValue
 Enum(processor_type) String(exynos-m1) Value(exynosm1)
 
 EnumValue
+Enum(processor_type) String(falkor) Value(falkor)
+
+EnumValue
 Enum(processor_type) String(qdf24xx) Value(qdf24xx)
 
 EnumValue
diff --git a/gcc/config/arm/arm-tune.md b/gcc/config/arm/arm-tune.md
index 594ce9d..867da26 100644
--- a/gcc/config/arm/arm-tune.md
+++ b/gcc/config/arm/arm-tune.md
@@ -34,7 +34,7 @@
cortexm3,marvell_pj4,cortexa15cortexa7,
cortexa17cortexa7,cortexa32,cortexa35,
cortexa53,cortexa57,cortexa72,
-   cortexa7

ubsan PATCH to fix compile-time hog with operator overloading (PR sanitizer/78208)

2016-11-04 Thread Marek Polacek

This is a similar case to PR sanitizer/70342.  Here, we were generating 
expression
in a quadratic fashion because of the initializer--we create SAVE_EXPR <>, then 
UBSAN_NULL >, and then COMPOUND_EXPR of these two and so on.

On this testcase we were instrumention CALL_EXPR that is in fact operator<<.  I
think those always return a reference, so it cannot be NULL, so there's no
point in instrumenting those?

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2016-11-04  Marek Polacek  

PR sanitizer/78208
* cp-gimplify.c (cp_genericize_r): Don't instrument
CALL_EXPR_OPERATOR_SYNTAX.

* g++.dg/ubsan/null-8.C: New.

diff --git gcc/cp/cp-gimplify.c gcc/cp/cp-gimplify.c
index 9b9b511..f39e9d5 100644
--- gcc/cp/cp-gimplify.c
+++ gcc/cp/cp-gimplify.c
@@ -1495,7 +1495,8 @@ cp_genericize_r (tree *stmt_p, int *walk_subtrees, void 
*data)
= TREE_CODE (fn) == ADDR_EXPR
  && TREE_CODE (TREE_OPERAND (fn, 0)) == FUNCTION_DECL
  && DECL_CONSTRUCTOR_P (TREE_OPERAND (fn, 0));
- if (flag_sanitize & (SANITIZE_NULL | SANITIZE_ALIGNMENT))
+ if (flag_sanitize & (SANITIZE_NULL | SANITIZE_ALIGNMENT)
+ && !CALL_EXPR_OPERATOR_SYNTAX (stmt))
ubsan_maybe_instrument_member_call (stmt, is_ctor);
  if ((flag_sanitize & SANITIZE_VPTR) && !is_ctor)
cp_ubsan_maybe_instrument_member_call (stmt);
diff --git gcc/testsuite/g++.dg/ubsan/null-8.C 
gcc/testsuite/g++.dg/ubsan/null-8.C
index e69de29..0600b93 100644
--- gcc/testsuite/g++.dg/ubsan/null-8.C
+++ gcc/testsuite/g++.dg/ubsan/null-8.C
@@ -0,0 +1,22 @@
+// PR sanitizer/78208
+// { dg-do compile }
+// { dg-options "-fsanitize=null" }
+
+class S
+{
+  virtual void foo () = 0;
+};
+
+struct T {
+  T &operator << (const char *s);
+};
+
+T t;
+
+void
+S::foo ()
+{
+  t << "a" << "b" << "c" << "d" << "e" << "f" << "g" << "h" << "i"
+<< "j" << "k" << "l" << "m" << "n" << "o" << "p" << "q" << "r"
+<< "s" << "t" << "u" << "v" << "w" << "z";
+}

Marek

Re: ubsan PATCH to fix compile-time hog with operator overloading (PR sanitizer/78208)

2016-11-04 Thread Jakub Jelinek

On Fri, Nov 04, 2016 at 05:05:51PM +0100, Marek Polacek wrote:
> This is a similar case to PR sanitizer/70342.  Here, we were generating 
> expression
> in a quadratic fashion because of the initializer--we create SAVE_EXPR <>, 
> then 
> UBSAN_NULL >, and then COMPOUND_EXPR of these two and so on.
> 
> On this testcase we were instrumention CALL_EXPR that is in fact operator<<.  
> I
> think those always return a reference, so it cannot be NULL, so there's no
> point in instrumenting those?

How do you know what is the return type of a user defined overloaded
operator?
Even when it returns a reference, I thought the point of -fsanitize=null was
for all references to verify their addresses are non-null.

I must say I don't understand the issue, if it is the same SAVE_EXPR in both
lhs of COMPOUND_EXPR and UBSAN_NULL argument, shouldn't cp_genericize_r use
of hash table to avoid walking the same tree multiple times avoid the
exponential compile time/memory?

> Bootstrapped/regtested on x86_64-linux, ok for trunk?
> 
> 2016-11-04  Marek Polacek  
> 
>   PR sanitizer/78208
>   * cp-gimplify.c (cp_genericize_r): Don't instrument
>   CALL_EXPR_OPERATOR_SYNTAX.
> 
>   * g++.dg/ubsan/null-8.C: New.
> 
> diff --git gcc/cp/cp-gimplify.c gcc/cp/cp-gimplify.c
> index 9b9b511..f39e9d5 100644
> --- gcc/cp/cp-gimplify.c
> +++ gcc/cp/cp-gimplify.c
> @@ -1495,7 +1495,8 @@ cp_genericize_r (tree *stmt_p, int *walk_subtrees, void 
> *data)
>   = TREE_CODE (fn) == ADDR_EXPR
> && TREE_CODE (TREE_OPERAND (fn, 0)) == FUNCTION_DECL
> && DECL_CONSTRUCTOR_P (TREE_OPERAND (fn, 0));
> -   if (flag_sanitize & (SANITIZE_NULL | SANITIZE_ALIGNMENT))
> +   if (flag_sanitize & (SANITIZE_NULL | SANITIZE_ALIGNMENT)
> +   && !CALL_EXPR_OPERATOR_SYNTAX (stmt))
>   ubsan_maybe_instrument_member_call (stmt, is_ctor);
> if ((flag_sanitize & SANITIZE_VPTR) && !is_ctor)
>   cp_ubsan_maybe_instrument_member_call (stmt);

Jakub

RE: [PATCH,testsuite] MIPS: Downgrade R6 to R5 if tests need branch-likely instructions.

2016-11-04 Thread Toma Tabacu

> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Toma Tabacu
> Sent: 04 November 2016 15:25
> To: Matthew Fortune; gcc-patches@gcc.gnu.org
> Cc: catherine_mo...@mentor.com
> Subject: RE: [PATCH,testsuite] MIPS: Downgrade R6 to R5 if tests need branch-
> likely instructions.
> 
> > From: Matthew Fortune
> > Sent: 03 November 2016 13:07
> > To: Toma Tabacu; gcc-patches@gcc.gnu.org
> > Cc: catherine_mo...@mentor.com
> > Subject: RE: [PATCH,testsuite] MIPS: Downgrade R6 to R5 if tests need
> > branch-likely instructions.
> >
> > Toma Tabacu  writes:
> > > The gcc.target/mips/wrap-delay.c test was failing on mips-img-*
> > > toolchains because it was using -mbranch-likely with an R6 target, and
> > > branch- likely instructions were removed in R6.
> > >
> > > This patch makes the testsuite downgrade to R5 if the -mbranch-likely
> > > option is present and we're targeting R6.
> > >
> > > Tested with mips-img-elf and mips-img-linux-gnu.
> >
> > Hi Toma,
> >
> > Welcome to GCC development, thanks for your first patch. As you can see
> > from Catherine's reply the change looks good. I'll just cover some
> > housekeeping issues...
> >
> > > gcc/testsuite/
> > > * gcc.target/mips/mips.exp: Add check for -mbranch-likely in
> > > condition for R5 downgrade.
> >
> > Changelogs are an art form which will take some getting used to. This is
> > almost there but needs to reference the function affected.
> >
> > * gcc.target/mips/mips.exp (mips-dg-options): Downgrade to R5
> > for -mbranch-likely and infer -mno-branch-likely for R6.
> >
> > I have extended the comment as well as there is an additional change
> > needed for this patch ideally.
> >
> > > diff --git a/gcc/testsuite/gcc.target/mips/mips.exp
> > > b/gcc/testsuite/gcc.target/mips/mips.exp
> > > index 7c24140..382d69c 100644
> > > --- a/gcc/testsuite/gcc.target/mips/mips.exp
> > > +++ b/gcc/testsuite/gcc.target/mips/mips.exp
> > > @@ -1176,7 +1176,8 @@ proc mips-dg-options { args } {
> > >|| [mips_have_test_option_p options "-mpaired-
> > > single"]
> > >|| [mips_have_test_option_p options "-
> > > mnan=legacy"]
> > >|| [mips_have_test_option_p options "-
> > > mabs=legacy"]
> > > -  || [mips_have_test_option_p options "!HAS_LSA"])
> > > } {
> > > +  || [mips_have_test_option_p options "!HAS_LSA"]
> > > +  || [mips_have_test_option_p options "-mbranch-
> > > likely"]) } {
> > > if { $gp_size == 32 } {
> > > mips_make_test_option options "-mips32r5"
> > > } else {
> >
> > Please can you make sure to retain the original patch formatting when
> > posting. I suspect you have copied this out of a putty session or similar 
> > and
> > have therefore lost the tabs.
> >
> > The extra change is that in the post-arch option processing we will need to
> > infer -mno-branch-likely for the $isa_rev > 5 case much like we infer -
> > mnan=2008 and -mabs=2008. This is so that when running the testsuite using
> > -mips32r5 or earlier, with -mbranch-likely as part of the user-supplied test
> > flags, then a test which is upgraded to
> > mips32r6 does not attempt to use -mbranch-likely.
> >
> > Hope that wasn't too cryptic!
> >
> > Thanks,
> > Matthew
> 
> The updated patch below includes the improved ChangeLog comment, correct
> formatting, and the post-arch enforcing of -mno-branch-likely for R6.
> 
> Regards,
> Toma
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/mips/mips.exp (mips-dg-options): Downgrade to R5
>   for -mbranch-likely and infer -mno-branch-likely for R6.
> 
> diff --git a/gcc/testsuite/gcc.target/mips/mips.exp
> b/gcc/testsuite/gcc.target/mips/mips.exp
> index 7c24140..6b7c46f 100644
> --- a/gcc/testsuite/gcc.target/mips/mips.exp
> +++ b/gcc/testsuite/gcc.target/mips/mips.exp
> @@ -1176,7 +1176,8 @@ proc mips-dg-options { args } {
>  || [mips_have_test_option_p options "-mpaired-single"]
>  || [mips_have_test_option_p options "-mnan=legacy"]
>  || [mips_have_test_option_p options "-mabs=legacy"]
> -|| [mips_have_test_option_p options "!HAS_LSA"]) } {
> +|| [mips_have_test_option_p options "!HAS_LSA"])
> +|| [mips_have_test_option_p options "-mbranch-likely"]) 
> } {
>   if { $gp_size == 32 } {
>   mips_make_test_option options "-mips32r5"
>   } else {
> @@ -1345,6 +1346,7 @@ proc mips-dg-options { args } {
>   mips_make_test_option options "-mno-paired-single"
>   mips_make_test_option options "-mnan=2008"
>   mips_make_test_option options "-mabs=2008"
> + mips_make_test_option options "-mno-branch-likely"
>   }
>   if { [regexp {^-march=(octeon|loongson)} $arch] } {
>   mips_make_test_option options "-mno-micromips"

An e

[PATCH] Add target hook to compute register pressure classes

2016-11-04 Thread Pat Haugen

While working to get -fsched-pressure profitable on PowerPC, one of the things 
I noticed causing problems was the selection of pressure classes. Understanding 
that getting the computation of pressure classes correct for all targets in the 
machine independent code is going to be extremely hard (or impossible), I 
propose the following target hook which lets the target decide the set of 
pressure classes to use. I will follow this up with an implementation of the 
hook for the rs6000 backend, but am currently benchmarking variations of it to 
see which is best.

Bootstrap/regtest on powerpc64le with no new regressions. Ok for trunk?

Note that the ira.c changes are not as complicated as what it looks, I only 
added a few lines to invoke the target hook, the rest is just indentation 
changes to move the existing code into the else leg.

-Pat

2016-11-04  Pat Haugen  

* target.def (compute_pressure_classes): New target hook.
* doc/tm.texi.in: Document it.
* doc/tm.texi: Regenerate.
* ira.c (setup_pressure_classes): Call target hook if defined.
Index: gcc/target.def
===
--- gcc/target.def	(revision 241821)
+++ gcc/target.def	(working copy)
@@ -5039,6 +5039,16 @@ DEFHOOK
   machine_mode, (enum insn_code icode),
   default_cstore_mode)
 
+/* This target hook allows the backend to compute the register pressure
+   classes to use.  */
+DEFHOOK
+(compute_pressure_classes,
+ "A target hook which lets a backend compute the set of pressure classes to\
+  be used by those optimization passes which take register pressure into\
+  account, as opposed to letting IRA compute them.  It returns the number of\
+  register classes stored in the array @var{pressure_classes}.",
+ int, (enum reg_class *pressure_classes), NULL)
+
 /* True if a structure, union or array with MODE containing FIELD should
be accessed using BLKmode.  */
 DEFHOOK
Index: gcc/doc/tm.texi.in
===
--- gcc/doc/tm.texi.in	(revision 241821)
+++ gcc/doc/tm.texi.in	(working copy)
@@ -2509,6 +2509,8 @@ value that the middle-end intended.
 
 @hook TARGET_CSTORE_MODE
 
+@hook TARGET_COMPUTE_PRESSURE_CLASSES
+
 @node Stack and Calling
 @section Stack Layout and Calling Conventions
 @cindex calling conventions
Index: gcc/doc/tm.texi
===
--- gcc/doc/tm.texi	(revision 241821)
+++ gcc/doc/tm.texi	(working copy)
@@ -2903,6 +2903,10 @@ This hook defines a class of registers w
 This hook defines the machine mode to use for the boolean result of  conditional store patterns.  The ICODE argument is the instruction code  for the cstore being performed.  Not definiting this hook is the same  as accepting the mode encoded into operand 0 of the cstore expander  patterns.
 @end deftypefn
 
+@deftypefn {Target Hook} int TARGET_COMPUTE_PRESSURE_CLASSES (enum reg_class *@var{pressure_classes})
+A target hook which lets a backend compute the set of pressure classes to  be used by those optimization passes which take register pressure into  account, as opposed to letting IRA compute them.  It returns the number of  register classes stored in the array @var{pressure_classes}.
+@end deftypefn
+
 @node Stack and Calling
 @section Stack Layout and Calling Conventions
 @cindex calling conventions
Index: gcc/ira.c
===
--- gcc/ira.c	(revision 241821)
+++ gcc/ira.c	(working copy)
@@ -792,78 +792,85 @@ setup_pressure_classes (void)
   HARD_REG_SET temp_hard_regset2;
   bool insert_p;
 
-  n = 0;
-  for (cl = 0; cl < N_REG_CLASSES; cl++)
-{
-  if (ira_class_hard_regs_num[cl] == 0)
-	continue;
-  if (ira_class_hard_regs_num[cl] != 1
-	  /* A register class without subclasses may contain a few
-	 hard registers and movement between them is costly
-	 (e.g. SPARC FPCC registers).  We still should consider it
-	 as a candidate for a pressure class.  */
-	  && alloc_reg_class_subclasses[cl][0] < cl)
+  if (targetm.compute_pressure_classes)
+n = targetm.compute_pressure_classes (pressure_classes);
+  else
+{ 
+  n = 0;
+  for (cl = 0; cl < N_REG_CLASSES; cl++)
 	{
-	  /* Check that the moves between any hard registers of the
-	 current class are not more expensive for a legal mode
-	 than load/store of the hard registers of the current
-	 class.  Such class is a potential candidate to be a
-	 register pressure class.  */
-	  for (m = 0; m < NUM_MACHINE_MODES; m++)
+	  if (ira_class_hard_regs_num[cl] == 0)
+	continue;
+	  if (ira_class_hard_regs_num[cl] != 1
+	  /* A register class without subclasses may contain a few
+		 hard registers and movement between them is costly
+		 (e.g. SPARC FPCC registers).  We still should consider it
+		 as a candidate for a pressure class.  */
+	  && alloc_reg_class_subclasses[cl][0] < cl)
 	{
-	  COPY_HAR

Re: [PATCH] Fix DSE not to consider calls as reads from function's body (PR target/77834)

2016-11-04 Thread Bernd Schmidt


On 11/04/2016 01:26 PM, Jakub Jelinek wrote:


In any case, the second hunk fixes an important DSE bug that just got
revealed by the former change.


Can you elaborate on that (what situation does it occur in, is it 
covered by the testsuite)?



Bernd

Re: [PATCH 1/2, i386] cmpstrnsi needs string length

2016-11-04 Thread Aaron Sawdey

ChangeLog for this patch:

2016-11-03  Aaron Sawdey  

* config/i386/i386.md (cmpstrnsi): New test to bail out if neither
string input is a string constant.

-- 
Aaron Sawdey, Ph.D.  acsaw...@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain

Re: [PATCH 2/2, expand] make expand_builtin_strncmp more general

2016-11-04 Thread Aaron Sawdey

ChangeLog for this patch:

2016-11-03  Aaron Sawdey  

* builtins.c (expand_builtin_strncmp): Attempt expansion of strncmp
via cmpstrnsi even if neither string is constant.

-- 
Aaron Sawdey, Ph.D.  acsaw...@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain

Re: [patch, libgfortran] Bug 78123 - Short reads with T edit descriptor not padding correctly

2016-11-04 Thread Jerry DeLisle


On 11/04/2016 07:11 AM, Christophe Lyon wrote:

On 4 November 2016 at 14:48, Christophe Lyon  wrote:

On 30 October 2016 at 21:40, Thomas Koenig  wrote:

Hi Jerry,


--- snip ---


The new test fails at execution on arm and aarch64:

  gfortran.dg/fmt_t_9.f   -O0  execution test
  gfortran.dg/fmt_t_9.f   -O1  execution test
  gfortran.dg/fmt_t_9.f   -O2  execution test
  gfortran.dg/fmt_t_9.f   -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions  execution test
  gfortran.dg/fmt_t_9.f   -O3 -g  execution test
  gfortran.dg/fmt_t_9.f   -Os  execution test

seen on gcc-5 and gcc-6 branches. Maybe I missed it on trunk due to
larger granularity in validations.


Hi Christophe,

The patch is very non target specific. All I can think at the moment is its 
related to buffering and flushing details, or an error in the test case.


When I get a chance, I would like to send you a test program to maybe help debug 
this.  Or, is there a machine available I cant ssh into and work on it there?


Jerry

[PATCH] Fix nonoverlapping_memrefs_p ICE (PR target/77834, take 2)

2016-11-04 Thread Jakub Jelinek

Hi!

On Wed, Nov 02, 2016 at 10:46:40AM +0100, Richard Biener wrote:
> Yeah, plus if a followup test would have disambiguated things (the
> dispatch to the tree oracle for example).

After discussing this on IRC that the dse.c and sched-deps.c (call (mem ...) )
changes are probably not safe, I'm proposing following patch which should be
safe to backport to release branches too, the only occurrences of
!DECL_RTL_SET_P where DECL_RTL worked fine in my bootstraps/regtests were
FUNCTION_DECLs, so the patch should only turn ICEs into returning the safe
return value that the expressions might overlap.

If/once this is in, I'm planning to test/submit a patch adding
  /* If one decl is known to be a function or label in a function and
 the other is some kind of data, they can't overlap.  */
  if ((TREE_CODE (exprx) == FUNCTION_DECL
   || TREE_CODE (exprx) == LABEL_DECL)
  != (TREE_CODE (expry) == FUNCTION_DECL
  || TREE_CODE (expry) == LABEL_DECL))
return 1;
before that.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-11-04  Jakub Jelinek  

PR target/77834
* alias.c (nonoverlapping_memrefs_p): Return 0 if exprx or expry
doesn't have rtl set.

* gcc.dg/pr77834.c: New test.

--- gcc/alias.c.jj  2016-10-21 17:06:27.0 +0200
+++ gcc/alias.c 2016-10-31 11:38:29.448031590 +0100
@@ -2755,6 +2755,13 @@ nonoverlapping_memrefs_p (const_rtx x, c
   || TREE_CODE (expry) == CONST_DECL)
 return 1;

+  /* If either of the decls doesn't have DECL_RTL set (e.g. marked as
+ living in multiple places), we can't tell anything.  Exception
+ are FUNCTION_DECLs for which we can create DECL_RTL on demand.  */
+  if ((!DECL_RTL_SET_P (exprx) && TREE_CODE (exprx) != FUNCTION_DECL)
+  || (!DECL_RTL_SET_P (expry) && TREE_CODE (expry) != FUNCTION_DECL))
+return 0;
+
   rtlx = DECL_RTL (exprx);
   rtly = DECL_RTL (expry);

--- gcc/testsuite/gcc.dg/pr77834.c.jj   2016-10-31 11:41:46.290521464 +0100
+++ gcc/testsuite/gcc.dg/pr77834.c  2016-10-31 11:41:24.0 +0100
@@ -0,0 +1,18 @@
+/* PR target/77834 */
+/* { dg-do compile } */
+/* { dg-options "-O -ftree-pre -Wno-psabi" } */
+/* { dg-additional-options "-mstringop-strategy=libcall" { target i?86-*-* 
x86_64-*-* } } */
+
+typedef int V __attribute__ ((vector_size (64)));
+
+V
+foo (V u, V v, int w)
+{
+  do
+{
+  if (u[0]) v ^= u[w];
+}
+  while ((V) { 0, u[w] }[1]);
+  u = (V) { v[v[0]], u[u[0]] };
+  return v + u;
+}

Jakub

Re: [PATCH] Fix DSE not to consider calls as reads from function's body (PR target/77834)

2016-11-04 Thread Jakub Jelinek

On Fri, Nov 04, 2016 at 05:18:13PM +0100, Bernd Schmidt wrote:
> On 11/04/2016 01:26 PM, Jakub Jelinek wrote:
> >
> >In any case, the second hunk fixes an important DSE bug that just got
> >revealed by the former change.
> 
> Can you elaborate on that (what situation does it occur in, is it covered by
> the testsuite)?

As discussed earlier, I'm proposing to leave the first hunk out and have
bootstrapped/regtested the following on x86_64-linux and i686-linux, ok for
trunk?

I don't have a testcase except for
FAIL: gcc.dg/torture/pr67690.c   -Os  execution test
that fails on i686-linux (or x86_64-linux with -m32) with the other hunk,
the details are mentioned in PR77834 comments.  We don't remove the store of
an argument to sibling constant call during the local DSE phase, and
the const sibling call is marked insn_info->frame_read, because of the
need of the arguments.  During the propagation it is handled right, but then
when doing the last step dse_step5 wouldn't call scan_reads on the sibcall,
because it didn't have non-NULL insn_info->rec nor
insn_info->non_frame_wild_read (the latter is used just for non-const calls
that may read anything).  scan_reads does pretty much:
  read_info_t read_info = insn_info->read_rec;
  if (insn_info->non_frame_wild_read)
do_something;
  if (insn_info->frame_read)
do_something2;
  while (read_info)
do_something3;
so not calling it for the insn_info->frame_read case is clearly wrong.

2016-11-04  Jakub Jelinek  

PR target/77834
* dse.c (dse_step5): Call scan_reads even if just
insn_info->frame_read.  Improve and fix dump file messages.

--- gcc/dse.c.jj2016-11-03 15:52:12.521965058 +0100
+++ gcc/dse.c   2016-11-04 09:42:27.640098125 +0100
@@ -3298,12 +3298,19 @@ dse_step5 (void)
  bitmap_clear (v);
}
  else if (insn_info->read_rec
-   || insn_info->non_frame_wild_read)
+  || insn_info->non_frame_wild_read
+  || insn_info->frame_read)
{
- if (dump_file && !insn_info->non_frame_wild_read)
-   fprintf (dump_file, "regular read\n");
-  else if (dump_file && (dump_flags & TDF_DETAILS))
-   fprintf (dump_file, "non-frame wild read\n");
+ if (dump_file && (dump_flags & TDF_DETAILS))
+   {
+ if (!insn_info->non_frame_wild_read
+ && !insn_info->frame_read)
+   fprintf (dump_file, "regular read\n");
+ if (insn_info->non_frame_wild_read)
+   fprintf (dump_file, "non-frame wild read\n");
+ if (insn_info->frame_read)
+   fprintf (dump_file, "frame read\n");
+   }
  scan_reads (insn_info, v, NULL);
}
}

Jakub

Re: [patch, libgfortran] Bug 78123 - Short reads with T edit descriptor not padding correctly

2016-11-04 Thread Christophe Lyon

On 4 November 2016 at 17:28, Jerry DeLisle  wrote:
> On 11/04/2016 07:11 AM, Christophe Lyon wrote:
>>
>> On 4 November 2016 at 14:48, Christophe Lyon 
>> wrote:
>>>
>>> On 30 October 2016 at 21:40, Thomas Koenig  wrote:

 Hi Jerry,

> --- snip ---
>>>
>>>
>>> The new test fails at execution on arm and aarch64:
>>>
>>>   gfortran.dg/fmt_t_9.f   -O0  execution test
>>>   gfortran.dg/fmt_t_9.f   -O1  execution test
>>>   gfortran.dg/fmt_t_9.f   -O2  execution test
>>>   gfortran.dg/fmt_t_9.f   -O3 -fomit-frame-pointer -funroll-loops
>>> -fpeel-loops -ftracer -finline-functions  execution test
>>>   gfortran.dg/fmt_t_9.f   -O3 -g  execution test
>>>   gfortran.dg/fmt_t_9.f   -Os  execution test
>>>
>>> seen on gcc-5 and gcc-6 branches. Maybe I missed it on trunk due to
>>> larger granularity in validations.
>
>
> Hi Christophe,
>
> The patch is very non target specific. All I can think at the moment is its
> related to buffering and flushing details, or an error in the test case.
>
> When I get a chance, I would like to send you a test program to maybe help
> debug this.  Or, is there a machine available I cant ssh into and work on it
> there?
>

I am building cross-compilers, testing with qemu.

However, there are aarch64 machines available in the GCC farm, maybe
you have access to them?

Thanks,

Christophe

> Jerry
>

RE: [PATCH,testsuite] MIPS: Downgrade R6 to R5 if tests need branch-likely instructions.

2016-11-04 Thread Matthew Fortune

Toma Tabacu  writes:
> > From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> > ow...@gcc.gnu.org] On Behalf Of Toma Tabacu
> > Sent: 04 November 2016 15:25
> > To: Matthew Fortune; gcc-patches@gcc.gnu.org
> > Cc: catherine_mo...@mentor.com
> > Subject: RE: [PATCH,testsuite] MIPS: Downgrade R6 to R5 if tests need
> > branch- likely instructions.
> >
> > > From: Matthew Fortune
> > > Sent: 03 November 2016 13:07
> > > To: Toma Tabacu; gcc-patches@gcc.gnu.org
> > > Cc: catherine_mo...@mentor.com
> > > Subject: RE: [PATCH,testsuite] MIPS: Downgrade R6 to R5 if tests
> > > need branch-likely instructions.
> > >
> > > Toma Tabacu  writes:
> > > > The gcc.target/mips/wrap-delay.c test was failing on mips-img-*
> > > > toolchains because it was using -mbranch-likely with an R6 target,
> > > > and
> > > > branch- likely instructions were removed in R6.
> > > >
> > > > This patch makes the testsuite downgrade to R5 if the
> > > > -mbranch-likely option is present and we're targeting R6.
> > > >
> > > > Tested with mips-img-elf and mips-img-linux-gnu.
> > >
> > > Hi Toma,
> > >
> > > Welcome to GCC development, thanks for your first patch. As you can
> > > see from Catherine's reply the change looks good. I'll just cover
> > > some housekeeping issues...
> > >
> > > > gcc/testsuite/
> > > > * gcc.target/mips/mips.exp: Add check for -mbranch-likely
> in
> > > > condition for R5 downgrade.
> > >
> > > Changelogs are an art form which will take some getting used to.
> > > This is almost there but needs to reference the function affected.
> > >
> > >   * gcc.target/mips/mips.exp (mips-dg-options): Downgrade to R5
> > >   for -mbranch-likely and infer -mno-branch-likely for R6.
> > >
> > > I have extended the comment as well as there is an additional change
> > > needed for this patch ideally.
> > >
> > > > diff --git a/gcc/testsuite/gcc.target/mips/mips.exp
> > > > b/gcc/testsuite/gcc.target/mips/mips.exp
> > > > index 7c24140..382d69c 100644
> > > > --- a/gcc/testsuite/gcc.target/mips/mips.exp
> > > > +++ b/gcc/testsuite/gcc.target/mips/mips.exp
> > > > @@ -1176,7 +1176,8 @@ proc mips-dg-options { args } {
> > > >|| [mips_have_test_option_p options
> > > > "-mpaired- single"]
> > > >|| [mips_have_test_option_p options "-
> > > > mnan=legacy"]
> > > >|| [mips_have_test_option_p options "-
> > > > mabs=legacy"]
> > > > -  || [mips_have_test_option_p options
> "!HAS_LSA"])
> > > > } {
> > > > +  || [mips_have_test_option_p options
> "!HAS_LSA"]
> > > > +  || [mips_have_test_option_p options
> > > > + "-mbranch-
> > > > likely"]) } {
> > > > if { $gp_size == 32 } {
> > > > mips_make_test_option options "-mips32r5"
> > > > } else {
> > >
> > > Please can you make sure to retain the original patch formatting
> > > when posting. I suspect you have copied this out of a putty session
> > > or similar and have therefore lost the tabs.
> > >
> > > The extra change is that in the post-arch option processing we will
> > > need to infer -mno-branch-likely for the $isa_rev > 5 case much like
> > > we infer -
> > > mnan=2008 and -mabs=2008. This is so that when running the testsuite
> > > using
> > > -mips32r5 or earlier, with -mbranch-likely as part of the
> > > user-supplied test flags, then a test which is upgraded to
> > > mips32r6 does not attempt to use -mbranch-likely.
> > >
> > > Hope that wasn't too cryptic!
> > >
> > > Thanks,
> > > Matthew
> >
> > The updated patch below includes the improved ChangeLog comment,
> > correct formatting, and the post-arch enforcing of -mno-branch-likely
> for R6.

Thanks, committed as r241850.

Matthew

[PATCH] Remove a FIXME from verify_type_variant

2016-11-04 Thread Martin Jambor

Hi,

now that we removed java, let us check whether the FIXME this removes
(and that was added by Honza) is actually true.  It only affects
checking runs and if it turns out to be false, we can easily revert
this.

Bootstrapped, LTO-bootstrapped and tested on x86_64-linux and
aarch64-linux, OK for trunk?

Thanks,

Martin



2016-11-03  Martin Jambor  

* tree.c (verify_type_variant): Use pointer comparison to check that
TYPE_SIZE_UNIT match.
---
 gcc/tree.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/gcc/tree.c b/gcc/tree.c
index 56cc653..10fc70d 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -13272,12 +13272,10 @@ verify_type_variant (const_tree t, tree tv)
verify_variant_match (TYPE_SIZE);
   if (TREE_CODE (TYPE_SIZE_UNIT (t)) != PLACEHOLDER_EXPR
  && TREE_CODE (TYPE_SIZE_UNIT (tv)) != PLACEHOLDER_EXPR
- && TYPE_SIZE_UNIT (t) != TYPE_SIZE_UNIT (tv)
- /* FIXME: ideally we should compare pointer equality, but java FE
-produce variants where size is INTEGER_CST of different type (int
-wrt size_type) during libjava biuld.  */
- && !operand_equal_p (TYPE_SIZE_UNIT (t), TYPE_SIZE_UNIT (tv), 0))
+ && TYPE_SIZE_UNIT (t) != TYPE_SIZE_UNIT (tv))
{
+ gcc_assert (!operand_equal_p (TYPE_SIZE_UNIT (t),
+   TYPE_SIZE_UNIT (tv), 0));
  error ("type variant has different TYPE_SIZE_UNIT");
  debug_tree (tv);
  error ("type variant's TYPE_SIZE_UNIT");
-- 
2.10.1

[PATCH, GCC, wwwdocs] Document new Cortex-M23 and Cortex-M33 processors support in ARM backend

2016-11-04 Thread Thomas Preudhomme


Hi there,

This patch document the newly added support in GCC 7 for Cortex-M23 and 
Cortex-M33 processors [1][2].


[1] http://www.arm.com/products/processors/cortex-m/cortex-m23-processor.php
[2] http://www.arm.com/products/processors/cortex-m/cortex-m33-processor.php

Is this ok for ?

Best regards,

Thomas
Index: htdocs/gcc-7/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-7/changes.html,v
retrieving revision 1.20
diff -u -r1.20 changes.html
--- htdocs/gcc-7/changes.html	25 Oct 2016 02:04:00 -	1.20
+++ htdocs/gcc-7/changes.html	26 Oct 2016 14:05:59 -
@@ -316,6 +316,16 @@
-mcpu=cortex-a73 and -mtune=cortex-a73
options as well as the equivalent target attributes and pragmas.
  
+ 
+   The ARM Cortex-M23 processor is now supported via the
+   -mcpu=cortex-m23 and -mtune=cortex-m23
+   options.
+ 
+ 
+   The ARM Cortex-M33 processor is now supported via the
+   -mcpu=cortex-m33 and -mtune=cortex-m33
+   options.
+ 

 
 AVR

Re: [PATCH, gcc/ARM, ping] Add support for Cortex-M33

2016-11-04 Thread Thomas Preudhomme




On 02/11/16 10:14, Kyrill Tkachov wrote:


On 02/11/16 10:07, Thomas Preudhomme wrote:

Ping?

Best regards,

Thomas

On 26/10/16 17:42, Thomas Preudhomme wrote:

Hi,

This patch adds support for the Cortex-M33 processor launched by ARM [1]. The
patch adds support for the name and wires it up to the ARMv8-M Mainline with DSP
extensions architecture and arm_v7m_tune tuning parameters for the time being.
It also updates documentation to mention this new processor.

[1] http://www.arm.com/products/processors/cortex-m/cortex-m33-processor.php

ChangeLog entry is as follows:

*** gcc/Changelog ***

2016-10-26  Thomas Preud'homme 

* config/arm/arm-arches.def (armv8-m.main+dsp): Set Cortex-M33 as
representative core for this architecture.
* config/arm/arm-cores.def (cortex-m33): Define new processor.
* config/arm/arm-tables.opt: Regenerate.
* config/arm/arm-tune.md: Likewise.
* config/arm/bpabi.h (BE8_LINK_SPEC): Add Cortex-M33 to the list of
valid -mcpu options.
* doc/invoke.texi (ARM Options): Document new Cortex-M33 processor.


Tested by building libgcc and libstdc++ for Cortex-M33 and running a hello world
compiled for it.

Is this ok for master?



Ok.
Thanks,
Kyrill


Committed, thanks.

Best regards,

Thomas

Re: [PATCH, gcc/ARM, ping] Add support for Cortex-M23

2016-11-04 Thread Thomas Preudhomme


On 02/11/16 10:13, Kyrill Tkachov wrote:


On 02/11/16 10:07, Thomas Preudhomme wrote:

Ping?

Best regards,

Thomas

On 26/10/16 17:42, Thomas Preudhomme wrote:

Hi,

This patch adds support for the Cortex-M23 processor launched by ARM [1]. The
patch adds support for the name and wires it up to the ARMv8-M Baseline
architecture and arm_v6m_tune tuning parameters for the time being. It also
updates documentation to mention this new processor.

[1] http://www.arm.com/products/processors/cortex-m/cortex-m23-processor.php

ChangeLog entry is as follows:

*** gcc/Changelog ***

2016-10-26  Thomas Preud'homme 

* config/arm/arm-arches.def (armv8-m.base): Set Cortex-M23 as
representative core for this architecture.
* config/arm/arm-cores.def (cortex-m23): Define new processor.
* config/arm/arm-tables.opt: Regenerate.
* config/arm/arm-tune.md: Likewise.
* config/arm/arm.c (arm_v6m_tune): Add Cortex-M23 to the list of cores
this tuning parameters apply to in the comment.
* config/arm/bpabi.h (BE8_LINK_SPEC): Add Cortex-M23 to the list of
valid -mcpu options.
* doc/invoke.texi (ARM Options): Document new Cortex-M23 processor.


Tested by building libgcc and libstdc++ for Cortex-M23 and running a hello world
compiled for it.

Is this ok for master?



Ok.
Thanks,
Kyrill



Committed, thanks.

Best regards,

Thomas

[PATCH] rtx_writer: avoid printing trailing nils

2016-11-04 Thread David Malcolm

On Fri, 2016-10-21 at 12:04 +0200, Bernd Schmidt wrote:
> On 10/21/2016 02:36 AM, David Malcolm wrote:
> > +  ASSERT_RTL_DUMP_EQ ("(cjump_insn (set (pc)\n"
> > +   "(label_ref 0))\n"
> > +  " (nil))\n",
> > +  jump_insn);
> >  }
> 
> I do wonder about the (nil)s and whether we can eliminate them.

The following patch does so, by adding a flag param to
rtx_writer::print_rtx allowing omitting the (nil) when in compact mode.

Successfully bootstrapped®rtested on x86_64-pc-linux-gnu.

OK for trunk?

gcc/ChangeLog:
* print-rtl.c (rtx_writer::print_rtx_operand_code_0): Don't print
trailing "(nil)" for NOTE_VAR_LOCATION and for ENTRY_VALUE_EXP.
(rtx_writer::print_rtx_operand_code_e): Likewise for
CALL_INSN_FUNCTION_USAGE and for REG_NOTES.
(rtx_writer::print_rtx_operand_codes_E_and_V): Always print nils
within a vec.
(rtx_writer::print_rtx): Add param "print_nil" to support not
printing some trailing "(nil)" values when in compact mode.
Don't print them for PAT_VAR_LOCATION_LOC.
(print_inline_rtx): Update for new param.
(debug_rtx): Likewise.
(rtx_writer::print_rtl): Likewise.
(rtx_writer::print_rtl_single_with_indent): Likewise.
* print-rtl.h (rtx_writer::print_rtx): Add "print_nil" param.
* rtl-tests.c (selftests::test_uncond_jump): Remove trailing
"(nil)" from expected output.
---
 gcc/print-rtl.c | 44 +---
 gcc/print-rtl.h |  2 +-
 gcc/rtl-tests.c |  2 +-
 3 files changed, 31 insertions(+), 17 deletions(-)

diff --git a/gcc/print-rtl.c b/gcc/print-rtl.c
index 341ecdf..643eb57 100644
--- a/gcc/print-rtl.c
+++ b/gcc/print-rtl.c
@@ -158,7 +158,7 @@ rtx_writer::print_rtx_operand_code_0 (const_rtx in_rtx 
ATTRIBUTE_UNUSED,
case NOTE_INSN_VAR_LOCATION:
case NOTE_INSN_CALL_ARG_LOCATION:
  fputc (' ', m_outfile);
- print_rtx (NOTE_VAR_LOCATION (in_rtx));
+ print_rtx (NOTE_VAR_LOCATION (in_rtx), false);
  break;
 
case NOTE_INSN_CFI:
@@ -201,7 +201,7 @@ rtx_writer::print_rtx_operand_code_0 (const_rtx in_rtx 
ATTRIBUTE_UNUSED,
   m_indent += 2;
   if (!m_sawclose)
fprintf (m_outfile, " ");
-  print_rtx (ENTRY_VALUE_EXP (in_rtx));
+  print_rtx (ENTRY_VALUE_EXP (in_rtx), false);
   m_indent -= 2;
 }
 #endif
@@ -224,11 +224,17 @@ rtx_writer::print_rtx_operand_code_e (const_rtx in_rtx, 
int idx)
   if (idx == 7 && CALL_P (in_rtx))
 {
   m_in_call_function_usage = true;
-  print_rtx (XEXP (in_rtx, idx));
+  print_rtx (XEXP (in_rtx, idx), false);
   m_in_call_function_usage = false;
 }
   else
-print_rtx (XEXP (in_rtx, idx));
+{
+  bool print_nil = true;
+  /* Avoid printing trailing "(nil)" for REG_NOTES.  */
+  if (INSN_CHAIN_CODE_P (GET_CODE (in_rtx)) && idx == 6)
+   print_nil = false;
+  print_rtx (XEXP (in_rtx, idx), print_nil);
+}
   m_indent -= 2;
 }
 
@@ -252,7 +258,7 @@ rtx_writer::print_rtx_operand_codes_E_and_V (const_rtx 
in_rtx, int idx)
m_sawclose = 1;
 
   for (int j = 0; j < XVECLEN (in_rtx, idx); j++)
-   print_rtx (XVECEXP (in_rtx, idx, j));
+   print_rtx (XVECEXP (in_rtx, idx, j), true);
 
   m_indent -= 2;
 }
@@ -564,10 +570,15 @@ rtx_writer::print_rtx_operand (const_rtx in_rtx, int idx)
 }
 }
 
-/* Print IN_RTX onto m_outfile.  This is the recursive part of printing.  */
+/* Print IN_RTX onto m_outfile.  This is the recursive part of printing.
+   PRINT_NIL controls the printing of "(nil)" in compact mode.
+   In compact mode, if IN_RTX is NULL, then "(nil)" is printed if PRINT_NIL
+   is true, and nothing is printed if PRINT_NIL is false.
+   In non-compact mode, "(nil)" is always printed for NULL, and PRINT_NIL
+   is ignored.  */
 
 void
-rtx_writer::print_rtx (const_rtx in_rtx)
+rtx_writer::print_rtx (const_rtx in_rtx, bool print_nil)
 {
   int idx = 0;
 
@@ -582,8 +593,11 @@ rtx_writer::print_rtx (const_rtx in_rtx)
 
   if (in_rtx == 0)
 {
-  fputs ("(nil)", m_outfile);
-  m_sawclose = 1;
+  if (print_nil || !m_compact)
+   {
+ fputs ("(nil)", m_outfile);
+ m_sawclose = 1;
+   }
   return;
 }
   else if (GET_CODE (in_rtx) > NUM_RTX_CODE)
@@ -657,7 +671,7 @@ rtx_writer::print_rtx (const_rtx in_rtx)
  else
print_mem_expr (m_outfile, PAT_VAR_LOCATION_DECL (in_rtx));
  fputc (' ', m_outfile);
- print_rtx (PAT_VAR_LOCATION_LOC (in_rtx));
+ print_rtx (PAT_VAR_LOCATION_LOC (in_rtx), false);
  if (PAT_VAR_LOCATION_STATUS (in_rtx)
  == VAR_INIT_STATUS_UNINITIALIZED)
fprintf (m_outfile, " [uninit]");
@@ -765,7 +779,7 @@ void
 print_inline_rtx (FILE *outf, const_rtx x, int ind)
 {
   rtx_writer w (outf, ind, false, false);
-

Re: ubsan PATCH to fix compile-time hog with operator overloading (PR sanitizer/78208)

2016-11-04 Thread Jason Merrill

On Fri, Nov 4, 2016 at 12:16 PM, Jakub Jelinek  wrote:
> On Fri, Nov 04, 2016 at 05:05:51PM +0100, Marek Polacek wrote:
>> This is a similar case to PR sanitizer/70342.  Here, we were generating 
>> expression
>> in a quadratic fashion because of the initializer--we create SAVE_EXPR <>, 
>> then
>> UBSAN_NULL >, and then COMPOUND_EXPR of these two and so on.
>>
>> On this testcase we were instrumention CALL_EXPR that is in fact operator<<. 
>>  I
>> think those always return a reference, so it cannot be NULL, so there's no
>> point in instrumenting those?
>
> How do you know what is the return type of a user defined overloaded
> operator?

Indeed, there are no constraints on the return type of overloaded operator<<.

Jason

RE: [PATCH,testsuite] MIPS: Downgrade R6 to R5 if tests need branch-likely instructions.

2016-11-04 Thread Toma Tabacu

> From: Matthew Fortune
> Sent: 04 November 2016 16:49
> To: Toma Tabacu; gcc-patches@gcc.gnu.org
> Cc: catherine_mo...@mentor.com
> Subject: RE: [PATCH,testsuite] MIPS: Downgrade R6 to R5 if tests need branch-
> likely instructions.
> 
> Toma Tabacu  writes:
> > > From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> > > ow...@gcc.gnu.org] On Behalf Of Toma Tabacu
> > > Sent: 04 November 2016 15:25
> > > To: Matthew Fortune; gcc-patches@gcc.gnu.org
> > > Cc: catherine_mo...@mentor.com
> > > Subject: RE: [PATCH,testsuite] MIPS: Downgrade R6 to R5 if tests need
> > > branch- likely instructions.
> > >
> > > > From: Matthew Fortune
> > > > Sent: 03 November 2016 13:07
> > > > To: Toma Tabacu; gcc-patches@gcc.gnu.org
> > > > Cc: catherine_mo...@mentor.com
> > > > Subject: RE: [PATCH,testsuite] MIPS: Downgrade R6 to R5 if tests
> > > > need branch-likely instructions.
> > > >
> > > > Toma Tabacu  writes:
> > > > > The gcc.target/mips/wrap-delay.c test was failing on mips-img-*
> > > > > toolchains because it was using -mbranch-likely with an R6 target,
> > > > > and
> > > > > branch- likely instructions were removed in R6.
> > > > >
> > > > > This patch makes the testsuite downgrade to R5 if the
> > > > > -mbranch-likely option is present and we're targeting R6.
> > > > >
> > > > > Tested with mips-img-elf and mips-img-linux-gnu.
> > > >
> > > > Hi Toma,
> > > >
> > > > Welcome to GCC development, thanks for your first patch. As you can
> > > > see from Catherine's reply the change looks good. I'll just cover
> > > > some housekeeping issues...
> > > >
> > > > > gcc/testsuite/
> > > > > * gcc.target/mips/mips.exp: Add check for -mbranch-likely
> > in
> > > > > condition for R5 downgrade.
> > > >
> > > > Changelogs are an art form which will take some getting used to.
> > > > This is almost there but needs to reference the function affected.
> > > >
> > > > * gcc.target/mips/mips.exp (mips-dg-options): Downgrade to R5
> > > > for -mbranch-likely and infer -mno-branch-likely for R6.
> > > >
> > > > I have extended the comment as well as there is an additional change
> > > > needed for this patch ideally.
> > > >
> > > > > diff --git a/gcc/testsuite/gcc.target/mips/mips.exp
> > > > > b/gcc/testsuite/gcc.target/mips/mips.exp
> > > > > index 7c24140..382d69c 100644
> > > > > --- a/gcc/testsuite/gcc.target/mips/mips.exp
> > > > > +++ b/gcc/testsuite/gcc.target/mips/mips.exp
> > > > > @@ -1176,7 +1176,8 @@ proc mips-dg-options { args } {
> > > > >|| [mips_have_test_option_p options
> > > > > "-mpaired- single"]
> > > > >|| [mips_have_test_option_p options "-
> > > > > mnan=legacy"]
> > > > >|| [mips_have_test_option_p options "-
> > > > > mabs=legacy"]
> > > > > -  || [mips_have_test_option_p options
> > "!HAS_LSA"])
> > > > > } {
> > > > > +  || [mips_have_test_option_p options
> > "!HAS_LSA"]
> > > > > +  || [mips_have_test_option_p options
> > > > > + "-mbranch-
> > > > > likely"]) } {
> > > > > if { $gp_size == 32 } {
> > > > > mips_make_test_option options "-mips32r5"
> > > > > } else {
> > > >
> > > > Please can you make sure to retain the original patch formatting
> > > > when posting. I suspect you have copied this out of a putty session
> > > > or similar and have therefore lost the tabs.
> > > >
> > > > The extra change is that in the post-arch option processing we will
> > > > need to infer -mno-branch-likely for the $isa_rev > 5 case much like
> > > > we infer -
> > > > mnan=2008 and -mabs=2008. This is so that when running the testsuite
> > > > using
> > > > -mips32r5 or earlier, with -mbranch-likely as part of the
> > > > user-supplied test flags, then a test which is upgraded to
> > > > mips32r6 does not attempt to use -mbranch-likely.
> > > >
> > > > Hope that wasn't too cryptic!
> > > >
> > > > Thanks,
> > > > Matthew
> > >
> > > The updated patch below includes the improved ChangeLog comment,
> > > correct formatting, and the post-arch enforcing of -mno-branch-likely
> > for R6.
> 
> Thanks, committed as r241850.
> 
> Matthew

I've noticed that there is a typo in my surname in the ChangeLog entry (in the
name and in the email address).

Regards,
Toma Tabacu

RE: [PATCH,testsuite] MIPS: Downgrade R6 to R5 if tests need branch-likely instructions.

2016-11-04 Thread Matthew Fortune

Toma Tabacu  writes:
> I've noticed that there is a typo in my surname in the ChangeLog entry
> (in the name and in the email address).

Apologies, corrected.

Matthew

Re: [PATCH] rtx_writer: avoid printing trailing nils

2016-11-04 Thread Bernd Schmidt


On 11/04/2016 06:32 PM, David Malcolm wrote:

   ASSERT_RTL_DUMP_EQ ("(cjump_insn 1 (set (pc)\n"
  "(label_ref 0))\n"
- " (nil))\n",
+ " )\n",
  jump_insn);


This looks like the patch doesn't go the whole way - we'd want the space 
savings from avoiding the newline.


You could consider giving the print_nil thing a default argument.


Bernd

Privatize copy_blkmode_from_reg

2016-11-04 Thread Eric Botcazou

Tested on x86-64/Linux, applied on the mainline as obvious.


2016-11-04  Eric Botcazou  

* expr.h (copy_blkmode_from_reg): Delete.
* expr.c (copy_blkmode_from_reg): Make static.

-- 
Eric BotcazouIndex: expr.c
===
--- expr.c	(revision 241808)
+++ expr.c	(working copy)
@@ -2591,7 +2591,7 @@ maybe_emit_group_store (rtx x, tree type
 
This is used on targets that return BLKmode values in registers.  */
 
-void
+static void
 copy_blkmode_from_reg (rtx target, rtx srcreg, tree type)
 {
   unsigned HOST_WIDE_INT bytes = int_size_in_bytes (type);
Index: expr.h
===
--- expr.h	(revision 241808)
+++ expr.h	(working copy)
@@ -146,9 +146,6 @@ extern void emit_group_store (rtx, rtx,
 
 extern rtx maybe_emit_group_store (rtx, tree);
 
-/* Copy BLKmode object from a set of registers.  */
-extern void copy_blkmode_from_reg (rtx, rtx, tree);
-
 /* Mark REG as holding a parameter for the next CALL_INSN.
Mode is TYPE_MODE of the non-promoted parameter, or VOIDmode.  */
 extern void use_reg_mode (rtx *, rtx, machine_mode);

Re: [PATCH, rs6000] Fold vector addition built-ins in GIMPLE

2016-11-04 Thread Segher Boessenkool

Hi Bill,

On Tue, Nov 01, 2016 at 09:05:03PM -0500, Bill Schmidt wrote:
> As Jakub suggested in response to my *ahem* ornate patch for overloaded
> function built-ins, a much better approach is to use the existing
> machinery for overloading and then immediately fold the specific
> functions during gimplification.  There is a target hook available for
> this purpose that we have not previously used.  This patch demonstrates
> this functionality by implementing the target hook and folding vector
> addition built-ins within it.  Future patches will fold other such
> operations, improving the optimization available for many vector
> intrinsics.
> 
> Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no
> regressions.  Is this ok for trunk?

This looks fine, please apply.

Thanks,


Segher

Re: [PATCH] rtx_writer: avoid printing trailing nils

2016-11-04 Thread Bernd Schmidt

Here's something simpler. Only very lightly tested, but isn't something 
like this all that's needed? Could decide whether to apply it to 
expr_list etc. as well.


Index: print-rtl.c
===
--- print-rtl.c (revision 241233)
+++ print-rtl.c (working copy)
@@ -697,7 +697,12 @@ print_rtx (const_rtx in_rtx)

   /* Get the format string and skip the first elements if we have handled
  them already.  */
-  for (; idx < GET_RTX_LENGTH (GET_CODE (in_rtx)); idx++)
+  int limit = GET_RTX_LENGTH (GET_CODE (in_rtx));
+  if (flag_compact
+  && INSN_CHAIN_CODE_P (GET_CODE (in_rtx))
+  && XEXP (in_rtx, limit - 1) == NULL_RTX)
+limit--;
+  for (; idx < limit; idx++)
 print_rtx_operand (in_rtx, idx);

   switch (GET_CODE (in_rtx))


Bernd

Re: [Patch, fortran] PR64933 - ASSOCIATE on a character variable does not allow substring expressions

2016-11-04 Thread Steve Kargl

On Fri, Nov 04, 2016 at 12:43:37PM +0100, Paul Richard Thomas wrote:
> 
> The associate construct does not readily permit the identification of
> the associate name as an array during parsing. However, this can be
> done whilst matching rvalues within the associate block. This patch
> extends this identification in gfc_match_varspec, whilst excluding
> character types from the present test. This latter change prevents
> scalar substrings from being pegged as array references, which was the
> origin of this bug.
> 
> Bootstraps and regtests on FC21/x86_64.
> 
> OK for trunk?
> 

Looks good to me.

-- 
Steve

Re: [PATCH] Print repeated rtl vector elements in a nicer way

2016-11-04 Thread Martin Jambor

On Thu, Nov 03, 2016 at 05:43:50PM +0100, Bernd Schmidt wrote:
> On 11/03/2016 05:35 PM, Martin Jambor wrote:
> > 
> > * print-rtl.c (print_rtx_operand_codes_E_and_V): Print how many times
> > an element is repeated istead of printing each repeated element.
> 
> "instead"

Will fix.

> 
> > ---
> >  gcc/print-rtl.c | 15 ++-
> >  1 file changed, 14 insertions(+), 1 deletion(-)
> > 
> > diff --git a/gcc/print-rtl.c b/gcc/print-rtl.c
> > index 341ecdf..9752c85 100644
> > --- a/gcc/print-rtl.c
> > +++ b/gcc/print-rtl.c
> > @@ -252,7 +252,20 @@ rtx_writer::print_rtx_operand_codes_E_and_V (const_rtx 
> > in_rtx, int idx)
> > m_sawclose = 1;
> > 
> >for (int j = 0; j < XVECLEN (in_rtx, idx); j++)
> > -   print_rtx (XVECEXP (in_rtx, idx, j));
> > +   {
> > + int j1;
> > +
> > + print_rtx (XVECEXP (in_rtx, idx, j));
> > + for (j1 = j + 1; j1 < XVECLEN (in_rtx, idx); j1++)
> > +   if (XVECEXP (in_rtx, idx, j) != XVECEXP (in_rtx, idx, j1))
> > + break;
> > +
> > + if (j1 != j + 1)
> > +   {
> > + fprintf (m_outfile, " repeated %ix", j1 - j);
> > + j = j1;
> > +   }
> > +   }
> 
> Good idea, but can you give an example of how this looks in
> practice?

For example, after the patch dumps of constant vectors look like this:

(insn 27 26 28 2 (set (reg:V64SI 450)
(vec_merge:V64SI (ashift:V64SI (reg/v:V64SI 433 [ workitem_id ])
(const_vector:V64SI [
(const_int 2 [0x2]) repeated 64x
]))
(unspec:V64SI [
(const_int 0 [0])
] UNSPEC_VECTOR)
(reg:DI 425 [ exec.1_4 ]))) "kernel.c":35 -1
 (nil))

Without it, the very same insn is printed as following:

(insn 27 26 28 2 (set (reg:V64SI 450)
(vec_merge:V64SI (ashift:V64SI (reg/v:V64SI 433 [ workitem_id ])
(const_vector:V64SI [
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
(const_int 2 [0x2])
]))
(unspec:V64SI [
(const_int 0 [0])
] UNSPEC_VECTOR)
(reg:DI 425 [ exec.1_4 ]))) "kernel.c":35 -1
 (nil))

> Also,
> it would be nice (and necessary for David's rtl-testing) to

[PATCH] rtx_writer: avoid printing trailing default values

2016-11-04 Thread David Malcolm

On Fri, 2016-11-04 at 18:51 +0100, Bernd Schmidt wrote:
> Here's something simpler. Only very lightly tested, but isn't
> something
> like this all that's needed? Could decide whether to apply it to
> expr_list etc. as well.
> 
> Index: print-rtl.c
> ===
> --- print-rtl.c   (revision 241233)
> +++ print-rtl.c   (working copy)
> @@ -697,7 +697,12 @@ print_rtx (const_rtx in_rtx)
> 
> /* Get the format string and skip the first elements if we have
> handled
>them already.  */
> -  for (; idx < GET_RTX_LENGTH (GET_CODE (in_rtx)); idx++)
> +  int limit = GET_RTX_LENGTH (GET_CODE (in_rtx));
> +  if (flag_compact
(as of r241586 this is now "m_compact")

> +  && INSN_CHAIN_CODE_P (GET_CODE (in_rtx))
> +  && XEXP (in_rtx, limit - 1) == NULL_RTX)
> +limit--;
> +  for (; idx < limit; idx++)
>   print_rtx_operand (in_rtx, idx);
> 
> switch (GET_CODE (in_rtx))

Thanks.

This seems to assume that the final code in the fmt string can be
accessed via XEXP (in_rtx, limit - 1), which if RTL checking is
enabled requires that the code be either 'e' or 'u'.

This isn't the case for JUMP_INSN, JUMP_TABLE_DATA, BARRIER
(all code '0'), CODE_LABEL (code 's') and NOTE (code 'i').

Also, we might want to omit the REG_NOTES from, say a JUMP_INSN, which
is the *penultimate* operand - for example, it doesn't omit the
trailing (nil) from the cjump_insn in test_uncond_jump.

That said, this is much simpler than my patch, so I used it as the
basis for the following: it uses the same approach as your patch, but
loops backwards from the end of the format string (rather than just once)
until it finds a non-default value, using a new function
"operand_has_default_value_p" to handle the logic for that.

With this, compact dumps omit the trailing (nil) for both regular insns
and for JUMP_INSNs, and omits the surplus newline seen in my earlier patch.

It also appears removes the trailing (nil) from expr_list.

Bootstrap®rtest in progress.

OK for trunk if it passes?

gcc/ChangeLog:
* print-rtl.c (operand_has_default_value_p): New function.
(rtx_writer::print_rtx): In compact mode, omit trailing operands
that have the default values.
* rtl-tests.c (selftest::test_dumping_insns): Remove empty
label string from expected dump.
(seltest::test_uncond_jump): Remove trailing "(nil)" for REG_NOTES
from expected dump.
---
 gcc/print-rtl.c | 45 -
 gcc/rtl-tests.c |  5 ++---
 2 files changed, 46 insertions(+), 4 deletions(-)

diff --git a/gcc/print-rtl.c b/gcc/print-rtl.c
index 341ecdf..493aa68 100644
--- a/gcc/print-rtl.c
+++ b/gcc/print-rtl.c
@@ -564,6 +564,40 @@ rtx_writer::print_rtx_operand (const_rtx in_rtx, int idx)
 }
 }
 
+/* Subroutine of rtx_writer::print_rtx.
+   In compact mode, determine if operand IDX of IN_RTX is interesting
+   to dump, or (if in a trailing position) it can be omitted.  */
+
+static bool
+operand_has_default_value_p (const_rtx in_rtx, int idx)
+{
+  const char *format_ptr = GET_RTX_FORMAT (GET_CODE (in_rtx));
+
+  switch (format_ptr[idx])
+{
+case 'e':
+case 'u':
+  return XEXP (in_rtx, idx) == NULL_RTX;
+
+case 's':
+  return XSTR (in_rtx, idx) == NULL;
+
+case '0':
+  switch (GET_CODE (in_rtx))
+   {
+   case JUMP_INSN:
+ return JUMP_LABEL (in_rtx) == NULL_RTX;
+
+   default:
+ return false;
+
+   }
+
+default:
+  return false;
+}
+}
+
 /* Print IN_RTX onto m_outfile.  This is the recursive part of printing.  */
 
 void
@@ -681,9 +715,18 @@ rtx_writer::print_rtx (const_rtx in_rtx)
fprintf (m_outfile, " %d", INSN_UID (in_rtx));
 }
 
+  /* Determine which is the final operand to print.
+ In compact mode, skip trailing operands that have the default values
+ e.g. trailing "(nil)" values.  */
+  int limit = GET_RTX_LENGTH (GET_CODE (in_rtx));
+  if (m_compact)
+while (limit > idx && operand_has_default_value_p (in_rtx, limit - 1))
+  limit--;
+
   /* Get the format string and skip the first elements if we have handled
  them already.  */
-  for (; idx < GET_RTX_LENGTH (GET_CODE (in_rtx)); idx++)
+
+  for (; idx < limit; idx++)
 print_rtx_operand (in_rtx, idx);
 
   switch (GET_CODE (in_rtx))
diff --git a/gcc/rtl-tests.c b/gcc/rtl-tests.c
index cf5239f..228226b 100644
--- a/gcc/rtl-tests.c
+++ b/gcc/rtl-tests.c
@@ -122,7 +122,7 @@ test_dumping_insns ()
   /* Labels.  */
   rtx_insn *label = gen_label_rtx ();
   CODE_LABEL_NUMBER (label) = 42;
-  ASSERT_RTL_DUMP_EQ ("(clabel 0 42 \"\")\n", label);
+  ASSERT_RTL_DUMP_EQ ("(clabel 0 42)\n", label);
 
   LABEL_NAME (label)= "some_label";
   ASSERT_RTL_DUMP_EQ ("(clabel 0 42 (\"some_label\"))\n", label);
@@ -176,8 +176,7 @@ test_uncond_jump ()
   ASSERT_TRUE (control_flow_insn_p (jump_insn));
 
   ASSERT_RTL_DUMP_EQ ("(cjump_insn 1 (set (pc)\n"
- "

Re: [PATCH] rtx_writer: avoid printing trailing default values

2016-11-04 Thread Bernd Schmidt


On 11/04/2016 08:14 PM, David Malcolm wrote:


With this, compact dumps omit the trailing (nil) for both regular insns
and for JUMP_INSNs, and omits the surplus newline seen in my earlier patch.

It also appears removes the trailing (nil) from expr_list.

Bootstrap®rtest in progress.

OK for trunk if it passes?


This seems OK, with something left to be addressed perhaps in a followup:


+   case JUMP_INSN:
+ return JUMP_LABEL (in_rtx) == NULL_RTX;


Weren't we going to omit JUMP_LABELs in compact mode? If so, we'll need 
a m_compact test here eventually as well.



Bernd

Re: [PATCH] Fix nonoverlapping_memrefs_p ICE (PR target/77834, take 2)

2016-11-04 Thread Richard Biener

On November 4, 2016 5:27:11 PM GMT+01:00, Jakub Jelinek  
wrote:
>Hi!
>
>On Wed, Nov 02, 2016 at 10:46:40AM +0100, Richard Biener wrote:
>> Yeah, plus if a followup test would have disambiguated things (the
>> dispatch to the tree oracle for example).
>
>After discussing this on IRC that the dse.c and sched-deps.c (call (mem
>...) )
>changes are probably not safe, I'm proposing following patch which
>should be
>safe to backport to release branches too, the only occurrences of
>!DECL_RTL_SET_P where DECL_RTL worked fine in my bootstraps/regtests
>were
>FUNCTION_DECLs, so the patch should only turn ICEs into returning the
>safe
>return value that the expressions might overlap.
>
>If/once this is in, I'm planning to test/submit a patch adding
>  /* If one decl is known to be a function or label in a function and
> the other is some kind of data, they can't overlap.  */
>  if ((TREE_CODE (exprx) == FUNCTION_DECL
>   || TREE_CODE (exprx) == LABEL_DECL)
>  != (TREE_CODE (expry) == FUNCTION_DECL
> || TREE_CODE (expry) == LABEL_DECL))
>return 1;
>before that.
>
>Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK for trunk and branches (if appropriate)

Richard.

>2016-11-04  Jakub Jelinek  
>
>   PR target/77834
>   * alias.c (nonoverlapping_memrefs_p): Return 0 if exprx or expry
>   doesn't have rtl set.
>
>   * gcc.dg/pr77834.c: New test.
>
>--- gcc/alias.c.jj 2016-10-21 17:06:27.0 +0200
>+++ gcc/alias.c2016-10-31 11:38:29.448031590 +0100
>@@ -2755,6 +2755,13 @@ nonoverlapping_memrefs_p (const_rtx x, c
>   || TREE_CODE (expry) == CONST_DECL)
> return 1;
> 
>+  /* If either of the decls doesn't have DECL_RTL set (e.g. marked as
>+ living in multiple places), we can't tell anything.  Exception
>+ are FUNCTION_DECLs for which we can create DECL_RTL on demand. 
>*/
>+  if ((!DECL_RTL_SET_P (exprx) && TREE_CODE (exprx) != FUNCTION_DECL)
>+  || (!DECL_RTL_SET_P (expry) && TREE_CODE (expry) !=
>FUNCTION_DECL))
>+return 0;
>+
>   rtlx = DECL_RTL (exprx);
>   rtly = DECL_RTL (expry);
> 
>--- gcc/testsuite/gcc.dg/pr77834.c.jj  2016-10-31 11:41:46.290521464
>+0100
>+++ gcc/testsuite/gcc.dg/pr77834.c 2016-10-31 11:41:24.0 +0100
>@@ -0,0 +1,18 @@
>+/* PR target/77834 */
>+/* { dg-do compile } */
>+/* { dg-options "-O -ftree-pre -Wno-psabi" } */
>+/* { dg-additional-options "-mstringop-strategy=libcall" { target
>i?86-*-* x86_64-*-* } } */
>+
>+typedef int V __attribute__ ((vector_size (64)));
>+
>+V
>+foo (V u, V v, int w)
>+{
>+  do
>+{
>+  if (u[0]) v ^= u[w];
>+}
>+  while ((V) { 0, u[w] }[1]);
>+  u = (V) { v[v[0]], u[u[0]] };
>+  return v + u;
>+}
>
>
>   Jakub

Re: [PATCH] Add target hook to compute register pressure classes

2016-11-04 Thread Vladimir N Makarov




On 11/04/2016 12:18 PM, Pat Haugen wrote:

While working to get -fsched-pressure profitable on PowerPC, one of the things 
I noticed causing problems was the selection of pressure classes. Understanding 
that getting the computation of pressure classes correct for all targets in the 
machine independent code is going to be extremely hard (or impossible), I 
propose the following target hook which lets the target decide the set of 
pressure classes to use. I will follow this up with an implementation of the 
hook for the rs6000 backend, but am currently benchmarking variations of it to 
see which is best.

Bootstrap/regtest on powerpc64le with no new regressions. Ok for trunk?

Yes.  It is OK, Pat.  Thanks for working on powerpc insn-scheduling issues.

Note that the ira.c changes are not as complicated as what it looks, I only 
added a few lines to invoke the target hook, the rest is just indentation 
changes to move the existing code into the else leg.

Re: [Patch, fortran] PR64933 - ASSOCIATE on a character variable does not allow substring expressions

2016-11-04 Thread Paul Richard Thomas

Hi Steve,

Thanks. At least it cannot do any harm :-)

I will get on with it.

Paul


On 4 November 2016 at 19:21, Steve Kargl
 wrote:
> On Fri, Nov 04, 2016 at 12:43:37PM +0100, Paul Richard Thomas wrote:
>>
>> The associate construct does not readily permit the identification of
>> the associate name as an array during parsing. However, this can be
>> done whilst matching rvalues within the associate block. This patch
>> extends this identification in gfc_match_varspec, whilst excluding
>> character types from the present test. This latter change prevents
>> scalar substrings from being pegged as array references, which was the
>> origin of this bug.
>>
>> Bootstraps and regtests on FC21/x86_64.
>>
>> OK for trunk?
>>
>
> Looks good to me.
>
> --
> Steve



-- 
The difference between genius and stupidity is; genius has its limits.

Albert Einstein

Re: [Patch, fortran] PR64933 - ASSOCIATE on a character variable does not allow substring expressions

2016-11-04 Thread Paul Richard Thomas

Committed as revision 241860.

Thanks for the review.

Paul

On 4 November 2016 at 19:21, Steve Kargl
 wrote:
> On Fri, Nov 04, 2016 at 12:43:37PM +0100, Paul Richard Thomas wrote:
>>
>> The associate construct does not readily permit the identification of
>> the associate name as an array during parsing. However, this can be
>> done whilst matching rvalues within the associate block. This patch
>> extends this identification in gfc_match_varspec, whilst excluding
>> character types from the present test. This latter change prevents
>> scalar substrings from being pegged as array references, which was the
>> origin of this bug.
>>
>> Bootstraps and regtests on FC21/x86_64.
>>
>> OK for trunk?
>>
>
> Looks good to me.
>
> --
> Steve



-- 
The difference between genius and stupidity is; genius has its limits.

Albert Einstein

Re: [PATCH] rtx_writer: avoid printing trailing default values

2016-11-04 Thread David Malcolm

On Fri, 2016-11-04 at 19:53 +0100, Bernd Schmidt wrote:
> On 11/04/2016 08:14 PM, David Malcolm wrote:
> > 
> > With this, compact dumps omit the trailing (nil) for both regular
> > insns
> > and for JUMP_INSNs, and omits the surplus newline seen in my
> > earlier patch.
> > 
> > It also appears removes the trailing (nil) from expr_list.
> > 
> > Bootstrap®rtest in progress.
> > 
> > OK for trunk if it passes?
> 
> This seems OK, with something left to be addressed perhaps in a
> followup:
> 
> > +   case JUMP_INSN:
> > + return JUMP_LABEL (in_rtx) == NULL_RTX;
> 
> Weren't we going to omit JUMP_LABELs in compact mode? If so, we'll
> need 
> a m_compact test here eventually as well.

We already omit JUMP_LABELs in compact mode, in
rtx_writer::print_rtx_operand_code_0, but we need this test to pass so
that we can potentially omit earlier (nil) operands in a JUMP_INSN e.g.
REG_NOTES.

Given that, perhaps this test should read:

case JUMP_INSN: 

  /* JUMP_LABELs are always omitted in compact mode, so treat
 any value here as omittable, so that earlier operands can
 potentially be omitted also.  */
  return true;

or:

  return m_compact;

if we want to support calling operand_has_default_value_p in non
-compact mode.

OK with one of those changes?  (assuming successful testing)

Re: [PATCH] rtx_writer: avoid printing trailing default values

2016-11-04 Thread Bernd Schmidt


On 11/04/2016 08:25 PM, David Malcolm wrote:


  return m_compact;


Ok with this one plus a comment.


Bernd

Re: [PATCH] enhance buffer overflow warnings (and c/53562)

2016-11-04 Thread Martin Sebor


Attached is an update to the patch that takes into consideration
the feedback I got.  It goes back to adding just one option,
-Wstringop-overflow, as in the original, while keeping the Object
Size type as an argument.  It uses type-1 as the default setting
for string functions (strcpy et al.) and, unconditionally, type-0
for raw memory functions (memcpy, etc.)

I retested Binutils 2.27 and the Linux kernel again with this patch
and also added Glibc, and it doesn't complain about anything (both
Binutils and the kernel also build cleanly with an unpatched GCC
with_FORTIFY_SOURCE=2 or its rough equivalent for the kernel).
The emit-rtl.c warning (bug 78174) has also been suppressed by
the change to bos type-0 for memcpy.

While the patch doesn't trigger any false positives (AFAIK) it is
subject to a fair number of false negatives due to the limitations
of the tree-object-size pass, and due to transformations done by
other passes that prevent it from detecting some otherwise obvious
overflows.  Although unfortunate, I believe the warnings that are
emitted are useful as the first line of defense in software that
doesn't use _FORTIFY_SOURCE (such as GCC itself).   And this can
of course be improved if some of the limitations are removed over
time.

Martin
PR c/53562 - Add -Werror= support for -D_FORTIFY_SOURCE / __builtin___memcpy_chk
PR middle-end/78149 - missing warning on strncpy buffer overflow due to an excessive bound
PR middle-end/78138 - missing warnings on buffer overflow with non-constant source length

gcc/c-family/ChangeLog:
	PR c/53562
	* c.opt (-Wstringop-overflow): New option.

gcc/ChangeLog:
	PR c/53562
	PR middle-end/78149
	PR middle-end/78138
	* builtins.c (expand_builtin_strcat, expand_builtin_strncat): New
	functions.
	(compute_dest_size, get_size_range, check_sizes, check_strncat_sizes)
	(check_memop_sizes): Same.
	(expand_builtin_memcpy): Call check memop_sizes.
	(expand_builtin_mempcpy): Same.
	(expand_builtin_memset): Same,
	(expand_builtin_bzero): Same.
	(expand_builtin_memory_chk): Call check_sizes.
	(expand_builtin_strcpy): Same.
	(expand_builtin_strncpy): Same.
	(maybe_emit_sprintf_chk_warning): Same.
	(expand_builtin): Handle strcat and strncat.
	(fini_object_sizes): Reset pointers.
	(compute_object_size): New function.
	* doc/invoke.texi (Warning Options): Document -Wstringop-overflow.

gcc/testsuite/ChangeLog:
	PR c/53562
	PR middle-end/78149
	PR middle-end/78138
	* c-c++-common/Wsizeof-pointer-memaccess2.c: Adjust expected diagnostic.
	* g++.dg/ext/builtin-object-size3.C (bar): Same.
	* g++.dg/ext/strncpy-chk1.C: Same.
	* g++.dg/opt/memcpy1.C: Same.
	* g++.dg/torture/Wsizeof-pointer-memaccess1.C: Same.
	* gcc.c-torture/compile/pr55569.c: Disable -Wstringop-overflow.
	* gcc.dg/Wobjsize-1.c: Adjust expected diagnostic.
	* gcc.dg/attr-alloc_size.c: Same.
	* gcc.dg/builtin-stringop-chk-1.c: Adjust expected diagnostic.
	* gcc.dg/builtin-stringop-chk-2.c: Same.
	* gcc.dg/builtin-stringop-chk-4.c: New test.
	* gcc.dg/builtin-strncat-chk-1.c: Adjust expected diagnostic.
	* gcc.dg/memcpy-2.c: Same.
	* gcc.dg/pr40340-1.c: Same.
	* gcc.dg/pr40340-2.c (main): Same.
	* gcc.dg/pr40340-5.c (main): Same.
	* gcc.dg/torture/Wsizeof-pointer-memaccess1.c: Same.
	* gcc.dg/torture/pr71132.c: Disable -Wstringop-overflow.
	* gcc.dg/fstack-protector-strong.c: Add expected diagnostic.

diff --git a/gcc/builtins.c b/gcc/builtins.c
index cc711a0..e2cf20a 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -67,7 +67,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "internal-fn.h"
 #include "case-cfn-macros.h"
 #include "gimple-fold.h"
-
+#include "intl.h"
 
 struct target_builtins default_target_builtins;
 #if SWITCHABLE_TARGET
@@ -125,9 +125,11 @@ static rtx expand_builtin_mempcpy (tree, rtx, machine_mode);
 static rtx expand_builtin_mempcpy_with_bounds (tree, rtx, machine_mode);
 static rtx expand_builtin_mempcpy_args (tree, tree, tree, rtx,
 	machine_mode, int, tree);
+static rtx expand_builtin_strcat (tree, rtx);
 static rtx expand_builtin_strcpy (tree, rtx);
 static rtx expand_builtin_strcpy_args (tree, tree, rtx);
 static rtx expand_builtin_stpcpy (tree, rtx, machine_mode);
+static rtx expand_builtin_strncat (tree, rtx);
 static rtx expand_builtin_strncpy (tree, rtx);
 static rtx builtin_memset_gen_str (void *, HOST_WIDE_INT, machine_mode);
 static rtx expand_builtin_memset (tree, rtx, machine_mode);
@@ -3011,6 +3013,292 @@ expand_builtin_memcpy_args (tree dest, tree src, tree len, rtx target, tree exp)
   return dest_addr;
 }
 
+/* Fill the 2-element RANGE array with the minimum and maximum values
+   EXP is known to have and return true, otherwise null and return
+   false.  */
+
+static bool
+get_size_range (tree exp, tree range[2])
+{
+  if (tree_fits_uhwi_p (exp))
+{
+  range[0] = range[1] = exp;
+  return true;
+}
+
+  if (TREE_CODE (exp) == SSA_NAME)
+{
+  wide_int min, max;
+  enum value_range_type range_type = get_range_info (exp, &min, &max);
+

Simplify X / X, 0 / X and X % X

2016-11-04 Thread Marc Glisse


Hello,

since we were discussing this recently...

The condition is copied from the existing 0 % X case, visible in the 
context of the diff.


As far as I understand, the main case where we do not want to optimize is 
during constexpr evaluation in the C++ front-end (it wants to detect the 
undefined behavior), and with late folding I think this means we only need 
to care about an explicit 0/0, not about X/X where X would become 0 after 
the simplification.


And later, if we do have something like X/0, we could handle it the same 
way as we currently handle *(char*)0, insert a trap after that instruction 
and clear the following code, which likely gives better code than 
replacing 0/0 with 1.


Bootstrap+regtest on powerpc64le-unknown-linux-gnu.

2016-11-07  Marc Glisse  

gcc/
* match.pd (0 / X, X / X, X % X): New simplifications.

gcc/testsuite/
* gcc.dg/tree-ssa/divide-5.c: New file.

--
Marc GlisseIndex: gcc/match.pd
===
--- gcc/match.pd	(revision 241856)
+++ gcc/match.pd	(working copy)
@@ -133,33 +133,47 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   (non_lvalue @0)))
 
 /* Transform x * -1.0 into -x.  */
 (simplify
  (mult @0 real_minus_onep)
   (if (!HONOR_SNANS (type)
&& (!HONOR_SIGNED_ZEROS (type)
|| !COMPLEX_FLOAT_TYPE_P (type)))
(negate @0)))
 
-/* Make sure to preserve divisions by zero.  This is the reason why
-   we don't simplify x / x to 1 or 0 / x to 0.  */
+/* X * 1, X / 1 -> X.  */
 (for op (mult trunc_div ceil_div floor_div round_div exact_div)
   (simplify
 (op @0 integer_onep)
 (non_lvalue @0)))
 
+/* Preserve explicit divisions by 0: the C++ front-end wants to detect
+   undefined behavior in constexpr evaluation, and assuming that the division
+   traps enables better optimizations than these anyway.  */
 (for div (trunc_div ceil_div floor_div round_div exact_div)
+ /* 0 / X is always zero.  */
+ (simplify
+  (div integer_zerop@0 @1)
+  /* But not for 0 / 0 so that we can get the proper warnings and errors.  */
+  (if (!integer_zerop (@1))
+   @0))
   /* X / -1 is -X.  */
  (simplify
(div @0 integer_minus_onep@1)
(if (!TYPE_UNSIGNED (type))
 (negate @0)))
+ /* X / X is one.  */
+ (simplify
+  (div @0 @0)
+  /* But not for 0 / 0 so that we can get the proper warnings and errors.  */
+  (if (!integer_zerop (@0))
+   { build_one_cst (type); }))
  /* X / abs (X) is X < 0 ? -1 : 1.  */ 
  (simplify
(div:C @0 (abs @0))
(if (INTEGRAL_TYPE_P (type)
 	&& TYPE_OVERFLOW_UNDEFINED (type))
 (cond (lt @0 { build_zero_cst (type); })
   { build_minus_one_cst (type); } { build_one_cst (type); })))
  /* X / -X is -1.  */
  (simplify
(div:C @0 (negate @0))
@@ -269,38 +283,42 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 	&& !real_zerop (@1))
 (with
  { tree tem = const_binop (RDIV_EXPR, type, build_one_cst (type), @1); }
  (if (tem)
   (mult @0 { tem; } )))
 (if (cst != COMPLEX_CST)
  (with { tree inverse = exact_inverse (type, @1); }
   (if (inverse)
(mult @0 { inverse; } 
 
-/* Same applies to modulo operations, but fold is inconsistent here
-   and simplifies 0 % x to 0, only preserving literal 0 % 0.  */
 (for mod (ceil_mod floor_mod round_mod trunc_mod)
  /* 0 % X is always zero.  */
  (simplify
   (mod integer_zerop@0 @1)
   /* But not for 0 % 0 so that we can get the proper warnings and errors.  */
   (if (!integer_zerop (@1))
@0))
  /* X % 1 is always zero.  */
  (simplify
   (mod @0 integer_onep)
   { build_zero_cst (type); })
  /* X % -1 is zero.  */
  (simplify
   (mod @0 integer_minus_onep@1)
   (if (!TYPE_UNSIGNED (type))
{ build_zero_cst (type); }))
+ /* X % X is zero.  */
+ (simplify
+  (mod @0 @0)
+  /* But not for 0 % 0 so that we can get the proper warnings and errors.  */
+  (if (!integer_zerop (@0))
+   { build_zero_cst (type); }))
  /* (X % Y) % Y is just X % Y.  */
  (simplify
   (mod (mod@2 @0 @1) @1)
   @2)
  /* From extract_muldiv_1: (X * C1) % C2 is zero if C1 is a multiple of C2.  */
  (simplify
   (mod (mult @0 INTEGER_CST@1) INTEGER_CST@2)
   (if (ANY_INTEGRAL_TYPE_P (type)
&& TYPE_OVERFLOW_UNDEFINED (type)
&& wi::multiple_of_p (@1, @2, TYPE_SIGN (type)))
Index: gcc/testsuite/gcc.dg/tree-ssa/divide-5.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/divide-5.c	(revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/divide-5.c	(working copy)
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-optimized" } */
+
+int f(int x){
+  int y = x;
+  int z = 0;
+  return x / y - x % y + z / y;
+}
+
+/* { dg-final { scan-tree-dump "return 1;" "optimized"} } */

Re: [patch,testsuite] Support dg-require-effective-target label_offsets.

2016-11-04 Thread Mike Stump

On Nov 4, 2016, at 2:35 AM, Georg-Johann Lay  wrote:
> 
>> .word   .L3-(.L2)
>> 
>> which seems wrong, given all the other code-gen.  I think this has to be 
>> gs(.L3)-(gs(.L2)), no?  I tried to get binutils to do that for me directly 
>> with a .word and it seemed resistant; which is unfortunate.

> Using gs, the expression would be gs(.L3)-(gs(.L2)) as gs implements word 
> addresses, not byte addresses.

As I said, I tried that and could not get it to work.

>> The compile only test cases work fine now, so nothing needs to be done for 
>> them.
> 
> ICE is not a nice approach

I never saw an ICE with current release binutils and top of tree gcc.  If you 
want me to see an ICE and comment on it, I would need the command line 
apparently.

> , IMO.  I'd prefer some more graceful error using code locations of the 
> labels (which are not available).

My preference is code-gen.  :-)  Anyway, the change I posted creates nice 
graceful errors, if one wants.

Re: [PATCH, GCC, wwwdocs] Document new Cortex-M23 and Cortex-M33 processors support in ARM backend

2016-11-04 Thread Gerald Pfeifer


On Fri, 4 Nov 2016, Thomas Preudhomme wrote:

This patch document the newly added support in GCC 7 for Cortex-M23 and
Cortex-M33 processors [1][2].

:

Is this ok for ?


Surely so for me.

Gerald

PR 78188 & 71848

2016-11-04 Thread David Edelsohn

Something in cgraph appears to be relying on COMDAT support if
DECL_ONE_ONLY is available.  This is the cause of the AIX bootstrap
failure due to recent tree-vrp.c change, earlier tree-ssa-sccvn.c
change.  Also a recent C++ front-end change caused new libstdc++
testsuite failures that seems to have the same root cause.

This patch disables the ipa-comdats pass on targets that do not
support COMDAT groups.

Pre-approved by Richi.

PR bootstrap/78188
PR c++/71848
* ipa-comdats.c (pass_ipa_comdats::gate): Require HAVE_COMDAT_GROUP.

Index: ipa-comdats.c
===
--- ipa-comdats.c   (revision 241862)
+++ ipa-comdats.c   (working copy)
@@ -416,7 +416,7 @@ class pass_ipa_comdats : public ipa_opt_pass_d
 bool
 pass_ipa_comdats::gate (function *)
 {
-  return optimize;
+  return HAVE_COMDAT_GROUP && optimize;
 }

 } // anon namespace

1 2 >

1 - 100 of 104 matches

Mail list logo