[PATCH] Add x | 0 -> x pattern

2016-08-19 Thread Richard Biener

For some reason it was missing.

Bootstrapped / tested on x86_64-unknown-linux-gnu, applied.

Richard.

2016-08-19  Richard Biener  

* match.pd (x | 0 -> x): Add.

Index: gcc/match.pd
===
--- gcc/match.pd(revision 239570)
+++ gcc/match.pd(working copy)
@@ -541,13 +541,18 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 
 /* x | ~0 -> ~0  */
 (simplify
-  (bit_ior @0 integer_all_onesp@1)
-  @1)
+ (bit_ior @0 integer_all_onesp@1)
+ @1)
+
+/* x | 0 -> x  */
+(simplify
+ (bit_ior @0 integer_zerop)
+ @0)
 
 /* x & 0 -> 0  */
 (simplify
-  (bit_and @0 integer_zerop@1)
-  @1)
+ (bit_and @0 integer_zerop@1)
+ @1)
 
 /* ~x | x -> -1 */
 /* ~x ^ x -> -1 */


[PATCH] Fix PR77290

2016-08-19 Thread Richard Biener

This fixes PR77290 - my previous patch to PRE to limit insertion
for flag_tree_parallelize_loops != 0 was confused because
flag_tree_parallelize_loops is the number of threads to parallelize
for (thus == 1 is the default and to not parallelize).

Fixed as obvious.

Richard.

2016-08-19  Richard Biener  

PR tree-optimization/77290
* tree-ssa-pre.c (eliminate_dom_walker::before_dom_children):
Fix flag_tree_parallelize_loops check.

Index: gcc/tree-ssa-pre.c
===
--- gcc/tree-ssa-pre.c  (revision 239606)
+++ gcc/tree-ssa-pre.c  (working copy)
@@ -4270,7 +4270,7 @@ eliminate_dom_walker::before_dom_childre
  if (sprime
  && TREE_CODE (sprime) == SSA_NAME
  && do_pre
- && (flag_tree_loop_vectorize || flag_tree_parallelize_loops)
+ && (flag_tree_loop_vectorize || flag_tree_parallelize_loops > 1)
  && loop_outer (b->loop_father)
  && has_zero_uses (sprime)
  && bitmap_bit_p (inserted_exprs, SSA_NAME_VERSION (sprime))


Re: [TREE-SSA-CCP] Issue warning when folding condition

2016-08-19 Thread Richard Biener
On Fri, 19 Aug 2016, Kugan Vivekanandarajah wrote:

> On 19 August 2016 at 12:09, Kugan Vivekanandarajah
>  wrote:
> > The testcase pr33738.C for warning fails with early-vrp patch. The
> > reason is, with early-vrp ccp2 is folding the comparison that used to
> > be folded in simplify_stmt_for_jump_threading. Since early-vrp does
> > not perform jump-threading is not optimized there.
> >
> > Attached patch adds this warning to tree-ssa-ccp.c. We might also run
> > into some other similar issues in the future.
> 
> Sorry, I attached the wrong patch (with typo). Here is the correct one.

I think emitting this warning from GIMPLE optimizations is fundamentally
flawed and the warning should be removed there and put next to
the cases we alrady handle in c/c-common.c:shorten_compare (or in
FE specific code).  I see no reason why only VRP or CCP would
do the simplification for -fstrict-enums enums (thus it seems to be
missing from the generic comparison folders).

Richard.

> Thanks,
> Kugan
> 
> >
> > Bootstrapped and regression tested on x86_64-linux-gnu with no new 
> > regressions.
> >
> > Is this OK for trunk?
> >
> > Thanks,
> > Kugan
> >
> > gcc/ChangeLog:
> >
> > 2016-08-18  Kugan Vivekanandarajah  
> >
> > * tree-ssa-ccp.c (ccp_fold_stmt): If the comparison is being folded
> > and the operand on the LHS is being compared against a constant
> > value that is outside of type limit, issue warning.
> 
k


Re: [RFC][PR61839]Convert CST BINOP COND_EXPR to COND_EXPR ? (CST BINOP 1) : (CST BINOP 0)

2016-08-19 Thread Kugan Vivekanandarajah
Ping?

https://gcc.gnu.org/ml/gcc-patches/2016-08/msg00987.html

Thanks,
Kugan

On 12 August 2016 at 13:19, kugan  wrote:
> Hi Richard,
>
>
> On 11/08/16 20:04, Richard Biener wrote:
>>
>> On Thu, Aug 11, 2016 at 6:11 AM, kugan
>>  wrote:
>
>
> [SNIP]
>
>>
>> +two_valued_val_range_p (tree var, tree *a, tree *b)
>> +{
>> +  value_range *vr = get_value_range (var);
>> +  if ((vr->type != VR_RANGE
>> +   && vr->type != VR_ANTI_RANGE)
>> +  || !range_int_cst_p (vr))
>> +return false;
>>
>> range_int_cst_p checks for vr->type == VR_RANGE so the anti-range handling
>> doesn't ever trigger - which means you should add a testcase for it as
>> well.
>
>
> Fixed it. I have also added a testcase.
>
>>
>>
>> +{
>> +  *a = TYPE_MIN_VALUE (TREE_TYPE (var));
>> +  *b = TYPE_MAX_VALUE (TREE_TYPE (var));
>>
>> note that for pointer types this doesn't work, please also use
>> vrp_val_min/max for
>> consistency.  I think you want to add a INTEGRAL_TYPE_P (TREE_TYPE (var))
>> to the guard of two_valued_val_range_p.
>>
>> +  /* First canonicalize to simplify tests.  */
>> +  if (commutative_tree_code (rhs_code)
>> + && TREE_CODE (rhs2) == INTEGER_CST)
>> +   std::swap (rhs1, rhs2);
>>
>> note that this doesn't really address my comment - if you just want to
>> handle
>> commutative ops then simply look for the constant in the second place
>> rather
>> than the first which is the canonical operand order.  But even for
>> non-commutative
>> operations we might want to apply this optimization - and then for both
>> cases,
>> rhs1 or rhs2 being constant.  Like x / 5 and 5 / x.
>>
>> Note that you can rely on int_const_binop returning NULL_TREE for
>> "invalid"
>> ops like x % 0 or x / 0, so no need to explicitely guard this here.
>
>
> Sorry, I misunderstood you. I have changed it now. I also added test-case to
> check this.
>
> Bootstrapped and regression tested on x86_64-linux-gnu with no new
> regressions. Is this OK for trunk now?
>
> Thanks,
> Kugan
>
> gcc/testsuite/ChangeLog:
>
> 2016-08-12  Kugan Vivekanandarajah  
>
> PR tree-optimization/61839
> * gcc.dg/tree-ssa/pr61839_1.c: New test.
> * gcc.dg/tree-ssa/pr61839_2.c: New test.
> * gcc.dg/tree-ssa/pr61839_3.c: New test.
> * gcc.dg/tree-ssa/pr61839_4.c: New test.
>
> gcc/ChangeLog:
>
> 2016-08-12  Kugan Vivekanandarajah  
>
> PR tree-optimization/61839
> * tree-vrp.c (two_valued_val_range_p): New.
> (simplify_stmt_using_ranges): Convert CST BINOP VAR where VAR is
> two-valued to VAR == VAL1 ? (CST BINOP VAL1) : (CST BINOP VAL2).
> Also Convert VAR BINOP CST where VAR is two-valued to
> VAR == VAL1 ? (VAL1 BINOP CST) : (VAL2 BINOP CST).


Re: [Patch] Implement std::experimental::variant

2016-08-19 Thread Jonathan Wakely

On 18/08/16 13:32 -0700, Tim Shen wrote:

Tested on x86_64-linux-gnu and checked in as r239590.


This updates the status at
https://gcc.gnu.org/onlinedocs/libstdc++/manual/status.html#status.iso.201z

Committed to trunk.


commit 6c46b6c04ac234b6443208e922bb3177344e90a8
Author: Jonathan Wakely 
Date:   Fri Aug 19 09:15:36 2016 +0100

Update C++17 library status table

	* doc/xml/manual/status_cxx2017.xml: Update status of make_from_tuple
	and variant.
	* doc/html/*: Regenerate.

diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2017.xml b/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
index 99f2bbf..35d6b6b 100644
--- a/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
+++ b/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
@@ -92,14 +92,13 @@ Feature-testing recommendations for C++.
 
 
 
-  
Variant: a type-safe union for C++17 
   
 	http://www.w3.org/1999/xlink"; xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0088r3.html";>
 	P0088R3
 	
   
-   No 
+   7 
__has_include() 
 
 
@@ -206,14 +205,13 @@ Feature-testing recommendations for C++.
 
 
 
-  
-   make_from_tuple: apply for construction 
+   make_from_tuple: apply for construction 
   
 	http://www.w3.org/1999/xlink"; xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0209r2.pdf";>
 	P0209R2
 	
   
-   No 
+   7 
__cpp_lib_make_from_tuple >= 201606 
 
 


Re: [PR72835] Incorrect arithmetic optimization involving bitfield arguments

2016-08-19 Thread Kugan Vivekanandarajah
Ping?

https://gcc.gnu.org/ml/gcc-patches/2016-08/msg00872.html

Thanks,
Kugan

On 11 August 2016 at 09:09, kugan  wrote:
> Hi,
>
>
> On 10/08/16 20:28, Richard Biener wrote:
>>
>> On Wed, Aug 10, 2016 at 10:57 AM, Jakub Jelinek  wrote:
>>>
>>> On Wed, Aug 10, 2016 at 08:51:32AM +1000, kugan wrote:

 I see it now. The problem is we are just looking at (-1) being in the
 ops
 list for passing changed to rewrite_expr_tree in the case of
 multiplication
 by negate.  If we have combined (-1), as in the testcase, we will not
 have
 the (-1) and will pass changed=false to rewrite_expr_tree.

 We should set changed based on what happens in try_special_add_to_ops.
 Attached patch does this. Bootstrap and regression testing are ongoing.
 Is
 this OK for trunk if there is no regression.
>>>
>>>
>>> I think the bug is elsewhere.  In particular in
>>> undistribute_ops_list/zero_one_operation/decrement_power.
>>> All those look problematic in this regard, they change RHS of statements
>>> to something that holds a different value, while keeping the LHS.
>>> So, generally you should instead just add a new stmt next to the old one,
>>> and adjust data structures (replace the old SSA_NAME in some ->op with
>>> the new one).  decrement_power might be a problem here, dunno if all the
>>> builtins are const in all cases that DSE would kill the old one,
>>> Richard, any preferences for that?  reset flow sensitive info + reset
>>> debug
>>> stmt uses, or something different?  Though, replacing the LHS with a new
>>> anonymous SSA_NAME might be needed too, in case it is before SSA_NAME of
>>> a
>>> user var that doesn't yet have any debug stmts.
>>
>>
>> I'd say replacing the LHS is the way to go, with calling the appropriate
>> helper
>> on the old stmt to generate a debug stmt for it / its uses (would need
>> to look it
>> up here).
>>
>
> Here is an attempt to fix it. The problem arises when in
> undistribute_ops_list, we linearize_expr_tree such that NEGATE_EXPR is added
> (-1) MULT_EXPR (OP). Real problem starts when we handle this in
> zero_one_operation. Unlike what was done earlier, we now change the stmt
> (with propagate_op_to_signle use or by directly) such that the value
> computed by stmt is no longer what it used to be. Because of this, what is
> computed in undistribute_ops_list and rewrite_expr_tree are also changed.
>
> undistribute_ops_list already expects this but rewrite_expr_tree will not if
> we dont pass the changed as an argument.
>
> The way I am fixing this now is, in linearize_expr_tree, I set ops_changed
> to true if we change NEGATE_EXPR to (-1) MULT_EXPR (OP). Then when we call
> zero_one_operation with ops_changed = true, I replace all the LHS in
> zero_one_operation with the new SSA and replace all the uses. I also call
> the rewrite_expr_tree with changed = false in this case.
>
> Does this make sense? Bootstrapped and regression tested for
> x86_64-linux-gnu without any new regressions.
>
> Thanks,
> Kugan
>
>
> gcc/testsuite/ChangeLog:
>
> 2016-08-10  Kugan Vivekanandarajah  
>
> PR tree-optimization/72835
> * gcc.dg/tree-ssa/pr72835.c: New test.
>
> gcc/ChangeLog:
>
> 2016-08-10  Kugan Vivekanandarajah  
>
> PR tree-optimization/72835
> * tree-ssa-reassoc.c (zero_one_operation): Incase of NEGATE_EXPR
> create and use
>  new SSA_NAME.
> (try_special_add_to_ops): Return true if we changed the value in
> operands.
> (linearize_expr_tree): Return true if try_special_add_top_ops set
> ops_changed to true.
> (undistribute_ops_list): Likewise.
> (reassociate_bb): Pass ops_changed returned by linearlize_expr_tree
> to rewrite_expr_tree.
>
>
>
> whil cif we change the operands such that the
>
> /zero_one_operation


[libcpp] append "evaluates to 0" for Wundef diagnostic

2016-08-19 Thread Prathamesh Kulkarni
Hi David,
This trivial patch appends "evaluates to 0", in Wundef diagnostic,
similar to clang, which prints the following diagnostic for undefined macro:
undef.c:1:5: warning: 'FOO' is not defined, evaluates to 0 [-Wundef]
#if FOO
^
Bootstrapped+tested on x86_64-unknown-linux-gnu.
OK to commit ?

Thanks,
Prathamesh
2016-08-19  Prathamesh Kulkarni  

libcpp/
* expr.c (eval_token): Append "evaluates to 0" to Wundef diagnostic.

testsuite/
* gcc.dg/cpp/warn-undef.c: Append "evaluates to 0" to dg-error.
* gcc.dg/cpp/warn-undef-2.c: Likewise.

diff --git a/gcc/testsuite/gcc.dg/cpp/warn-undef-2.c 
b/gcc/testsuite/gcc.dg/cpp/warn-undef-2.c
index 15fdde9..e71aeba 100644
--- a/gcc/testsuite/gcc.dg/cpp/warn-undef-2.c
+++ b/gcc/testsuite/gcc.dg/cpp/warn-undef-2.c
@@ -1,5 +1,5 @@
 // { dg-do preprocess }
 // { dg-options "-std=gnu99 -fdiagnostics-show-option -Werror=undef" }
 /* { dg-message "some warnings being treated as errors" "" {target "*-*-*"} 0 
} */
-#if x  // { dg-error "\"x\" is not defined .-Werror=undef." }
+#if x  // { dg-error "\"x\" is not defined, evaluates to 0 .-Werror=undef." }
 #endif
diff --git a/gcc/testsuite/gcc.dg/cpp/warn-undef.c 
b/gcc/testsuite/gcc.dg/cpp/warn-undef.c
index dd4524d..2c2c421 100644
--- a/gcc/testsuite/gcc.dg/cpp/warn-undef.c
+++ b/gcc/testsuite/gcc.dg/cpp/warn-undef.c
@@ -1,5 +1,5 @@
 // { dg-do preprocess }
 // { dg-options "-std=gnu99 -fdiagnostics-show-option -Wundef" }
 
-#if x  // { dg-warning "\"x\" is not defined .-Wundef." }
+#if x  // { dg-warning "\"x\" is not defined, evaluates to 0 .-Wundef." }
 #endif
diff --git a/libcpp/expr.c b/libcpp/expr.c
index 5cdca6f..d32f5a9 100644
--- a/libcpp/expr.c
+++ b/libcpp/expr.c
@@ -1073,7 +1073,7 @@ eval_token (cpp_reader *pfile, const cpp_token *token,
  result.low = 0;
  if (CPP_OPTION (pfile, warn_undef) && !pfile->state.skip_eval)
cpp_warning_with_line (pfile, CPP_W_UNDEF, virtual_location, 0,
-  "\"%s\" is not defined",
+  "\"%s\" is not defined, evaluates to 0",
   NODE_NAME (token->val.node.node));
}
   break;


Re: [PATCH] Add a TARGET_GEN_MEMSET_VALUE hook

2016-08-19 Thread Richard Biener
On Thu, Aug 18, 2016 at 5:16 PM, H.J. Lu  wrote:
> On Thu, Aug 18, 2016 at 1:18 AM, Richard Biener
>  wrote:
>> On Wed, Aug 17, 2016 at 10:11 PM, H.J. Lu  wrote:
>>> builtin_memset_gen_str returns a register used for memset, which only
>>> supports integer registers.  But a target may use vector registers in
>>> memmset.  This patch adds a TARGET_GEN_MEMSET_VALUE hook to duplicate
>>> QImode value to mode derived from STORE_MAX_PIECES, which can be used
>>> with vector instructions.  The default hook is the same as the original
>>> builtin_memset_gen_str.  A target can override it to support vector
>>> instructions for STORE_MAX_PIECES.
>>>
>>> Tested on x86-64 and i686.  Any comments?
>>>
>>> H.J.
>>> ---
>>> gcc/
>>>
>>> * builtins.c (builtin_memset_gen_str): Call 
>>> targetm.gen_memset_value.
>>> (default_gen_memset_value): New function.
>>> * target.def (gen_memset_value): New hook.
>>> * targhooks.c: Inclue "expmed.h" and "builtins.h".
>>> (default_gen_memset_value): New function.
>>
>> I see default_gen_memset_value in builtins.c but it belongs here.
>>
>>> * targhooks.h (default_gen_memset_value): New prototype.
>>> * config/i386/i386.c (ix86_gen_memset_value): New function.
>>> (TARGET_GEN_MEMSET_VALUE): New.
>>> * config/i386/i386.h (STORE_MAX_PIECES): Likewise.
>>> * doc/tm.texi.in: Add TARGET_GEN_MEMSET_VALUE hook.
>>> * doc/tm.texi: Updated.
>>>
>
> Like this?

Aww, ok - I see builtins.c is a better place - sorry for the extra work.

Richard.

> H.J.


Re: Fwd: [PATCH] genmultilib: improve error reporting for MULTILIB_REUSE

2016-08-19 Thread Thomas Preudhomme



On 18/08/16 17:39, Jeff Law wrote:

On 08/10/2016 09:51 AM, Thomas Preudhomme wrote:



*** gcc/ChangeLog ***

2016-08-01  Thomas Preud'homme  

* doc/fragments.texi (MULTILIB_REUSE): Mention that only options in
MULTILIB_OPTIONS should be used.  Small wording fixes.
* genmultilib: Memorize set of all option combinations in
combination_space.  Detect if RHS of MULTILIB_REUSE uses an
option not
found in MULTILIB_OPTIONS by checking if option set is listed in
combination_space.  Output new and existing error message to
stderr.

[ snip ]



+A reuse rule is comprised of two parts connected by equality sign.  The left
+part is the option set used to build multilib and the right part is the option
+set that will reuse this multilib.  Both part should only use options specified

"Both part" -> "Both parts" I think here.

OK with that change.


Noted. I'm waiting for https://gcc.gnu.org/ml/gcc-patches/2016-08/msg00814.html 
before committing this or otherwise build using --with-multilib-list=aprofile 
would stop working.


Best regards,

Thomas


Re: [PATCH] Restrict jump threading statement simplifier to scalar types (PR71077)

2016-08-19 Thread Richard Biener
On Fri, Aug 19, 2016 at 1:06 AM, Patrick Palka  wrote:
> On Thu, 18 Aug 2016, Richard Biener wrote:
>
>> On August 18, 2016 8:25:18 PM GMT+02:00, Patrick Palka 
>>  wrote:
>> >In comment #5 Yuri reports that r235653 introduces a runtime failure
>> >for
>> >176.gcc which I guess is caused by the combining step in
>> >simplify_control_stmt_condition_1() not behaving properly on operands
>> >of
>> >type VECTOR_TYPE.  I'm a bit stumped as to why it mishandles
>> >VECTOR_TYPEs because the logic should be generic enough to support them
>> >as well.  But it was confirmed that restricting the combining step to
>> >operands of scalar type fixes the runtime failure so here is a patch
>> >that does this.  Does this look OK to commit after bootstrap +
>> >regtesting on x86_64-pc-linux-gnu?
>>
>> Hum, I'd rather understand what is going wrong.  Can you at least isolate a 
>> testcase?
>>
>> Richard.
>
> I don't have access to the SPEC benchmarks unfortunately.  Maybe Yuri
> can isolate a test case?
>
> But I think I found a theoretical bug which may or may not coincide with
> the bug that Yuri is observing.  The part of the combining step that may
> provide wrong results for VECTOR_TYPEs is the one that simplifies the
> conditional (A & B) != 0 to true when given that A != 0 and B != 0 and
> given that their TYPE_PRECISION is 1.
>
> The TYPE_PRECISION test was intended to succeed only on scalars, but
> IIUC it accidentally succeeds on one-dimensional vectors too.  So we may
> be wrongly simplifying X & Y != <0> to true given that e.g.  X == <8>
> and Y == <2>.  So this simplification should probably be restricted to
> integral types like so:
>
> diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
> index 170e456..b8c8b70 100644
> --- a/gcc/tree-ssa-threadedge.c
> +++ b/gcc/tree-ssa-threadedge.c
> @@ -648,14 +648,17 @@ simplify_control_stmt_condition_1 (edge e,
>   if (res1 != NULL_TREE && res2 != NULL_TREE)
> {
>   if (rhs_code == BIT_AND_EXPR
> + && INTEGRAL_TYPE_P (TREE_TYPE (op0))
>   && TYPE_PRECISION (TREE_TYPE (op0)) == 1

you can use element_precision (op0) == 1 instead.

Richard.

>   && integer_nonzerop (res1)
>   && integer_nonzerop (res2))
> --
> 2.9.3.650.g20ba99f
>
> Hope this makes sense.
>
>>
>> >gcc/ChangeLog:
>> >
>> > PR tree-optimization/71077
>> > * tree-ssa-threadedge.c (simplify_control_stmt_condition_1):
>> > Perform the combining step only if the operands have an integral
>> > or a pointer type.
>> >---
>> > gcc/tree-ssa-threadedge.c | 3 +++
>> > 1 file changed, 3 insertions(+)
>> >
>> >diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
>> >index 170e456..a97c00c 100644
>> >--- a/gcc/tree-ssa-threadedge.c
>> >+++ b/gcc/tree-ssa-threadedge.c
>> >@@ -577,6 +577,9 @@ simplify_control_stmt_condition_1 (edge e,
>> >   if (handle_dominating_asserts
>> >   && (cond_code == EQ_EXPR || cond_code == NE_EXPR)
>> >   && TREE_CODE (op0) == SSA_NAME
>> >+  /* ??? Vector types are mishandled here.  */
>> >+  && (INTEGRAL_TYPE_P (TREE_TYPE (op0))
>> >+  || POINTER_TYPE_P (TREE_TYPE (op0)))
>> >   && integer_zerop (op1))
>> > {
>> >   gimple *def_stmt = SSA_NAME_DEF_STMT (op0);
>>
>>
>>


Re: [RFC][PR61839]Convert CST BINOP COND_EXPR to COND_EXPR ? (CST BINOP 1) : (CST BINOP 0)

2016-08-19 Thread Richard Biener
On Fri, Aug 19, 2016 at 10:17 AM, Kugan Vivekanandarajah
 wrote:
> Ping?
>
> https://gcc.gnu.org/ml/gcc-patches/2016-08/msg00987.html

Sorry for the delay.



+  /* ~[TYPE_MIN + 1, TYPE_MAX - 1] */
+  if (vr->type == VR_ANTI_RANGE

  && INTEGRAL_TYPE_P (TREE_TYPE (var))

+  && wi::sub (vr->min, wi::min_value (TREE_TYPE (var))) == 1
+  && wi::sub (wi::max_value (TREE_TYPE (var)), vr->max) == 1)

use vrp_val_min/max instead of wi::max/min_value

+{
+  *a = vrp_val_min (TREE_TYPE (var));
+  *b = vrp_val_max (TREE_TYPE (var));

Ok with that change.

Thanks,
Richard.

> Thanks,
> Kugan
>
> On 12 August 2016 at 13:19, kugan  wrote:
>> Hi Richard,
>>
>>
>> On 11/08/16 20:04, Richard Biener wrote:
>>>
>>> On Thu, Aug 11, 2016 at 6:11 AM, kugan
>>>  wrote:
>>
>>
>> [SNIP]
>>
>>>
>>> +two_valued_val_range_p (tree var, tree *a, tree *b)
>>> +{
>>> +  value_range *vr = get_value_range (var);
>>> +  if ((vr->type != VR_RANGE
>>> +   && vr->type != VR_ANTI_RANGE)
>>> +  || !range_int_cst_p (vr))
>>> +return false;
>>>
>>> range_int_cst_p checks for vr->type == VR_RANGE so the anti-range handling
>>> doesn't ever trigger - which means you should add a testcase for it as
>>> well.
>>
>>
>> Fixed it. I have also added a testcase.
>>
>>>
>>>
>>> +{
>>> +  *a = TYPE_MIN_VALUE (TREE_TYPE (var));
>>> +  *b = TYPE_MAX_VALUE (TREE_TYPE (var));
>>>
>>> note that for pointer types this doesn't work, please also use
>>> vrp_val_min/max for
>>> consistency.  I think you want to add a INTEGRAL_TYPE_P (TREE_TYPE (var))
>>> to the guard of two_valued_val_range_p.
>>>
>>> +  /* First canonicalize to simplify tests.  */
>>> +  if (commutative_tree_code (rhs_code)
>>> + && TREE_CODE (rhs2) == INTEGER_CST)
>>> +   std::swap (rhs1, rhs2);
>>>
>>> note that this doesn't really address my comment - if you just want to
>>> handle
>>> commutative ops then simply look for the constant in the second place
>>> rather
>>> than the first which is the canonical operand order.  But even for
>>> non-commutative
>>> operations we might want to apply this optimization - and then for both
>>> cases,
>>> rhs1 or rhs2 being constant.  Like x / 5 and 5 / x.
>>>
>>> Note that you can rely on int_const_binop returning NULL_TREE for
>>> "invalid"
>>> ops like x % 0 or x / 0, so no need to explicitely guard this here.
>>
>>
>> Sorry, I misunderstood you. I have changed it now. I also added test-case to
>> check this.
>>
>> Bootstrapped and regression tested on x86_64-linux-gnu with no new
>> regressions. Is this OK for trunk now?
>>
>> Thanks,
>> Kugan
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2016-08-12  Kugan Vivekanandarajah  
>>
>> PR tree-optimization/61839
>> * gcc.dg/tree-ssa/pr61839_1.c: New test.
>> * gcc.dg/tree-ssa/pr61839_2.c: New test.
>> * gcc.dg/tree-ssa/pr61839_3.c: New test.
>> * gcc.dg/tree-ssa/pr61839_4.c: New test.
>>
>> gcc/ChangeLog:
>>
>> 2016-08-12  Kugan Vivekanandarajah  
>>
>> PR tree-optimization/61839
>> * tree-vrp.c (two_valued_val_range_p): New.
>> (simplify_stmt_using_ranges): Convert CST BINOP VAR where VAR is
>> two-valued to VAR == VAL1 ? (CST BINOP VAL1) : (CST BINOP VAL2).
>> Also Convert VAR BINOP CST where VAR is two-valued to
>> VAR == VAL1 ? (VAL1 BINOP CST) : (VAL2 BINOP CST).


[PATCH] Avoid return in x86 intrins returning void

2016-08-19 Thread Jakub Jelinek
On Fri, Aug 19, 2016 at 03:21:03AM -0500, Segher Boessenkool wrote:
> On Fri, Aug 19, 2016 at 08:45:49AM +0200, Jakub Jelinek wrote:
> > On Fri, Aug 19, 2016 at 01:50:52PM +0800, lhmouse wrote:
> > > Given the `_fxsave()` function returning `void`, it is invalid C but 
> > > valid C++:
> > 
> > It is also a GNU C extension.
> 
> And GCC warns with -Wpedantic (but not without).  It does the "correct"
> thing in either case.

That said, unlike the long long extension used pretty much everywhere, in
the intrinsic headers this extension doesn't buy us anything.

So ok for trunk to remove those if testing succeeds?

2016-08-19  Jakub Jelinek  

* config/i386/fxsrintrin.h (_fxsave): Remove return keyword in inlines
returning void.
(_fxrstor, _fxsave64, _fxrstor64): Likewise.
* config/i386/xsaveintrin.h (_xsave, _xrstor, _xsave64, _xrstor64):
Likewise.
* config/i386/xsaveoptintrin.h (_xsaveopt, _xsaveopt64): Likewise.
* config/i386/pkuintrin.h (_wrpkru): Likewise.  Add space after
function name.
(_rdpkru_u32): Add space after function name.

--- gcc/config/i386/fxsrintrin.h.jj 2016-01-04 14:55:55.0 +0100
+++ gcc/config/i386/fxsrintrin.h2016-08-19 11:23:16.214265208 +0200
@@ -38,14 +38,14 @@ extern __inline void
 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
 _fxsave (void *__P)
 {
-  return __builtin_ia32_fxsave (__P);
+  __builtin_ia32_fxsave (__P);
 }
 
 extern __inline void
 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
 _fxrstor (void *__P)
 {
-  return __builtin_ia32_fxrstor (__P);
+  __builtin_ia32_fxrstor (__P);
 }
 
 #ifdef __x86_64__
@@ -53,14 +53,14 @@ extern __inline void
 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
 _fxsave64 (void *__P)
 {
-return __builtin_ia32_fxsave64 (__P);
+  __builtin_ia32_fxsave64 (__P);
 }
 
 extern __inline void
 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
 _fxrstor64 (void *__P)
 {
-return __builtin_ia32_fxrstor64 (__P);
+  __builtin_ia32_fxrstor64 (__P);
 }
 #endif
 
--- gcc/config/i386/xsaveintrin.h.jj2016-01-04 14:55:56.0 +0100
+++ gcc/config/i386/xsaveintrin.h   2016-08-19 11:23:46.626878520 +0200
@@ -38,14 +38,14 @@ extern __inline void
 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
 _xsave (void *__P, long long __M)
 {
-  return __builtin_ia32_xsave (__P, __M);
+  __builtin_ia32_xsave (__P, __M);
 }
 
 extern __inline void
 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
 _xrstor (void *__P, long long __M)
 {
-  return __builtin_ia32_xrstor (__P, __M);
+  __builtin_ia32_xrstor (__P, __M);
 }
 
 #ifdef __x86_64__
@@ -53,14 +53,14 @@ extern __inline void
 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
 _xsave64 (void *__P, long long __M)
 {
-  return __builtin_ia32_xsave64 (__P, __M);
+  __builtin_ia32_xsave64 (__P, __M);
 }
 
 extern __inline void
 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
 _xrstor64 (void *__P, long long __M)
 {
-  return __builtin_ia32_xrstor64 (__P, __M);
+  __builtin_ia32_xrstor64 (__P, __M);
 }
 #endif
 
--- gcc/config/i386/xsaveoptintrin.h.jj 2016-01-04 14:55:55.0 +0100
+++ gcc/config/i386/xsaveoptintrin.h2016-08-19 11:24:05.664636461 +0200
@@ -38,7 +38,7 @@ extern __inline void
 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
 _xsaveopt (void *__P, long long __M)
 {
-  return __builtin_ia32_xsaveopt (__P, __M);
+  __builtin_ia32_xsaveopt (__P, __M);
 }
 
 #ifdef __x86_64__
@@ -46,7 +46,7 @@ extern __inline void
 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
 _xsaveopt64 (void *__P, long long __M)
 {
-  return __builtin_ia32_xsaveopt64 (__P, __M);
+  __builtin_ia32_xsaveopt64 (__P, __M);
 }
 #endif
 
--- gcc/config/i386/pkuintrin.h.jj  2016-01-04 14:55:56.0 +0100
+++ gcc/config/i386/pkuintrin.h 2016-08-19 11:24:24.274399843 +0200
@@ -36,16 +36,16 @@
 
 extern __inline unsigned int
 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
-_rdpkru_u32(void)
+_rdpkru_u32 (void)
 {
   return __builtin_ia32_rdpkru ();
 }
 
 extern __inline void
 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
-_wrpkru(unsigned int key)
+_wrpkru (unsigned int key)
 {
-  return __builtin_ia32_wrpkru (key);
+  __builtin_ia32_wrpkru (key);
 }
 
 #ifdef __DISABLE_PKU__


Jakub


Re: [PATCH] Avoid return in x86 intrins returning void

2016-08-19 Thread Uros Bizjak
On Fri, Aug 19, 2016 at 11:41 AM, Jakub Jelinek  wrote:
> On Fri, Aug 19, 2016 at 03:21:03AM -0500, Segher Boessenkool wrote:
>> On Fri, Aug 19, 2016 at 08:45:49AM +0200, Jakub Jelinek wrote:
>> > On Fri, Aug 19, 2016 at 01:50:52PM +0800, lhmouse wrote:
>> > > Given the `_fxsave()` function returning `void`, it is invalid C but 
>> > > valid C++:
>> >
>> > It is also a GNU C extension.
>>
>> And GCC warns with -Wpedantic (but not without).  It does the "correct"
>> thing in either case.
>
> That said, unlike the long long extension used pretty much everywhere, in
> the intrinsic headers this extension doesn't buy us anything.
>
> So ok for trunk to remove those if testing succeeds?
>
> 2016-08-19  Jakub Jelinek  
>
> * config/i386/fxsrintrin.h (_fxsave): Remove return keyword in inlines
> returning void.
> (_fxrstor, _fxsave64, _fxrstor64): Likewise.
> * config/i386/xsaveintrin.h (_xsave, _xrstor, _xsave64, _xrstor64):
> Likewise.
> * config/i386/xsaveoptintrin.h (_xsaveopt, _xsaveopt64): Likewise.
> * config/i386/pkuintrin.h (_wrpkru): Likewise.  Add space after
> function name.
> (_rdpkru_u32): Add space after function name.

OK as obvious patch.

Thanks,
Uros.

> --- gcc/config/i386/fxsrintrin.h.jj 2016-01-04 14:55:55.0 +0100
> +++ gcc/config/i386/fxsrintrin.h2016-08-19 11:23:16.214265208 +0200
> @@ -38,14 +38,14 @@ extern __inline void
>  __attribute__((__gnu_inline__, __always_inline__, __artificial__))
>  _fxsave (void *__P)
>  {
> -  return __builtin_ia32_fxsave (__P);
> +  __builtin_ia32_fxsave (__P);
>  }
>
>  extern __inline void
>  __attribute__((__gnu_inline__, __always_inline__, __artificial__))
>  _fxrstor (void *__P)
>  {
> -  return __builtin_ia32_fxrstor (__P);
> +  __builtin_ia32_fxrstor (__P);
>  }
>
>  #ifdef __x86_64__
> @@ -53,14 +53,14 @@ extern __inline void
>  __attribute__((__gnu_inline__, __always_inline__, __artificial__))
>  _fxsave64 (void *__P)
>  {
> -return __builtin_ia32_fxsave64 (__P);
> +  __builtin_ia32_fxsave64 (__P);
>  }
>
>  extern __inline void
>  __attribute__((__gnu_inline__, __always_inline__, __artificial__))
>  _fxrstor64 (void *__P)
>  {
> -return __builtin_ia32_fxrstor64 (__P);
> +  __builtin_ia32_fxrstor64 (__P);
>  }
>  #endif
>
> --- gcc/config/i386/xsaveintrin.h.jj2016-01-04 14:55:56.0 +0100
> +++ gcc/config/i386/xsaveintrin.h   2016-08-19 11:23:46.626878520 +0200
> @@ -38,14 +38,14 @@ extern __inline void
>  __attribute__((__gnu_inline__, __always_inline__, __artificial__))
>  _xsave (void *__P, long long __M)
>  {
> -  return __builtin_ia32_xsave (__P, __M);
> +  __builtin_ia32_xsave (__P, __M);
>  }
>
>  extern __inline void
>  __attribute__((__gnu_inline__, __always_inline__, __artificial__))
>  _xrstor (void *__P, long long __M)
>  {
> -  return __builtin_ia32_xrstor (__P, __M);
> +  __builtin_ia32_xrstor (__P, __M);
>  }
>
>  #ifdef __x86_64__
> @@ -53,14 +53,14 @@ extern __inline void
>  __attribute__((__gnu_inline__, __always_inline__, __artificial__))
>  _xsave64 (void *__P, long long __M)
>  {
> -  return __builtin_ia32_xsave64 (__P, __M);
> +  __builtin_ia32_xsave64 (__P, __M);
>  }
>
>  extern __inline void
>  __attribute__((__gnu_inline__, __always_inline__, __artificial__))
>  _xrstor64 (void *__P, long long __M)
>  {
> -  return __builtin_ia32_xrstor64 (__P, __M);
> +  __builtin_ia32_xrstor64 (__P, __M);
>  }
>  #endif
>
> --- gcc/config/i386/xsaveoptintrin.h.jj 2016-01-04 14:55:55.0 +0100
> +++ gcc/config/i386/xsaveoptintrin.h2016-08-19 11:24:05.664636461 +0200
> @@ -38,7 +38,7 @@ extern __inline void
>  __attribute__((__gnu_inline__, __always_inline__, __artificial__))
>  _xsaveopt (void *__P, long long __M)
>  {
> -  return __builtin_ia32_xsaveopt (__P, __M);
> +  __builtin_ia32_xsaveopt (__P, __M);
>  }
>
>  #ifdef __x86_64__
> @@ -46,7 +46,7 @@ extern __inline void
>  __attribute__((__gnu_inline__, __always_inline__, __artificial__))
>  _xsaveopt64 (void *__P, long long __M)
>  {
> -  return __builtin_ia32_xsaveopt64 (__P, __M);
> +  __builtin_ia32_xsaveopt64 (__P, __M);
>  }
>  #endif
>
> --- gcc/config/i386/pkuintrin.h.jj  2016-01-04 14:55:56.0 +0100
> +++ gcc/config/i386/pkuintrin.h 2016-08-19 11:24:24.274399843 +0200
> @@ -36,16 +36,16 @@
>
>  extern __inline unsigned int
>  __attribute__((__gnu_inline__, __always_inline__, __artificial__))
> -_rdpkru_u32(void)
> +_rdpkru_u32 (void)
>  {
>return __builtin_ia32_rdpkru ();
>  }
>
>  extern __inline void
>  __attribute__((__gnu_inline__, __always_inline__, __artificial__))
> -_wrpkru(unsigned int key)
> +_wrpkru (unsigned int key)
>  {
> -  return __builtin_ia32_wrpkru (key);
> +  __builtin_ia32_wrpkru (key);
>  }
>
>  #ifdef __DISABLE_PKU__
>
>
> Jakub


Re: [PR tree-optimization/71691] Fix unswitching in presence of maybe-undefined SSA_NAMEs

2016-08-19 Thread Richard Biener
On Thu, Aug 18, 2016 at 7:29 PM, Jeff Law  wrote:
>
> So to recap, the problem with this BZ is we have a maybe-undefined SSA_NAME
> in the IL.  That maybe-undefined name is used as a condition inside a loop
> and we unswitch that condition.
>
> tree-ssa-loop-unswitch.c already tries to avoid doing that, but uses the
> optimistic ssa_undefined_value_p function.
>
> Essentially ssa_undefined_value_p just checks to see if the SSA_NAME is set
> from a real statement.  It doesn't look at the operands of that statement to
> see if they're undefined, nor does it walk the use-def chains of the RHS
> operands.  In short, it's totally inappropriate for unswitching's needs.
>
> This patch introduces a new class where we can ask if a particular SSA_NAME
> is defined or may be undefined.  Only the latter interface is currently used
> and I wouldn't object if we wanted to avoid the former interface until we
> needed it (it's just a trivial bitmap test, so we're not losing any real
> knowledge of how to implement it).
>
> We walk the CFG in dominator order.  For each block we walk the PHIs and
> mark the LHS as defined IFF all the RHS arguments are defined.  Then we walk
> the statements and mark their LHSs as defined IIFF all the RHS arguments are
> defined.
>
> This gives us a conservative solution to the may be undefined question. We
> do not try to keep this information up-to-date as statements or CFG are
> updated -- queries on newly added SSA_NAMEs always return may-be-undefined.
>
> This information is ephemeral and not kept up-to-date.  We perform the
> analysis in the class's constructor and tear down the resulting bitmap in
> the class's destructor.
>
>
> Bootstrapped and regression tested on x86_64-linux-gnu.
>
> Ok for the trunk?

Few comments from only reading parts of tree-ssa-defined-or-undefined.c.

Usually all SSA names will be defined thus it looks more efficient to
record maybe_undefined names (to reduce bitmap cardinality).

Also you are iterating over DOM children which doesn't reliably visit
the merge after a if-then-else after the then or else part.  I think you
want to iterate in RPO order instead which guarantees that predecessors
have been visited (apart from those reachable over backedges).

I think you can spare yourself recording anything for virtual SSA names
(just use SSA_OP_DEFS instead of ALL_DEFS and skip virtual PHIs).

Note that unswitching is not really interested in definedness -- you are
treating parameters as "defined", but inlining may expose they are not.
Thus you are _not_ computing a conservative "defined" either.

Unswitching is interested in
defined-or-used-before-the-point-I-want-to-insert-a-use.

This means that what you can pre-compute for the whole function is the
dominance frontier that specifies either the must-def or all (possibly
unused) must-uses
(uses in PHIs are not uses obviously).

It might be actually cheaper to simply iterate over all immediate uses
of the COND ops
unswitching wants to insert and perform dominator checks (plus of course do the
check for a param default def and the names def).  Like add a

dominated_by_definition_or_use_p (basic_block bb, tree name)

helper for this.

Thanks,
Richard.

>
>
> Jeff
>
>
> PR tree-optimization/71691
> * Makefile.in (OBJS): Add tree-ssa-defined-or-undefined.
> * tree-ssa-defined-or-undefined.h: New file.
> * tree-ssa-defined-or-undefined.c: New file.
> * tree-ssa-loop-unswitch.c: Include tree-ssa-defined-or-undefined.h
> (tree_ssa_unswitch_loops): Create a defined_or_undefined class
> instance
> and pass it to tree_unswitch_single_loop.
> (tree_unswitch_single_loop): Pass instance down through recursive
> calls
> and into tree_may_unswitch_on.
> (tree_may_unswitch_on): Use defined_or_undefined instance rather
> than
> ssa_undefined_value_p.
>
> PR tree-optimization/71691
> * gcc.c-torture/execute/pr71691.c: New test.
>
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index 7a0160f..9e881b8 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -1491,6 +1491,7 @@ OBJS = \
> tree-ssa-coalesce.o \
> tree-ssa-copy.o \
> tree-ssa-dce.o \
> +   tree-ssa-defined-or-undefined.o \
> tree-ssa-dom.o \
> tree-ssa-dse.o \
> tree-ssa-forwprop.o \
> diff --git a/gcc/testsuite/gcc.c-torture/execute/pr71691.c
> b/gcc/testsuite/gcc.c-torture/execute/pr71691.c
> new file mode 100644
> index 000..2c5dbb6
> --- /dev/null
> +++ b/gcc/testsuite/gcc.c-torture/execute/pr71691.c
> @@ -0,0 +1,45 @@
> +char b;
> +short f;
> +unsigned e;
> +int g = 20;
> +
> +void
> +foo ()
> +{
> +  int l, h;
> +  for (l = 0; l <= 7; l++)
> +{
> +  int j = 38;
> +  if (g)
> +   h = 0;
> +  for (; h <= 7; h++)
> +   {
> + int i, k = b % (j % 4);
> + g = f;
> + for (;;)
> +   {
> + j = 6 || b;
> + if (e)
> + 

[PATCH] Uglify inline argument and local var names in x86 intrinsics

2016-08-19 Thread Jakub Jelinek
On Fri, Aug 19, 2016 at 11:48:16AM +0200, Uros Bizjak wrote:
> > * config/i386/pkuintrin.h (_wrpkru): Likewise.  Add space after
> > function name.
> > (_rdpkru_u32): Add space after function name.
> 
> OK as obvious patch.

Thanks.  When changing this part, I've noticed the argument name key.
While most of the inline function arguments and local variables are properly
uglified (using __* names), some of them aren't, so if one e.g. does
#define key ({ ... })
#include 
or similar, it will fail to compile.
E.g. the glibc and libstdc++ headers try hard to uglify everything that
isn't part of the namespace reserved for what the header provides and
implementation and IMNSHO so should do the intrinsics.

I've used -g -dA -fno-merge-debug-strings -S -fkeep-inline-functions 
-D__always_inline__= -D__artificial__=
to compile sse-13.c (in addition to its dg-options) and then
grep -A3 'DW_TAG_formal_parameter\|DW_TAG_variable' sse-13.s
and searched in there for "[^_  ] regexp.

Ok for trunk if testing succeeds?

2016-08-19  Jakub Jelinek  

* config/i386/rdseedintrin.h (_rdseed16_step, _rdseed32_step,
_rdseed64_step): Uglify argument names and/or local variable names
in inline functions.
* config/i386/rtmintrin.h (_xabort): Likewise.
* config/i386/avx512vlintrin.h (_mm256_ternarylogic_epi64,
_mm256_mask_ternarylogic_epi64, _mm256_maskz_ternarylogic_epi64,
_mm256_ternarylogic_epi32, _mm256_mask_ternarylogic_epi32,
_mm256_maskz_ternarylogic_epi32, _mm_ternarylogic_epi64,
_mm_mask_ternarylogic_epi64, _mm_maskz_ternarylogic_epi64,
_mm_ternarylogic_epi32, _mm_mask_ternarylogic_epi32,
_mm_maskz_ternarylogic_epi32): Likewise.
* config/i386/lwpintrin.h (__llwpcb, __lwpval32, __lwpval64,
__lwpins32, __lwpins64): Likewise.
* config/i386/avx2intrin.h (_mm_i32gather_pd, _mm_mask_i32gather_pd,
_mm256_i32gather_pd, _mm256_mask_i32gather_pd, _mm_i64gather_pd,
_mm_mask_i64gather_pd, _mm256_i64gather_pd, _mm256_mask_i64gather_pd,
_mm_i32gather_ps, _mm_mask_i32gather_ps, _mm256_i32gather_ps,
_mm256_mask_i32gather_ps, _mm_i64gather_ps, _mm_mask_i64gather_ps,
_mm256_i64gather_ps, _mm256_mask_i64gather_ps, _mm_i32gather_epi64,
_mm_mask_i32gather_epi64, _mm256_i32gather_epi64,
_mm256_mask_i32gather_epi64, _mm_i64gather_epi64,
_mm_mask_i64gather_epi64, _mm256_i64gather_epi64,
_mm256_mask_i64gather_epi64, _mm_i32gather_epi32,
_mm_mask_i32gather_epi32, _mm256_i32gather_epi32,
_mm256_mask_i32gather_epi32, _mm_i64gather_epi32,
_mm_mask_i64gather_epi32, _mm256_i64gather_epi32,
_mm256_mask_i64gather_epi32): Likewise.
* config/i386/pmm_malloc.h (_mm_malloc, _mm_free): Likewise.
* config/i386/ia32intrin.h (__writeeflags): Likewise.
* config/i386/pkuintrin.h (_wrpkru): Likewise.
* config/i386/avx512pfintrin.h (_mm512_mask_prefetch_i32gather_pd,
_mm512_mask_prefetch_i32gather_ps, _mm512_mask_prefetch_i64gather_pd,
_mm512_mask_prefetch_i64gather_ps, _mm512_prefetch_i32scatter_pd,
_mm512_prefetch_i32scatter_ps, _mm512_mask_prefetch_i32scatter_pd,
_mm512_mask_prefetch_i32scatter_ps, _mm512_prefetch_i64scatter_pd,
_mm512_prefetch_i64scatter_ps, _mm512_mask_prefetch_i64scatter_pd,
_mm512_mask_prefetch_i64scatter_ps): Likewise.
* config/i386/gmm_malloc.h (_mm_malloc, _mm_free): Likewise.
* config/i386/avx512fintrin.h (_mm512_ternarylogic_epi64,
_mm512_mask_ternarylogic_epi64, _mm512_maskz_ternarylogic_epi64,
_mm512_ternarylogic_epi32, _mm512_mask_ternarylogic_epi32,
_mm512_maskz_ternarylogic_epi32, _mm512_i32gather_ps,
_mm512_mask_i32gather_ps, _mm512_i32gather_pd, _mm512_i64gather_ps,
_mm512_i64gather_pd, _mm512_i32gather_epi32, _mm512_i32gather_epi64,
_mm512_i64gather_epi32, _mm512_i64gather_epi64): Likewise.

--- gcc/config/i386/rdseedintrin.h.jj   2016-01-04 14:55:56.0 +0100
+++ gcc/config/i386/rdseedintrin.h  2016-08-19 11:55:35.603707812 +0200
@@ -37,24 +37,24 @@
 
 extern __inline int
 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
-_rdseed16_step (unsigned short *p)
+_rdseed16_step (unsigned short *__p)
 {
-return __builtin_ia32_rdseed_hi_step (p);
+return __builtin_ia32_rdseed_hi_step (__p);
 }
 
 extern __inline int
 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
-_rdseed32_step (unsigned int *p)
+_rdseed32_step (unsigned int *__p)
 {
-return __builtin_ia32_rdseed_si_step (p);
+return __builtin_ia32_rdseed_si_step (__p);
 }
 
 #ifdef __x86_64__
 extern __inline int
 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
-_rdseed64_step (unsigned long long *p)
+_rdseed64_step (unsigned long long *__p)
 {
-return __builtin_ia32_rdseed_di_step (p);
+return __builtin_ia32_rdseed_di_step

Re: Implement C _FloatN, _FloatNx types [version 6]

2016-08-19 Thread Joseph Myers
On Fri, 19 Aug 2016, Richard Biener wrote:

> Ok if libcpp maintainers do not object.

I consider the libcpp changes something I can self-approve - but, any 
comments?

> Can you quickly verify if LTO works with the new types?  I don't see anything
> that would prevent it but having new global trees and backends initializing 
> them
> might come up with surprises (see tree-streamer.c:preload_common_nodes)

Well, the execution tests are in gcc.dg/torture, which is run with various 
options including -flto (and I've checked the testsuite logs to confirm 
these tests are indeed run with such options).  Is there something else 
you think should be tested?

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] Uglify inline argument and local var names in x86 intrinsics

2016-08-19 Thread Uros Bizjak
On Fri, Aug 19, 2016 at 12:56 PM, Jakub Jelinek  wrote:
> On Fri, Aug 19, 2016 at 11:48:16AM +0200, Uros Bizjak wrote:
>> > * config/i386/pkuintrin.h (_wrpkru): Likewise.  Add space after
>> > function name.
>> > (_rdpkru_u32): Add space after function name.
>>
>> OK as obvious patch.
>
> Thanks.  When changing this part, I've noticed the argument name key.
> While most of the inline function arguments and local variables are properly
> uglified (using __* names), some of them aren't, so if one e.g. does
> #define key ({ ... })
> #include 
> or similar, it will fail to compile.
> E.g. the glibc and libstdc++ headers try hard to uglify everything that
> isn't part of the namespace reserved for what the header provides and
> implementation and IMNSHO so should do the intrinsics.
>
> I've used -g -dA -fno-merge-debug-strings -S -fkeep-inline-functions 
> -D__always_inline__= -D__artificial__=
> to compile sse-13.c (in addition to its dg-options) and then
> grep -A3 'DW_TAG_formal_parameter\|DW_TAG_variable' sse-13.s
> and searched in there for "[^_  ] regexp.
>
> Ok for trunk if testing succeeds?

Yes, also OK.

Thanks,
Uros.

>
> 2016-08-19  Jakub Jelinek  
>
> * config/i386/rdseedintrin.h (_rdseed16_step, _rdseed32_step,
> _rdseed64_step): Uglify argument names and/or local variable names
> in inline functions.
> * config/i386/rtmintrin.h (_xabort): Likewise.
> * config/i386/avx512vlintrin.h (_mm256_ternarylogic_epi64,
> _mm256_mask_ternarylogic_epi64, _mm256_maskz_ternarylogic_epi64,
> _mm256_ternarylogic_epi32, _mm256_mask_ternarylogic_epi32,
> _mm256_maskz_ternarylogic_epi32, _mm_ternarylogic_epi64,
> _mm_mask_ternarylogic_epi64, _mm_maskz_ternarylogic_epi64,
> _mm_ternarylogic_epi32, _mm_mask_ternarylogic_epi32,
> _mm_maskz_ternarylogic_epi32): Likewise.
> * config/i386/lwpintrin.h (__llwpcb, __lwpval32, __lwpval64,
> __lwpins32, __lwpins64): Likewise.
> * config/i386/avx2intrin.h (_mm_i32gather_pd, _mm_mask_i32gather_pd,
> _mm256_i32gather_pd, _mm256_mask_i32gather_pd, _mm_i64gather_pd,
> _mm_mask_i64gather_pd, _mm256_i64gather_pd, _mm256_mask_i64gather_pd,
> _mm_i32gather_ps, _mm_mask_i32gather_ps, _mm256_i32gather_ps,
> _mm256_mask_i32gather_ps, _mm_i64gather_ps, _mm_mask_i64gather_ps,
> _mm256_i64gather_ps, _mm256_mask_i64gather_ps, _mm_i32gather_epi64,
> _mm_mask_i32gather_epi64, _mm256_i32gather_epi64,
> _mm256_mask_i32gather_epi64, _mm_i64gather_epi64,
> _mm_mask_i64gather_epi64, _mm256_i64gather_epi64,
> _mm256_mask_i64gather_epi64, _mm_i32gather_epi32,
> _mm_mask_i32gather_epi32, _mm256_i32gather_epi32,
> _mm256_mask_i32gather_epi32, _mm_i64gather_epi32,
> _mm_mask_i64gather_epi32, _mm256_i64gather_epi32,
> _mm256_mask_i64gather_epi32): Likewise.
> * config/i386/pmm_malloc.h (_mm_malloc, _mm_free): Likewise.
> * config/i386/ia32intrin.h (__writeeflags): Likewise.
> * config/i386/pkuintrin.h (_wrpkru): Likewise.
> * config/i386/avx512pfintrin.h (_mm512_mask_prefetch_i32gather_pd,
> _mm512_mask_prefetch_i32gather_ps, _mm512_mask_prefetch_i64gather_pd,
> _mm512_mask_prefetch_i64gather_ps, _mm512_prefetch_i32scatter_pd,
> _mm512_prefetch_i32scatter_ps, _mm512_mask_prefetch_i32scatter_pd,
> _mm512_mask_prefetch_i32scatter_ps, _mm512_prefetch_i64scatter_pd,
> _mm512_prefetch_i64scatter_ps, _mm512_mask_prefetch_i64scatter_pd,
> _mm512_mask_prefetch_i64scatter_ps): Likewise.
> * config/i386/gmm_malloc.h (_mm_malloc, _mm_free): Likewise.
> * config/i386/avx512fintrin.h (_mm512_ternarylogic_epi64,
> _mm512_mask_ternarylogic_epi64, _mm512_maskz_ternarylogic_epi64,
> _mm512_ternarylogic_epi32, _mm512_mask_ternarylogic_epi32,
> _mm512_maskz_ternarylogic_epi32, _mm512_i32gather_ps,
> _mm512_mask_i32gather_ps, _mm512_i32gather_pd, _mm512_i64gather_ps,
> _mm512_i64gather_pd, _mm512_i32gather_epi32, _mm512_i32gather_epi64,
> _mm512_i64gather_epi32, _mm512_i64gather_epi64): Likewise.
>
> --- gcc/config/i386/rdseedintrin.h.jj   2016-01-04 14:55:56.0 +0100
> +++ gcc/config/i386/rdseedintrin.h  2016-08-19 11:55:35.603707812 +0200
> @@ -37,24 +37,24 @@
>
>  extern __inline int
>  __attribute__((__gnu_inline__, __always_inline__, __artificial__))
> -_rdseed16_step (unsigned short *p)
> +_rdseed16_step (unsigned short *__p)
>  {
> -return __builtin_ia32_rdseed_hi_step (p);
> +return __builtin_ia32_rdseed_hi_step (__p);
>  }
>
>  extern __inline int
>  __attribute__((__gnu_inline__, __always_inline__, __artificial__))
> -_rdseed32_step (unsigned int *p)
> +_rdseed32_step (unsigned int *__p)
>  {
> -return __builtin_ia32_rdseed_si_step (p);
> +return __builtin_ia32_rdseed_si_step (__p);
>  }
>
>  #ifdef __x86_64__
>

Re: Implement C _FloatN, _FloatNx types [version 6]

2016-08-19 Thread Richard Biener
On Fri, Aug 19, 2016 at 1:05 PM, Joseph Myers  wrote:
> On Fri, 19 Aug 2016, Richard Biener wrote:
>
>> Ok if libcpp maintainers do not object.
>
> I consider the libcpp changes something I can self-approve - but, any
> comments?

No comments on that part.

>> Can you quickly verify if LTO works with the new types?  I don't see anything
>> that would prevent it but having new global trees and backends initializing 
>> them
>> might come up with surprises (see tree-streamer.c:preload_common_nodes)
>
> Well, the execution tests are in gcc.dg/torture, which is run with various
> options including -flto (and I've checked the testsuite logs to confirm
> these tests are indeed run with such options).  Is there something else
> you think should be tested?

No, I think that's enough.

Thanks,
Richard.

>
> --
> Joseph S. Myers
> jos...@codesourcery.com


Re: Implement -Wimplicit-fallthrough (take 3)

2016-08-19 Thread Marek Polacek
On Thu, Aug 18, 2016 at 07:31:12PM +0100, Manuel López-Ibáñez wrote:
> Does this warning make sense if !(lang_GNU_C() || lang_GNU_CXX()) ?

I don't think so, it's meant for C/C++ only.  I added a better check.
Thanks,

Marek


Re: [RFC][IPA-VRP] Early VRP Implementation

2016-08-19 Thread Richard Biener
On Tue, Aug 16, 2016 at 9:45 AM, kugan
 wrote:
> Hi Richard,
>
> On 12/08/16 20:43, Richard Biener wrote:
>>
>> On Wed, Aug 3, 2016 at 3:17 AM, kugan 
>> wrote:
>
>
> [SNIP]
>
>>
>> diff --git a/gcc/common.opt b/gcc/common.opt
>> index 8a292ed..7028cd4 100644
>> --- a/gcc/common.opt
>> +++ b/gcc/common.opt
>> @@ -2482,6 +2482,10 @@ ftree-vrp
>>  Common Report Var(flag_tree_vrp) Init(0) Optimization
>>  Perform Value Range Propagation on trees.
>>
>> +fdisable-tree-evrp
>> +Common Report Var(flag_disable_early_vrp) Init(0) Optimization
>> +Disable Early Value Range Propagation on trees.
>> +
>>
>> no please, this is automatically supported via -fdisable-
>
>
> I am now having -ftree-evrp which is enabled all the time. But This will
> only be used for disabling the early-vrp. That is, early-vrp will be run
> when ftree-vrp is enabled and ftree-evrp is not explicitly disabled. Is this
> OK?

Why would one want to disable early-vrp?  I see you do this in the testsuite
for non-early VRP unit-tests but using -fdisable-tree-evrp1 there
would be ok as well.

>>
>> @@ -1728,11 +1736,12 @@ extract_range_from_assert (value_range *vr_p, tree
>> expr)
>>  always false.  */
>>
>>  static void
>> -extract_range_from_ssa_name (value_range *vr, tree var)
>> +extract_range_from_ssa_name (value_range *vr, bool dom_p, tree var)
>>  {
>>value_range *var_vr = get_value_range (var);
>>
>> -  if (var_vr->type != VR_VARYING)
>> +  if (var_vr->type != VR_VARYING
>> +  && (!dom_p || var_vr->type != VR_UNDEFINED))
>>  copy_value_range (vr, var_vr);
>>else
>>  set_value_range (vr, VR_RANGE, var, var, NULL);
>>
>> why do you need these changes?  I think I already told you you need to
>> initialize the lattice to sth else than VR_UNDEFINED and that you can't
>> fully re-use update_value_range.  If you don't want to do that then
>> instead
>> of doing changes all over the place do it in get_value_range and have a
>> global flag.
>
>
> I have now added a global early_vrp_p and use this to initialize
> VR_INITIALIZER and get_value_range default to VR_VARYING.

ICK.  Ok, I see that this works, but it is quite ugly, so (see below)

>>
>>
>> @@ -3594,7 +3643,8 @@ extract_range_from_cond_expr (value_range *vr,
>> gassign *stmt)
>> on the range of its operand and the expression code.  */
>>
>>  static void
>> -extract_range_from_comparison (value_range *vr, enum tree_code code,
>> +extract_range_from_comparison (value_range *vr,
>> +  enum tree_code code,
>>tree type, tree op0, tree op1)
>>  {
>>bool sop = false;
>>
>> remove these kind of no-op changes.
>
>
> Done.
>
>
>>
>> +/* Initialize local data structures for VRP.  If DOM_P is true,
>> +   we will be calling this from early_vrp where value range propagation
>> +   is done by visiting stmts in dominator tree.  ssa_propagate engine
>> +   is not used in this case and that part of the ininitialization will
>> +   be skipped.  */
>>
>>  static void
>> -vrp_initialize (void)
>> +vrp_initialize (bool dom_p)
>>  {
>>basic_block bb;
>>
>> @@ -6949,6 +7010,9 @@ vrp_initialize (void)
>>vr_phi_edge_counts = XCNEWVEC (int, num_ssa_names);
>>bitmap_obstack_initialize (&vrp_equiv_obstack);
>>
>> +  if (dom_p)
>> +return;
>> +
>>
>> split the function instead.
>>
>> @@ -7926,7 +7992,8 @@ vrp_visit_switch_stmt (gswitch *stmt, edge
>> *taken_edge_p)
>> If STMT produces a varying value, return SSA_PROP_VARYING.  */
>>
>>  static enum ssa_prop_result
>> -vrp_visit_stmt (gimple *stmt, edge *taken_edge_p, tree *output_p)
>> +vrp_visit_stmt_worker (gimple *stmt, bool dom_p,  edge *taken_edge_p,
>> +  tree *output_p)
>>  {
>>tree def;
>>ssa_op_iter iter;
>> @@ -7940,7 +8007,7 @@ vrp_visit_stmt (gimple *stmt, edge
>> *taken_edge_p, tree *output_p)
>>if (!stmt_interesting_for_vrp (stmt))
>>  gcc_assert (stmt_ends_bb_p (stmt));
>>else if (is_gimple_assign (stmt) || is_gimple_call (stmt))
>> -return vrp_visit_assignment_or_call (stmt, output_p);
>> +return vrp_visit_assignment_or_call (stmt, dom_p, output_p);
>>else if (gimple_code (stmt) == GIMPLE_COND)
>>  return vrp_visit_cond_stmt (as_a  (stmt), taken_edge_p);
>>else if (gimple_code (stmt) == GIMPLE_SWITCH)
>> @@ -7954,6 +8021,12 @@ vrp_visit_stmt (gimple *stmt, edge
>> *taken_edge_p, tree *output_p)
>>return SSA_PROP_VARYING;
>>  }
>>
>> +static enum ssa_prop_result
>> +vrp_visit_stmt (gimple *stmt, edge *taken_edge_p, tree *output_p)
>> +{
>> +  return vrp_visit_stmt_worker (stmt, false, taken_edge_p, output_p);
>> +}
>>
>> as said the refactoring that would be appreciated is to split out the
>> update_value_range calls
>> from the worker functions so you can call the respective functions
>> from the DOM implementations.
>> That they are globbed in vrp_visit_stmt currently is due to the API of
>> the SSA propagator.
>
> Sent this as separate patch for easy reviewing and testing.

Thank

[ARM][PR target/77281] Fix an invalid check for vectors of, the same floating-point constants.

2016-08-19 Thread Matthew Wahab

Hello,

Test gcc.c-torture/execute/ieee/pr72824-2.c fails for arm targets
because the code generated to move a vector of signed and unsigned zeros
treats it as a vector of unsigned zeros.

That is, an assignment x = { 0.f, -0.f, 0.f, -0.f } is treated as the
assignment x = { 0.f, 0.f, 0.f, 0.f }.

This is due to config/arm/arm.c/neon_valid_immediate using real_equal to
compare the vector elements. This patch replaces the check, using
const_vec_duplicate_p instead. It doesn't add a new test because
pr72824-2.c is enough to check the behaviour.

Tested for arm-none-linux-gnueabihf with native bootstrap and make check
and for arm-none-eabi with cross-compiled check-gcc.

2016-08-19  Matthew Wahab  

PR target/77281
* config/arm/arm.c (neon_valid_immediate): Delete declaration.
Use const_vec_duplicate to check for duplicated elements.

Ok for trunk?
Matthew
>From 90c1c86b7a3d8bc6ac07363aea5fba8f29ef3e96 Mon Sep 17 00:00:00 2001
From: Matthew Wahab 
Date: Wed, 17 Aug 2016 14:43:48 +0100
Subject: [PATCH] [ARM] Fix a wrong test for vectors of the same constants.

---
 gcc/config/arm/arm.c | 13 -
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index a6afdcc..c1d010c 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -12471,7 +12471,6 @@ neon_valid_immediate (rtx op, machine_mode mode, int inverse,
   if (GET_MODE_CLASS (mode) == MODE_VECTOR_FLOAT)
 {
   rtx el0 = CONST_VECTOR_ELT (op, 0);
-  const REAL_VALUE_TYPE *r0;
 
   if (!vfp3_const_double_rtx (el0) && el0 != CONST0_RTX (GET_MODE (el0)))
 return -1;
@@ -12480,14 +12479,10 @@ neon_valid_immediate (rtx op, machine_mode mode, int inverse,
   if (GET_MODE_INNER (mode) == HFmode)
 	return -1;
 
-  r0 = CONST_DOUBLE_REAL_VALUE (el0);
-
-  for (i = 1; i < n_elts; i++)
-{
-  rtx elt = CONST_VECTOR_ELT (op, i);
-  if (!real_equal (r0, CONST_DOUBLE_REAL_VALUE (elt)))
-return -1;
-}
+  /* All elements in the vector must be the same.  Note that 0.0 and -0.0
+	 are distinct in this context.  */
+  if (!const_vec_duplicate_p (op))
+	return -1;
 
   if (modconst)
 *modconst = CONST_VECTOR_ELT (op, 0);
-- 
2.1.4



Re: [PATCH] Restrict jump threading statement simplifier to scalar types (PR71077)

2016-08-19 Thread Yuri Rumyantsev
Hi,

Here is a simple test-case to reproduce 176.gcc failure (I run it on
Haswell machine).
Using 20160819 compiler build we get:
gcc -O3 -m32 -mavx2 test.c -o test.ref.exe
/users/ysrumyan/isse_6866$ ./test.ref.exe
Aborted (core dumped)

If I apply patch proposed by Patrick test runs properly
Instead of running we can check number of .jump thread.

2016-08-19 12:25 GMT+03:00 Richard Biener :
> On Fri, Aug 19, 2016 at 1:06 AM, Patrick Palka  wrote:
>> On Thu, 18 Aug 2016, Richard Biener wrote:
>>
>>> On August 18, 2016 8:25:18 PM GMT+02:00, Patrick Palka 
>>>  wrote:
>>> >In comment #5 Yuri reports that r235653 introduces a runtime failure
>>> >for
>>> >176.gcc which I guess is caused by the combining step in
>>> >simplify_control_stmt_condition_1() not behaving properly on operands
>>> >of
>>> >type VECTOR_TYPE.  I'm a bit stumped as to why it mishandles
>>> >VECTOR_TYPEs because the logic should be generic enough to support them
>>> >as well.  But it was confirmed that restricting the combining step to
>>> >operands of scalar type fixes the runtime failure so here is a patch
>>> >that does this.  Does this look OK to commit after bootstrap +
>>> >regtesting on x86_64-pc-linux-gnu?
>>>
>>> Hum, I'd rather understand what is going wrong.  Can you at least isolate a 
>>> testcase?
>>>
>>> Richard.
>>
>> I don't have access to the SPEC benchmarks unfortunately.  Maybe Yuri
>> can isolate a test case?
>>
>> But I think I found a theoretical bug which may or may not coincide with
>> the bug that Yuri is observing.  The part of the combining step that may
>> provide wrong results for VECTOR_TYPEs is the one that simplifies the
>> conditional (A & B) != 0 to true when given that A != 0 and B != 0 and
>> given that their TYPE_PRECISION is 1.
>>
>> The TYPE_PRECISION test was intended to succeed only on scalars, but
>> IIUC it accidentally succeeds on one-dimensional vectors too.  So we may
>> be wrongly simplifying X & Y != <0> to true given that e.g.  X == <8>
>> and Y == <2>.  So this simplification should probably be restricted to
>> integral types like so:
>>
>> diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
>> index 170e456..b8c8b70 100644
>> --- a/gcc/tree-ssa-threadedge.c
>> +++ b/gcc/tree-ssa-threadedge.c
>> @@ -648,14 +648,17 @@ simplify_control_stmt_condition_1 (edge e,
>>   if (res1 != NULL_TREE && res2 != NULL_TREE)
>> {
>>   if (rhs_code == BIT_AND_EXPR
>> + && INTEGRAL_TYPE_P (TREE_TYPE (op0))
>>   && TYPE_PRECISION (TREE_TYPE (op0)) == 1
>
> you can use element_precision (op0) == 1 instead.
>
> Richard.
>
>>   && integer_nonzerop (res1)
>>   && integer_nonzerop (res2))
>> --
>> 2.9.3.650.g20ba99f
>>
>> Hope this makes sense.
>>
>>>
>>> >gcc/ChangeLog:
>>> >
>>> > PR tree-optimization/71077
>>> > * tree-ssa-threadedge.c (simplify_control_stmt_condition_1):
>>> > Perform the combining step only if the operands have an integral
>>> > or a pointer type.
>>> >---
>>> > gcc/tree-ssa-threadedge.c | 3 +++
>>> > 1 file changed, 3 insertions(+)
>>> >
>>> >diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
>>> >index 170e456..a97c00c 100644
>>> >--- a/gcc/tree-ssa-threadedge.c
>>> >+++ b/gcc/tree-ssa-threadedge.c
>>> >@@ -577,6 +577,9 @@ simplify_control_stmt_condition_1 (edge e,
>>> >   if (handle_dominating_asserts
>>> >   && (cond_code == EQ_EXPR || cond_code == NE_EXPR)
>>> >   && TREE_CODE (op0) == SSA_NAME
>>> >+  /* ??? Vector types are mishandled here.  */
>>> >+  && (INTEGRAL_TYPE_P (TREE_TYPE (op0))
>>> >+  || POINTER_TYPE_P (TREE_TYPE (op0)))
>>> >   && integer_zerop (op1))
>>> > {
>>> >   gimple *def_stmt = SSA_NAME_DEF_STMT (op0);
>>>
>>>
>>>
typedef unsigned int ui;
ui x[32*32];
ui y[32];
ui z[32];
void __attribute__ ((noinline, noclone)) foo (ui n, ui z)
{
  ui i, b;
  ui v;
 for (i = 0; i< n; i++)
  {
v = y[i];
if (v) {
  for (b = 0; b < 32; b++)
	if ((v >> b) & 1)
	  x[i*32 +b] = z;
  y[i] = 0;
}
  } 
}

int main()
{
  int i;
  unsigned int val;
  for (i = 0; i<32; i++)
{
  val = 1 << i;
  y[i] = (i & 1)? 0 : val;
  z[i] = i;
}
foo (32, 10);
  for (i=0; i<1024; i+=66)
if (x[i] != 10)
  __builtin_abort ();
  return 0;
}


Re: [Patch] Disable text mode translation in ada for Cygwin

2016-08-19 Thread JonY
On 5/26/2016 20:36, JonY wrote:
> Text mode translation should not be done for Cygwin, especially since it does 
> not
> support unicode setmode calls. This also fixes ada builds for Cygwin.
> 
> OK for trunk?

Ping?




signature.asc
Description: OpenPGP digital signature


Re: Implement -Wimplicit-fallthrough (take 3)

2016-08-19 Thread Marek Polacek
On Thu, Aug 18, 2016 at 04:01:42PM +0200, Jakub Jelinek wrote:
> On Thu, Aug 18, 2016 at 03:50:07PM +0200, Marek Polacek wrote:
> > +case GIMPLE_BIND:
> > +  {
> > +   gbind *bind = as_a  (stmt);
> > +   return last_stmt_in_scope (
> > +gimple_seq_last_stmt (gimple_bind_body (bind)));
> > +  }
> > +
> > +case GIMPLE_TRY:
> > +  {
> > +   gtry *try_stmt = as_a  (stmt);
> > +   return last_stmt_in_scope (
> > +gimple_seq_last_stmt (gimple_try_eval (try_stmt)));
> 
> Just a minor formatting detail.
>   stmt = gimple_seq_last_stmt (gimple_try_eval (try_stmt));
>   return last_stmt_in_scope (stmt);
> and similarly above might be nicer.  Or do the tail recursion by hand?
> No need to repost for that.

Fixed as suggested, thanks.  Yes -- trailing ( are super ugly.

Marek


Re: Implement -Wimplicit-fallthrough (take 3)

2016-08-19 Thread Marek Polacek
On Fri, Aug 19, 2016 at 01:21:14PM +0200, Marek Polacek wrote:
> On Thu, Aug 18, 2016 at 07:31:12PM +0100, Manuel López-Ibáñez wrote:
> > Does this warning make sense if !(lang_GNU_C() || lang_GNU_CXX()) ?
> 
> I don't think so, it's meant for C/C++ only.  I added a better check.

Well, maybe the warning could also work for ObjC and ObjC++, but since I
haven't included any testcases for these languages so far, maybe better to
restrict it for C and C++ only.

Go switch statements look very similar to C/C++ switches, but they don't
fall through, so the warning would be pointless.  No idea about Fortran and
Ada.

Marek


Re: Implement -Wimplicit-fallthrough (take 3)

2016-08-19 Thread Jakub Jelinek
On Fri, Aug 19, 2016 at 02:01:29PM +0200, Marek Polacek wrote:
> On Fri, Aug 19, 2016 at 01:21:14PM +0200, Marek Polacek wrote:
> > On Thu, Aug 18, 2016 at 07:31:12PM +0100, Manuel López-Ibáñez wrote:
> > > Does this warning make sense if !(lang_GNU_C() || lang_GNU_CXX()) ?
> > 
> > I don't think so, it's meant for C/C++ only.  I added a better check.
> 
> Well, maybe the warning could also work for ObjC and ObjC++, but since I
> haven't included any testcases for these languages so far, maybe better to
> restrict it for C and C++ only.

IMHO it should be also on for ObjC and ObjC++, even if there is no test
coverage (though, it would be good to add some eventually).

> Go switch statements look very similar to C/C++ switches, but they don't
> fall through, so the warning would be pointless.  No idea about Fortran and
> Ada.

Go don't fall through by default, but there is fallthrough keyword for
falling through.  And Fortran SELECT CASE doesn't fall through.

Jakub


Re: Implement -Wimplicit-fallthrough (take 3)

2016-08-19 Thread Marek Polacek
On Thu, Aug 18, 2016 at 01:07:26PM -0400, David Malcolm wrote:
> On Thu, 2016-08-18 at 15:50 +0200, Marek Polacek wrote:
> > Now that all various switch fallthrough bugfixes and adjustments were
> > committed, and this patch has shrunk considerably, I'm presenting the
> > latest
> > version.  The changes from the last version are not huge; we don't
> > warn for a
> > fall through to a user-defined label anymore, and I made some tiny
> > changes
> > regarding parsing attributes in the C FE, as requested by Joseph.
> > 
> > This patch is accompanied by another patch that merely adds various
> > gcc_fallthroughs, in places where a FALLTHRU comment wouldn't work.
> > 
> > It's been tested on powerpc64le-unknown-linux-gnu, aarch64-linux-gnu,
> > and x86_64-redhat-linux.
> > 
> > 2016-08-18  Marek Polacek  
> > Jakub Jelinek  
> > 
> > PR c/7652
> 
> [...]
>  
> > diff --git gcc/gcc/gimplify.c gcc/gcc/gimplify.c
> > index 1e43dbb..1925263 100644
> > --- gcc/gcc/gimplify.c
> > +++ gcc/gcc/gimplify.c
> 
> [...]
> 
> > +   if (warned_p)
> > + {
> > +   rich_location richloc (line_table, gimple_location
> > (next));
> > +   richloc.add_fixit_insert (gimple_location (next),
> > + "insert '__attribute__ "
> > + "((fallthrough));' to
> > silence "
> > + "this warning");
> > +   richloc.add_fixit_insert (gimple_location (next),
> > + "insert 'break;' to avoid
> > "
> > + "fall-through");
> > +   inform_at_rich_loc (&richloc, "here");
> > + }
> 
> This isn't quite the way that fix-its are meant to be used, in my mind,
> at least: the insertion text is supposed to be something that could be
> literally inserted into the code (e.g. by an IDE) i.e. it should be a
> code fragment, rather than a message to the user.
> 
> Here's an idea of how the above could look:
> 
>   if (warned_p)
> {
>   /* Suggestion one: add "__attribute__ ((fallthrough));".  */
>   rich_location richloc_attr (line_table, gimple_location (next));
>   richloc_attr.add_fixit_insert (gimple_location (next),
>  "__attribute__ ((fallthrough));");
>   inform_at_rich_loc (&richloc_attr, "insert %qs to silence this warning",
>   "__attribute__ ((fallthrough));")
> 
>   /* Suggestion two: add "break;".  */
>   rich_location richloc_break (line_table, gimple_location (next));
>   richloc_break.add_fixit_insert (gimple_location (next),
>   "break;");
>   inform_at_rich_loc (&richloc_break, "insert %qs to avoid fall-through",
>   "break;");
> }
> 
> 
> (and using %qs for quoting the language elements in the messages
> themselves).
 
Thanks for the clarification, I made it so and the output looks *much* better.

> There doesn't seem to be any test coverage in the patch for the fix-it
> hints.
> 
> The easiest way to do this is to create a test case with:
> 
> /* { dg-options "-fdiagnostics-show-caret" } */
> 
> and then add:
> 
>  /* { dg-begin-multiline-output "" }
> QUOTE OF EXPECTED CARET+FIXIT OUTPUT, omitting any trailing deja-gnu
> directives
>  { dg-end-multiline-output "" } */
> 
> so that we can directly verify that the results look sane.
 
Ok, I added a testcase testing multiline comments.

> BTW, is there some way to break up warn_implicit_fallthrough_r, maybe
> moving the GIMPLE_LABEL handling to a subroutine? (and perhaps the
> suggestion-handling above could live in its own subroutine, etc).

Let me try to split it up a bit.

Thanks,

Marek


Re: [Patch] Disable text mode translation in ada for Cygwin

2016-08-19 Thread Arnaud Charlet
> > Text mode translation should not be done for Cygwin, especially since it
> > does not
> > support unicode setmode calls. This also fixes ada builds for Cygwin.
> > 
> > OK for trunk?
> 
> Ping?

Can you send the link to your original submission for easy retrieval?

Arno


Re: Implement -Wimplicit-fallthrough (take 3)

2016-08-19 Thread Arnaud Charlet
> > > > Does this warning make sense if !(lang_GNU_C() || lang_GNU_CXX()) ?
> > > 
> > > I don't think so, it's meant for C/C++ only.  I added a better check.
> > 
> > Well, maybe the warning could also work for ObjC and ObjC++, but since I
> > haven't included any testcases for these languages so far, maybe better
> > to
> > restrict it for C and C++ only.
> 
> IMHO it should be also on for ObjC and ObjC++, even if there is no test
> coverage (though, it would be good to add some eventually).
> 
> > Go switch statements look very similar to C/C++ switches, but they don't
> > fall through, so the warning would be pointless.  No idea about Fortran and
> > Ada.
> 
> Go don't fall through by default, but there is fallthrough keyword for
> falling through.  And Fortran SELECT CASE doesn't fall through.

Ada never falls through.

Arno


Re: [libcpp] append "evaluates to 0" for Wundef diagnostic

2016-08-19 Thread David Malcolm
On Fri, 2016-08-19 at 14:15 +0530, Prathamesh Kulkarni wrote:
> Hi David,
> This trivial patch appends "evaluates to 0", in Wundef diagnostic,
> similar to clang, which prints the following diagnostic for undefined
> macro:
> undef.c:1:5: warning: 'FOO' is not defined, evaluates to 0 [-Wundef]
> #if FOO
> ^
> Bootstrapped+tested on x86_64-unknown-linux-gnu.
> OK to commit ?

Nice tweak; LGTM.

Thanks
Dave


Re: [libcpp] append "evaluates to 0" for Wundef diagnostic

2016-08-19 Thread Prathamesh Kulkarni
On 19 August 2016 at 18:29, David Malcolm  wrote:
> On Fri, 2016-08-19 at 14:15 +0530, Prathamesh Kulkarni wrote:
>> Hi David,
>> This trivial patch appends "evaluates to 0", in Wundef diagnostic,
>> similar to clang, which prints the following diagnostic for undefined
>> macro:
>> undef.c:1:5: warning: 'FOO' is not defined, evaluates to 0 [-Wundef]
>> #if FOO
>> ^
>> Bootstrapped+tested on x86_64-unknown-linux-gnu.
>> OK to commit ?
>
> Nice tweak; LGTM.
Thanks, committed as r239609.

Regards,
Prathamesh
>
> Thanks
> Dave


Re: [ARM][PR target/77281] Fix an invalid check for vectors of, the same floating-point constants.

2016-08-19 Thread Richard Earnshaw (lists)
On 19/08/16 12:48, Matthew Wahab wrote:
> Hello,
> 
> Test gcc.c-torture/execute/ieee/pr72824-2.c fails for arm targets
> because the code generated to move a vector of signed and unsigned zeros
> treats it as a vector of unsigned zeros.
> 
> That is, an assignment x = { 0.f, -0.f, 0.f, -0.f } is treated as the
> assignment x = { 0.f, 0.f, 0.f, 0.f }.
> 
> This is due to config/arm/arm.c/neon_valid_immediate using real_equal to
> compare the vector elements. This patch replaces the check, using
> const_vec_duplicate_p instead. It doesn't add a new test because
> pr72824-2.c is enough to check the behaviour.
> 
> Tested for arm-none-linux-gnueabihf with native bootstrap and make check
> and for arm-none-eabi with cross-compiled check-gcc.
> 
> 2016-08-19  Matthew Wahab  
> 
> PR target/77281
> * config/arm/arm.c (neon_valid_immediate): Delete declaration.
> Use const_vec_duplicate to check for duplicated elements.
> 
> Ok for trunk?

OK.

Thanks.

R.

> Matthew
> 
> 0001-ARM-Fix-a-wrong-test-for-vectors-of-the-same-constan.patch
> 
> 
> From 90c1c86b7a3d8bc6ac07363aea5fba8f29ef3e96 Mon Sep 17 00:00:00 2001
> From: Matthew Wahab 
> Date: Wed, 17 Aug 2016 14:43:48 +0100
> Subject: [PATCH] [ARM] Fix a wrong test for vectors of the same constants.
> 
> ---
>  gcc/config/arm/arm.c | 13 -
>  1 file changed, 4 insertions(+), 9 deletions(-)
> 
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index a6afdcc..c1d010c 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -12471,7 +12471,6 @@ neon_valid_immediate (rtx op, machine_mode mode, int 
> inverse,
>if (GET_MODE_CLASS (mode) == MODE_VECTOR_FLOAT)
>  {
>rtx el0 = CONST_VECTOR_ELT (op, 0);
> -  const REAL_VALUE_TYPE *r0;
>  
>if (!vfp3_const_double_rtx (el0) && el0 != CONST0_RTX (GET_MODE (el0)))
>  return -1;
> @@ -12480,14 +12479,10 @@ neon_valid_immediate (rtx op, machine_mode mode, 
> int inverse,
>if (GET_MODE_INNER (mode) == HFmode)
>   return -1;
>  
> -  r0 = CONST_DOUBLE_REAL_VALUE (el0);
> -
> -  for (i = 1; i < n_elts; i++)
> -{
> -  rtx elt = CONST_VECTOR_ELT (op, i);
> -  if (!real_equal (r0, CONST_DOUBLE_REAL_VALUE (elt)))
> -return -1;
> -}
> +  /* All elements in the vector must be the same.  Note that 0.0 and -0.0
> +  are distinct in this context.  */
> +  if (!const_vec_duplicate_p (op))
> + return -1;
>  
>if (modconst)
>  *modconst = CONST_VECTOR_ELT (op, 0);
> 



Re: [ARM][PR target/77281] Fix an invalid check for vectors of, the same floating-point constants.

2016-08-19 Thread Matthew Wahab

On 19/08/16 14:30, Richard Earnshaw (lists) wrote:

On 19/08/16 12:48, Matthew Wahab wrote:

2016-08-19  Matthew Wahab  

 PR target/77281
 * config/arm/arm.c (neon_valid_immediate): Delete declaration.
 Use const_vec_duplicate to check for duplicated elements.

Ok for trunk?


OK.

Thanks.

R.


Is this ok to backport to gcc-6?
Matthew


Re: Repeated use of the OpenACC routine directive

2016-08-19 Thread Cesar Philippidis
On 08/16/2016 06:05 PM, Thomas Schwinge wrote:
> On Mon, 01 Aug 2016 17:51:24 +0200, I wrote:
>> > We found that it's not correct that we currently unconditionally diagnose
>> > an error for repeated use of the OpenACC routine directive on one
>> > function/declaration.  (For reference, it is also permissible for an
>> > "ordinary" function to have several declarations plus a definition, as
>> > long as these are compatible.)  This is, the following shall be valid:
>> > 
>> > #pragma acc routine worker
>> > void f(void)
>> > {
>> > }
>> > #pragma acc routine (f) worker
>> > #pragma acc routine worker
>> > extern void f(void);
>> > 
>> > [...]

Because of the different scoping rules, I don't think it makes sense to
allow users to repeat acc routine directives in fortran. Consequently, I
couldn't create a 1:1 of your c tests to fortran. However, after
inspecting your patch I did find some areas where fortran had
insufficient test coverage, specifically with routines inside interface
blocks and that warning for unutilized parallelism.

I've applied this patch, which addresses the acc routine test coverage
deficiencies in fortran, to gomp-4_0-branch.

Cesar
2016-08-19  Cesar Philippidis  

	gcc/testsuite/
	* gfortran.dg/goacc/routine-8.f90: New test.
	* gfortran.dg/goacc/routine-level-of-parallelism-1.f90: New test.


diff --git a/gcc/testsuite/gfortran.dg/goacc/routine-8.f90 b/gcc/testsuite/gfortran.dg/goacc/routine-8.f90
new file mode 100644
index 000..c903915
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/routine-8.f90
@@ -0,0 +1,32 @@
+! Test ACC ROUTINE inside an interface block.
+
+program main
+  interface
+ function s_1 (a)
+   integer a
+   !$acc routine
+ end function s_1
+  end interface
+
+  interface
+ function s_2 (a)
+   integer a
+   !$acc routine seq
+ end function s_2
+  end interface
+
+  interface
+ function s_3 (a)
+   integer a
+   !$acc routine (s_3) ! { dg-error "Only the ..ACC ROUTINE form without list is allowed in interface block" }
+ end function s_3
+  end interface
+
+  interface
+ function s_4 (a)
+   integer a
+ !$acc routine (s_4) seq ! { dg-error "Only the ..ACC ROUTINE form without list is allowed in interface block" }
+ end function s_4
+  end interface
+end program main
+
diff --git a/gcc/testsuite/gfortran.dg/goacc/routine-level-of-parallelism-1.f90 b/gcc/testsuite/gfortran.dg/goacc/routine-level-of-parallelism-1.f90
new file mode 100644
index 000..364a058
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/routine-level-of-parallelism-1.f90
@@ -0,0 +1,74 @@
+! Test various aspects of clauses specifying compatible levels of
+! parallelism with the OpenACC routine directive.  The Fortran counterpart is
+! c-c++-common/goacc/routine-level-of-parallelism-2.c
+
+subroutine g_1 ! { dg-warning "region is gang partitioned but does not contain gang partitioned code" }
+  !$acc routine gang
+! { dg-bogus "region is worker partitioned but does not contain worker partitioned code" "worker partitioned" { xfail *-*-* } 5 }
+! { dg-bogus "region is vector partitioned but does not contain vector partitioned code" "worker partitioned" { xfail *-*-* } 5 }
+end subroutine g_1
+
+subroutine s_1_2a
+  !$acc routine
+end subroutine s_1_2a
+
+subroutine s_1_2b
+  !$acc routine seq
+end subroutine s_1_2b
+
+subroutine s_1_2c
+  !$acc routine (s_1_2c)
+end subroutine s_1_2c
+
+subroutine s_1_2d
+  !$acc routine (s_1_2d) seq
+end subroutine s_1_2d
+
+module s_2
+contains
+  subroutine s_2_1a
+!$acc routine
+  end subroutine s_2_1a
+
+  subroutine s_2_1b
+!$acc routine seq
+  end subroutine s_2_1b
+
+  subroutine s_2_1c
+!$acc routine (s_2_1c)
+  end subroutine s_2_1c
+
+  subroutine s_2_1d
+!$acc routine (s_2_1d) seq
+  end subroutine s_2_1d
+end module s_2
+
+subroutine test
+  external g_1, w_1, v_1
+  external s_1_1, s_1_2
+
+  interface
+ function s_3_1a (a)
+   integer a
+   !$acc routine
+ end function s_3_1a
+  end interface
+
+  interface
+ function s_3_1b (a)
+   integer a
+   !$acc routine seq
+ end function s_3_1b
+  end interface
+
+  !$acc routine(g_1) gang
+
+  !$acc routine(w_1) worker
+
+  !$acc routine(v_1) worker
+
+  ! Also test the implicit seq clause.
+
+  !$acc routine (s_1_1) seq
+
+end subroutine test


Re: [PATCH] Add a TARGET_GEN_MEMSET_VALUE hook

2016-08-19 Thread H.J. Lu
On Fri, Aug 19, 2016 at 2:21 AM, Richard Biener
 wrote:
> On Thu, Aug 18, 2016 at 5:16 PM, H.J. Lu  wrote:
>> On Thu, Aug 18, 2016 at 1:18 AM, Richard Biener
>>  wrote:
>>> On Wed, Aug 17, 2016 at 10:11 PM, H.J. Lu  wrote:
 builtin_memset_gen_str returns a register used for memset, which only
 supports integer registers.  But a target may use vector registers in
 memmset.  This patch adds a TARGET_GEN_MEMSET_VALUE hook to duplicate
 QImode value to mode derived from STORE_MAX_PIECES, which can be used
 with vector instructions.  The default hook is the same as the original
 builtin_memset_gen_str.  A target can override it to support vector
 instructions for STORE_MAX_PIECES.

 Tested on x86-64 and i686.  Any comments?

 H.J.
 ---
 gcc/

 * builtins.c (builtin_memset_gen_str): Call 
 targetm.gen_memset_value.
 (default_gen_memset_value): New function.
 * target.def (gen_memset_value): New hook.
 * targhooks.c: Inclue "expmed.h" and "builtins.h".
 (default_gen_memset_value): New function.
>>>
>>> I see default_gen_memset_value in builtins.c but it belongs here.
>>>
 * targhooks.h (default_gen_memset_value): New prototype.
 * config/i386/i386.c (ix86_gen_memset_value): New function.
 (TARGET_GEN_MEMSET_VALUE): New.
 * config/i386/i386.h (STORE_MAX_PIECES): Likewise.
 * doc/tm.texi.in: Add TARGET_GEN_MEMSET_VALUE hook.
 * doc/tm.texi: Updated.

>>
>> Like this?
>
> Aww, ok - I see builtins.c is a better place - sorry for the extra work.
>
> Richard.

Here is the original patch with the updated ChangeLog.  OK for trunk?


-- 
H.J.
From 21720a63c20bdfae89e08d2a4b1ea73a87c84caf Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Sun, 6 Mar 2016 06:38:21 -0800
Subject: [PATCH] Add a TARGET_GEN_MEMSET_VALUE hook

builtin_memset_gen_str returns a register used for memset, which only
supports integer registers.  But a target may use vector registers in
memmset.  This patch adds a TARGET_GEN_MEMSET_VALUE hook to duplicate
QImode value to mode derived from STORE_MAX_PIECES, which can be used
with vector instructions.  The default hook is the same as the original
builtin_memset_gen_str.  A target can override it to support vector
instructions for STORE_MAX_PIECES.

gcc/

	* builtins.c (builtin_memset_gen_str): Call targetm.gen_memset_value.
	(default_gen_memset_value): New function.
	* target.def (gen_memset_value): New hook.
	* targhooks.h (default_gen_memset_value): New prototype.
	* config/i386/i386.c (ix86_gen_memset_value): New function.
	(TARGET_GEN_MEMSET_VALUE): New.
	* config/i386/i386.h (STORE_MAX_PIECES): Likewise.
	* doc/tm.texi.in: Add TARGET_GEN_MEMSET_VALUE hook.
	* doc/tm.texi: Updated.

gcc/testsuite/

	* gcc.target/i386/pieces-memset-1.c: New test.
	* gcc.target/i386/pieces-memset-2.c: Likewise.
	* gcc.target/i386/pieces-memset-3.c: Likewise.
	* gcc.target/i386/pieces-memset-4.c: Likewise.
	* gcc.target/i386/pieces-memset-5.c: Likewise.
	* gcc.target/i386/pieces-memset-6.c: Likewise.
	* gcc.target/i386/pieces-memset-7.c: Likewise.
	* gcc.target/i386/pieces-memset-8.c: Likewise.
	* gcc.target/i386/pieces-memset-9.c: Likewise.
	* gcc.target/i386/pieces-memset-10.c: Likewise.
	* gcc.target/i386/pieces-memset-11.c: Likewise.
	* gcc.target/i386/pieces-memset-12.c: Likewise.
	* gcc.target/i386/pieces-memset-13.c: Likewise.
	* gcc.target/i386/pieces-memset-14.c: Likewise.
	* gcc.target/i386/pieces-memset-15.c: Likewise.
	* gcc.target/i386/pieces-memset-16.c: Likewise.
	* gcc.target/i386/pieces-memset-17.c: Likewise.
	* gcc.target/i386/pieces-memset-18.c: Likewise.
	* gcc.target/i386/pieces-memset-19.c: Likewise.
	* gcc.target/i386/pieces-memset-20.c: Likewise.
	* gcc.target/i386/pieces-memset-21.c: Likewise.
	* gcc.target/i386/pieces-memset-22.c: Likewise.
	* gcc.target/i386/pieces-memset-23.c: Likewise.
	* gcc.target/i386/pieces-memset-24.c: Likewise.
	* gcc.target/i386/pieces-memset-25.c: Likewise.
	* gcc.target/i386/pieces-memset-26.c: Likewise.
	* gcc.target/i386/pieces-memset-27.c: Likewise.
	* gcc.target/i386/pieces-memset-28.c: Likewise.
	* gcc.target/i386/pieces-memset-29.c: Likewise.
	* gcc.target/i386/pieces-memset-30.c: Likewise.
	* gcc.target/i386/pieces-memset-31.c: Likewise.
	* gcc.target/i386/pieces-memset-32.c: Likewise.
	* gcc.target/i386/pieces-memset-33.c: Likewise.
	* gcc.target/i386/pieces-memset-34.c: Likewise.
	* gcc.target/i386/pieces-memset-35.c: Likewise.
	* gcc.target/i386/pieces-memset-36.c: Likewise.
	* gcc.target/i386/pieces-memset-37.c: Likewise.
	* gcc.target/i386/pieces-memset-38.c: Likewise.
	* gcc.target/i386/pieces-memset-39.c: Likewise.
	* gcc.target/i386/pieces-memset-40.c: Likewise.
	* gcc.target/i386/pieces-memset-41.c: Likewise.
	* gcc.target/i386/pieces-memset-42.c: Likewise.
	* gcc.target/i386/pieces-memset-43.c: Likewise.
	* gcc.target/i386/pieces-memset-44.c: Likewise.
---
 g

Re: [ARM][PR target/77281] Fix an invalid check for vectors of, the same floating-point constants.

2016-08-19 Thread Richard Earnshaw (lists)
On 19/08/16 15:06, Matthew Wahab wrote:
> On 19/08/16 14:30, Richard Earnshaw (lists) wrote:
>> On 19/08/16 12:48, Matthew Wahab wrote:
>>> 2016-08-19  Matthew Wahab  
>>>
>>>  PR target/77281
>>>  * config/arm/arm.c (neon_valid_immediate): Delete declaration.
>>>  Use const_vec_duplicate to check for duplicated elements.
>>>
>>> Ok for trunk?
>>
>> OK.
>>
>> Thanks.
>>
>> R.
> 
> Is this ok to backport to gcc-6?
> Matthew
> 

I believe we're in a release process, so backporting needs RM approval.

R.


Re: Implement C _FloatN, _FloatNx types [version 6]

2016-08-19 Thread Joseph Myers
On Fri, 19 Aug 2016, Richard Biener wrote:

> >> Can you quickly verify if LTO works with the new types?  I don't see 
> >> anything
> >> that would prevent it but having new global trees and backends 
> >> initializing them
> >> might come up with surprises (see tree-streamer.c:preload_common_nodes)
> >
> > Well, the execution tests are in gcc.dg/torture, which is run with various
> > options including -flto (and I've checked the testsuite logs to confirm
> > these tests are indeed run with such options).  Is there something else
> > you think should be tested?
> 
> No, I think that's enough.

Then I'll commit the patch later today in the absence of comments from 
other libcpp maintainers (and then go on to update and retest the built-in 
functions patch).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: PING: new pass to warn on questionable uses of alloca() and VLAs

2016-08-19 Thread Aldy Hernandez

On 08/19/2016 05:35 AM, Aldy Hernandez wrote:

On 08/04/2016 12:37 PM, Jeff Law wrote:

On 07/27/2016 03:01 AM, Aldy Hernandez wrote:

Just in case this got lost in noise, since I know there was a lot of
back and forth between Martin Sebor and I.

This is the last iteration.

Tested on x86-64 Linux.

OK for trunk?

curr


gcc/

* Makefile.in (OBJS): Add gimple-ssa-warn-walloca.o.
* passes.def: Add two instances of pass_walloca.
* tree-pass.h (make_pass_walloca): New.
* gimple-ssa-warn-walloca.c: New file.
* doc/invoke.texi: Document -Walloca, -Walloca-larger-than=, and
-Wvla-larger-than= options.

gcc/c-family/

* c.opt (Walloca): New.
(Walloca-larger-than=): New.
(Wvla-larger-than=): New.

As someone already noted, it's gimple-ssa-warn-alloca, not
gimple-ssa-warn-walloca for the ChangeLog entry.


Fixed.



On the nittish side, you're mixing C and C++ comment styles.  Choosing
one and sticking with it seems better :-)


Fixed.  Settled for C++ comments, except the copyright headers and the
testcases.






+@item -Walloca
+@opindex Wno-alloca
+@opindex Walloca
+This option warns on all uses of @code{alloca} in the source.
+
+@item -Walloca-larger-than=@var{n}
+This option warns on calls to @code{alloca} that are not bounded by a
+controlling predicate limiting its size to @var{n} bytes, or calls to
+@code{alloca} where the bound is unknown.

So for each of these little examples, I'd stuff the code into a trivial
function definition and make "n" a parameter.  That way it's obvious the
value of "n" comes from a context where we don't initially know its
range, but we may be able to narrow the range due to statements in the
function.


Done.



;

+
+class pass_walloca : public gimple_opt_pass
+{
+public:
+  pass_walloca (gcc::context *ctxt)
+: gimple_opt_pass(pass_data_walloca, ctxt), first_time_p (false)
+  {}
+  opt_pass *clone () { return new pass_walloca (m_ctxt); }
+  void set_pass_param (unsigned int n, bool param)
+{
+  gcc_assert (n == 0);
+  first_time_p = param;
+}

ISTM that you're using "first_time_p" here, but in passes.def you refer
to this parameter as "strict_mode_p" in comments.

ie:

+  NEXT_PASS (pass_walloca, /*strict_mode_p=*/false);

I'd just drop the /*strict_mode_p*/ comment in both places it appears in
your patch's change to passes.def.  I think we've generally frowned on
those embedded comments, even though some have snuck in.


I've seen a lot of embedded comments throughout GCC, especially in
optional type arguments.  ISTM it makes things clearer for these
parameters.  But hey, I don't care that much.  Fixed.




+
+// We have a few heuristics up our sleeve to determine if a call to
+// alloca() is within bounds.  Try them out and return the type of
+// alloca call this is based on its argument.
+//
+// Given a known argument (ARG) to alloca() and an EDGE (E)
+// calculating said argument, verify that the last statement in the BB
+// in E->SRC is a gate comparing ARG to an acceptable bound for
+// alloca().  See examples below.
+//
+// MAX_SIZE is WARN_ALLOCA= adjusted for VLAs.  It is the maximum size
+// in bytes we allow for arg.
+//
+// If the alloca bound is determined to be too large, ASSUMED_LIMIT is
+// set to the bound used to determine this.  ASSUMED_LIMIT is only set
+// for ALLOCA_BOUND_MAYBE_LARGE and ALLOCA_BOUND_DEFINITELY_LARGE.
+//
+// Returns the alloca type.
+
+static enum alloca_type
+alloca_call_type_by_arg (tree arg, edge e, unsigned max_size,
+ wide_int *assumed_limit)

So I wonder if you ought to have a structure here for the return value
which contains the alloca type and assumed limit.  I know in the past we
avoided aggregate returns, but these days that doesn't seem necessary.
Seems cleaner than having a return value and output parameters.


Done, C++ style with a simple constructor :).




+{
+  // All the tests bellow depend on the jump being on the TRUE path.
+  if (!(e->flags & EDGE_TRUE_VALUE))
+return ALLOCA_UNBOUNDED;

Seems like a fairly arbitrary and undesirable limitation.  Couldn't the
developer just have easily written

if (arg > N>
x = malloc (...)
else
x = alloca (...)

It also seems like you'd want to handle the set of LT/LE/GT/GE rather
than just LE.  Or is it the case that we always canonicalize LT into LE
by adjusting the constant (I vaguely remember running into that in RTL,
so it's entirely possible and there'd likely be a canonicalization of
GT/GE as well).


Most of it gets canonicalized, but your testcase is definitely possible,
so I fixed this.



It also seems that once Andrew's infrastructure is in place this becomes
dead code as we can just ask for the range at a point in the program,
including for each incoming edge.  You might want a comment to that
effect.


Done.







+
+  /* Check for:
+ if (arg .cond. LIMIT) -or- if (LIMIT .cond. arg)
+   alloca(arg);
+
+ Where LIMIT has a bound of unknown range.  */
+  tree limit = NULL;
+ 

Re: PING: new pass to warn on questionable uses of alloca() and VLAs

2016-08-19 Thread Aldy Hernandez

On 08/04/2016 12:37 PM, Jeff Law wrote:

[ughhh, one more time but with actual content]


On 07/27/2016 03:01 AM, Aldy Hernandez wrote:

Just in case this got lost in noise, since I know there was a lot of
back and forth between Martin Sebor and I.

This is the last iteration.

Tested on x86-64 Linux.

OK for trunk?

curr


gcc/

* Makefile.in (OBJS): Add gimple-ssa-warn-walloca.o.
* passes.def: Add two instances of pass_walloca.
* tree-pass.h (make_pass_walloca): New.
* gimple-ssa-warn-walloca.c: New file.
* doc/invoke.texi: Document -Walloca, -Walloca-larger-than=, and
-Wvla-larger-than= options.

gcc/c-family/

* c.opt (Walloca): New.
(Walloca-larger-than=): New.
(Wvla-larger-than=): New.

As someone already noted, it's gimple-ssa-warn-alloca, not
gimple-ssa-warn-walloca for the ChangeLog entry.


Fixed.



On the nittish side, you're mixing C and C++ comment styles.  Choosing
one and sticking with it seems better :-)


Fixed.  Settled for C++ comments, except the copyright headers and the 
testcases.







+@item -Walloca
+@opindex Wno-alloca
+@opindex Walloca
+This option warns on all uses of @code{alloca} in the source.
+
+@item -Walloca-larger-than=@var{n}
+This option warns on calls to @code{alloca} that are not bounded by a
+controlling predicate limiting its size to @var{n} bytes, or calls to
+@code{alloca} where the bound is unknown.

So for each of these little examples, I'd stuff the code into a trivial
function definition and make "n" a parameter.  That way it's obvious the
value of "n" comes from a context where we don't initially know its
range, but we may be able to narrow the range due to statements in the
function.


Done.



;

+
+class pass_walloca : public gimple_opt_pass
+{
+public:
+  pass_walloca (gcc::context *ctxt)
+: gimple_opt_pass(pass_data_walloca, ctxt), first_time_p (false)
+  {}
+  opt_pass *clone () { return new pass_walloca (m_ctxt); }
+  void set_pass_param (unsigned int n, bool param)
+{
+  gcc_assert (n == 0);
+  first_time_p = param;
+}

ISTM that you're using "first_time_p" here, but in passes.def you refer
to this parameter as "strict_mode_p" in comments.

ie:

+  NEXT_PASS (pass_walloca, /*strict_mode_p=*/false);

I'd just drop the /*strict_mode_p*/ comment in both places it appears in
your patch's change to passes.def.  I think we've generally frowned on
those embedded comments, even though some have snuck in.


I've seen a lot of embedded comments throughout GCC, especially in 
optional type arguments.  ISTM it makes things clearer for these 
parameters.  But hey, I don't care that much.  Fixed.





+
+// We have a few heuristics up our sleeve to determine if a call to
+// alloca() is within bounds.  Try them out and return the type of
+// alloca call this is based on its argument.
+//
+// Given a known argument (ARG) to alloca() and an EDGE (E)
+// calculating said argument, verify that the last statement in the BB
+// in E->SRC is a gate comparing ARG to an acceptable bound for
+// alloca().  See examples below.
+//
+// MAX_SIZE is WARN_ALLOCA= adjusted for VLAs.  It is the maximum size
+// in bytes we allow for arg.
+//
+// If the alloca bound is determined to be too large, ASSUMED_LIMIT is
+// set to the bound used to determine this.  ASSUMED_LIMIT is only set
+// for ALLOCA_BOUND_MAYBE_LARGE and ALLOCA_BOUND_DEFINITELY_LARGE.
+//
+// Returns the alloca type.
+
+static enum alloca_type
+alloca_call_type_by_arg (tree arg, edge e, unsigned max_size,
+ wide_int *assumed_limit)

So I wonder if you ought to have a structure here for the return value
which contains the alloca type and assumed limit.  I know in the past we
avoided aggregate returns, but these days that doesn't seem necessary.
Seems cleaner than having a return value and output parameters.


Done, C++ style with a simple constructor :).




+{
+  // All the tests bellow depend on the jump being on the TRUE path.
+  if (!(e->flags & EDGE_TRUE_VALUE))
+return ALLOCA_UNBOUNDED;

Seems like a fairly arbitrary and undesirable limitation.  Couldn't the
developer just have easily written

if (arg > N>
x = malloc (...)
else
x = alloca (...)

It also seems like you'd want to handle the set of LT/LE/GT/GE rather
than just LE.  Or is it the case that we always canonicalize LT into LE
by adjusting the constant (I vaguely remember running into that in RTL,
so it's entirely possible and there'd likely be a canonicalization of
GT/GE as well).


Most of it gets canonicalized, but your testcase is definitely possible, 
so I fixed this.




It also seems that once Andrew's infrastructure is in place this becomes
dead code as we can just ask for the range at a point in the program,
including for each incoming edge.  You might want a comment to that effect.


Done.







+
+  /* Check for:
+ if (arg .cond. LIMIT) -or- if (LIMIT .cond. arg)
+   alloca(arg);
+
+ Where LIMIT has a bound of unknown range.  */
+  tree limit =

Re: Implement C _FloatN, _FloatNx types [version 6]

2016-08-19 Thread Szabolcs Nagy
On 17/08/16 21:17, Joseph Myers wrote:
> Although there is HFmode support for ARM and AArch64, use of that for
> _Float16 is not enabled.  Supporting _Float16 would require additional
> work on the excess precision aspects of TS 18661-3: there are new
> values of FLT_EVAL_METHOD, which are not currently supported in GCC,
> and FLT_EVAL_METHOD == 0 now means that operations and constants on
> types narrower than float are evaluated to the range and precision of
> float.  Implementing that, so that _Float16 gets evaluated with excess
> range and precision, would involve changes to the excess precision
> infrastructure so that the _Float16 case is enabled by default, unlike
> the x87 case which is only enabled for -fexcess-precision=standard.
> Other differences between _Float16 and __fp16 would also need to be
> disentangled.

i wonder how gcc can support _Float16 without excess
precision.

using FLT_EVAL_METHOD==16 can break conforming c99/c11
code which only expects 0,1,2 values to appear (and does
not use _Float16 at all), but it seems to be the better
fit for hardware with half precision instructions.



Re: [PATCH] - improve sprintf buffer overflow detection (middle-end/49905)

2016-08-19 Thread Martin Sebor

My biggest concern with this iteration is the tight integration between
the optimization and warning.  We generally avoid that kind of tight
integration such that enabling the warning does not affect the
optimization and vice-versa.

So ISTM you have to do the analysis if the optimization or warning has
been requested.  Then you conditionalize whether or not the warnings are
emitted by their flag and the optimization based on its flag.


As we discussed in IRC yesterday, the warning and the optimization
are independent of one another, and each controlled by its own option
(-Wformat-length and -fprintf-return-value).  In light of that we've
agreed that submitting both as part of the same patch is sufficient.



I understand you're going to have some further work to do because of
conflicts with David's patches.  With that in mind, I'd suggest a bit of
carving things up so things can start moving forward.


Patch #1.  All the fixes to static buffer sizes that were inspired by
your warning.  These are all approved and can go in immediately.


Attached is this patch.



Patch #2. Improvement to __builtin_object_size to handle
POINTER_PLUS_EXPR on arrays.  This is something that stands on it own
and ought to be reviewable quickly and doesn't really belong in the
bowels of the warning/optimization patch you're developing.


Sure.  I'll submit this patch next.



Patch #3. Core infrastructure and possibly the warning.  The reason I
say possibly the warning is they may be intertwined enough that
separating them makes more work than it saves.  I think the warning bits
are largely ready to go and may just need twiddling due to conflicts
with David's work.

Patch #4. The optimizations you've got now which I'll want to take
another look at.  Other than the overly tight integration with the
warning, I don't see anything inherently wrong, but I would like to take
another look at those once #1-#3 are done and dusted.


As we agreed, these will be submitted as one patch (probably
next week).



Patch #5 and beyond: Further optimization work.


As one of the next steps I'd like to make this feature available
to user-defined sprintf-like functions decorated with attribute
format.  To do that, I'm thinking of adding either a fourth
(optional) argument to attribute format printf indicating which
of the function arguments is the destination buffer (to compute
its size), or perhaps a new attribute under its own name.  I'm
actually leaning toward latter since I think it could be used
in other contexts as well.  I welcome comments and suggestions
on this idea.

Thanks also for the rest of the detailed comments (snipped). I'll
also take care of those requests before I submit the next patch.

Martin
gcc/c-family/ChangeLog:
2016-08-18  Martin Sebor  

	* c-ada-spec.c (dump_ada_function_declaration): Increase buffer
	size to guarantee it fits the output of the formatted function
	regardless of its arguments.

gcc/cp/ChangeLog:
2016-08-18  Martin Sebor  

	* mangle.c: Increase buffer size to guarantee it fits the output
	of the formatted function regardless of its arguments.

gcc/go/ChangeLog:
2016-08-18  Martin Sebor  

	* gofrontend/expressions.cc: Increase buffer size to guarantee
	it fits the output of the formatted function regardless of its
	arguments.

gcc/java/ChangeLog:
2016-08-18  Martin Sebor  

	* decl.c (give_name_to_locals): Increase buffer size to guarantee
	it fits the output of the formatted function regardless of its
	arguments.
	* mangle_name.c (append_unicode_mangled_name): Same.

gcc/ChangeLog:
2016-08-18  Martin Sebor  

	* genmatch.c (parser::parse_expr): Increase buffer size to guarantee
	it fits the output of the formatted function regardless of its
	arguments.
	* gcc/genmodes.c (parser::parse_expr): Same.
	* gimplify.c (gimplify_asm_expr): Same.
	* passes.c (pass_manager::register_one_dump_file): Same.
	* print-tree.c (print_node): Same.

diff --git a/gcc/c-family/c-ada-spec.c b/gcc/c-family/c-ada-spec.c
index a4e0c38..6a8e04b 100644
--- a/gcc/c-family/c-ada-spec.c
+++ b/gcc/c-family/c-ada-spec.c
@@ -1603,7 +1603,7 @@ dump_ada_function_declaration (pretty_printer *buffer, tree func,
 {
   tree arg;
   const tree node = TREE_TYPE (func);
-  char buf[16];
+  char buf[17];
   int num = 0, num_args = 0, have_args = true, have_ellipsis = false;
 
   /* Compute number of arguments.  */
diff --git a/gcc/cp/mangle.c b/gcc/cp/mangle.c
index d8b5c45..5859d62 100644
--- a/gcc/cp/mangle.c
+++ b/gcc/cp/mangle.c
@@ -1740,7 +1740,9 @@ static void
 write_real_cst (const tree value)
 {
   long target_real[4];  /* largest supported float */
-  char buffer[9];   /* eight hex digits in a 32-bit number */
+  /* Buffer for eight hex digits in a 32-bit number but big enough
+ even for 64-bit long to avoid warnings.  */
+  char buffer[17];
   int i, limit, dir;
 
   tree type = TREE_TYPE (value);
diff --git a/gcc/genmatch.c b/gcc/genmatch.c
index 02e945a..6195a3b 100644
--- a/gcc/genmatch.c
+++ b/gcc/genmatch.c
@@ -4051,7 +4

Re: [PATCH] - improve sprintf buffer overflow detection (middle-end/49905)

2016-08-19 Thread Jeff Law

On 08/19/2016 09:29 AM, Martin Sebor wrote:

My biggest concern with this iteration is the tight integration between
the optimization and warning.  We generally avoid that kind of tight
integration such that enabling the warning does not affect the
optimization and vice-versa.

So ISTM you have to do the analysis if the optimization or warning has
been requested.  Then you conditionalize whether or not the warnings are
emitted by their flag and the optimization based on its flag.


As we discussed in IRC yesterday, the warning and the optimization
are independent of one another, and each controlled by its own option
(-Wformat-length and -fprintf-return-value).  In light of that we've
agreed that submitting both as part of the same patch is sufficient.
Right.  I must have mis-read something.  I'll look for the tidbit that 
made me think they were more intertwined than they really are.  It may 
be the case that we just want to tweak a comment.






I understand you're going to have some further work to do because of
conflicts with David's patches.  With that in mind, I'd suggest a bit of
carving things up so things can start moving forward.


Patch #1.  All the fixes to static buffer sizes that were inspired by
your warning.  These are all approved and can go in immediately.


Attached is this patch.

Approved.  Please install.






Patch #2. Improvement to __builtin_object_size to handle
POINTER_PLUS_EXPR on arrays.  This is something that stands on it own
and ought to be reviewable quickly and doesn't really belong in the
bowels of the warning/optimization patch you're developing.


Sure.  I'll submit this patch next.

Excellent.



Patch #5 and beyond: Further optimization work.


As one of the next steps I'd like to make this feature available
to user-defined sprintf-like functions decorated with attribute
format.  To do that, I'm thinking of adding either a fourth
(optional) argument to attribute format printf indicating which
of the function arguments is the destination buffer (to compute
its size), or perhaps a new attribute under its own name.  I'm
actually leaning toward latter since I think it could be used
in other contexts as well.  I welcome comments and suggestions
on this idea.
Whichever we think will be easier to use and thus encourage folks to 
annotate their code properly :-)


jeff



[committed] Fix OpenMP ICE (PR fortran/69281)

2016-08-19 Thread Jakub Jelinek
Hi!

On the following testcase we ICE because the BLOCK that is initially in the
BIND_EXPR on the OMP_PARALLEL/TASK/TARGET body contains a VAR_DECL for
-fstack-arrays added artificial VLA.  Later on for move_sese_* reasons this
is the BLOCK used for the moving, which means that a copy of the VAR_DECL
with DECL_VALUE_EXPR is also in the original function where if we version
that function, we might tweak its type and that then turns the *omp_fn*
into invalid IL.  This problem doesn't exist for C/C++, because there we
actually wrap the body into 2 BLOCKs, one user supplied and the other that
just holds the artificials added during gimplification/omp lowering that
aren't problematic.  This patch makes the Fortran FE match the C/C++ more
closely by wrapping the body into another BIND_EXPR with its own BLOCK.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.
Backports queued.

2016-08-19  Jakub Jelinek  

PR fortran/69281
* trans-openmp.c (gfc_trans_omp_parallel, gfc_trans_omp_task,
gfc_trans_omp_target): Wrap gfc_trans_omp_code result in an extra
BIND_EXPR with its own forced BLOCK.

* gfortran.dg/gomp/pr69281.f90: New test.

--- gcc/fortran/trans-openmp.c.jj   2016-07-01 17:11:37.0 +0200
+++ gcc/fortran/trans-openmp.c  2016-08-19 14:19:46.246922616 +0200
@@ -3554,7 +3554,9 @@ gfc_trans_omp_parallel (gfc_code *code)
   gfc_start_block (&block);
   omp_clauses = gfc_trans_omp_clauses (&block, code->ext.omp_clauses,
   code->loc);
+  pushlevel ();
   stmt = gfc_trans_omp_code (code->block->next, true);
+  stmt = build3_v (BIND_EXPR, NULL, stmt, poplevel (1, 0));
   stmt = build2_loc (input_location, OMP_PARALLEL, void_type_node, stmt,
 omp_clauses);
   gfc_add_expr_to_block (&block, stmt);
@@ -4062,7 +4064,9 @@ gfc_trans_omp_task (gfc_code *code)
   gfc_start_block (&block);
   omp_clauses = gfc_trans_omp_clauses (&block, code->ext.omp_clauses,
   code->loc);
+  pushlevel ();
   stmt = gfc_trans_omp_code (code->block->next, true);
+  stmt = build3_v (BIND_EXPR, NULL, stmt, poplevel (1, 0));
   stmt = build2_loc (input_location, OMP_TASK, void_type_node, stmt,
 omp_clauses);
   gfc_add_expr_to_block (&block, stmt);
@@ -4215,7 +4219,11 @@ gfc_trans_omp_target (gfc_code *code)
   = gfc_trans_omp_clauses (&block, &clausesa[GFC_OMP_SPLIT_TARGET],
   code->loc);
   if (code->op == EXEC_OMP_TARGET)
-stmt = gfc_trans_omp_code (code->block->next, true);
+{
+  pushlevel ();
+  stmt = gfc_trans_omp_code (code->block->next, true);
+  stmt = build3_v (BIND_EXPR, NULL, stmt, poplevel (1, 0));
+}
   else
 {
   pushlevel ();
--- gcc/testsuite/gfortran.dg/gomp/pr69281.f90.jj   2016-08-19 
13:54:11.411692953 +0200
+++ gcc/testsuite/gfortran.dg/gomp/pr69281.f90  2016-08-19 14:23:11.0 
+0200
@@ -0,0 +1,63 @@
+! PR fortran/69281
+! { dg-do compile }
+! { dg-additional-options "-fstack-arrays -O2" }
+
+program pr69281
+  implicit none
+  call foo1((/ 1, 3, 3, 7 /))
+  call foo2((/ 1, 3, 3, 7 /))
+  call foo3((/ 1, 3, 3, 7 /))
+  call foo4((/ 1, 3, 3, 7 /))
+  call foo5((/ 1, 3, 3, 7 /))
+  call foo6((/ 1, 3, 3, 7 /))
+contains
+  subroutine foo1(x)
+integer, intent(in) :: x(:)
+!$omp parallel
+  call baz(bar(x))
+!$omp end parallel
+  end subroutine
+  subroutine foo2(x)
+integer, intent(in) :: x(:)
+!$omp task
+  call baz(bar(x))
+!$omp end task
+  end subroutine
+  subroutine foo3(x)
+integer, intent(in) :: x(:)
+!$omp target
+  call baz(bar(x))
+!$omp end target
+  end subroutine
+  subroutine foo4(x)
+integer, intent(in) :: x(:)
+!$omp target teams
+  call baz(bar(x))
+!$omp end target teams
+  end subroutine
+  subroutine foo5(x)
+integer, intent(in) :: x(:)
+integer :: i
+!$omp parallel do
+  do i = 1, 1
+call baz(bar(x))
+  end do
+  end subroutine
+  subroutine foo6(x)
+integer, intent(in) :: x(:)
+integer :: i
+!$omp target teams distribute parallel do
+  do i = 1, 1
+call baz(bar(x))
+  end do
+  end subroutine
+  function bar(x) result(a)
+integer, dimension(:), intent(in) :: x
+integer, dimension(2,size(x)) :: a
+a(1,:) = 1
+a(2,:) = x
+  end function
+  subroutine baz(a)
+integer, dimension(:,:), intent(in) :: a
+  end subroutine
+end program

Jakub


[committed] Add testcase (PR fortran/72744)

2016-08-19 Thread Jakub Jelinek
Hi!

This got also fixed by PR69281 fix I've just committed, I've added the
testcase too.

2016-08-19  Jakub Jelinek  

PR fortran/72744
* gfortran.dg/gomp/pr72744.f90: New test.

--- gcc/testsuite/gfortran.dg/gomp/pr72744.f90.jj   2016-08-19 
14:49:18.590105777 +0200
+++ gcc/testsuite/gfortran.dg/gomp/pr72744.f90  2016-08-19 14:48:58.0 
+0200
@@ -0,0 +1,18 @@
+! PR fortran/72744
+! { dg-do compile }
+! { dg-additional-options "-Ofast" }
+
+program pr72744
+  integer, parameter :: n = 20
+  integer :: i, z(n), h(n)
+  z = [(i, i=1,n)]
+  h = [(i, i=n,1,-1)]
+  call sub (n, h)
+  if ( any(h/=z) ) call abort
+end
+subroutine sub (n, x)
+  integer :: n, x(n)
+!$omp parallel
+  x(:) = x(n:1:-1)
+!$omp end parallel
+end

Jakub


[PATCH] Fix ambiguities in C++17 mode

2016-08-19 Thread Jonathan Wakely

This fixes some more test FAILs when using -std=gnu++17, due to
names in std::experimental now also being in std.

* include/experimental/tuple (apply): Qualify call to __apply_impl.
* include/std/tuple (apply): Likewise.
* testsuite/experimental/system_error/value.cc: Fix ambiguities in
C++17 mode.
* testsuite/experimental/tuple/tuple_size.cc: Likewise.
* testsuite/experimental/type_traits/value.cc: Likewise.

Tested x86_64-linux, committed to trunk.

commit d1eda6aaf0f7105277fa06368c5f9970df126154
Author: Jonathan Wakely 
Date:   Fri Aug 19 09:35:11 2016 +0100

Fix ambiguities in C++17 mode

	* include/experimental/tuple (apply): Qualify call to __apply_impl.
	* include/std/tuple (apply): Likewise.
	* testsuite/experimental/system_error/value.cc: Fix ambiguities in
	C++17 mode.
	* testsuite/experimental/tuple/tuple_size.cc: Likewise.
	* testsuite/experimental/type_traits/value.cc: Likewise.

diff --git a/libstdc++-v3/include/experimental/tuple b/libstdc++-v3/include/experimental/tuple
index bfa1ed1..b653ea7 100644
--- a/libstdc++-v3/include/experimental/tuple
+++ b/libstdc++-v3/include/experimental/tuple
@@ -66,8 +66,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 {
   using _Indices =
 	std::make_index_sequence>>;
-  return __apply_impl(std::forward<_Fn>(__f), std::forward<_Tuple>(__t),
-			  _Indices{});
+  return experimental::__apply_impl(std::forward<_Fn>(__f),
+	std::forward<_Tuple>(__t),
+	_Indices{});
 }
 
 _GLIBCXX_END_NAMESPACE_VERSION
diff --git a/libstdc++-v3/include/std/tuple b/libstdc++-v3/include/std/tuple
index 29db833..c06a040 100644
--- a/libstdc++-v3/include/std/tuple
+++ b/libstdc++-v3/include/std/tuple
@@ -1652,8 +1652,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 apply(_Fn&& __f, _Tuple&& __t)
 {
   using _Indices = make_index_sequence>>;
-  return __apply_impl(std::forward<_Fn>(__f), std::forward<_Tuple>(__t),
-			  _Indices{});
+  return std::__apply_impl(std::forward<_Fn>(__f),
+			   std::forward<_Tuple>(__t),
+			   _Indices{});
 }
 
 #define __cpp_lib_make_from_tuple  201606
diff --git a/libstdc++-v3/testsuite/experimental/system_error/value.cc b/libstdc++-v3/testsuite/experimental/system_error/value.cc
index ee792cb..3b346ce 100644
--- a/libstdc++-v3/testsuite/experimental/system_error/value.cc
+++ b/libstdc++-v3/testsuite/experimental/system_error/value.cc
@@ -20,21 +20,23 @@
 
 #include 
 #include 
-using namespace std;
-using namespace std::experimental;
+using std::is_error_code_enum;
+using std::is_error_condition_enum;
+using std::experimental::is_error_code_enum_v;
+using std::experimental::is_error_condition_enum_v;
 
 // These tests are rather simple, the front-end tests already test
 // variable templates, and the library tests for the underlying
 // traits are more elaborate. These are just simple sanity tests.
 
-static_assert(is_error_code_enum_v
-	  && is_error_code_enum::value, "");
+static_assert(is_error_code_enum_v
+	  && is_error_code_enum::value, "");
 
 static_assert(!is_error_code_enum_v
 	  && !is_error_code_enum::value, "");
 
-static_assert(is_error_condition_enum_v
-	  && is_error_condition_enum::value, "");
+static_assert(is_error_condition_enum_v
+	  && is_error_condition_enum::value, "");
 
 static_assert(!is_error_condition_enum_v
 	  && !is_error_condition_enum::value, "");
diff --git a/libstdc++-v3/testsuite/experimental/tuple/tuple_size.cc b/libstdc++-v3/testsuite/experimental/tuple/tuple_size.cc
index 953d3b2..e9a49ea 100644
--- a/libstdc++-v3/testsuite/experimental/tuple/tuple_size.cc
+++ b/libstdc++-v3/testsuite/experimental/tuple/tuple_size.cc
@@ -20,8 +20,9 @@
 
 #include 
 
-using namespace std;
-using namespace std::experimental;
+using std::tuple;
+using std::tuple_size;
+using std::experimental::tuple_size_v;
 
 // These tests are rather simple, the front-end tests already test
 // variable templates, and the library tests for the underlying
diff --git a/libstdc++-v3/testsuite/experimental/type_traits/value.cc b/libstdc++-v3/testsuite/experimental/type_traits/value.cc
index 16b63cb..6a7a95d 100644
--- a/libstdc++-v3/testsuite/experimental/type_traits/value.cc
+++ b/libstdc++-v3/testsuite/experimental/type_traits/value.cc
@@ -20,8 +20,74 @@
 
 #include 
 
-using namespace std;
-using namespace experimental;
+using std::true_type;
+using std::false_type;
+using std::nullptr_t;
+using std::is_void;
+using std::is_null_pointer;
+using std::is_integral;
+using std::is_floating_point;
+using std::is_array;
+using std::is_pointer;
+using std::is_lvalue_reference;
+using std::is_rvalue_reference;
+using std::is_member_object_pointer;
+using std::is_member_function_pointer;
+using std::is_enum;
+using std::is_union;
+using std::is_class;
+using std::is_function;
+using std::is_reference;
+using std::is_arithmetic;
+using std::is_fundamental;
+using std::is_object;
+using std::is_scalar;
+using std

[committed] Fix handling of ASSOCIATE in OpenMP resolving (PR fortran/71014)

2016-08-19 Thread Jakub Jelinek
Hi!

The saving+clearing and restoring of OpenMP state during gfc_resolve is
meant for when during resolving of one subroutine/function etc. we resolve
another one, but apparently ASSOCIATE blocks (and BLOCK, though that isn't
supported yet - OpenMP 4.5 only supports Fortran 2003, not 2008) add another
namespace and we definitely want to e.g. notice vars used as DO loop
iterators even inside of ASSOCIATE.

Fixed thusly, tested on x86_64-linux and i686-linux, committed to trunk,
backports queued.

2016-08-19  Jakub Jelinek  

PR fortran/71014
* resolve.c (gfc_resolve): For ns->construct_entities don't save, clear
and restore omp state around the resolving.

* testsuite/libgomp.fortran/pr71014.f90: New test.

--- gcc/fortran/resolve.c.jj2016-08-16 09:01:20.0 +0200
+++ gcc/fortran/resolve.c   2016-08-19 15:57:13.173596549 +0200
@@ -15587,7 +15587,8 @@ gfc_resolve (gfc_namespace *ns)
   /* As gfc_resolve can be called during resolution of an OpenMP construct
  body, we should clear any state associated to it, so that say NS's
  DO loops are not interpreted as OpenMP loops.  */
-  gfc_omp_save_and_clear_state (&old_omp_state);
+  if (!ns->construct_entities)
+gfc_omp_save_and_clear_state (&old_omp_state);
 
   resolve_types (ns);
   component_assignment_level = 0;
@@ -15599,5 +15600,6 @@ gfc_resolve (gfc_namespace *ns)
 
   gfc_run_passes (ns);
 
-  gfc_omp_restore_state (&old_omp_state);
+  if (!ns->construct_entities)
+gfc_omp_restore_state (&old_omp_state);
 }
--- libgomp/testsuite/libgomp.fortran/pr71014.f90.jj2016-08-19 
16:12:40.272914118 +0200
+++ libgomp/testsuite/libgomp.fortran/pr71014.f90   2016-08-19 
16:10:33.0 +0200
@@ -0,0 +1,20 @@
+! PR fortran/71014
+! { dg-do run }
+! { dg-additional-options "-O0" }
+
+program pr71014
+  implicit none
+  integer :: i, j
+  integer, parameter :: t = 100*101/2
+  integer :: s(16)
+  s(:) = 0
+!$omp parallel do
+  do j = 1, 16
+associate (k => j)
+  do i = 1, 100
+s(j) = s(j) + i
+  end do
+end associate
+  end do
+  if (any(s /= t)) call abort
+end program pr71014

Jakub


[PATCH] Define std::not_fn for C++17

2016-08-19 Thread Jonathan Wakely

This updates std::experimental::not_fn to match the C++17 semantics
(which are a superset of the Library Fundamentals v2 semantics) and
then copies it to std::not_fn as well.

* doc/xml/manual/status_cxx2017.xml: Update status of not_fn.
* doc/html/*: Regenerate.
* include/experimental/functional (_Not_fn, not_fn): Match C++17
semantics.
* include/std/functional (_Not_fn, not_fn): Define for C++17.
* testsuite/20_util/not_fn/1.cc: New.
* testsuite/experimental/functional/not_fn.cc: Test abstract class.
Remove test for volatile-qualified wrapper.

Tested x86_64-linux, committed to trunk.


commit 8014ab8c2415e84d4b9b9f9de0718633dc8ca7b8
Author: Jonathan Wakely 
Date:   Fri Aug 19 12:33:13 2016 +0100

Define std::not_fn for C++17

* doc/xml/manual/status_cxx2017.xml: Update status of not_fn.
* doc/html/*: Regenerate.
* include/experimental/functional (_Not_fn, not_fn): Match C++17
semantics.
* include/std/functional (_Not_fn, not_fn): Define for C++17.
* testsuite/20_util/not_fn/1.cc: New.
* testsuite/experimental/functional/not_fn.cc: Test abstract class.
Remove test for volatile-qualified wrapper.

diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2017.xml 
b/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
index 331420e..ff96627 100644
--- a/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
+++ b/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
@@ -321,14 +321,13 @@ Feature-testing recommendations for C++.
 
 
 
-  
Adopt not_fn from Library Fundamentals 2 for C++17 

   
http://www.w3.org/1999/xlink"; 
xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0005r4.html";>
P0005R4

   
-   No 
+   7 
   __cpp_lib_not_fn >= 201603
 
 
diff --git a/libstdc++-v3/include/experimental/functional 
b/libstdc++-v3/include/experimental/functional
index ed41f5a..eddbcf1 100644
--- a/libstdc++-v3/include/experimental/functional
+++ b/libstdc++-v3/include/experimental/functional
@@ -386,41 +386,46 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 public:
   template
explicit
-   _Not_fn(_Fn2&& __fn) : _M_fn(std::forward<_Fn2>(__fn)) { }
+   _Not_fn(_Fn2&& __fn)
+   : _M_fn(std::forward<_Fn2>(__fn)) { }
 
   _Not_fn(const _Not_fn& __fn) = default;
   _Not_fn(_Not_fn&& __fn) = default;
-  _Not_fn& operator=(const _Not_fn& __fn) = default;
-  _Not_fn& operator=(_Not_fn&& __fn) = default;
   ~_Not_fn() = default;
 
   template
auto
-   operator()(_Args&&... __args)
-   noexcept(noexcept(!_M_fn(std::forward<_Args>(__args)...)))
-   -> decltype(!_M_fn(std::forward<_Args>(__args)...))
-   { return !_M_fn(std::forward<_Args>(__args)...); }
+   operator()(_Args&&... __args) &
+   noexcept(__is_nothrow_callable<_Fn&(_Args&&...)>::value)
+   -> decltype(!std::declval>())
+   { return !std::__invoke(_M_fn, std::forward<_Args>(__args)...); }
 
   template
auto
-   operator()(_Args&&... __args) const
-   noexcept(noexcept(!_M_fn(std::forward<_Args>(__args)...)))
-   -> decltype(!_M_fn(std::forward<_Args>(__args)...))
-   { return !_M_fn(std::forward<_Args>(__args)...); }
+   operator()(_Args&&... __args) const &
+   noexcept(__is_nothrow_callable::value)
+   -> decltype(!std::declval>())
+   { return !std::__invoke(_M_fn, std::forward<_Args>(__args)...); }
 
   template
auto
-   operator()(_Args&&... __args) volatile
-   noexcept(noexcept(!_M_fn(std::forward<_Args>(__args)...)))
-   -> decltype(!_M_fn(std::forward<_Args>(__args)...))
-   { return !_M_fn(std::forward<_Args>(__args)...); }
+   operator()(_Args&&... __args) &&
+   noexcept(__is_nothrow_callable<_Fn&&(_Args&&...)>::value)
+   -> decltype(!std::declval>())
+   {
+ return !std::__invoke(std::move(_M_fn),
+   std::forward<_Args>(__args)...);
+   }
 
   template
auto
-   operator()(_Args&&... __args) const volatile
-   noexcept(noexcept(!_M_fn(std::forward<_Args>(__args)...)))
-   -> decltype(!_M_fn(std::forward<_Args>(__args)...))
-   { return !_M_fn(std::forward<_Args>(__args)...); }
+   operator()(_Args&&... __args) const &&
+   noexcept(__is_nothrow_callable::value)
+   -> decltype(!std::declval>())
+   {
+ return !std::__invoke(std::move(_M_fn),
+   std::forward<_Args>(__args)...);
+   }
 };
 
   /// [func.not_fn] Function template not_fn
@@ -429,8 +434,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 not_fn(_Fn&& __fn)
 noexcept(std::is_nothrow_constructible, _Fn&&>::value)
 {
-  using __maybe_type = _Maybe_wrap_member_pointer>;
-  return _Not_fn{std::forward<_Fn>(__fn)};
+  return _Not_fn>{std::forward<_Fn>(__fn)};
 }
 
 _GLIBCXX_END_NAMESPACE_VERSIO

Re: Implement C _FloatN, _FloatNx types [version 6]

2016-08-19 Thread David Malcolm
On Fri, 2016-08-19 at 14:40 +, Joseph Myers wrote:
> On Fri, 19 Aug 2016, Richard Biener wrote:
> 
> > > > Can you quickly verify if LTO works with the new types?  I
> > > > don't see anything
> > > > that would prevent it but having new global trees and backends
> > > > initializing them
> > > > might come up with surprises (see tree
> > > > -streamer.c:preload_common_nodes)
> > > 
> > > Well, the execution tests are in gcc.dg/torture, which is run
> > > with various
> > > options including -flto (and I've checked the testsuite logs to
> > > confirm
> > > these tests are indeed run with such options).  Is there
> > > something else
> > > you think should be tested?
> > 
> > No, I think that's enough.
> 
> Then I'll commit the patch later today in the absence of comments
> from 
> other libcpp maintainers (and then go on to update and retest the
> built-in 
> functions patch).

[For reference, the latest version of the patch seems to be here:
  https://gcc.gnu.org/ml/gcc-patches/2016-08/msg01290.html ]

Although I'm listed as a libcpp maintainer, I believe today is the
first time I've looked at libcpp/expr.c:interpret_float_suffix.  Also,
my attempts to search for the standard you're referring to are failing,
which isn't helping, so my apologies in advance.

I attempted to review your changes to interpret_float_suffix, and I'm
having a hard time understanding what the supported suffixes now are.

Please could you take this opportunity to add some examples to the
header comment for that function, both for the common cases e.g. "f",
and for the new suffixes; nothing in the patch body appears to document
them.  (ideally, referencing the standard).

Also, it would be good to add some more comments to the function body. 
 For example, in this hunk:

-  if (f + d + l + w + q > 1 || i > 1)
+  if (f + d + l + w + q + fn + fnx > 1 || i > 1)

should it read something like :
   /* Reject duplicate suffixes, contradictory suffixes [...] */

where "[...]" is something relating to fn + fnx, which I can't figure
out in the absence of the standard you're referring to.

Sorry about my ignorance, hope this is constructive feedback
Dave


Re: [PATCH] Define std::not_fn for C++17

2016-08-19 Thread Jonathan Wakely

On 19/08/16 16:46 +0100, Jonathan Wakely wrote:

This updates std::experimental::not_fn to match the C++17 semantics
(which are a superset of the Library Fundamentals v2 semantics) and
then copies it to std::not_fn as well.

* doc/xml/manual/status_cxx2017.xml: Update status of not_fn.


Oops, the "fixes for not_fn" row in the table should be shown as done
too.


* doc/html/*: Regenerate.
* include/experimental/functional (_Not_fn, not_fn): Match C++17
semantics.
* include/std/functional (_Not_fn, not_fn): Define for C++17.
* testsuite/20_util/not_fn/1.cc: New.
* testsuite/experimental/functional/not_fn.cc: Test abstract class.
Remove test for volatile-qualified wrapper.

Tested x86_64-linux, committed to trunk.


I wonder if we want to move the experimental::_Not_fn class into
 and have std::not_fn and std::experimental::not_fn return
exactly the same type.




[PATCH] Define std::atomic::is_always_lock_free for C++17

2016-08-19 Thread Jonathan Wakely

Another new C++17 feature.

* include/std/atomic (atomic::is_always_lock_free): Define.
* testsuite/29_atomics/atomic/60695.cc: Adjust dg-error lineno.
* testsuite/29_atomics/atomic/is_always_lock_free.cc: New.
* testsuite/29_atomics/atomic_integral/is_always_lock_free.cc: New.
* doc/xml/manual/status_cxx2017.xml: Update status.
* doc/html/*: Regenerate.

Tested x86_64-linux and powerpc64le-linux, committed to trunk.


commit a79363ae83c16b1daf082d32effe727b887c9d95
Author: Jonathan Wakely 
Date:   Fri Aug 19 10:54:26 2016 +0100

Define std::atomic::is_always_lock_free for C++17

* include/std/atomic (atomic::is_always_lock_free): Define.
* testsuite/29_atomics/atomic/60695.cc: Adjust dg-error lineno.
* testsuite/29_atomics/atomic/is_always_lock_free.cc: New.
* testsuite/29_atomics/atomic_integral/is_always_lock_free.cc: New.
* doc/xml/manual/status_cxx2017.xml: Update status.
* doc/html/*: Regenerate.

diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2017.xml 
b/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
index 35d6b6b..331420e 100644
--- a/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
+++ b/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
@@ -699,14 +699,13 @@ Feature-testing recommendations for C++.
 
 
 
-  
constexpr atomic::is_always_lock_free  
 
   
http://www.w3.org/1999/xlink"; 
xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0152r1.html";>
P0152R1

   
-   No 
+   7 
__cpp_lib_atomic_is_always_lock_free >= 201603 

 
 
diff --git a/libstdc++-v3/include/std/atomic b/libstdc++-v3/include/std/atomic
index f8894bf..ad7a4f6 100644
--- a/libstdc++-v3/include/std/atomic
+++ b/libstdc++-v3/include/std/atomic
@@ -50,6 +50,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
* @{
*/
 
+#if __cplusplus > 201402L
+# define __cpp_lib_atomic_is_always_lock_free 201603
+#endif
+
   template
 struct atomic;
 
@@ -90,6 +94,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 bool
 is_lock_free() const volatile noexcept { return _M_base.is_lock_free(); }
 
+#if __cplusplus > 201402L
+static constexpr bool is_always_lock_free = ATOMIC_BOOL_LOCK_FREE == 2;
+#endif
+
 void
 store(bool __i, memory_order __m = memory_order_seq_cst) noexcept
 { _M_base.store(__i, __m); }
@@ -221,6 +229,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
reinterpret_cast(-__alignof(_M_i)));
   }
 
+#if __cplusplus > 201402L
+  static constexpr bool is_always_lock_free
+   = __atomic_always_lock_free(sizeof(_M_i), 0);
+#endif
+
   void
   store(_Tp __i, memory_order __m = memory_order_seq_cst) noexcept
   { __atomic_store(std::__addressof(_M_i), std::__addressof(__i), __m); }
@@ -416,6 +429,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   is_lock_free() const volatile noexcept
   { return _M_b.is_lock_free(); }
 
+#if __cplusplus > 201402L
+static constexpr bool is_always_lock_free = ATOMIC_POINTER_LOCK_FREE == 2;
+#endif
+
   void
   store(__pointer_type __p,
memory_order __m = memory_order_seq_cst) noexcept
@@ -537,6 +554,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   using __base_type::operator __integral_type;
   using __base_type::operator=;
+
+#if __cplusplus > 201402L
+static constexpr bool is_always_lock_free = ATOMIC_CHAR_LOCK_FREE == 2;
+#endif
 };
 
   /// Explicit specialization for signed char.
@@ -556,6 +577,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   using __base_type::operator __integral_type;
   using __base_type::operator=;
+
+#if __cplusplus > 201402L
+static constexpr bool is_always_lock_free = ATOMIC_CHAR_LOCK_FREE == 2;
+#endif
 };
 
   /// Explicit specialization for unsigned char.
@@ -575,6 +600,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   using __base_type::operator __integral_type;
   using __base_type::operator=;
+
+#if __cplusplus > 201402L
+static constexpr bool is_always_lock_free = ATOMIC_CHAR_LOCK_FREE == 2;
+#endif
 };
 
   /// Explicit specialization for short.
@@ -594,6 +623,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   using __base_type::operator __integral_type;
   using __base_type::operator=;
+
+#if __cplusplus > 201402L
+static constexpr bool is_always_lock_free = ATOMIC_SHORT_LOCK_FREE == 2;
+#endif
 };
 
   /// Explicit specialization for unsigned short.
@@ -613,6 +646,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   using __base_type::operator __integral_type;
   using __base_type::operator=;
+
+#if __cplusplus > 201402L
+static constexpr bool is_always_lock_free = ATOMIC_SHORT_LOCK_FREE == 2;
+#endif
 };
 
   /// Explicit specialization for int.
@@ -632,6 +669,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   using __base_type::operator __integral_type;
   using __base_type::operator=;
+
+#if __cplusplus > 201402L
+static constexpr bool is_always_lock_free = A

Re: Implement C _FloatN, _FloatNx types [version 6]

2016-08-19 Thread Joseph Myers
On Fri, 19 Aug 2016, Szabolcs Nagy wrote:

> On 17/08/16 21:17, Joseph Myers wrote:
> > Although there is HFmode support for ARM and AArch64, use of that for
> > _Float16 is not enabled.  Supporting _Float16 would require additional
> > work on the excess precision aspects of TS 18661-3: there are new
> > values of FLT_EVAL_METHOD, which are not currently supported in GCC,
> > and FLT_EVAL_METHOD == 0 now means that operations and constants on
> > types narrower than float are evaluated to the range and precision of
> > float.  Implementing that, so that _Float16 gets evaluated with excess
> > range and precision, would involve changes to the excess precision
> > infrastructure so that the _Float16 case is enabled by default, unlike
> > the x87 case which is only enabled for -fexcess-precision=standard.
> > Other differences between _Float16 and __fp16 would also need to be
> > disentangled.
> 
> i wonder how gcc can support _Float16 without excess
> precision.
> 
> using FLT_EVAL_METHOD==16 can break conforming c99/c11
> code which only expects 0,1,2 values to appear (and does
> not use _Float16 at all), but it seems to be the better
> fit for hardware with half precision instructions.

Maybe this indicates that -fexcess-precision=standard, whether explicit or 
implies by a -std option, must cause FLT_EVAL_METHOD=0 for such hardware, 
and some new -fexcess-precision= option is needed to select 
FLT_EVAL_METHOD=16 (say -fexcess-precision=16, with no expectation that 
most numeric -fexcess-precision= arguments are supported except where a 
target hook says they are or where they are the default FLT_EVAL_METHOD 
value).  Then -std=c2x, if C2X integrates TS 18661-3, might not imply the 
value 0 for such hardware, because the value 16 would also be OK as a 
standard value in that case.  This can be part of the design issues to 
address alongside those I mentioned in 
.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] Simplify dg-options for tests using pthreads

2016-08-19 Thread Jonathan Wakely

On 18/08/16 09:06 +0100, Jonathan Wakely wrote:

It's been several years now that Solaris supported -pthread as well as
-pthreads, so there's no need to have separate dg-options directives
for Solaris.

This patch removes all the lines:

// { dg-options "... -pthreads" { target *-*-solaris* } }

And adds *-*-solaris* to the list of targets in:

// { dg-options "... -pthread" { target ... } }

Rainer, any objection to this change?


Committed to trunk.



Re: Implement C _FloatN, _FloatNx types [version 6]

2016-08-19 Thread Joseph Myers
On Fri, 19 Aug 2016, David Malcolm wrote:

> Please could you take this opportunity to add some examples to the
> header comment for that function, both for the common cases e.g. "f",
> and for the new suffixes; nothing in the patch body appears to document
> them.  (ideally, referencing the standard).
> 
> Also, it would be good to add some more comments to the function body. 
>  For example, in this hunk:
> 
> -  if (f + d + l + w + q > 1 || i > 1)
> +  if (f + d + l + w + q + fn + fnx > 1 || i > 1)
> 
> should it read something like :
>/* Reject duplicate suffixes, contradictory suffixes [...] */
> 
> where "[...]" is something relating to fn + fnx, which I can't figure
> out in the absence of the standard you're referring to.

How does this seem?  I think 
 was the last 
public draft of TS 18661-3 before publication.

Index: libcpp/expr.c
===
--- libcpp/expr.c   (revision 239623)
+++ libcpp/expr.c   (working copy)
@@ -86,16 +86,53 @@ static cpp_num parse_has_include (cpp_reader *, en
 
 /* Subroutine of cpp_classify_number.  S points to a float suffix of
length LEN, possibly zero.  Returns 0 for an invalid suffix, or a
-   flag vector describing the suffix.  */
+   flag vector (of CPP_N_* bits) describing the suffix.  */
 static unsigned int
 interpret_float_suffix (cpp_reader *pfile, const uchar *s, size_t len)
 {
   size_t flags;
-  size_t f, d, l, w, q, i;
+  size_t f, d, l, w, q, i, fn, fnx, fn_bits;
 
   flags = 0;
-  f = d = l = w = q = i = 0;
+  f = d = l = w = q = i = fn = fnx = fn_bits = 0;
 
+  /* The following decimal float suffixes, from TR 24732:2009 and TS
+ 18661-2:2015, are supported:
+
+ df, DF - _Decimal32.
+ dd, DD - _Decimal64.
+ dl, DL - _Decimal128.
+
+ The dN and DN suffixes for _DecimalN, and dNx and DNx for
+ _DecimalNx, defined in TS 18661-3:2015, are not supported.
+
+ Fixed-point suffixes, from TR 18037:2008, are supported.  They
+ consist of three parts, in order:
+
+ (i) An optional u or U, for unsigned types.
+
+ (ii) An optional h or H, for short types, or l or L, for long
+ types, or ll or LL, for long long types.  Use of ll or LL is a
+ GNU extension.
+
+ (iii) r or R, for _Fract types, or k or K, for _Accum types.
+
+ Otherwise the suffix is for a binary or standard floating-point
+ type.  Such a suffix, or the absence of a suffix, may be preceded
+ or followed by i, I, j or J, to indicate an imaginary number with
+ the corresponding complex type.  The following suffixes for
+ binary or standard floating-point types are supported:
+
+ f, F - float (ISO C and C++).
+ l, L - long double (ISO C and C++).
+ d, D - double, even with the FLOAT_CONST_DECIMAL64 pragma in
+   operation (from TR 24732:2009; the pragma and the suffix
+   are not included in TS 18661-2:2015).
+ w, W - machine-specific type such as __float80 (GNU extension).
+ q, Q - machine-specific type such as __float128 (GNU extension).
+ fN, FN - _FloatN (TS 18661-3:2015).
+ fNx, FNx - _FloatNx (TS 18661-3:2015).  */
+
   /* Process decimal float suffixes, which are two letters starting
  with d or D.  Order and case are significant.  */
   if (len == 2 && (*s == 'd' || *s == 'D'))
@@ -172,21 +209,65 @@ interpret_float_suffix (cpp_reader *pfile, const u
 
   /* In any remaining valid suffix, the case and order don't matter.  */
   while (len--)
-switch (s[len])
-  {
-  case 'f': case 'F': f++; break;
-  case 'd': case 'D': d++; break;
-  case 'l': case 'L': l++; break;
-  case 'w': case 'W': w++; break;
-  case 'q': case 'Q': q++; break;
-  case 'i': case 'I':
-  case 'j': case 'J': i++; break;
-  default:
-   return 0;
-  }
+{
+  switch (s[0])
+   {
+   case 'f': case 'F':
+ f++;
+ if (len > 0
+ && !CPP_OPTION (pfile, cplusplus)
+ && s[1] >= '1'
+ && s[1] <= '9'
+ && fn_bits == 0)
+   {
+ f--;
+ while (len > 0
+&& s[1] >= '0'
+&& s[1] <= '9'
+&& fn_bits < CPP_FLOATN_MAX)
+   {
+ fn_bits = fn_bits * 10 + (s[1] - '0');
+ len--;
+ s++;
+   }
+ if (len > 0 && s[1] == 'x')
+   {
+ fnx++;
+ len--;
+ s++;
+   }
+ else
+   fn++;
+   }
+ break;
+   case 'd': case 'D': d++; break;
+   case 'l': case 'L': l++; break;
+   case 'w': case 'W': w++; break;
+   case 'q': case 'Q': q++; break;
+   case 'i': case 'I':
+   case 'j': case 'J': i++; break;
+   default:
+ return 0;
+   }
+  s++;
+}
 
-  if (f + d + l + w

Re: Implement C _FloatN, _FloatNx types [version 6]

2016-08-19 Thread David Malcolm
On Fri, 2016-08-19 at 16:51 +, Joseph Myers wrote:
> On Fri, 19 Aug 2016, David Malcolm wrote:
> 
> > Please could you take this opportunity to add some examples to the
> > header comment for that function, both for the common cases e.g.
> > "f",
> > and for the new suffixes; nothing in the patch body appears to
> > document
> > them.  (ideally, referencing the standard).
> > 
> > Also, it would be good to add some more comments to the function
> > body. 
> >  For example, in this hunk:
> > 
> > -  if (f + d + l + w + q > 1 || i > 1)
> > +  if (f + d + l + w + q + fn + fnx > 1 || i > 1)
> > 
> > should it read something like :
> >/* Reject duplicate suffixes, contradictory suffixes [...] */
> > 
> > where "[...]" is something relating to fn + fnx, which I can't
> > figure
> > out in the absence of the standard you're referring to.
> 
> How does this seem?  I think 
>  was the
> last 
> public draft of TS 18661-3 before publication.

Thanks - the comments make things much clearer.

[...snip...]



[PATCH, i386]: Fix PR77270, Flag -mprftchw is shared with 3dnow for -march=k8

2016-08-19 Thread Uros Bizjak
Hello!

There is a problem with -march=native, when -m3dnow and -mno-prfchw
are added to the compilation flags. The prefetchw is enabled by
-m3dnow, but TARGET_PRFCHW is never enabled, since explicit
-mno-prfchw is passed by the driver.

Attached patch fixes this by divorcing TARGET_3DNOW from TARGET_PRFCHW.

2016-08-19  Uros Bizjak  

PR target/77270
* config/i386/i386.c (ix86_option_override_internal): Remove
PTA_PRFCHW from entries that also have PTA_3DNOW flag.
Enable SSE prefetch also for TARGET_PREFETCHWT1.
Do not try to enable TARGET_PRFCHW ISA flag here.
* config/i386/i386.md (prefetch): Enable also for TARGET_3DNOW.
Rewrite expander function body.
(*prefetch_3dnow): Enable for TARGET_3DNOW and TARGET_PREFETCHWT1.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline, I plan to backport the patch to gcc-6 branch,
once it opens.

Uros.
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 239609)
+++ config/i386/i386.c  (working copy)
@@ -4843,9 +4843,9 @@
   {"lakemont", PROCESSOR_LAKEMONT, CPU_PENTIUM, PTA_NO_80387},
   {"pentium-mmx", PROCESSOR_PENTIUM, CPU_PENTIUM, PTA_MMX},
   {"winchip-c6", PROCESSOR_I486, CPU_NONE, PTA_MMX},
-  {"winchip2", PROCESSOR_I486, CPU_NONE, PTA_MMX | PTA_3DNOW | PTA_PRFCHW},
-  {"c3", PROCESSOR_I486, CPU_NONE, PTA_MMX | PTA_3DNOW | PTA_PRFCHW},
-  {"samuel-2", PROCESSOR_I486, CPU_NONE, PTA_MMX | PTA_3DNOW | PTA_PRFCHW},
+  {"winchip2", PROCESSOR_I486, CPU_NONE, PTA_MMX | PTA_3DNOW},
+  {"c3", PROCESSOR_I486, CPU_NONE, PTA_MMX | PTA_3DNOW},
+  {"samuel-2", PROCESSOR_I486, CPU_NONE, PTA_MMX | PTA_3DNOW},
   {"c3-2", PROCESSOR_PENTIUMPRO, CPU_PENTIUMPRO,
PTA_MMX | PTA_SSE | PTA_FXSR},
   {"nehemiah", PROCESSOR_PENTIUMPRO, CPU_PENTIUMPRO,
@@ -4896,20 +4896,20 @@
   {"knl", PROCESSOR_KNL, CPU_SLM, PTA_KNL},
   {"intel", PROCESSOR_INTEL, CPU_SLM, PTA_NEHALEM},
   {"geode", PROCESSOR_GEODE, CPU_GEODE,
-   PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_PREFETCH_SSE | PTA_PRFCHW},
+   PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_PREFETCH_SSE},
   {"k6", PROCESSOR_K6, CPU_K6, PTA_MMX},
-  {"k6-2", PROCESSOR_K6, CPU_K6, PTA_MMX | PTA_3DNOW | PTA_PRFCHW},
-  {"k6-3", PROCESSOR_K6, CPU_K6, PTA_MMX | PTA_3DNOW | PTA_PRFCHW},
+  {"k6-2", PROCESSOR_K6, CPU_K6, PTA_MMX | PTA_3DNOW},
+  {"k6-3", PROCESSOR_K6, CPU_K6, PTA_MMX | PTA_3DNOW},
   {"athlon", PROCESSOR_ATHLON, CPU_ATHLON,
-   PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_PREFETCH_SSE | PTA_PRFCHW},
+   PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_PREFETCH_SSE},
   {"athlon-tbird", PROCESSOR_ATHLON, CPU_ATHLON,
-   PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_PREFETCH_SSE | PTA_PRFCHW},
+   PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_PREFETCH_SSE},
   {"athlon-4", PROCESSOR_ATHLON, CPU_ATHLON,
-   PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE | PTA_PRFCHW | PTA_FXSR},
+   PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE | PTA_FXSR},
   {"athlon-xp", PROCESSOR_ATHLON, CPU_ATHLON,
-   PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE | PTA_PRFCHW | PTA_FXSR},
+   PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE | PTA_FXSR},
   {"athlon-mp", PROCESSOR_ATHLON, CPU_ATHLON,
-   PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE | PTA_PRFCHW | PTA_FXSR},
+   PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE | PTA_FXSR},
   {"x86-64", PROCESSOR_K8, CPU_K8,
PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_NO_SAHF | PTA_FXSR},
   {"eden-x2", PROCESSOR_K8, CPU_K8,
@@ -4937,31 +4937,31 @@
 | PTA_SSSE3 | PTA_SSE4_1 | PTA_FXSR},
   {"k8", PROCESSOR_K8, CPU_K8,
PTA_64BIT | PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE
-   | PTA_SSE2 | PTA_NO_SAHF | PTA_PRFCHW | PTA_FXSR},
+   | PTA_SSE2 | PTA_NO_SAHF | PTA_FXSR},
   {"k8-sse3", PROCESSOR_K8, CPU_K8,
PTA_64BIT | PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE
-   | PTA_SSE2 | PTA_SSE3 | PTA_NO_SAHF | PTA_PRFCHW | PTA_FXSR},
+   | PTA_SSE2 | PTA_SSE3 | PTA_NO_SAHF | PTA_FXSR},
   {"opteron", PROCESSOR_K8, CPU_K8,
PTA_64BIT | PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE
-   | PTA_SSE2 | PTA_NO_SAHF | PTA_PRFCHW | PTA_FXSR},
+   | PTA_SSE2 | PTA_NO_SAHF | PTA_FXSR},
   {"opteron-sse3", PROCESSOR_K8, CPU_K8,
PTA_64BIT | PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE
-   | PTA_SSE2 | PTA_SSE3 | PTA_NO_SAHF | PTA_PRFCHW | PTA_FXSR},
+   | PTA_SSE2 | PTA_SSE3 | PTA_NO_SAHF | PTA_FXSR},
   {"athlon64", PROCESSOR_K8, CPU_K8,
PTA_64BIT | PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE
-   | PTA_SSE2 | PTA_NO_SAHF | PTA_PRFCHW | PTA_FXSR},
+   | PTA_SSE2 | PTA_NO_SAHF | PTA_FXSR},
   {"athlon64-sse3", PROCESSOR_K8, CPU_K8,
PTA_64BIT | PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE
-   | PTA_SSE2 | PTA_SSE3 | PTA_NO_SAHF | PTA_PRFCHW | PTA_FXSR},
+   | PTA

[PR59319] output friends in debug info

2016-08-19 Thread Alexandre Oliva
This is not a finished patch.  There are two issues I'd like feedback
on before a final submission.  See them below.  First, a general
description.

Handling non-template friends is kind of easy, but it required a bit
of infrastructure in dwarf2out to avoid (i) forcing debug info for
unused types or functions: DW_TAG_friend DIEs are only emitted if
their DW_AT_friend DIE is emitted, and (ii) creating DIEs for such
types or functions just to have them discarded at the end.  To this
end, I introduced a list (vec, actually) of types with friends,
processed at the end of the translation unit, and a list of
DW_TAG_friend DIEs that, when we're pruning unused types, reference
DIEs that are still not known to be used, revisited after we finish
deciding all other DIEs, so that we prune DIEs that would have
referenced pruned types or functions.

Handlig template friends turned out to be trickier: there's no
representation in DWARF for templates.  I decided to give debuggers as
much information as possible, enumerating all specializations of
friend templates and outputting DW_TAG_friend DIEs referencing them as
well, but marking them as DW_AT_artificial to indicate they're not
explicitly stated in the source code.  This attribute is not valid for
DW_TAG_friend, so it's only emitted in non-strict mode.  The greatest
challenge was to enumerate all specializations of a template.  It
looked trivial at first, given DECL_TEMPLATE_INSTANTIATIONS, but in
some of the testcases, cases it wouldn't list any specializations, and
in others it would list only some of them.  I couldn't figure out the
logic behind that, and it seemed clear from the documentation of this
macro that at least in some cases it wouldn't hold the list, so I
ended up writing code to look for specializations in the hashtables of
decl or type specializations.  That worked fine, but it's not exactly
an efficient way to obtain the desired information, at least in some
cases.



- should we output specializations of friend templates as friends even
  in strict mode?  Currently we output them with DW_AT_artificial in
  non-strict mode, and without the artificial mark in strict mode.

- is there any way we can use DECL_TEMPLATE_INSTANTIATIONS reliably to
  enumerate the specializations of a friend template, or at least tell
  when it can be used?

- I haven't used local_specializations, should I?  I was a bit
  confused about the apparently unused local_specialization_stack,
  too.

I haven't covered partial and explicit specializations in the
testcases yet.


for gcc/ChangeLog

PR debug/59319
* dwarf2out.c (class_types_with_friends): New.
(gen_friend_tags_for_type, gen_friend_tags): New.
(gen_member_die): Record class types with friends.
(deferred_marks): New.
(prune_unused_types_defer_undecided_mark_p): New.
(prune_unused_types_defer_mark): New.
(prune_unused_types_deferred_walk): New.
(prune_unused_types_walk): Defer DW_TAG_friend.
(prune_unused_types): Check deferred marks is empty on entry,
empty it after processing.
(dwarf2out_finish): Generate friend tags.
* langhooks-def.h (LANG_HOOKS_GET_FRIENDS): New.
(LANG_HOOKS_FOR_TYPES_INITIALIZER): Add it.
* langhooks.h (lang_hooks_for_types): Add get_friends.

for gcc/cp/ChangeLog

PR debug/59319
* cp-objcp-common.c (cp_get_friends): New.
* cp-objcp-common.h (cp_get_friends): Declare.
(LANG_HOOKS_GET_FRIENDS): Override.
* cp-tree.h (enumerate_friend_specializations): Declare.
* pt.c (enumerate_friend_specializations): New.

for gcc/testsuite/ChangeLog

PR debug/59319
* g++.dg/debug/dwarf2/friend-1.C: New.
* g++.dg/debug/dwarf2/friend-2.C: New.
* g++.dg/debug/dwarf2/friend-3.C: New.
* g++.dg/debug/dwarf2/friend-4.C: New.
* g++.dg/debug/dwarf2/friend-5.C: New.
* g++.dg/debug/dwarf2/friend-6.C: New.
* g++.dg/debug/dwarf2/friend-7.C: New.
* g++.dg/debug/dwarf2/friend-8.C: New.
* g++.dg/debug/dwarf2/friend-9.C: New.
* g++.dg/debug/dwarf2/friend-10.C: New.
* g++.dg/debug/dwarf2/friend-11.C: New.
* g++.dg/debug/dwarf2/friend-12.C: New.
* g++.dg/debug/dwarf2/friend-13.C: New.
---
 gcc/cp/cp-objcp-common.c  |  103 
 gcc/cp/cp-objcp-common.h  |3 
 gcc/cp/cp-tree.h  |1 
 gcc/cp/pt.c   |   73 +++
 gcc/dwarf2out.c   |  161 +
 gcc/langhooks-def.h   |4 -
 gcc/langhooks.h   |   19 +++
 gcc/testsuite/g++.dg/debug/dwarf2/friend-1.C  |   10 ++
 gcc/testsuite/g++.dg/debug/dwarf2/friend-10.C |   13 ++
 gcc/testsuite/g++.dg/debug/dwarf2/friend-11.C |   13 ++
 gcc/testsuite/g++.dg/debug/dwarf2/friend-12.C |   15 ++
 gcc/testsui

[PATCH] newlib-stdint.h: Remove 32 bit longs

2016-08-19 Thread Andy Ross
We ran into this issue in the Zephyr project with our toolchain (gcc
built with --enable-newlib).  Basically GCC appears to be honoring a
legacy requirement to give newlib a "long" instead of "int" for
__INT32_TYPE__, which then leaks out through the current newlib
headers as a long-valued int32_t, which produces gcc warnings when a
uint32_t is passed to an unqualified printf format specifier like
"%d".

But the newlib headers, if __INT32_TYPE__ is *not* defined by the
compiler, have code to chose a "int" over "long" immediately
thereafter.  It seems whatever requirement this was honoring isn't
valid anymore.

>From 784fb1760a930d0309f878bbae7bfd38137f5689 Mon Sep 17 00:00:00 2001
From: Andy Ross 
Date: Fri, 19 Aug 2016 09:40:42 -0700
Subject: [PATCH] newlib-stdint.h: Remove 32 bit longs

This would make __INT32_TYPE__ a "long" instead of an "int", which
would then percolate down in newlib's own headers into a typedef for
int32_t.  Which is wrong.  Newlib's headers, if __INT32_TYPE__ were
not defined, actually would chose an int for this type.  The comment
that newlib uses a 32 bit long appears to be a lie, perhaps
historical.

Signed-off-by: Andy Ross 
---
 gcc/config/newlib-stdint.h | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/gcc/config/newlib-stdint.h b/gcc/config/newlib-stdint.h
index eb99556..0275948 100644
--- a/gcc/config/newlib-stdint.h
+++ b/gcc/config/newlib-stdint.h
@@ -22,10 +22,9 @@ a copy of the GCC Runtime Library Exception along with this 
program;
 see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 .  */

-/* newlib uses 32-bit long in certain cases for all non-SPU
-   targets.  */
+/* newlib used to use a 32-bit long, no longer */
 #ifndef STDINT_LONG32
-#define STDINT_LONG32 (LONG_TYPE_SIZE == 32)
+#define STDINT_LONG32 0
 #endif

 #define SIG_ATOMIC_TYPE "int"
-- 
2.7.4



Add minimal _FloatN, _FloatNx built-in functions [version 2]

2016-08-19 Thread Joseph Myers
[Version 2 of this patch updates the testcases to use dg-add-options as in 
the final version of the _FloatN patch that went in; there are no changes 
of substance outside the testsuite.  Version 1 was 
.]


This patch adds a minimal set of built-in functions for the new
_FloatN and _FloatNx types.

The functions added are __builtin_fabs*, __builtin_copysign*,
__builtin_huge_val*, __builtin_inf*, __builtin_nan* and
__builtin_nans* (where * = fN or fNx).  That is, 42 new entries are
added to the enum of built-in functions and the associated array of
decls, where not all of them are actually supported on any one target.

These functions are believed to be sufficient for libgcc (complex
multiplication and division use __builtin_huge_val*,
__builtin_copysign* and __builtin_fabs*) and for glibc (which also
depends on complex multiplication from libgcc, as well as using such
functions itself).  The basic target-independent support for folding /
expanding calls to these built-in functions is wired up, so those for
constants can be used in static initializers, and the fabs and
copysign built-ins can always be expanded to bit-manipulation inline
(for any format setting signbit_ro and signbit_rw, which covers all
formats supported for _FloatN and _FloatNx), although insn patterns
for fabs (abs2) and copysign (copysign3) will be used when
available and may result in more optimal code.

The complex multiplication and division functions in libgcc rely on
predefined macros (defined with -fbuilding-libgcc) to say what the
built-in function suffixes to use with a particular mode are.  This
patch updates that code accordingly, where previously it involved a
hack supposing that machine-specific suffixes for constants were also
suffixes for built-in functions.

As with the main _FloatN / _FloatNx patch, this patch does not update
code dealing only with optimizations that currently has cases only
covering float, double and long double, though some such cases are
straightforward and may be covered in a followup patch.

The functions are defined with DEF_GCC_BUILTIN, so calls to the TS
18661-3 functions such as fabsf128 and copysignf128, without the
__builtin_, will not be optimized.  As noted in the original _FloatN /
_FloatNx patch submission, in principle the bulk of the libm functions
that have built-in versions should have those versions extended to
cover the new types, but that would require more consideration of the
effects of increasing the size of the enum and initializing many more
functions at startup.

I don't know whether target-specific built-in functions can readily be
made into aliases for target-independent functions, but if they can,
it would make sense to do so for the x86, ia64 and rs6000 *q functions
corresponding to these, so that they can benefit from the
architecture-independent folding logic and from any optimizations
enabled for these functions in future, and so that less
target-specific code is needed to support them.

Bootstrapped with no regressions on x86_64-pc-linux-gnu.  OK to
commit (the non-C-front-end parts)?

gcc:
2016-08-19  Joseph Myers  

* tree.h (CASE_FLT_FN_FLOATN_NX, float16_type_node)
(float32_type_node, float64_type_node, float32x_type_node)
(float128x_type_node): New macros.
* builtin-types.def (BT_FLOAT16, BT_FLOAT32, BT_FLOAT64)
(BT_FLOAT128, BT_FLOAT32X, BT_FLOAT64X, BT_FLOAT128X)
(BT_FN_FLOAT16, BT_FN_FLOAT32, BT_FN_FLOAT64, BT_FN_FLOAT128)
(BT_FN_FLOAT32X, BT_FN_FLOAT64X, BT_FN_FLOAT128X)
(BT_FN_FLOAT16_FLOAT16, BT_FN_FLOAT32_FLOAT32)
(BT_FN_FLOAT64_FLOAT64, BT_FN_FLOAT128_FLOAT128)
(BT_FN_FLOAT32X_FLOAT32X, BT_FN_FLOAT64X_FLOAT64X)
(BT_FN_FLOAT128X_FLOAT128X, BT_FN_FLOAT16_CONST_STRING)
(BT_FN_FLOAT32_CONST_STRING, BT_FN_FLOAT64_CONST_STRING)
(BT_FN_FLOAT128_CONST_STRING, BT_FN_FLOAT32X_CONST_STRING)
(BT_FN_FLOAT64X_CONST_STRING, BT_FN_FLOAT128X_CONST_STRING)
(BT_FN_FLOAT16_FLOAT16_FLOAT16, BT_FN_FLOAT32_FLOAT32_FLOAT32)
(BT_FN_FLOAT64_FLOAT64_FLOAT64, BT_FN_FLOAT128_FLOAT128_FLOAT128)
(BT_FN_FLOAT32X_FLOAT32X_FLOAT32X)
(BT_FN_FLOAT64X_FLOAT64X_FLOAT64X)
(BT_FN_FLOAT128X_FLOAT128X_FLOAT128X): New type definitions.
* builtins.def (DEF_GCC_FLOATN_NX_BUILTINS): New macro.
(copysign, fabs, huge_val, inf, nan, nans): Use it.
* builtins.c (expand_builtin): Use CASE_FLT_FN_FLOATN_NX for fabs
and copysign.
(fold_builtin_0): Use CASE_FLT_FN_FLOATN_NX for inf and huge_val.
(fold_builtin_1): Use CASE_FLT_FN_FLOATN_NX for fabs.
* doc/extend.texi (Other Builtins): Document these built-in
functions.
* fold-const-call.c (fold_const_call): Use CASE_FLT_FN_FLOATN_NX
for nan and nans.

gcc/c-family:
2016-08-19  Joseph Myers  

* c-family/c-cppbuiltin.c (c_cpp_builtins): Check _FloatN and
_Float

[committed] Reimplement removal fix-it hints in terms of replace

2016-08-19 Thread David Malcolm
This patch eliminates class fixit_remove, reimplementing
rich_location::add_fixit_remove in terms of replacement with the
empty string.  Deleting the removal subclass simplifies
fixit-handling code, as we only have two concrete fixit_hint
subclasses to deal with, rather than three.

The patch also fixes some problems in diagnostic-show-locus.c for
situations where a replacement fix-it has a different range to the
range of the diagnostic, by unifying the drawing of the two kinds of
fixits.  For example, this:

  foo = bar.field;
  ^
m_field

becomes:

  foo = bar.field;
  ^
-
m_field

showing the range to be replaced.

Successfully bootstrapped®rtested on x86_64-pc-linux-gnu.

Committed to trunk as r239632.

gcc/ChangeLog:
* diagnostic-show-locus.c
(layout::annotation_line_showed_range_p): New method.
(layout::print_any_fixits): Remove case fixit_hint::REMOVE.
Reimplement case fixit_hint::REPLACE to cover removals, and
replacements where the range of the replacement isn't one
of the ranges in the rich_location.
(test_one_liner_fixit_replace): Likewise.
(selftest::test_one_liner_fixit_replace_non_equal_range): New
function.
(selftest::test_one_liner_fixit_replace_equal_secondary_range):
New function.
(selftest::test_diagnostic_show_locus_one_liner): Call the new
functions.
* diagnostic.c (print_parseable_fixits): Remove case
fixit_hint::REMOVE.

libcpp/ChangeLog:
* include/line-map.h (fixit_hint::kind): Delete REPLACE.
(class fixit_remove): Delete.
* line-map.c (rich_location::add_fixit_remove): Reimplement
by calling add_fixit_replace with an empty string.
(fixit_remove::fixit_remove): Delete.
(fixit_remove::affects_line_p): Delete.
---
 gcc/diagnostic-show-locus.c | 124 +---
 gcc/diagnostic.c|   4 --
 libcpp/include/line-map.h   |  23 +---
 libcpp/line-map.c   |  18 +--
 4 files changed, 106 insertions(+), 63 deletions(-)

diff --git a/gcc/diagnostic-show-locus.c b/gcc/diagnostic-show-locus.c
index 4498f7c..32b1078 100644
--- a/gcc/diagnostic-show-locus.c
+++ b/gcc/diagnostic-show-locus.c
@@ -199,6 +199,8 @@ class layout
 
   bool print_source_line (int row, line_bounds *lbounds_out);
   void print_annotation_line (int row, const line_bounds lbounds);
+  bool annotation_line_showed_range_p (int line, int start_column,
+  int finish_column) const;
   void print_any_fixits (int row, const rich_location *richloc);
 
   void show_ruler (int max_column) const;
@@ -1053,6 +1055,26 @@ layout::print_annotation_line (int row, const 
line_bounds lbounds)
   print_newline ();
 }
 
+/* Subroutine of layout::print_any_fixits.
+
+   Determine if the annotation line printed for LINE contained
+   the exact range from START_COLUMN to FINISH_COLUMN.  */
+
+bool
+layout::annotation_line_showed_range_p (int line, int start_column,
+   int finish_column) const
+{
+  layout_range *range;
+  int i;
+  FOR_EACH_VEC_ELT (m_layout_ranges, i, range)
+if (range->m_start.m_line == line
+   && range->m_start.m_column == start_column
+   && range->m_finish.m_line == line
+   && range->m_finish.m_column == finish_column)
+  return true;
+  return false;
+}
+
 /* If there are any fixit hints on source line ROW within RICHLOC, print them.
They are printed in order, attempting to combine them onto lines, but
starting new lines if necessary.  */
@@ -1083,33 +1105,39 @@ layout::print_any_fixits (int row, const rich_location 
*richloc)
  }
  break;
 
-   case fixit_hint::REMOVE:
+   case fixit_hint::REPLACE:
  {
-   fixit_remove *remove = static_cast  (hint);
-   /* This assumes the removal just affects one line.  */
-   source_range src_range = remove->get_range ();
+   fixit_replace *replace = static_cast  (hint);
+   source_range src_range = replace->get_range ();
+   int line = LOCATION_LINE (src_range.m_start);
int start_column = LOCATION_COLUMN (src_range.m_start);
int finish_column = LOCATION_COLUMN (src_range.m_finish);
-   move_to_column (&column, start_column);
-   for (int column = start_column; column <= finish_column; 
column++)
+
+   /* If the range of the replacement wasn't printed in the
+  annotation line, then print an extra underline to
+  indicate exactly what is being replaced.
+  Always show it for removals.  */
+   if (!annotation_line_showed_range_p (line, start_column,
+finish_column)
+   || replace->get_length () == 0)
   

[PATCH], Patch #5, Improve vector int initialization on PowerPC

2016-08-19 Thread Michael Meissner
This is a rewrite of patch #3 to improve vector int initialization on the
PowerPC 64-bit systems wtih direct move (power8, and forthcoming power9).

This patch adds full support for doing vector int initialization in the GPR and
vector registers, rather than creating a stack temporary, doing 4 stores, and
then a vector load (including having an interlock due to having different sizes
of stores vs. loads being done).

In addition to the vector int initialization changes, I separated vector int
from vector float initialization insns.  In looking at vector float, I noticed
that there were places that used the old preferred register class mechanism
that was never used, and I eliminated the preferred register class
alternatives.  I also noticed that the scalar alternatives for float were not
modified to allow float scalar variables to be in Altivec registers.

Finally, in editing the code, I noticed that we were using an explicit XOR to
initialize a register to all 0's.  I changed this to set the vector to
CONST0_RTX (), which mirrors similar changes I've did on May 15th on the
normal vector moves.

I ran all of the Spec 2006 benchmark suite that I normally run and there were
no significant timing differences between using this patch and the base
compiler.  Originally there was a regression in tonto, but it was fixed when
Alan's patch on August 18th was applied to the trunk.

I wrote a program that did a lot of vector initializations and some simple
vector adds, and it is 5.7% faster for vector initialization of 4 independent
variables, and 7.7% faster if all of the elements are the same.

I have built bootstrap compilers and have run make check on these patches on a
big endian Power8 system and a little endian Power8 system with no regressions.
Previous versions of the patch did boostrap and had no regressions on a big
endian Power7 system.  Are these patches ok to install on the trunk?

[gcc]
2016-08-19  Michael Meissner  

* config/rs6000/rs6000-protos.h (rs6000_split_v4si_init): Add
declaration.
* config/rs6000/rs6000.c (rs6000_expand_vector_init): Set
initialization of all 0's to the 0 constant, instead of directly
generating XOR.  Add support for V4SImode vector initialization on
64-bit systems with direct move, and rework the ISA 3.0 V4SImode
initialization.  Change variables used in V4SFmode vector
intialization.  For V4SFmode vector splat on ISA 3.0, make sure
any memory addresses are in index form.
(regno_or_subregno): New helper function to return a register
number for either REG or SUBREG.
(rs6000_adjust_vec_address): Do not generate ADDI ,R0,.
Use regno_or_subregno where possible.
(rs6000_split_v4si_init_di_reg): New helper function to build up a
DImode value from two SImode values in order to generate V4SImode
vector initialization on 64-bit systems with direct move.
(rs6000_split_v4si_init): Split up the insns for a V4SImode vector
initialization.
(rtx_is_swappable_p): V4SImode vector initialization insn is not
swappable.
* config/rs6000/vsx.md (UNSPEC_VSX_VEC_INIT): New unspec.
(vsx_concat_v2sf): Eliminate using 'preferred' register classes.
Allow SFmode values to come from Altivec registers.
(vsx_init_v4si): New insn/split for V4SImode vector initialization
on 64-bit systems with direct move.
(vsx_splat_, VSX_W iterator): Rework V4SImode and V4SFmode
vector initializations, to allow V4SImode vector initializations
on 64-bit systems with direct move.
(vsx_splat_v4si): Likewise.
(vsx_splat_v4si_di): Likewise.
(vsx_splat_v4sf): Likewise.
(vsx_splat_v4sf_internal): Likewise.
(vsx_xxspltw_, VSX_W iterator): Eliminate using 'preferred'
register classes.
(vsx_xxspltw__direct, VSX_W iterator): Likewise.
* config/rs6000/rs6000.h (TARGET_DIRECT_MOVE_64BIT): Disallow
optimization if -maltivec=be.

[gcc/testsuite]
2016-08-19  Michael Meissner  

* gcc.target/powerpc/vec-init-1.c: Add tests where the vector is
being created from pointers to memory locations.
* gcc.target/powerpc/vec-init-2.c: Likewise.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/rs6000-protos.h
===
--- gcc/config/rs6000/rs6000-protos.h   
(.../svn+ssh://meiss...@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)
(revision 239554)
+++ gcc/config/rs6000/rs6000-protos.h   (.../gcc/config/rs6000) (working copy)
@@ -65,6 +65,7 @@ extern void rs6000_expand_vector_set (rt
 extern void rs6000_expand_vector_extract (rtx, rtx, rtx);
 extern void rs6000_split_vec_extract_var (rtx, rtx, rtx, rtx, rtx);
 extern rtx rs6000_adjust_vec_address (rtx, rt

[PATCH] tree-optimization/71831 - __builtin_object_size poor results with no optimization

2016-08-19 Thread Martin Sebor

As requested in the review of the following patch

  https://gcc.gnu.org/ml/gcc-patches/2016-08/msg01363.html

attached is the small enhancement to compute_builtin_object_size to
make the function usable even without optimization without the full
overhead of the tree-object-size pass.  The enhancement (disabled
when optimization is enabled so as not to change the results there)
is relied on by the -Wformat-length patch.

The patch looks bigger than it actually is because:

1) It modifies the return type of the function to bool rather than
   unsigned HOST_WIDE_INT representing the object size (this was
   necessary to avoid having its callers misinterpret zero as
   unknown when it means zero bytes).

2) As a result of a small change to the conditional that controls
   the main algorithm of the compute_builtin_object_size function
   it changes the depth of its indentation (without actually
   changing any of the code there).

Martin
PR tree-optimization/71831 - __builtin_object_size poor results with no
	optimization

gcc/testsuite/ChangeLog:
2016-08-19  Martin Sebor  

	PR tree-optimization/71831
	* gcc.dg/builtin-object-size-16.c: New test.

gcc/ChangeLog:
2016-08-19  Martin Sebor  

	PR tree-optimization/71831
	* tree-object-size.h: Return bool instead of the size and add
	argument for the size.
	* tree-object-size.c (compute_object_offset): Update signature.
	(addr_object_size): Same.
	(compute_builtin_object_size): Return bool instead of the size
	and add argument for the size.  Handle POINTER_PLUS_EXPR when
	optimization is disabled.
	(expr_object_size): Adjust.
	(plus_stmt_object_size): Adjust.
	(pass_object_sizes::execute): Adjust.
	* builtins.c (fold_builtin_object_size): Adjust.
	* doc/extend.texi (Object Size Checking): Update.
	* ubsan.c (instrument_object_size): Adjust.

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 03a0dc8..5d0c1af 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -9610,7 +9610,7 @@ fold_builtin_object_size (tree ptr, tree ost)
 
   if (TREE_CODE (ptr) == ADDR_EXPR)
 {
-  bytes = compute_builtin_object_size (ptr, object_size_type);
+  compute_builtin_object_size (ptr, object_size_type, &bytes);
   if (wi::fits_to_tree_p (bytes, size_type_node))
 	return build_int_cstu (size_type_node, bytes);
 }
@@ -9619,9 +9619,8 @@ fold_builtin_object_size (tree ptr, tree ost)
   /* If object size is not known yet, delay folding until
later.  Maybe subsequent passes will help determining
it.  */
-  bytes = compute_builtin_object_size (ptr, object_size_type);
-  if (bytes != (unsigned HOST_WIDE_INT) (object_size_type < 2 ? -1 : 0)
-  && wi::fits_to_tree_p (bytes, size_type_node))
+  if (compute_builtin_object_size (ptr, object_size_type, &bytes)
+	  && wi::fits_to_tree_p (bytes, size_type_node))
 	return build_int_cstu (size_type_node, bytes);
 }
 
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 5285e00..6cf64f5 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -10009,8 +10009,15 @@ __atomic_store_n(&lockvar, 0, __ATOMIC_RELEASE|__ATOMIC_HLE_RELEASE);
 @findex __builtin___fprintf_chk
 @findex __builtin___vfprintf_chk
 
-GCC implements a limited buffer overflow protection mechanism
-that can prevent some buffer overflow attacks.
+GCC implements a limited buffer overflow protection mechanism that can
+prevent some buffer overflow attacks by determining the sizes of objects
+into which data is about to be written and preventing the writes when
+the size isn't sufficient.  The built-in functions described below yield
+the best results when used together and when optimization is enabled.
+For example, to detect object sizes across function boundaries or to
+follow pointer assignments through non-trivial control flow they rely
+on various optimization passes enabled with @option{-O2}.  However, to
+a limited extent, they can be used without optimization as well.
 
 @deftypefn {Built-in Function} {size_t} __builtin_object_size (void * @var{ptr}, int @var{type})
 is a built-in construct that returns a constant number of bytes from
diff --git a/gcc/testsuite/gcc.dg/builtin-object-size-16.c b/gcc/testsuite/gcc.dg/builtin-object-size-16.c
new file mode 100644
index 000..c2dbe76
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/builtin-object-size-16.c
@@ -0,0 +1,195 @@
+/* PR 71831 - __builtin_object_size poor results with no optimization
+   Verify that even without optimization __builtin_object_size returns
+   a meaningful result for a subset of simple expressins.  In cases
+   where the result could not easily be made to match the one obtained
+   with optimization the built-in was made to fail instead.  */
+/* { dg-do run } */
+/* { dg-options "-O0" } */
+
+static int nfails;
+
+#define TEST_FAILURE(line, obj, type, expect, result)		\
+  __builtin_printf ("FAIL: line %i: __builtin_object_size("	\
+		#obj ", %i) == %zu, got %zu\n",		\
+		line, type, expect, result), ++nfails
+
+#define bos(obj, type

Re: [Patch] Disable text mode translation in ada for Cygwin

2016-08-19 Thread JonY
On 8/19/2016 20:49, Arnaud Charlet wrote:
>>> Text mode translation should not be done for Cygwin, especially since it
>>> does not
>>> support unicode setmode calls. This also fixes ada builds for Cygwin.
>>>
>>> OK for trunk?
>>
>> Ping?
> 
> Can you send the link to your original submission for easy retrieval?
> 
> Arno
> 

Bottom of the page:
https://patchwork.ozlabs.org/patch/626650/




signature.asc
Description: OpenPGP digital signature


Re: [PATCH] Restrict jump threading statement simplifier to scalar types (PR71077)

2016-08-19 Thread Patrick Palka
On Fri, 19 Aug 2016, Yuri Rumyantsev wrote:

> Hi,
> 
> Here is a simple test-case to reproduce 176.gcc failure (I run it on
> Haswell machine).
> Using 20160819 compiler build we get:
> gcc -O3 -m32 -mavx2 test.c -o test.ref.exe
> /users/ysrumyan/isse_6866$ ./test.ref.exe
> Aborted (core dumped)
> 
> If I apply patch proposed by Patrick test runs properly
> Instead of running we can check number of .jump thread.

Thanks!  With this test case I was able to identify the cause of the
wrong code generation.

Turns out that the problem lies in fold-const.c.  It does not correctly
fold VECTOR_CST comparisons that have a scalar boolean result type.  In
particular fold_binary folds the comparison {1,1,1} != {0,0,0} to false
which causes the threader to register an incorrect jump thread.  In
general VEC1 != VEC2 gets folded as if it were VEC1 == VEC2.

This patch fixes the problematic folding logic.

Does this look OK to commit after bootstrap + regtesting?  The faulty
logic was introduced by the fix for PR68542 (r232518) so I think it is
present on the 6 branch as well.

gcc/ChangeLog:

PR tree-optimization/71044
PR tree-optimization/68542
* fold-const.c (fold_relational_const): Fix folding of
VECTOR_CST comparisons that have a scalar boolean result type.
(test_vector_folding): New static function.
(fold_const_c_tests): Call it.
---
 gcc/fold-const.c | 30 +-
 1 file changed, 25 insertions(+), 5 deletions(-)

diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 30c1e0d..0271ac3 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -13897,7 +13897,6 @@ fold_relational_const (enum tree_code code, tree type, 
tree op0, tree op1)
   if (!VECTOR_TYPE_P (type))
{
  /* Have vector comparison with scalar boolean result.  */
- bool result = true;
  gcc_assert ((code == EQ_EXPR || code == NE_EXPR)
  && VECTOR_CST_NELTS (op0) == VECTOR_CST_NELTS (op1));
  for (unsigned i = 0; i < VECTOR_CST_NELTS (op0); i++)
@@ -13905,11 +13904,12 @@ fold_relational_const (enum tree_code code, tree 
type, tree op0, tree op1)
  tree elem0 = VECTOR_CST_ELT (op0, i);
  tree elem1 = VECTOR_CST_ELT (op1, i);
  tree tmp = fold_relational_const (code, type, elem0, elem1);
- result &= integer_onep (tmp);
+ if (tmp == NULL_TREE)
+   return NULL_TREE;
+ if (integer_zerop (tmp))
+   return constant_boolean_node (false, type);
}
- if (code == NE_EXPR)
-   result = !result;
- return constant_boolean_node (result, type);
+ return constant_boolean_node (true, type);
}
   unsigned count = VECTOR_CST_NELTS (op0);
   tree *elts =  XALLOCAVEC (tree, count);
@@ -14517,12 +14517,32 @@ test_arithmetic_folding ()
   x);
 }
 
+/* Verify that various binary operations on vectors are folded
+   correctly.  */
+
+static void
+test_vector_folding ()
+{
+  tree inner_type = integer_type_node;
+  tree type = build_vector_type (inner_type, 4);
+  tree zero = build_zero_cst (type);
+  tree one = build_one_cst (type);
+
+  /* Verify equality tests that return a scalar boolean result.  */
+  tree res_type = boolean_type_node;
+  ASSERT_FALSE (integer_nonzerop (fold_build2 (EQ_EXPR, res_type, zero, one)));
+  ASSERT_TRUE (integer_nonzerop (fold_build2 (EQ_EXPR, res_type, zero, zero)));
+  ASSERT_TRUE (integer_nonzerop (fold_build2 (NE_EXPR, res_type, zero, one)));
+  ASSERT_FALSE (integer_nonzerop (fold_build2 (NE_EXPR, res_type, one, one)));
+}
+
 /* Run all of the selftests within this file.  */
 
 void
 fold_const_c_tests ()
 {
   test_arithmetic_folding ();
+  test_vector_folding ();
 }
 
 } // namespace selftest
-- 
2.9.3.650.g20ba99f



[PATCH] Improve readability of debug_tree() dumps for SSA_NAME and VECTOR_CST

2016-08-19 Thread Patrick Palka
* For SSA_NAME: Print the ssa name's def stmt on its own line.

Before:
 
unit size 
align 32 symtab 0 alias set -1 canonical type 0x7688a888 
precision 32 min  max  context 
pointer_to_this >
unsigned V8SI
size 
unit size 
align 256 symtab 0 alias set 1 canonical type 0x76a09bd0 nunits 8
pointer_to_this >
visited var def_stmt vect__2.15_101 = 
vect__1.14_99 & { 1, 1, 1, 1, 1, 1, 1, 1 };

version 101>

After:
(gdb) print debug_tree (op0)
 
unit size 
align 32 symtab 0 alias set -1 canonical type 0x7688a888 
precision 32 min  max  context 
pointer_to_this >
unsigned V8SI
size 
unit size 
align 256 symtab 0 alias set 1 canonical type 0x76a09bd0 nunits 8
pointer_to_this >
visited var 
def_stmt vect__2.15_101 = vect__1.14_99 & { 1, 1, 1, 1, 1, 1, 1, 1 };

version 101>

* For VECTOR_CST: Coalesce the output of identical consecutive elt values.

Before:
 
unit size 
align 32 symtab 0 alias set -1 canonical type 0x7688a888 
precision 32 min  max  context 
pointer_to_this >
unsigned V8SI
size 
unit size 
align 256 symtab 0 alias set 1 canonical type 0x76a09bd0 nunits 8
pointer_to_this >
constant
elt0:   
constant 0> elt1:   elt2:   elt3:   elt4:   elt5:   elt6:   elt7:  >

After:
(gdb) print debug_tree (op1)
 
unit size 
align 32 symtab 0 alias set -1 canonical type 0x7688a888 
precision 32 min  max  context 
pointer_to_this >
unsigned V8SI
size 
unit size 
align 256 symtab 0 alias set 1 canonical type 0x76a09bd0 nunits 8
pointer_to_this >
constant
elt0...elt7:   constant 0>>

(I also tested the change on non-uniform VECTOR_CSTs.)

Does this look OK to commit after bootstrap + regtesting?

gcc/ChangeLog:

* print-tree.c (print_node) [VECTOR_CST]: Coalesce the output of
identical consecutive elt values.
[SSA_NAME]: Print the name's def stmt on its own line.
---
 gcc/print-tree.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/gcc/print-tree.c b/gcc/print-tree.c
index 468f1ff..c87f901 100644
--- a/gcc/print-tree.c
+++ b/gcc/print-tree.c
@@ -770,8 +770,17 @@ print_node (FILE *file, const char *prefix, tree node, int 
indent)
 
for (i = 0; i < VECTOR_CST_NELTS (node); ++i)
  {
-   sprintf (buf, "elt%u: ", i);
+   unsigned j;
+   for (j = i + 1; j < VECTOR_CST_NELTS (node); j++)
+ if (VECTOR_CST_ELT (node, j) != VECTOR_CST_ELT (node, i))
+   break;
+   j--;
+   if (i == j)
+ sprintf (buf, "elt%u: ", i);
+   else
+ sprintf (buf, "elt%u...elt%u: ", i, j);
print_node (file, buf, VECTOR_CST_ELT (node, i), indent + 4);
+   i = j;
  }
  }
  break;
@@ -869,6 +878,7 @@ print_node (FILE *file, const char *prefix, tree node, int 
indent)
 
case SSA_NAME:
  print_node_brief (file, "var", SSA_NAME_VAR (node), indent + 4);
+ indent_to (file, indent + 4);
  fprintf (file, "def_stmt ");
  print_gimple_stmt (file, SSA_NAME_DEF_STMT (node), indent + 4, 0);
 
-- 
2.9.3.650.g20ba99f



[PATCH] Handle VECTOR_CST in integer_nonzerop()

2016-08-19 Thread Patrick Palka
integer_nonzerop() currently unconditionally returns false for a
VECTOR_CST argument.  This is confusing because one would expect that
integer_onep(x) => integer_nonzerop(x) for all x but that is currently
not the case.  For a VECTOR_CST of all ones i.e. {1,1,1,1},
integer_onep() returns true but integer_nonzerop() returns false.

This patch makes integer_nonzerop() handle VECTOR_CSTs in the obvious
way and also adds some self tests (the last of which fails without the
change).  Does this look OK to commit afetr bootstrap + regtesting on
x86_64-pc-linux-gnu?

gcc/ChangeLog:

* tree.c (integer_nonzerop): Rewrite to use a switch.  Handle
VECTOR_CSTs.
(test_vector_constants): New static function.
(tree_c_tests): Call it.
---
 gcc/tree.c | 43 ++-
 1 file changed, 38 insertions(+), 5 deletions(-)

diff --git a/gcc/tree.c b/gcc/tree.c
index 33e6f97..48795c7 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -2439,11 +2439,24 @@ integer_pow2p (const_tree expr)
 int
 integer_nonzerop (const_tree expr)
 {
-  return ((TREE_CODE (expr) == INTEGER_CST
-  && !wi::eq_p (expr, 0))
- || (TREE_CODE (expr) == COMPLEX_CST
- && (integer_nonzerop (TREE_REALPART (expr))
- || integer_nonzerop (TREE_IMAGPART (expr);
+  switch (TREE_CODE (expr))
+{
+case INTEGER_CST:
+  return !wi::eq_p (expr, 0);
+case COMPLEX_CST:
+  return (integer_nonzerop (TREE_REALPART (expr))
+ || integer_nonzerop (TREE_IMAGPART (expr)));
+case VECTOR_CST:
+  {
+   unsigned i;
+   for (i = 0; i < VECTOR_CST_NELTS (expr); ++i)
+ if (integer_nonzerop (VECTOR_CST_ELT (expr, i)))
+   return true;
+   return false;
+  }
+default:
+  return false;
+}
 }
 
 /* Return 1 if EXPR is the integer constant one.  For vector,
@@ -14230,6 +14243,25 @@ test_integer_constants ()
   ASSERT_EQ (type, TREE_TYPE (zero));
 }
 
+/* Verify various predicates and operations on vector constants.  */
+
+static void
+test_vector_constants ()
+{
+  tree inner_type = integer_type_node;
+  tree type = build_vector_type (inner_type, 8);
+  tree zero = build_zero_cst (type);
+  tree one = build_one_cst (type);
+
+  ASSERT_TRUE (integer_zerop (zero));
+  ASSERT_FALSE (integer_onep (zero));
+  ASSERT_FALSE (integer_nonzerop (zero));
+
+  ASSERT_FALSE (integer_zerop (one));
+  ASSERT_TRUE (integer_onep (one));
+  ASSERT_TRUE (integer_nonzerop (one));
+}
+
 /* Verify identifiers.  */
 
 static void
@@ -14258,6 +14290,7 @@ void
 tree_c_tests ()
 {
   test_integer_constants ();
+  test_vector_constants ();
   test_identifiers ();
   test_labels ();
 }
-- 
2.9.3.650.g20ba99f



Re: [PATCH] newlib-stdint.h: Remove 32 bit longs

2016-08-19 Thread Andrew Pinski
On Fri, Aug 19, 2016 at 12:16 PM, Andy Ross  wrote:
> We ran into this issue in the Zephyr project with our toolchain (gcc
> built with --enable-newlib).  Basically GCC appears to be honoring a
> legacy requirement to give newlib a "long" instead of "int" for
> __INT32_TYPE__, which then leaks out through the current newlib
> headers as a long-valued int32_t, which produces gcc warnings when a
> uint32_t is passed to an unqualified printf format specifier like
> "%d".
>
> But the newlib headers, if __INT32_TYPE__ is *not* defined by the
> compiler, have code to chose a "int" over "long" immediately
> thereafter.  It seems whatever requirement this was honoring isn't
> valid anymore.

Couple of things missing here.  A changelog is the first thing.
The second thing is it seems like Zephyr project should be using the
PRIdx, etc. instead of just %d for int32_t to be portable.

Thanks,
Andrew


>
> From 784fb1760a930d0309f878bbae7bfd38137f5689 Mon Sep 17 00:00:00 2001
> From: Andy Ross 
> Date: Fri, 19 Aug 2016 09:40:42 -0700
> Subject: [PATCH] newlib-stdint.h: Remove 32 bit longs
>
> This would make __INT32_TYPE__ a "long" instead of an "int", which
> would then percolate down in newlib's own headers into a typedef for
> int32_t.  Which is wrong.  Newlib's headers, if __INT32_TYPE__ were
> not defined, actually would chose an int for this type.  The comment
> that newlib uses a 32 bit long appears to be a lie, perhaps
> historical.
>
> Signed-off-by: Andy Ross 
> ---
>  gcc/config/newlib-stdint.h | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/gcc/config/newlib-stdint.h b/gcc/config/newlib-stdint.h
> index eb99556..0275948 100644
> --- a/gcc/config/newlib-stdint.h
> +++ b/gcc/config/newlib-stdint.h
> @@ -22,10 +22,9 @@ a copy of the GCC Runtime Library Exception along with 
> this program;
>  see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
>  .  */
>
> -/* newlib uses 32-bit long in certain cases for all non-SPU
> -   targets.  */
> +/* newlib used to use a 32-bit long, no longer */
>  #ifndef STDINT_LONG32
> -#define STDINT_LONG32 (LONG_TYPE_SIZE == 32)
> +#define STDINT_LONG32 0
>  #endif
>
>  #define SIG_ATOMIC_TYPE "int"
> --
> 2.7.4
>


Re: [PATCH] Handle VECTOR_CST in integer_nonzerop()

2016-08-19 Thread Patrick Palka
On Fri, Aug 19, 2016 at 7:30 PM, Patrick Palka  wrote:
> integer_nonzerop() currently unconditionally returns false for a
> VECTOR_CST argument.  This is confusing because one would expect that
> integer_onep(x) => integer_nonzerop(x) for all x but that is currently
> not the case.  For a VECTOR_CST of all ones i.e. {1,1,1,1},
> integer_onep() returns true but integer_nonzerop() returns false.
>
> This patch makes integer_nonzerop() handle VECTOR_CSTs in the obvious
> way and also adds some self tests (the last of which fails without the
> change).  Does this look OK to commit afetr bootstrap + regtesting on
> x86_64-pc-linux-gnu?

Actually I guess there is some ambiguity as to whether
integer_nonzerop() should return true for a VECTOR_CST only if it has
at least one non-zero element or only if all of its elements are
non-zero...

>
> gcc/ChangeLog:
>
> * tree.c (integer_nonzerop): Rewrite to use a switch.  Handle
> VECTOR_CSTs.
> (test_vector_constants): New static function.
> (tree_c_tests): Call it.
> ---
>  gcc/tree.c | 43 ++-
>  1 file changed, 38 insertions(+), 5 deletions(-)
>
> diff --git a/gcc/tree.c b/gcc/tree.c
> index 33e6f97..48795c7 100644
> --- a/gcc/tree.c
> +++ b/gcc/tree.c
> @@ -2439,11 +2439,24 @@ integer_pow2p (const_tree expr)
>  int
>  integer_nonzerop (const_tree expr)
>  {
> -  return ((TREE_CODE (expr) == INTEGER_CST
> -  && !wi::eq_p (expr, 0))
> - || (TREE_CODE (expr) == COMPLEX_CST
> - && (integer_nonzerop (TREE_REALPART (expr))
> - || integer_nonzerop (TREE_IMAGPART (expr);
> +  switch (TREE_CODE (expr))
> +{
> +case INTEGER_CST:
> +  return !wi::eq_p (expr, 0);
> +case COMPLEX_CST:
> +  return (integer_nonzerop (TREE_REALPART (expr))
> + || integer_nonzerop (TREE_IMAGPART (expr)));
> +case VECTOR_CST:
> +  {
> +   unsigned i;
> +   for (i = 0; i < VECTOR_CST_NELTS (expr); ++i)
> + if (integer_nonzerop (VECTOR_CST_ELT (expr, i)))
> +   return true;
> +   return false;
> +  }
> +default:
> +  return false;
> +}
>  }
>
>  /* Return 1 if EXPR is the integer constant one.  For vector,
> @@ -14230,6 +14243,25 @@ test_integer_constants ()
>ASSERT_EQ (type, TREE_TYPE (zero));
>  }
>
> +/* Verify various predicates and operations on vector constants.  */
> +
> +static void
> +test_vector_constants ()
> +{
> +  tree inner_type = integer_type_node;
> +  tree type = build_vector_type (inner_type, 8);
> +  tree zero = build_zero_cst (type);
> +  tree one = build_one_cst (type);
> +
> +  ASSERT_TRUE (integer_zerop (zero));
> +  ASSERT_FALSE (integer_onep (zero));
> +  ASSERT_FALSE (integer_nonzerop (zero));
> +
> +  ASSERT_FALSE (integer_zerop (one));
> +  ASSERT_TRUE (integer_onep (one));
> +  ASSERT_TRUE (integer_nonzerop (one));
> +}
> +
>  /* Verify identifiers.  */
>
>  static void
> @@ -14258,6 +14290,7 @@ void
>  tree_c_tests ()
>  {
>test_integer_constants ();
> +  test_vector_constants ();
>test_identifiers ();
>test_labels ();
>  }
> --
> 2.9.3.650.g20ba99f
>


[PATCH], Patch #6, Improve vector short/char splat initialization on PowerPC

2016-08-19 Thread Michael Meissner
This patch is a follow up to patch #5.  It adds the support to use the Altivec
VSPLTB/VSPLTH instructions if you are creating a vector char or vector short
where each element is the same (but not constant) on 64-bit systems with direct
move.

The patch has been part of the larger set of patches for vector initialization
that I've been testing for awhile.  Most of those patches were submitted in
patch #5, and in this patch (#6).

There are a few patches remaining that cause a 4% performance degradation in
the zeusmp benchmark (everything else with the larger set of patches is about
the same performance).  I built and ran zeusmp, and these particular patches do
not cause the degradation.  I will submit a full run over the weekend just to
be sure.

I tested these patches on a big endian Power8 system and a little endian Power8
system, and previous versions have run on a big endian Power7 system.  There
were no regressions caused by these patches.  Can I install these patches in
the GCC 7 trunk after the patches in patch #5 are installed?

[gcc]
2016-08-19  Michael Meissner  

* config/rs6000/rs6000.c (rs6000_expand_vector_init): Add support
for using VSPLTH/VSPLTB to initialize vector short and vector char
vectors with all of the same element.

* config/rs6000/vsx.md (VSX_SPLAT_I): New mode iterators and
attributes to initialize V8HImode and V16QImode vectors with the
same element.
(VSX_SPLAT_COUNT): Likewise.
(VSX_SPLAT_SUFFIX): Likewise.
(vsx_vsplt_di): New insns to support
initializing V8HImode and V16QImode vectors with the same
element.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 239627)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -6827,6 +6827,32 @@ rs6000_expand_vector_init (rtx target, r
   return;
 }
 
+  /* Special case initializing vector short/char that are splats if we are on
+ 64-bit systems with direct move.  */
+  if (all_same && TARGET_DIRECT_MOVE_64BIT
+  && (mode == V16QImode || mode == V8HImode))
+{
+  rtx op0 = XVECEXP (vals, 0, 0);
+  rtx di_tmp = gen_reg_rtx (DImode);
+
+  if (!REG_P (op0))
+   op0 = force_reg (GET_MODE_INNER (mode), op0);
+
+  if (mode == V16QImode)
+   {
+ emit_insn (gen_zero_extendqidi2 (di_tmp, op0));
+ emit_insn (gen_vsx_vspltb_di (target, di_tmp));
+ return;
+   }
+
+  if (mode == V8HImode)
+   {
+ emit_insn (gen_zero_extendhidi2 (di_tmp, op0));
+ emit_insn (gen_vsx_vsplth_di (target, di_tmp));
+ return;
+   }
+}
+
   /* Store value to stack temp.  Load vector element.  Splat.  However, splat
  of 64-bit items is not supported on Altivec.  */
   if (all_same && GET_MODE_SIZE (inner_mode) <= 4)
Index: gcc/config/rs6000/vsx.md
===
--- gcc/config/rs6000/vsx.md(revision 239588)
+++ gcc/config/rs6000/vsx.md(working copy)
@@ -281,6 +281,16 @@ (define_mode_attr VSX_EX [(V16QI "v")
  (V8HI  "v")
  (V4SI  "wa")])
 
+;; Iterator for the 2 short vector types to do a splat from an integer
+(define_mode_iterator VSX_SPLAT_I [V16QI V8HI])
+
+;; Mode attribute to give the count for the splat instruction to splat
+;; the value in the 64-bit integer slot
+(define_mode_attr VSX_SPLAT_COUNT [(V16QI "7") (V8HI "3")])
+
+;; Mode attribute to give the suffix for the splat instruction
+(define_mode_attr VSX_SPLAT_SUFFIX [(V16QI "b") (V8HI "h")])
+
 ;; Constants for creating unspecs
 (define_c_enum "unspec"
   [UNSPEC_VSX_CONCAT
@@ -2766,6 +2776,16 @@ (define_insn "vsx_xxspltw__direct"
   "xxspltw %x0,%x1,%2"
   [(set_attr "type" "vecperm")])
 
+;; V16QI/V8HI splat support on ISA 2.07
+(define_insn "vsx_vsplt_di"
+  [(set (match_operand:VSX_SPLAT_I 0 "altivec_register_operand" "=v")
+   (vec_duplicate:VSX_SPLAT_I
+(truncate:
+ (match_operand:DI 1 "altivec_register_operand" "v"]
+  "VECTOR_MEM_VSX_P (mode) && TARGET_DIRECT_MOVE_64BIT"
+  "vsplt %0,%1,"
+  [(set_attr "type" "vecperm")])
+
 ;; V2DF/V2DI splat for use by vec_splat builtin
 (define_insn "vsx_xxspltd_"
   [(set (match_operand:VSX_D 0 "vsx_register_operand" "=wa")


[PATCH] selftest.h: add ASSERT_STR_CONTAINS

2016-08-19 Thread David Malcolm
More enabling work for some new selftests I'm working on.

Successfully bootstrapped®rtested on x86_64-pc-linux-gnu.

OK for trunk?

gcc/ChangeLog:
* selftest.c (selftest::assert_str_contains): New function.
(selftest::test_assertions): Verify ASSERT_STR_CONTAINS.
* selftest.h (selftest::assert_str_contains): New decl.
(ASSERT_STR_CONTAINS): New macro.
---
 gcc/selftest.c | 34 ++
 gcc/selftest.h | 19 +++
 2 files changed, 53 insertions(+)

diff --git a/gcc/selftest.c b/gcc/selftest.c
index d25f5c0..629db98 100644
--- a/gcc/selftest.c
+++ b/gcc/selftest.c
@@ -87,6 +87,39 @@ selftest::assert_streq (const location &loc,
 desc_expected, desc_actual, val_expected, val_actual);
 }
 
+/* Implementation detail of ASSERT_STR_CONTAINS.
+   Use strstr to determine if val_needle is is within val_haystack.
+   ::selftest::pass if it is found.
+   ::selftest::fail if it is not found.  */
+
+void
+selftest::assert_str_contains (const location &loc,
+  const char *desc_haystack,
+  const char *desc_needle,
+  const char *val_haystack,
+  const char *val_needle)
+{
+  /* If val_haystack is NULL, fail with a custom error message.  */
+  if (val_haystack == NULL)
+::selftest::fail_formatted
+   (loc, "ASSERT_STR_CONTAINS (%s, %s) haystack=NULL",
+desc_haystack, desc_needle);
+
+  /* If val_needle is NULL, fail with a custom error message.  */
+  if (val_needle == NULL)
+::selftest::fail_formatted
+   (loc, "ASSERT_STR_CONTAINS (%s, %s) haystack=\"%s\" needle=NULL",
+desc_haystack, desc_needle, val_haystack);
+
+  const char *test = strstr (val_haystack, val_needle);
+  if (test)
+::selftest::pass (loc, "ASSERT_STR_CONTAINS");
+  else
+::selftest::fail_formatted
+   (loc, "ASSERT_STR_CONTAINS (%s, %s) haystack=\"%s\" needle=\"%s\"",
+desc_haystack, desc_needle, val_haystack, val_needle);
+}
+
 /* Constructor.  Create a tempfile using SUFFIX, and write CONTENT to
it.  Abort if anything goes wrong, using LOC as the effective
location in the problem report.  */
@@ -131,6 +164,7 @@ test_assertions ()
   ASSERT_NE (1, 2);
   ASSERT_STREQ ("test", "test");
   ASSERT_STREQ_AT (SELFTEST_LOCATION, "test", "test");
+  ASSERT_STR_CONTAINS ("foo bar baz", "bar");
 }
 
 /* Run all of the selftests within this file.  */
diff --git a/gcc/selftest.h b/gcc/selftest.h
index 58a40f6..b073ed6 100644
--- a/gcc/selftest.h
+++ b/gcc/selftest.h
@@ -69,6 +69,14 @@ extern void assert_streq (const location &loc,
  const char *desc_expected, const char *desc_actual,
  const char *val_expected, const char *val_actual);
 
+/* Implementation detail of ASSERT_STR_CONTAINS.  */
+
+extern void assert_str_contains (const location &loc,
+const char *desc_haystack,
+const char *desc_needle,
+const char *val_haystack,
+const char *val_needle);
+
 /* A class for writing out a temporary sourcefile for use in selftests
of input handling.  */
 
@@ -249,6 +257,17 @@ extern int num_passes;
(EXPECTED), (ACTUAL));  \
   SELFTEST_END_STMT
 
+/* Evaluate HAYSTACK and NEEDLE and use strstr to determine if NEEDLE
+   is within HAYSTACK.
+   ::selftest::pass if NEEDLE is found.
+   ::selftest::fail if it is not found.  */
+
+#define ASSERT_STR_CONTAINS(HAYSTACK, NEEDLE)  \
+  SELFTEST_BEGIN_STMT  \
+  ::selftest::assert_str_contains (SELFTEST_LOCATION, #HAYSTACK, #NEEDLE, \
+  (HAYSTACK), (NEEDLE));   \
+  SELFTEST_END_STMT
+
 /* Evaluate PRED1 (VAL1), calling ::selftest::pass if it is true,
::selftest::fail if it is false.  */
 
-- 
1.8.5.3



Re: [PATCH] Restrict jump threading statement simplifier to scalar types (PR71077)

2016-08-19 Thread David Malcolm
On Fri, 2016-08-19 at 19:25 -0400, Patrick Palka wrote:
> On Fri, 19 Aug 2016, Yuri Rumyantsev wrote:
> 
> > Hi,
> > 
> > Here is a simple test-case to reproduce 176.gcc failure (I run it
> > on
> > Haswell machine).
> > Using 20160819 compiler build we get:
> > gcc -O3 -m32 -mavx2 test.c -o test.ref.exe
> > /users/ysrumyan/isse_6866$ ./test.ref.exe
> > Aborted (core dumped)
> > 
> > If I apply patch proposed by Patrick test runs properly
> > Instead of running we can check number of .jump thread.
> 
> Thanks!  With this test case I was able to identify the cause of the
> wrong code generation.
> 
> Turns out that the problem lies in fold-const.c.  It does not
> correctly
> fold VECTOR_CST comparisons that have a scalar boolean result type. 
>  In
> particular fold_binary folds the comparison {1,1,1} != {0,0,0} to
> false
> which causes the threader to register an incorrect jump thread.  In
> general VEC1 != VEC2 gets folded as if it were VEC1 == VEC2.
> 
> This patch fixes the problematic folding logic.
> 
> Does this look OK to commit after bootstrap + regtesting?  The faulty
> logic was introduced by the fix for PR68542 (r232518) so I think it
> is
> present on the 6 branch as well.
> 
> gcc/ChangeLog:
> 
>   PR tree-optimization/71044
>   PR tree-optimization/68542
>   * fold-const.c (fold_relational_const): Fix folding of
>   VECTOR_CST comparisons that have a scalar boolean result type.
>   (test_vector_folding): New static function.
>   (fold_const_c_tests): Call it.

FWIW I don't know if we have any policy about this, but I've been
spelling out namespaces such as "selftest" in ChangeLog entries, in the
hope that it makes it easier to distinguish the "real" code from the
selftests.  So the above might read:

gcc/ChangeLog:

PR tree-optimization/71044
PR tree-optimization/68542
* fold-const.c (fold_relational_const): Fix folding of
VECTOR_CST comparisons that have a scalar boolean result type.
(selftest::test_vector_folding): New static function.
(selftest::fold_const_c_tests): Call it.

Also, if there's already a source file that reproduces the issue, is it
worth turning it into a DejaGnu test to complement the selftests?  (for
both end-to-end testing *and* unit-testing, a "belt and braces"
approach).

Hope this is constructive
Dave

> ---
>  gcc/fold-const.c | 30 +-
>  1 file changed, 25 insertions(+), 5 deletions(-)
> 
> diff --git a/gcc/fold-const.c b/gcc/fold-const.c
> index 30c1e0d..0271ac3 100644
> --- a/gcc/fold-const.c
> +++ b/gcc/fold-const.c
> @@ -13897,7 +13897,6 @@ fold_relational_const (enum tree_code code,
> tree type, tree op0, tree op1)
>if (!VECTOR_TYPE_P (type))
>   {
> /* Have vector comparison with scalar boolean result.  */
> -   bool result = true;
> gcc_assert ((code == EQ_EXPR || code == NE_EXPR)
> && VECTOR_CST_NELTS (op0) == VECTOR_CST_NELTS
> (op1));
> for (unsigned i = 0; i < VECTOR_CST_NELTS (op0); i++)
> @@ -13905,11 +13904,12 @@ fold_relational_const (enum tree_code code,
> tree type, tree op0, tree op1)
> tree elem0 = VECTOR_CST_ELT (op0, i);
> tree elem1 = VECTOR_CST_ELT (op1, i);
> tree tmp = fold_relational_const (code, type, elem0,
> elem1);
> -   result &= integer_onep (tmp);
> +   if (tmp == NULL_TREE)
> + return NULL_TREE;
> +   if (integer_zerop (tmp))
> + return constant_boolean_node (false, type);
>   }
> -   if (code == NE_EXPR)
> - result = !result;
> -   return constant_boolean_node (result, type);
> +   return constant_boolean_node (true, type);
>   }
>unsigned count = VECTOR_CST_NELTS (op0);
>tree *elts =  XALLOCAVEC (tree, count);
> @@ -14517,12 +14517,32 @@ test_arithmetic_folding ()
>  x);
>  }
>  
> +/* Verify that various binary operations on vectors are folded
> +   correctly.  */
> +
> +static void
> +test_vector_folding ()
> +{
> +  tree inner_type = integer_type_node;
> +  tree type = build_vector_type (inner_type, 4);
> +  tree zero = build_zero_cst (type);
> +  tree one = build_one_cst (type);
> +
> +  /* Verify equality tests that return a scalar boolean result.  */
> +  tree res_type = boolean_type_node;
> +  ASSERT_FALSE (integer_nonzerop (fold_build2 (EQ_EXPR, res_type,
> zero, one)));
> +  ASSERT_TRUE (integer_nonzerop (fold_build2 (EQ_EXPR, res_type,
> zero, zero)));
> +  ASSERT_TRUE (integer_nonzerop (fold_build2 (NE_EXPR, res_type,
> zero, one)));
> +  ASSERT_FALSE (integer_nonzerop (fold_build2 (NE_EXPR, res_type,
> one, one)));
> +}
> +
>  /* Run all of the selftests within this file.  */
>  
>  void
>  fold_const_c_tests ()
>  {
>test_arithmetic_folding ();
> +  test_vector_folding ();
>  }
>  
>  } // namespace selftest


Re: [PATCH] Restrict jump threading statement simplifier to scalar types (PR71077)

2016-08-19 Thread Patrick Palka
On Fri, 19 Aug 2016, David Malcolm wrote:

> On Fri, 2016-08-19 at 19:25 -0400, Patrick Palka wrote:
> > On Fri, 19 Aug 2016, Yuri Rumyantsev wrote:
> > 
> > > Hi,
> > > 
> > > Here is a simple test-case to reproduce 176.gcc failure (I run it
> > > on
> > > Haswell machine).
> > > Using 20160819 compiler build we get:
> > > gcc -O3 -m32 -mavx2 test.c -o test.ref.exe
> > > /users/ysrumyan/isse_6866$ ./test.ref.exe
> > > Aborted (core dumped)
> > > 
> > > If I apply patch proposed by Patrick test runs properly
> > > Instead of running we can check number of .jump thread.
> > 
> > Thanks!  With this test case I was able to identify the cause of the
> > wrong code generation.
> > 
> > Turns out that the problem lies in fold-const.c.  It does not
> > correctly
> > fold VECTOR_CST comparisons that have a scalar boolean result type. 
> >  In
> > particular fold_binary folds the comparison {1,1,1} != {0,0,0} to
> > false
> > which causes the threader to register an incorrect jump thread.  In
> > general VEC1 != VEC2 gets folded as if it were VEC1 == VEC2.
> > 
> > This patch fixes the problematic folding logic.
> > 
> > Does this look OK to commit after bootstrap + regtesting?  The faulty
> > logic was introduced by the fix for PR68542 (r232518) so I think it
> > is
> > present on the 6 branch as well.
> > 
> > gcc/ChangeLog:
> > 
> > PR tree-optimization/71044
> > PR tree-optimization/68542
> > * fold-const.c (fold_relational_const): Fix folding of
> > VECTOR_CST comparisons that have a scalar boolean result type.
> > (test_vector_folding): New static function.
> > (fold_const_c_tests): Call it.
> 
> FWIW I don't know if we have any policy about this, but I've been
> spelling out namespaces such as "selftest" in ChangeLog entries, in the
> hope that it makes it easier to distinguish the "real" code from the
> selftests.  So the above might read:
> 
> gcc/ChangeLog:
> 
>   PR tree-optimization/71044
>   PR tree-optimization/68542
>   * fold-const.c (fold_relational_const): Fix folding of
>   VECTOR_CST comparisons that have a scalar boolean result type.
>   (selftest::test_vector_folding): New static function.
>   (selftest::fold_const_c_tests): Call it.

Will do.

> 
> Also, if there's already a source file that reproduces the issue, is it
> worth turning it into a DejaGnu test to complement the selftests?  (for
> both end-to-end testing *and* unit-testing, a "belt and braces"
> approach).

Turning it into a compile test that counts the number of jumps threaded
seems potentially flaky but I'm not against it.  And I'm not sure how to
reliably turn it into an execution test.  Would the directives

/* { dg-do run }  */
/* { dg-require-effective-target avx2 }  */
/* { dg-require-effective-target ia32 }  */
/* { dg-options "-O3 -mavx2" }  */

work?

> 
> Hope this is constructive
> Dave
> 
> > ---
> >  gcc/fold-const.c | 30 +-
> >  1 file changed, 25 insertions(+), 5 deletions(-)
> > 
> > diff --git a/gcc/fold-const.c b/gcc/fold-const.c
> > index 30c1e0d..0271ac3 100644
> > --- a/gcc/fold-const.c
> > +++ b/gcc/fold-const.c
> > @@ -13897,7 +13897,6 @@ fold_relational_const (enum tree_code code,
> > tree type, tree op0, tree op1)
> >if (!VECTOR_TYPE_P (type))
> > {
> >   /* Have vector comparison with scalar boolean result.  */
> > - bool result = true;
> >   gcc_assert ((code == EQ_EXPR || code == NE_EXPR)
> >   && VECTOR_CST_NELTS (op0) == VECTOR_CST_NELTS
> > (op1));
> >   for (unsigned i = 0; i < VECTOR_CST_NELTS (op0); i++)
> > @@ -13905,11 +13904,12 @@ fold_relational_const (enum tree_code code,
> > tree type, tree op0, tree op1)
> >   tree elem0 = VECTOR_CST_ELT (op0, i);
> >   tree elem1 = VECTOR_CST_ELT (op1, i);
> >   tree tmp = fold_relational_const (code, type, elem0,
> > elem1);
> > - result &= integer_onep (tmp);
> > + if (tmp == NULL_TREE)
> > +   return NULL_TREE;
> > + if (integer_zerop (tmp))
> > +   return constant_boolean_node (false, type);
> > }
> > - if (code == NE_EXPR)
> > -   result = !result;
> > - return constant_boolean_node (result, type);
> > + return constant_boolean_node (true, type);
> > }
> >unsigned count = VECTOR_CST_NELT

Re: [PATCH] Make clone materialization an IPA pass

2016-08-19 Thread Andrew Pinski
On Thu, Aug 18, 2016 at 1:14 AM, Richard Biener  wrote:
>
> The following patch makes it possible to add statistic counters to
> update-ssa.  Clone materialization ends up updating SSA form from
> a context with current_pass being NULL - wrapping materialize_all_clones
> into a pass fixes this.
>
> Bootstrap / regtest running on x86_64-unknown-linux-gnu, ok for trunk?
>
> (simple-ipa-pass was the simplest but not sure if it is the most efficient
> given I remember we somehow pull in all bodies for this?)

This patch introduced an ICE with LTO due to it calling the update_ssa
without being inside a pass.
I filed this as https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77305 .

Thanks,
Andrew

>
> Thanks,
> Richard.
>
> 2016-08-18  Richard Biener  
>
> * tree-pass.h (make_pass_materialize_all_clones): Declare.
> * ipa.c (pass_data_materialize_all_clones, 
> pass_materialize_all_clones,
> make_pass_materialize_all_clones): New simple IPA pass encapsulating
> clone materialization.
> * passes.def (all_late_ipa_passes): Start with
> pass_materialize_all_clones.
> * cgraphunit.c (symbol_table::compile): Remove call to
> materialize_all_clones.
> * tree-into-ssa.c: Include statistics.h.
> (update_ssa): Count number of times we do incremental/rewrite
> SSA update.
>
> Index: gcc/tree-pass.h
> ===
> *** gcc/tree-pass.h (revision 239560)
> --- gcc/tree-pass.h (working copy)
> *** extern ipa_opt_pass_d *make_pass_ipa_pro
> *** 504,509 
> --- 504,511 
>   extern ipa_opt_pass_d *make_pass_ipa_cdtor_merge (gcc::context *ctxt);
>   extern ipa_opt_pass_d *make_pass_ipa_single_use (gcc::context *ctxt);
>   extern ipa_opt_pass_d *make_pass_ipa_comdats (gcc::context *ctxt);
> + extern simple_ipa_opt_pass *make_pass_materialize_all_clones (gcc::context *
> + ctxt);
>
>   extern gimple_opt_pass *make_pass_cleanup_cfg_post_optimizing (gcc::context
>*ctxt);
> Index: gcc/ipa.c
> ===
> *** gcc/ipa.c   (revision 239560)
> --- gcc/ipa.c   (working copy)
> *** make_pass_ipa_single_use (gcc::context *
> *** 1443,1445 
> --- 1443,1486 
>   {
> return new pass_ipa_single_use (ctxt);
>   }
> +
> + /* Materialize all clones.  */
> +
> + namespace {
> +
> + const pass_data pass_data_materialize_all_clones =
> + {
> +   SIMPLE_IPA_PASS, /* type */
> +   "materialize-all-clones", /* name */
> +   OPTGROUP_NONE, /* optinfo_flags */
> +   TV_IPA_OPT, /* tv_id */
> +   0, /* properties_required */
> +   0, /* properties_provided */
> +   0, /* properties_destroyed */
> +   0, /* todo_flags_start */
> +   0, /* todo_flags_finish */
> + };
> +
> + class pass_materialize_all_clones : public simple_ipa_opt_pass
> + {
> + public:
> +   pass_materialize_all_clones (gcc::context *ctxt)
> + : simple_ipa_opt_pass (pass_data_materialize_all_clones, ctxt)
> +   {}
> +
> +   /* opt_pass methods: */
> +   virtual unsigned int execute (function *)
> + {
> +   symtab->materialize_all_clones ();
> +   return 0;
> + }
> +
> + }; // class pass_materialize_all_clones
> +
> + } // anon namespace
> +
> + simple_ipa_opt_pass *
> + make_pass_materialize_all_clones (gcc::context *ctxt)
> + {
> +   return new pass_materialize_all_clones (ctxt);
> + }
> Index: gcc/passes.def
> ===
> *** gcc/passes.def  (revision 239560)
> --- gcc/passes.def  (working copy)
> *** along with GCC; see the file COPYING3.
> *** 167,172 
> --- 167,173 
>passes are executed after partitioning and thus see just parts of the
>compiled unit.  */
> INSERT_PASSES_AFTER (all_late_ipa_passes)
> +   NEXT_PASS (pass_materialize_all_clones);
> NEXT_PASS (pass_ipa_pta);
> NEXT_PASS (pass_dispatcher_calls);
> NEXT_PASS (pass_omp_simd_clone);
> Index: gcc/cgraphunit.c
> ===
> *** gcc/cgraphunit.c(revision 239560)
> --- gcc/cgraphunit.c(working copy)
> *** symbol_table::compile (void)
> *** 2435,2441 
>   fprintf (stderr, "Assembling functions:\n");
> symtab_node::checking_verify_symtab_nodes ();
>
> -   materialize_all_clones ();
> bitmap_obstack_initialize (NULL);
> execute_ipa_pass_list (g->get_passes ()->all_late_ipa_passes);
> bitmap_obstack_release (NULL);
> --- 2438,2443 
> Index: gcc/tree-into-ssa.c
> ===
> *** gcc/tree-into-ssa.c (revision 239560)
> --- gcc/tree-into-ssa.c (working copy)
> *** along with GCC; see the file COPYING3.
> *** 37,42 
> --- 37,43 
>   #include "tree-dfa.h"
>

Re: [PATCH] newlib-stdint.h: Remove 32 bit longs

2016-08-19 Thread Joel Sherrill
RTEMS uses the PRI constants and we don't see warnings. Is there a specific 
test case which would demonstrate this is actually broken. The file 
newlib-stdint.h will impact more targets than Zephyr and I think they owe a 
demo case.

On August 19, 2016 7:37:22 PM EDT, Andrew Pinski  wrote:
>On Fri, Aug 19, 2016 at 12:16 PM, Andy Ross 
>wrote:
>> We ran into this issue in the Zephyr project with our toolchain (gcc
>> built with --enable-newlib).  Basically GCC appears to be honoring a
>> legacy requirement to give newlib a "long" instead of "int" for
>> __INT32_TYPE__, which then leaks out through the current newlib
>> headers as a long-valued int32_t, which produces gcc warnings when a
>> uint32_t is passed to an unqualified printf format specifier like
>> "%d".
>>
>> But the newlib headers, if __INT32_TYPE__ is *not* defined by the
>> compiler, have code to chose a "int" over "long" immediately
>> thereafter.  It seems whatever requirement this was honoring isn't
>> valid anymore.
>
>Couple of things missing here.  A changelog is the first thing.
>The second thing is it seems like Zephyr project should be using the
>PRIdx, etc. instead of just %d for int32_t to be portable.
>
>Thanks,
>Andrew
>
>
>>
>> From 784fb1760a930d0309f878bbae7bfd38137f5689 Mon Sep 17 00:00:00
>2001
>> From: Andy Ross 
>> Date: Fri, 19 Aug 2016 09:40:42 -0700
>> Subject: [PATCH] newlib-stdint.h: Remove 32 bit longs
>>
>> This would make __INT32_TYPE__ a "long" instead of an "int", which
>> would then percolate down in newlib's own headers into a typedef for
>> int32_t.  Which is wrong.  Newlib's headers, if __INT32_TYPE__ were
>> not defined, actually would chose an int for this type.  The comment
>> that newlib uses a 32 bit long appears to be a lie, perhaps
>> historical.
>>
>> Signed-off-by: Andy Ross 
>> ---
>>  gcc/config/newlib-stdint.h | 5 ++---
>>  1 file changed, 2 insertions(+), 3 deletions(-)
>>
>> diff --git a/gcc/config/newlib-stdint.h b/gcc/config/newlib-stdint.h
>> index eb99556..0275948 100644
>> --- a/gcc/config/newlib-stdint.h
>> +++ b/gcc/config/newlib-stdint.h
>> @@ -22,10 +22,9 @@ a copy of the GCC Runtime Library Exception along
>with this program;
>>  see the files COPYING3 and COPYING.RUNTIME respectively.  If not,
>see
>>  .  */
>>
>> -/* newlib uses 32-bit long in certain cases for all non-SPU
>> -   targets.  */
>> +/* newlib used to use a 32-bit long, no longer */
>>  #ifndef STDINT_LONG32
>> -#define STDINT_LONG32 (LONG_TYPE_SIZE == 32)
>> +#define STDINT_LONG32 0
>>  #endif
>>
>>  #define SIG_ATOMIC_TYPE "int"
>> --
>> 2.7.4
>>

--joel


Re: [Patch, testsuite] Skip tests that expect 4 byte alignment for avr

2016-08-19 Thread Mike Stump
On Aug 11, 2016, at 12:40 AM, Senthil Kumar Selvaraj 
 wrote:
> 
>  The below patch adds the AVR target to the list of targets that don't
>  have natural_alignment_32. It also skips ipa/propalign-*.c
>  tests (which expect 4 byte alignment), if both
>  natural_alignment_32 and natural_alignment_64 are false.
> 
>  Is this the right way to fix this? Ok to commit?

LGTM.  Ok.

If anyone else has a reason why it would be bad, don't let me discourage you 
from speaking up.

> +/* { dg-skip-if "No alignment restrictions" { { ! natural_alignment_32 } && 
> { ! natural_alignment_64 } } } */


Re: protected alloca class for malloc fallback

2016-08-19 Thread Mike Stump
On Aug 10, 2016, at 10:03 AM, Oleg Endo  wrote:
> 
> Or just wait until people have agreed to switch to C++11 or C++14.  I
> don't think in practice anybody uses an C++11-incapable GCC to build a
> newer GCC these days.

I use the system gcc 4.4.7 on RHEL to build a newer cross compiler...  I could 
bootstrap a newer native compiler, if I had too.

Re: [PATCH] - improve sprintf buffer overflow detection (middle-end/49905)

2016-08-19 Thread Trevor Saunders
> > > Patch #5 and beyond: Further optimization work.
> > 
> > As one of the next steps I'd like to make this feature available
> > to user-defined sprintf-like functions decorated with attribute
> > format.  To do that, I'm thinking of adding either a fourth
> > (optional) argument to attribute format printf indicating which
> > of the function arguments is the destination buffer (to compute
> > its size), or perhaps a new attribute under its own name.  I'm
> > actually leaning toward latter since I think it could be used
> > in other contexts as well.  I welcome comments and suggestions
> > on this idea.
> Whichever we think will be easier to use and thus encourage folks to
> annotate their code properly :-)

So, sort of related I've been thinking about writing C++ string building
classes some lately.  Where you'd have a fixed length buffer and a
pointer to the next buffer (so a degenerate rope with only one side of
the tree).  One use case for such a thing is building up strings of
assembly to output in gcc.  however one somewhat awkward bit is that we
often format assembly with printf, which isn't really great for this use
case, because you need to deal with the case the current buffer only has
space for part of the string you are formatting.  So it would be nice to
have something like an interuptable printf that tells you where in the
string it stopped formatting, and allows you to continue from there with
a new buffer.
 
 It also seems worth noting that in C++11 you can actually write a
 printf that is typesafe without the format attributes for example Tom
 has one here https://github.com/tromey/typesafe-printf.  I'm not sure
 putting such a thing in libc for C++ programs is a great idea because
 of compile time costs, but its tempting.

 Trev



[PATCH] Fix sporadic failure of gcc.dg/cpp/pr66415-1.c

2016-08-19 Thread Bernd Edlinger
Hi,

it turns out that in this test case the expected output depends on the COLUMNS
setting of the terminal where the test suite is started.  With less than 82 
columns
the multiline output is not as expected.

The reason for that seems to be at diagnostic_set_caret_max_width where
it is dependent on isatty(fileno(stderr)) if get_terminal_width gets called.
And looking at get_terminal_width it is clear that the different COLUMNS
setting make the difference.

With the following patch the test case succeeds independently of the terminal
settings.


Is it OK for trunk?


Thanks
Bernd.2016-08-20  Bernd Edlinger  

	PR c/52952
	* gcc.dg/cpp/pr66415-1.c: Fix sporadic failure.

Index: gcc/testsuite/gcc.dg/cpp/pr66415-1.c
===
--- gcc/testsuite/gcc.dg/cpp/pr66415-1.c	(revision 239624)
+++ gcc/testsuite/gcc.dg/cpp/pr66415-1.c	(working copy)
@@ -1,6 +1,7 @@
 /* PR c/66415 */
 /* { dg-do compile } */
 /* { dg-options "-Wformat -fdiagnostics-show-caret" } */
+/* { dg-set-compiler-env-var COLUMNS "82" } */
 
 void
 fn1 (void)