[patch] Remove superfluous /dev/null on grep line

2016-04-06 Thread Eric Botcazou
Hi,

we recently ran into build failures on Windows systems using a somewhat old 
grep, coming from a syntax error in the libstdc++-symbols.ver version file:

# Symbol versioning for shared libraries.
if ENABLE_SYMVERS
libstdc++-symbols.ver:  ${glibcxx_srcdir}/$(SYMVER_FILE) \
$(port_specific_symbol_files)
cp ${glibcxx_srcdir}/$(SYMVER_FILE) $@.tmp
chmod +w $@.tmp
if test "x$(port_specific_symbol_files)" != x; then \
  if grep '^# Appended to version file.' \
   $(port_specific_symbol_files) /dev/null > /dev/null 2>&1; then 
\
cat $(port_specific_symbol_files) >> $@.tmp; \
  else \
sed -n '1,/DO NOT DELETE/p' $@.tmp > tmp.top; \
sed -n '/DO NOT DELETE/,$$p' $@.tmp > tmp.bottom; \
cat tmp.top $(port_specific_symbol_files) tmp.bottom > $@.tmp; \
rm tmp.top tmp.bottom; \
  fi; \
fi

Note the double /dev/null on the grep command line.  The first one causes the 
grep to fail when the command is invoked on these systems.  That's old code, 
but it is now invoked for config/abi/pre/float128.ver on the mainline and 5 
branch and this breaks the build on these systems (4.9 builds fine).

This first /dev/null doesn't serve any useful purpose and seems to be a typo, 
so the attached patch gets rid of it.

Tested on x86/Windows and x86-64/Linux, OK for mainline and 5 branch?


2016-04-06  Eric Botcazou 

libstdc++-v3/
* src/Makefile.am (libstdc++-symbols.ver): Remove useless /dev/null.
* src/Makefile.in: Regenerate.

-- 
Eric BotcazouIndex: src/Makefile.am
===
--- src/Makefile.am	(revision 234695)
+++ src/Makefile.am	(working copy)
@@ -228,7 +228,7 @@ libstdc++-symbols.ver:  ${glibcxx_srcdir
 	chmod +w $@.tmp
 	if test "x$(port_specific_symbol_files)" != x; then \
 	  if grep '^# Appended to version file.' \
-	   $(port_specific_symbol_files) /dev/null > /dev/null 2>&1; then \
+	   $(port_specific_symbol_files) > /dev/null 2>&1; then \
 	cat $(port_specific_symbol_files) >> $@.tmp; \
 	  else \
 	sed -n '1,/DO NOT DELETE/p' $@.tmp > tmp.top; \


Re: [RFC] introduce --param max-lto-partition for having an upper bound on partition size

2016-04-06 Thread Prathamesh Kulkarni
On 5 April 2016 at 18:28, Richard Biener  wrote:
> On Tue, 5 Apr 2016, Prathamesh Kulkarni wrote:
>
>> On 5 April 2016 at 16:58, Richard Biener  wrote:
>> > On Tue, 5 Apr 2016, Prathamesh Kulkarni wrote:
>> >
>> >> On 4 April 2016 at 19:44, Jan Hubicka  wrote:
>> >> >
>> >> >> diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
>> >> >> index 9eb63c2..bc0c612 100644
>> >> >> --- a/gcc/lto/lto-partition.c
>> >> >> +++ b/gcc/lto/lto-partition.c
>> >> >> @@ -511,9 +511,20 @@ lto_balanced_map (int n_lto_partitions)
>> >> >>varpool_order.qsort (varpool_node_cmp);
>> >> >>
>> >> >>/* Compute partition size and create the first partition.  */
>> >> >> +  if (PARAM_VALUE (MIN_PARTITION_SIZE) > PARAM_VALUE 
>> >> >> (MAX_PARTITION_SIZE))
>> >> >> +fatal_error (input_location, "min partition size cannot be 
>> >> >> greater than max partition size");
>> >> >> +
>> >> >>partition_size = total_size / n_lto_partitions;
>> >> >>if (partition_size < PARAM_VALUE (MIN_PARTITION_SIZE))
>> >> >>  partition_size = PARAM_VALUE (MIN_PARTITION_SIZE);
>> >> >> +  else if (partition_size > PARAM_VALUE (MAX_PARTITION_SIZE))
>> >> >> +{
>> >> >> +  n_lto_partitions = total_size / PARAM_VALUE 
>> >> >> (MAX_PARTITION_SIZE);
>> >> >> +  if (total_size % PARAM_VALUE (MAX_PARTITION_SIZE))
>> >> >> + n_lto_partitions++;
>> >> >> +  partition_size = total_size / n_lto_partitions;
>> >> >> +}
>> >> >
>> >> > lto_balanced_map actually works in a way that looks for cheapest 
>> >> > cutpoint in range
>> >> > 3/4*parittion_size to 2*partition_size and picks the cheapest range.
>> >> > Setting partition_size to this value will thus not cause partitioner to 
>> >> > produce smaller
>> >> > partitions only.  I suppose modify the conditional:
>> >> >
>> >> >   /* Partition is too large, unwind into step when best cost was 
>> >> > reached and
>> >> >  start new partition.  */
>> >> >   if (partition->insns > 2 * partition_size)
>> >> >
>> >> > and/or in the code above set the partition_size to half of 
>> >> > total_size/max_size.
>> >> >
>> >> > I know this is somewhat sloppy.  This was really just first cut 
>> >> > implementation
>> >> > many years ago. I expected to reimplement it marter soon, but then 
>> >> > there was
>> >> > never really a need for it (I am trying to avoid late IPA optimizations 
>> >> > so the
>> >> > partitioning decisions should mostly affect compile time performance 
>> >> > only).
>> >> > If ARM is more sensitive for partitining, perhaps it would make sense 
>> >> > to try to
>> >> > look for something smarter.
>> >> >
>> >> >> +
>> >> >>npartitions = 1;
>> >> >>partition = new_partition ("");
>> >> >>if (symtab->dump_file)
>> >> >> diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c
>> >> >> index 9dd513f..294b8a4 100644
>> >> >> --- a/gcc/lto/lto.c
>> >> >> +++ b/gcc/lto/lto.c
>> >> >> @@ -3112,6 +3112,12 @@ do_whole_program_analysis (void)
>> >> >>timevar_pop (TV_WHOPR_WPA);
>> >> >>
>> >> >>timevar_push (TV_WHOPR_PARTITIONING);
>> >> >> +
>> >> >> +  if (flag_lto_partition != LTO_PARTITION_BALANCED
>> >> >> +  && PARAM_VALUE (MAX_PARTITION_SIZE) != INT_MAX)
>> >> >> +fatal_error (input_location, "--param max-lto-partition should 
>> >> >> only"
>> >> >> +  " be used with balanced partitioning\n");
>> >> >> +
>> >> >
>> >> > I think we should wire in resonable MAX_PARTITION_SIZE default.  THe 
>> >> > value you
>> >> > found experimentally may be a good start. For that reason we can't 
>> >> > really
>> >> > refuse a value when !LTO_PARTITION_BALANCED.  Just document it as 
>> >> > parameter for
>> >> > balanced partitioning only and add a parameter to lto_balanced_map 
>> >> > specifying whether
>> >> > this param should be honored (because the same path is used for 
>> >> > partitioning to one partition)
>> >> >
>> >> > Otherwise the patch looks good to me modulo missing documentation.
>> >> Thanks for the review. I have updated the patch.
>> >> Does this version look OK ?
>> >> I had randomly chosen 1, not sure if that's an appropriate value
>> >> for default.
>> >
>> > I think it's way too small.  This is roughly the number of GIMPLE stmts
>> > (thus roughly the number of instructions).  So with say a 8 byte
>> > instruction format it is on the order of 80kB.  You'd want to have a
>> > default of at least several ten times of large-unit-insns (also 1).
>> > I'd choose sth like 100 (one million).  I find the lto-min-partition
>> > number quite small as well (and up it by a factor of 10).
>> Done in this version.
>
> I'd do that separately.
>
> Please no default parameter for lto_balanced_map (), instead change
> all callers.
>
>> Is it OK after bootstrap+test ?
>
> Note that this is for stage1 only.  I'll leave approval to Honza
> (also verification of the default max param - not sure if for example
> chromium or firefox should/will be split to more than 32 partitions
> with the patch)
Removed d

Re: [RFC] introduce --param max-lto-partition for having an upper bound on partition size

2016-04-06 Thread Richard Biener
On Wed, 6 Apr 2016, Prathamesh Kulkarni wrote:

> On 5 April 2016 at 18:28, Richard Biener  wrote:
> > On Tue, 5 Apr 2016, Prathamesh Kulkarni wrote:
> >
> >> On 5 April 2016 at 16:58, Richard Biener  wrote:
> >> > On Tue, 5 Apr 2016, Prathamesh Kulkarni wrote:
> >> >
> >> >> On 4 April 2016 at 19:44, Jan Hubicka  wrote:
> >> >> >
> >> >> >> diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
> >> >> >> index 9eb63c2..bc0c612 100644
> >> >> >> --- a/gcc/lto/lto-partition.c
> >> >> >> +++ b/gcc/lto/lto-partition.c
> >> >> >> @@ -511,9 +511,20 @@ lto_balanced_map (int n_lto_partitions)
> >> >> >>varpool_order.qsort (varpool_node_cmp);
> >> >> >>
> >> >> >>/* Compute partition size and create the first partition.  */
> >> >> >> +  if (PARAM_VALUE (MIN_PARTITION_SIZE) > PARAM_VALUE 
> >> >> >> (MAX_PARTITION_SIZE))
> >> >> >> +fatal_error (input_location, "min partition size cannot be 
> >> >> >> greater than max partition size");
> >> >> >> +
> >> >> >>partition_size = total_size / n_lto_partitions;
> >> >> >>if (partition_size < PARAM_VALUE (MIN_PARTITION_SIZE))
> >> >> >>  partition_size = PARAM_VALUE (MIN_PARTITION_SIZE);
> >> >> >> +  else if (partition_size > PARAM_VALUE (MAX_PARTITION_SIZE))
> >> >> >> +{
> >> >> >> +  n_lto_partitions = total_size / PARAM_VALUE 
> >> >> >> (MAX_PARTITION_SIZE);
> >> >> >> +  if (total_size % PARAM_VALUE (MAX_PARTITION_SIZE))
> >> >> >> + n_lto_partitions++;
> >> >> >> +  partition_size = total_size / n_lto_partitions;
> >> >> >> +}
> >> >> >
> >> >> > lto_balanced_map actually works in a way that looks for cheapest 
> >> >> > cutpoint in range
> >> >> > 3/4*parittion_size to 2*partition_size and picks the cheapest range.
> >> >> > Setting partition_size to this value will thus not cause partitioner 
> >> >> > to produce smaller
> >> >> > partitions only.  I suppose modify the conditional:
> >> >> >
> >> >> >   /* Partition is too large, unwind into step when best cost was 
> >> >> > reached and
> >> >> >  start new partition.  */
> >> >> >   if (partition->insns > 2 * partition_size)
> >> >> >
> >> >> > and/or in the code above set the partition_size to half of 
> >> >> > total_size/max_size.
> >> >> >
> >> >> > I know this is somewhat sloppy.  This was really just first cut 
> >> >> > implementation
> >> >> > many years ago. I expected to reimplement it marter soon, but then 
> >> >> > there was
> >> >> > never really a need for it (I am trying to avoid late IPA 
> >> >> > optimizations so the
> >> >> > partitioning decisions should mostly affect compile time performance 
> >> >> > only).
> >> >> > If ARM is more sensitive for partitining, perhaps it would make sense 
> >> >> > to try to
> >> >> > look for something smarter.
> >> >> >
> >> >> >> +
> >> >> >>npartitions = 1;
> >> >> >>partition = new_partition ("");
> >> >> >>if (symtab->dump_file)
> >> >> >> diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c
> >> >> >> index 9dd513f..294b8a4 100644
> >> >> >> --- a/gcc/lto/lto.c
> >> >> >> +++ b/gcc/lto/lto.c
> >> >> >> @@ -3112,6 +3112,12 @@ do_whole_program_analysis (void)
> >> >> >>timevar_pop (TV_WHOPR_WPA);
> >> >> >>
> >> >> >>timevar_push (TV_WHOPR_PARTITIONING);
> >> >> >> +
> >> >> >> +  if (flag_lto_partition != LTO_PARTITION_BALANCED
> >> >> >> +  && PARAM_VALUE (MAX_PARTITION_SIZE) != INT_MAX)
> >> >> >> +fatal_error (input_location, "--param max-lto-partition should 
> >> >> >> only"
> >> >> >> +  " be used with balanced partitioning\n");
> >> >> >> +
> >> >> >
> >> >> > I think we should wire in resonable MAX_PARTITION_SIZE default.  THe 
> >> >> > value you
> >> >> > found experimentally may be a good start. For that reason we can't 
> >> >> > really
> >> >> > refuse a value when !LTO_PARTITION_BALANCED.  Just document it as 
> >> >> > parameter for
> >> >> > balanced partitioning only and add a parameter to lto_balanced_map 
> >> >> > specifying whether
> >> >> > this param should be honored (because the same path is used for 
> >> >> > partitioning to one partition)
> >> >> >
> >> >> > Otherwise the patch looks good to me modulo missing documentation.
> >> >> Thanks for the review. I have updated the patch.
> >> >> Does this version look OK ?
> >> >> I had randomly chosen 1, not sure if that's an appropriate value
> >> >> for default.
> >> >
> >> > I think it's way too small.  This is roughly the number of GIMPLE stmts
> >> > (thus roughly the number of instructions).  So with say a 8 byte
> >> > instruction format it is on the order of 80kB.  You'd want to have a
> >> > default of at least several ten times of large-unit-insns (also 1).
> >> > I'd choose sth like 100 (one million).  I find the lto-min-partition
> >> > number quite small as well (and up it by a factor of 10).
> >> Done in this version.
> >
> > I'd do that separately.
> >
> > Please no default parameter for lto_balanced_map (), instead change
> > all callers.
> >
> >> 

[PATCH] PR70117, ppc long double isinf

2016-04-06 Thread Alan Modra
On Tue, Apr 05, 2016 at 11:29:30AM +0200, Richard Biener wrote:
> In general the patch looks like a good approach to me but can we
> hide that
> 
> > +  const struct real_format *fmt = FLOAT_MODE_FORMAT (mode);
> > +  bool is_ibm_extended = fmt->pnan < fmt->p;
> 
> in a function somewhere in real.[ch]?

On looking in real.h, I see there is already a macro to do it.

Here's the revised version that properly tests the long double
subnormal limit.  Bootstrapped and regression tested
powerpc64le-linux.

gcc/
PR target/70117
* builtins.c (fold_builtin_classify): For IBM extended precision,
look at just the high-order double to test for NaN.
(fold_builtin_interclass_mathfn): Similarly for Inf.  For isnormal
test just the high double for Inf but both doubles for subnormal
limit.
gcc/testsuite/
* gcc.target/powerpc/pr70117.c: New.

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 9368ed0..9162838 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -7529,6 +7529,8 @@ fold_builtin_interclass_mathfn (location_t loc, tree 
fndecl, tree arg)
 
   mode = TYPE_MODE (TREE_TYPE (arg));
 
+  bool is_ibm_extended = MODE_COMPOSITE_P (mode);
+
   /* If there is no optab, try generic code.  */
   switch (DECL_FUNCTION_CODE (fndecl))
 {
@@ -7538,10 +7540,18 @@ fold_builtin_interclass_mathfn (location_t loc, tree 
fndecl, tree arg)
   {
/* isinf(x) -> isgreater(fabs(x),DBL_MAX).  */
tree const isgr_fn = builtin_decl_explicit (BUILT_IN_ISGREATER);
-   tree const type = TREE_TYPE (arg);
+   tree type = TREE_TYPE (arg);
REAL_VALUE_TYPE r;
char buf[128];
 
+   if (is_ibm_extended)
+ {
+   /* NaN and Inf are encoded in the high-order double value
+  only.  The low-order value is not significant.  */
+   type = double_type_node;
+   mode = DFmode;
+   arg = fold_build1_loc (loc, NOP_EXPR, type, arg);
+ }
get_max_float (REAL_MODE_FORMAT (mode), buf, sizeof (buf));
real_from_string (&r, buf);
result = build_call_expr (isgr_fn, 2,
@@ -7554,10 +7564,18 @@ fold_builtin_interclass_mathfn (location_t loc, tree 
fndecl, tree arg)
   {
/* isfinite(x) -> islessequal(fabs(x),DBL_MAX).  */
tree const isle_fn = builtin_decl_explicit (BUILT_IN_ISLESSEQUAL);
-   tree const type = TREE_TYPE (arg);
+   tree type = TREE_TYPE (arg);
REAL_VALUE_TYPE r;
char buf[128];
 
+   if (is_ibm_extended)
+ {
+   /* NaN and Inf are encoded in the high-order double value
+  only.  The low-order value is not significant.  */
+   type = double_type_node;
+   mode = DFmode;
+   arg = fold_build1_loc (loc, NOP_EXPR, type, arg);
+ }
get_max_float (REAL_MODE_FORMAT (mode), buf, sizeof (buf));
real_from_string (&r, buf);
result = build_call_expr (isle_fn, 2,
@@ -7577,21 +7595,72 @@ fold_builtin_interclass_mathfn (location_t loc, tree 
fndecl, tree arg)
/* isnormal(x) -> isgreaterequal(fabs(x),DBL_MIN) &
   islessequal(fabs(x),DBL_MAX).  */
tree const isle_fn = builtin_decl_explicit (BUILT_IN_ISLESSEQUAL);
-   tree const isge_fn = builtin_decl_explicit (BUILT_IN_ISGREATEREQUAL);
-   tree const type = TREE_TYPE (arg);
+   tree type = TREE_TYPE (arg);
+   tree orig_arg, max_exp, min_exp;
+   machine_mode orig_mode = mode;
REAL_VALUE_TYPE rmax, rmin;
char buf[128];
 
+   orig_arg = arg = builtin_save_expr (arg);
+   if (is_ibm_extended)
+ {
+   /* Use double to test the normal range of IBM extended
+  precision.  Emin for IBM extended precision is
+  different to emin for IEEE double, being 53 higher
+  since the low double exponent is at least 53 lower
+  than the high double exponent.  */
+   type = double_type_node;
+   mode = DFmode;
+   arg = fold_build1_loc (loc, NOP_EXPR, type, arg);
+ }
+   arg = fold_build1_loc (loc, ABS_EXPR, type, arg);
+
get_max_float (REAL_MODE_FORMAT (mode), buf, sizeof (buf));
real_from_string (&rmax, buf);
-   sprintf (buf, "0x1p%d", REAL_MODE_FORMAT (mode)->emin - 1);
+   sprintf (buf, "0x1p%d", REAL_MODE_FORMAT (orig_mode)->emin - 1);
real_from_string (&rmin, buf);
-   arg = builtin_save_expr (fold_build1_loc (loc, ABS_EXPR, type, arg));
-   result = build_call_expr (isle_fn, 2, arg,
- build_real (type, rmax));
-   result = fold_build2 (BIT_AND_EXPR, integer_type_node, result,
- build_call_expr (isge_fn, 2, arg,
-  build_real (type, rmin)));
+   max_exp = build_real (type, rmax);
+   min_exp = build_real (type, rmin);
+
+   max_exp = build_call_expr (isle_fn, 2, arg, max_exp);
+   if (is_ibm_extende

[wwwdocs] Simplify gcc-4.8/cxx0x_status.html (and convert to global CSS)

2016-04-06 Thread Gerald Pfeifer
This is similar to what Jason reported last week for the main 
cxxstatus.html page.  

Things already improved by virtue of the global styles I introduced, 
this now completes things, simplifying the page (removing all those 
align="center"s and s) and using the global CSS sheet, so any 
changes can be made in one place going forward.

I plan on tackling the other C++ status pages in the coming days
as well.

Gerald

Simplify, using global CSS styles for C++ status tables, avoiding
duplication and manual markup.

Index: gcc-4.8/cxx0x_status.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.8/cxx0x_status.html,v
retrieving revision 1.8
diff -u -r1.8 cxx0x_status.html
--- gcc-4.8/cxx0x_status.html   24 Apr 2013 15:14:26 -  1.8
+++ gcc-4.8/cxx0x_status.html   5 Apr 2016 16:03:22 -
@@ -2,13 +2,6 @@
 
 
   Status of Experimental C++11 Support in GCC 4.8
-  
-/*  */
-  
 
 
 
@@ -38,7 +31,7 @@
 
 
 
-  
+  
 
   Language Feature
   Proposal
@@ -47,323 +40,324 @@
 
   Rvalue references
   http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n2118.html";>N2118
-   Yes
+   Yes
 
 
   Rvalue references for *this
   http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2439.htm";>N2439
-  4.8.1
+  4.8.1
 
 
   Initialization of class objects by rvalues
   http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2004/n1610.html";>N1610
-  Yes
+  Yes
 
 
   Non-static data member initializers
   http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2008/n2756.htm";>N2756
-  Yes
+  Yes
 
 
   Variadic templates
   http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2242.pdf";>N2242
-   Yes
+   Yes
 
 
   Extending variadic template template 
parameters
   http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2555.pdf";>N2555
-   Yes
+   Yes
 
 
   Initializer lists
   http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2672.htm";>N2672
-   Yes
+   Yes
 
 
   Static assertions
   http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2004/n1720.html";>N1720
-   Yes
+   Yes
 
 
   auto-typed variables
   http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n1984.pdf";>N1984
-   Yes
+   Yes
 
 
   Multi-declarator auto
   http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2004/n1737.pdf";>N1737
-   Yes
+   Yes
 
 
   Removal of auto as a storage-class 
specifier
   http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2546.htm";>N2546
-   Yes
+   Yes
 
 
   New function declarator syntax
   http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2541.htm";>N2541
-   Yes
+   Yes
 
 
   New wording for C++11 lambdas
   http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2009/n2927.pdf";>N2927
-  Yes
+  Yes
 
 
   Declared type of an expression
   http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2343.pdf";>N2343
-   Yes
+   Yes
 
 
   decltype and call expressions
   http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2011/n3276.pdf";>N3276
-  4.8.1
+  4.8.1
 
 
   Right angle brackets
   http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1757.html";>N1757
-   Yes
+   Yes
 
 
   Default template arguments for function templates
   http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#226";>DR226
-   Yes
+   Yes
 
 
   Solving the SFINAE problem for expressions
   http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2634.html";>DR339
-   Yes
+   Yes
 
 
   Template aliases
   http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2258.pdf";>N2258
-   Yes
+   Yes
 
 
   Extern templates
   http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n1987.htm";>N1987
-   Yes
+   Yes
 
 
   Null pointer constant
   http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2431.pdf";>N2431
-  Yes
+  Yes
 
 
   Strongly-typed enums
   http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2347.pdf";>N2347
-   Yes
+   Yes
 
 
   Forward declarations for enums
   
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2764.pdf";>N2764
   
-  Yes
+  Yes
 
 
   Generalized attributes
   http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2761.pdf";>N2761
-  Yes
+  Yes
 
 
   Generalized constant expressions
   http://www.open-std.org/jtc1/sc22/wg21/doc

Re: [PATCH] PR70117, ppc long double isinf

2016-04-06 Thread Richard Biener
On Wed, Apr 6, 2016 at 10:31 AM, Alan Modra  wrote:
> On Tue, Apr 05, 2016 at 11:29:30AM +0200, Richard Biener wrote:
>> In general the patch looks like a good approach to me but can we
>> hide that
>>
>> > +  const struct real_format *fmt = FLOAT_MODE_FORMAT (mode);
>> > +  bool is_ibm_extended = fmt->pnan < fmt->p;
>>
>> in a function somewhere in real.[ch]?
>
> On looking in real.h, I see there is already a macro to do it.
>
> Here's the revised version that properly tests the long double
> subnormal limit.  Bootstrapped and regression tested
> powerpc64le-linux.

Can you add a testcase or two for the isnormal () case?

I wonder whether the isnormal tests are too excessive to put in
inline code and thus libgcc code wouldn't be better to handle this...

At least the glibc implementation looks a lot simpler to me ...
(if ./sysdeps/ieee754/ldbl-128ibm/s_fpclassifyl.c is the correct one).

Thus an alternative is to inline sth similar via the folding or via
an optab and not folding (I'd prefer the latter).

That said, did you inspect the generated code for a isnormal (x)
call for non-constant x?  What does XLC do here?

Richard.

> gcc/
> PR target/70117
> * builtins.c (fold_builtin_classify): For IBM extended precision,
> look at just the high-order double to test for NaN.
> (fold_builtin_interclass_mathfn): Similarly for Inf.  For isnormal
> test just the high double for Inf but both doubles for subnormal
> limit.
> gcc/testsuite/
> * gcc.target/powerpc/pr70117.c: New.
>
> diff --git a/gcc/builtins.c b/gcc/builtins.c
> index 9368ed0..9162838 100644
> --- a/gcc/builtins.c
> +++ b/gcc/builtins.c
> @@ -7529,6 +7529,8 @@ fold_builtin_interclass_mathfn (location_t loc, tree 
> fndecl, tree arg)
>
>mode = TYPE_MODE (TREE_TYPE (arg));
>
> +  bool is_ibm_extended = MODE_COMPOSITE_P (mode);
> +
>/* If there is no optab, try generic code.  */
>switch (DECL_FUNCTION_CODE (fndecl))
>  {
> @@ -7538,10 +7540,18 @@ fold_builtin_interclass_mathfn (location_t loc, tree 
> fndecl, tree arg)
>{
> /* isinf(x) -> isgreater(fabs(x),DBL_MAX).  */
> tree const isgr_fn = builtin_decl_explicit (BUILT_IN_ISGREATER);
> -   tree const type = TREE_TYPE (arg);
> +   tree type = TREE_TYPE (arg);
> REAL_VALUE_TYPE r;
> char buf[128];
>
> +   if (is_ibm_extended)
> + {
> +   /* NaN and Inf are encoded in the high-order double value
> +  only.  The low-order value is not significant.  */
> +   type = double_type_node;
> +   mode = DFmode;
> +   arg = fold_build1_loc (loc, NOP_EXPR, type, arg);
> + }
> get_max_float (REAL_MODE_FORMAT (mode), buf, sizeof (buf));
> real_from_string (&r, buf);
> result = build_call_expr (isgr_fn, 2,
> @@ -7554,10 +7564,18 @@ fold_builtin_interclass_mathfn (location_t loc, tree 
> fndecl, tree arg)
>{
> /* isfinite(x) -> islessequal(fabs(x),DBL_MAX).  */
> tree const isle_fn = builtin_decl_explicit (BUILT_IN_ISLESSEQUAL);
> -   tree const type = TREE_TYPE (arg);
> +   tree type = TREE_TYPE (arg);
> REAL_VALUE_TYPE r;
> char buf[128];
>
> +   if (is_ibm_extended)
> + {
> +   /* NaN and Inf are encoded in the high-order double value
> +  only.  The low-order value is not significant.  */
> +   type = double_type_node;
> +   mode = DFmode;
> +   arg = fold_build1_loc (loc, NOP_EXPR, type, arg);
> + }
> get_max_float (REAL_MODE_FORMAT (mode), buf, sizeof (buf));
> real_from_string (&r, buf);
> result = build_call_expr (isle_fn, 2,
> @@ -7577,21 +7595,72 @@ fold_builtin_interclass_mathfn (location_t loc, tree 
> fndecl, tree arg)
> /* isnormal(x) -> isgreaterequal(fabs(x),DBL_MIN) &
>islessequal(fabs(x),DBL_MAX).  */
> tree const isle_fn = builtin_decl_explicit (BUILT_IN_ISLESSEQUAL);
> -   tree const isge_fn = builtin_decl_explicit (BUILT_IN_ISGREATEREQUAL);
> -   tree const type = TREE_TYPE (arg);
> +   tree type = TREE_TYPE (arg);
> +   tree orig_arg, max_exp, min_exp;
> +   machine_mode orig_mode = mode;
> REAL_VALUE_TYPE rmax, rmin;
> char buf[128];
>
> +   orig_arg = arg = builtin_save_expr (arg);
> +   if (is_ibm_extended)
> + {
> +   /* Use double to test the normal range of IBM extended
> +  precision.  Emin for IBM extended precision is
> +  different to emin for IEEE double, being 53 higher
> +  since the low double exponent is at least 53 lower
> +  than the high double exponent.  */
> +   type = double_type_node;
> +   mode = DFmode;
> +   arg = fold_build1_loc (loc, NOP_EXPR, type, arg);
> + }
> +   arg = fold_build1_loc (loc, ABS_EXPR, type, arg);
> +
> get_max_float (REAL_MODE_FORMAT (mo

Re: [patch] Remove superfluous /dev/null on grep line

2016-04-06 Thread Jonathan Wakely

On 06/04/16 09:39 +0200, Eric Botcazou wrote:

Hi,

we recently ran into build failures on Windows systems using a somewhat old
grep, coming from a syntax error in the libstdc++-symbols.ver version file:

# Symbol versioning for shared libraries.
if ENABLE_SYMVERS
libstdc++-symbols.ver:  ${glibcxx_srcdir}/$(SYMVER_FILE) \
$(port_specific_symbol_files)
cp ${glibcxx_srcdir}/$(SYMVER_FILE) $@.tmp
chmod +w $@.tmp
if test "x$(port_specific_symbol_files)" != x; then \
  if grep '^# Appended to version file.' \
   $(port_specific_symbol_files) /dev/null > /dev/null 2>&1; then
\
cat $(port_specific_symbol_files) >> $@.tmp; \
  else \
sed -n '1,/DO NOT DELETE/p' $@.tmp > tmp.top; \
sed -n '/DO NOT DELETE/,$$p' $@.tmp > tmp.bottom; \
cat tmp.top $(port_specific_symbol_files) tmp.bottom > $@.tmp; \
rm tmp.top tmp.bottom; \
  fi; \
fi

Note the double /dev/null on the grep command line.  The first one causes the
grep to fail when the command is invoked on these systems.  That's old code,
but it is now invoked for config/abi/pre/float128.ver on the mainline and 5
branch and this breaks the build on these systems (4.9 builds fine).

This first /dev/null doesn't serve any useful purpose and seems to be a typo,


Doesn't it mean that if $port_specific_symbol_files contains only
whitespace we don't hang waiting for input from stdin? The 'if' above
it will be true when "x$port_specific_symbol_files" = "x " or similar.

I don't see any way for that to happen in the FSF tree, so it should
be safe. I'm a bit concerned about making that change this late in
stage 4 though. There isn't much time to find out if it breaks an
obscure target.




Re: [RFC] introduce --param max-lto-partition for having an upper bound on partition size

2016-04-06 Thread Prathamesh Kulkarni
On 6 April 2016 at 13:44, Richard Biener  wrote:
> On Wed, 6 Apr 2016, Prathamesh Kulkarni wrote:
>
>> On 5 April 2016 at 18:28, Richard Biener  wrote:
>> > On Tue, 5 Apr 2016, Prathamesh Kulkarni wrote:
>> >
>> >> On 5 April 2016 at 16:58, Richard Biener  wrote:
>> >> > On Tue, 5 Apr 2016, Prathamesh Kulkarni wrote:
>> >> >
>> >> >> On 4 April 2016 at 19:44, Jan Hubicka  wrote:
>> >> >> >
>> >> >> >> diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
>> >> >> >> index 9eb63c2..bc0c612 100644
>> >> >> >> --- a/gcc/lto/lto-partition.c
>> >> >> >> +++ b/gcc/lto/lto-partition.c
>> >> >> >> @@ -511,9 +511,20 @@ lto_balanced_map (int n_lto_partitions)
>> >> >> >>varpool_order.qsort (varpool_node_cmp);
>> >> >> >>
>> >> >> >>/* Compute partition size and create the first partition.  */
>> >> >> >> +  if (PARAM_VALUE (MIN_PARTITION_SIZE) > PARAM_VALUE 
>> >> >> >> (MAX_PARTITION_SIZE))
>> >> >> >> +fatal_error (input_location, "min partition size cannot be 
>> >> >> >> greater than max partition size");
>> >> >> >> +
>> >> >> >>partition_size = total_size / n_lto_partitions;
>> >> >> >>if (partition_size < PARAM_VALUE (MIN_PARTITION_SIZE))
>> >> >> >>  partition_size = PARAM_VALUE (MIN_PARTITION_SIZE);
>> >> >> >> +  else if (partition_size > PARAM_VALUE (MAX_PARTITION_SIZE))
>> >> >> >> +{
>> >> >> >> +  n_lto_partitions = total_size / PARAM_VALUE 
>> >> >> >> (MAX_PARTITION_SIZE);
>> >> >> >> +  if (total_size % PARAM_VALUE (MAX_PARTITION_SIZE))
>> >> >> >> + n_lto_partitions++;
>> >> >> >> +  partition_size = total_size / n_lto_partitions;
>> >> >> >> +}
>> >> >> >
>> >> >> > lto_balanced_map actually works in a way that looks for cheapest 
>> >> >> > cutpoint in range
>> >> >> > 3/4*parittion_size to 2*partition_size and picks the cheapest range.
>> >> >> > Setting partition_size to this value will thus not cause partitioner 
>> >> >> > to produce smaller
>> >> >> > partitions only.  I suppose modify the conditional:
>> >> >> >
>> >> >> >   /* Partition is too large, unwind into step when best cost was 
>> >> >> > reached and
>> >> >> >  start new partition.  */
>> >> >> >   if (partition->insns > 2 * partition_size)
>> >> >> >
>> >> >> > and/or in the code above set the partition_size to half of 
>> >> >> > total_size/max_size.
>> >> >> >
>> >> >> > I know this is somewhat sloppy.  This was really just first cut 
>> >> >> > implementation
>> >> >> > many years ago. I expected to reimplement it marter soon, but then 
>> >> >> > there was
>> >> >> > never really a need for it (I am trying to avoid late IPA 
>> >> >> > optimizations so the
>> >> >> > partitioning decisions should mostly affect compile time performance 
>> >> >> > only).
>> >> >> > If ARM is more sensitive for partitining, perhaps it would make 
>> >> >> > sense to try to
>> >> >> > look for something smarter.
>> >> >> >
>> >> >> >> +
>> >> >> >>npartitions = 1;
>> >> >> >>partition = new_partition ("");
>> >> >> >>if (symtab->dump_file)
>> >> >> >> diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c
>> >> >> >> index 9dd513f..294b8a4 100644
>> >> >> >> --- a/gcc/lto/lto.c
>> >> >> >> +++ b/gcc/lto/lto.c
>> >> >> >> @@ -3112,6 +3112,12 @@ do_whole_program_analysis (void)
>> >> >> >>timevar_pop (TV_WHOPR_WPA);
>> >> >> >>
>> >> >> >>timevar_push (TV_WHOPR_PARTITIONING);
>> >> >> >> +
>> >> >> >> +  if (flag_lto_partition != LTO_PARTITION_BALANCED
>> >> >> >> +  && PARAM_VALUE (MAX_PARTITION_SIZE) != INT_MAX)
>> >> >> >> +fatal_error (input_location, "--param max-lto-partition should 
>> >> >> >> only"
>> >> >> >> +  " be used with balanced partitioning\n");
>> >> >> >> +
>> >> >> >
>> >> >> > I think we should wire in resonable MAX_PARTITION_SIZE default.  THe 
>> >> >> > value you
>> >> >> > found experimentally may be a good start. For that reason we can't 
>> >> >> > really
>> >> >> > refuse a value when !LTO_PARTITION_BALANCED.  Just document it as 
>> >> >> > parameter for
>> >> >> > balanced partitioning only and add a parameter to lto_balanced_map 
>> >> >> > specifying whether
>> >> >> > this param should be honored (because the same path is used for 
>> >> >> > partitioning to one partition)
>> >> >> >
>> >> >> > Otherwise the patch looks good to me modulo missing documentation.
>> >> >> Thanks for the review. I have updated the patch.
>> >> >> Does this version look OK ?
>> >> >> I had randomly chosen 1, not sure if that's an appropriate value
>> >> >> for default.
>> >> >
>> >> > I think it's way too small.  This is roughly the number of GIMPLE stmts
>> >> > (thus roughly the number of instructions).  So with say a 8 byte
>> >> > instruction format it is on the order of 80kB.  You'd want to have a
>> >> > default of at least several ten times of large-unit-insns (also 1).
>> >> > I'd choose sth like 100 (one million).  I find the lto-min-partition
>> >> > number quite small as well (and up it by a factor of 10).
>> >> 

Re: [patch] Remove superfluous /dev/null on grep line

2016-04-06 Thread Jakub Jelinek
On Wed, Apr 06, 2016 at 09:50:48AM +0100, Jonathan Wakely wrote:
> On 06/04/16 09:39 +0200, Eric Botcazou wrote:
> >we recently ran into build failures on Windows systems using a somewhat old
> >grep, coming from a syntax error in the libstdc++-symbols.ver version file:
> >
> ># Symbol versioning for shared libraries.
> >if ENABLE_SYMVERS
> >libstdc++-symbols.ver:  ${glibcxx_srcdir}/$(SYMVER_FILE) \
> > $(port_specific_symbol_files)
> > cp ${glibcxx_srcdir}/$(SYMVER_FILE) $@.tmp
> > chmod +w $@.tmp
> > if test "x$(port_specific_symbol_files)" != x; then \
> >   if grep '^# Appended to version file.' \
> >$(port_specific_symbol_files) /dev/null > /dev/null 2>&1; then
> >\
> > cat $(port_specific_symbol_files) >> $@.tmp; \
> >   else \
> > sed -n '1,/DO NOT DELETE/p' $@.tmp > tmp.top; \
> > sed -n '/DO NOT DELETE/,$$p' $@.tmp > tmp.bottom; \
> > cat tmp.top $(port_specific_symbol_files) tmp.bottom > $@.tmp; \
> > rm tmp.top tmp.bottom; \
> >   fi; \
> > fi
> >
> >Note the double /dev/null on the grep command line.  The first one causes the
> >grep to fail when the command is invoked on these systems.  That's old code,
> >but it is now invoked for config/abi/pre/float128.ver on the mainline and 5
> >branch and this breaks the build on these systems (4.9 builds fine).
> >
> >This first /dev/null doesn't serve any useful purpose and seems to be a typo,
> 
> Doesn't it mean that if $port_specific_symbol_files contains only
> whitespace we don't hang waiting for input from stdin? The 'if' above
> it will be true when "x$port_specific_symbol_files" = "x " or similar.
> 
> I don't see any way for that to happen in the FSF tree, so it should
> be safe. I'm a bit concerned about making that change this late in
> stage 4 though. There isn't much time to find out if it breaks an
> obscure target.

As it is a make variable, can't make be used to test this?
So perhaps
chmod +w $@.tmp
ifneq ($(port_specific_symbol_files),)
  if grep '^# Appended to version file.' \
   $(port_specific_symbol_files) /dev/null > /dev/null 2>&1; then \
cat $(port_specific_symbol_files) >> $@.tmp; \
  else \
sed -n '1,/DO NOT DELETE/p' $@.tmp > tmp.top; \
sed -n '/DO NOT DELETE/,$$p' $@.tmp > tmp.bottom; \
cat tmp.top $(port_specific_symbol_files) tmp.bottom > $@.tmp; \
rm tmp.top tmp.bottom; \
  fi;
endif
?  Though, I think the initial and trailing whitespace is removed during
expansion (or already parsing of the vars), so even the
test "x$(port_specific_symbol_files)" != x
check should work right.

Jakub


Re: [RFC] introduce --param max-lto-partition for having an upper bound on partition size

2016-04-06 Thread Richard Biener
On Wed, 6 Apr 2016, Prathamesh Kulkarni wrote:

> On 6 April 2016 at 13:44, Richard Biener  wrote:
> > On Wed, 6 Apr 2016, Prathamesh Kulkarni wrote:
> >
> >> On 5 April 2016 at 18:28, Richard Biener  wrote:
> >> > On Tue, 5 Apr 2016, Prathamesh Kulkarni wrote:
> >> >
> >> >> On 5 April 2016 at 16:58, Richard Biener  wrote:
> >> >> > On Tue, 5 Apr 2016, Prathamesh Kulkarni wrote:
> >> >> >
> >> >> >> On 4 April 2016 at 19:44, Jan Hubicka  wrote:
> >> >> >> >
> >> >> >> >> diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
> >> >> >> >> index 9eb63c2..bc0c612 100644
> >> >> >> >> --- a/gcc/lto/lto-partition.c
> >> >> >> >> +++ b/gcc/lto/lto-partition.c
> >> >> >> >> @@ -511,9 +511,20 @@ lto_balanced_map (int n_lto_partitions)
> >> >> >> >>varpool_order.qsort (varpool_node_cmp);
> >> >> >> >>
> >> >> >> >>/* Compute partition size and create the first partition.  */
> >> >> >> >> +  if (PARAM_VALUE (MIN_PARTITION_SIZE) > PARAM_VALUE 
> >> >> >> >> (MAX_PARTITION_SIZE))
> >> >> >> >> +fatal_error (input_location, "min partition size cannot be 
> >> >> >> >> greater than max partition size");
> >> >> >> >> +
> >> >> >> >>partition_size = total_size / n_lto_partitions;
> >> >> >> >>if (partition_size < PARAM_VALUE (MIN_PARTITION_SIZE))
> >> >> >> >>  partition_size = PARAM_VALUE (MIN_PARTITION_SIZE);
> >> >> >> >> +  else if (partition_size > PARAM_VALUE (MAX_PARTITION_SIZE))
> >> >> >> >> +{
> >> >> >> >> +  n_lto_partitions = total_size / PARAM_VALUE 
> >> >> >> >> (MAX_PARTITION_SIZE);
> >> >> >> >> +  if (total_size % PARAM_VALUE (MAX_PARTITION_SIZE))
> >> >> >> >> + n_lto_partitions++;
> >> >> >> >> +  partition_size = total_size / n_lto_partitions;
> >> >> >> >> +}
> >> >> >> >
> >> >> >> > lto_balanced_map actually works in a way that looks for cheapest 
> >> >> >> > cutpoint in range
> >> >> >> > 3/4*parittion_size to 2*partition_size and picks the cheapest 
> >> >> >> > range.
> >> >> >> > Setting partition_size to this value will thus not cause 
> >> >> >> > partitioner to produce smaller
> >> >> >> > partitions only.  I suppose modify the conditional:
> >> >> >> >
> >> >> >> >   /* Partition is too large, unwind into step when best cost 
> >> >> >> > was reached and
> >> >> >> >  start new partition.  */
> >> >> >> >   if (partition->insns > 2 * partition_size)
> >> >> >> >
> >> >> >> > and/or in the code above set the partition_size to half of 
> >> >> >> > total_size/max_size.
> >> >> >> >
> >> >> >> > I know this is somewhat sloppy.  This was really just first cut 
> >> >> >> > implementation
> >> >> >> > many years ago. I expected to reimplement it marter soon, but then 
> >> >> >> > there was
> >> >> >> > never really a need for it (I am trying to avoid late IPA 
> >> >> >> > optimizations so the
> >> >> >> > partitioning decisions should mostly affect compile time 
> >> >> >> > performance only).
> >> >> >> > If ARM is more sensitive for partitining, perhaps it would make 
> >> >> >> > sense to try to
> >> >> >> > look for something smarter.
> >> >> >> >
> >> >> >> >> +
> >> >> >> >>npartitions = 1;
> >> >> >> >>partition = new_partition ("");
> >> >> >> >>if (symtab->dump_file)
> >> >> >> >> diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c
> >> >> >> >> index 9dd513f..294b8a4 100644
> >> >> >> >> --- a/gcc/lto/lto.c
> >> >> >> >> +++ b/gcc/lto/lto.c
> >> >> >> >> @@ -3112,6 +3112,12 @@ do_whole_program_analysis (void)
> >> >> >> >>timevar_pop (TV_WHOPR_WPA);
> >> >> >> >>
> >> >> >> >>timevar_push (TV_WHOPR_PARTITIONING);
> >> >> >> >> +
> >> >> >> >> +  if (flag_lto_partition != LTO_PARTITION_BALANCED
> >> >> >> >> +  && PARAM_VALUE (MAX_PARTITION_SIZE) != INT_MAX)
> >> >> >> >> +fatal_error (input_location, "--param max-lto-partition 
> >> >> >> >> should only"
> >> >> >> >> +  " be used with balanced partitioning\n");
> >> >> >> >> +
> >> >> >> >
> >> >> >> > I think we should wire in resonable MAX_PARTITION_SIZE default.  
> >> >> >> > THe value you
> >> >> >> > found experimentally may be a good start. For that reason we can't 
> >> >> >> > really
> >> >> >> > refuse a value when !LTO_PARTITION_BALANCED.  Just document it as 
> >> >> >> > parameter for
> >> >> >> > balanced partitioning only and add a parameter to lto_balanced_map 
> >> >> >> > specifying whether
> >> >> >> > this param should be honored (because the same path is used for 
> >> >> >> > partitioning to one partition)
> >> >> >> >
> >> >> >> > Otherwise the patch looks good to me modulo missing documentation.
> >> >> >> Thanks for the review. I have updated the patch.
> >> >> >> Does this version look OK ?
> >> >> >> I had randomly chosen 1, not sure if that's an appropriate value
> >> >> >> for default.
> >> >> >
> >> >> > I think it's way too small.  This is roughly the number of GIMPLE 
> >> >> > stmts
> >> >> > (thus roughly the number of instructions).  So with say a 8 byte
> >> >> > instruction format it i

Re: [patch] Remove superfluous /dev/null on grep line

2016-04-06 Thread Jonathan Wakely

On 06/04/16 11:01 +0200, Jakub Jelinek wrote:

On Wed, Apr 06, 2016 at 09:50:48AM +0100, Jonathan Wakely wrote:

On 06/04/16 09:39 +0200, Eric Botcazou wrote:
>we recently ran into build failures on Windows systems using a somewhat old
>grep, coming from a syntax error in the libstdc++-symbols.ver version file:
>
># Symbol versioning for shared libraries.
>if ENABLE_SYMVERS
>libstdc++-symbols.ver:  ${glibcxx_srcdir}/$(SYMVER_FILE) \
>$(port_specific_symbol_files)
>cp ${glibcxx_srcdir}/$(SYMVER_FILE) $@.tmp
>chmod +w $@.tmp
>if test "x$(port_specific_symbol_files)" != x; then \
>  if grep '^# Appended to version file.' \
>   $(port_specific_symbol_files) /dev/null > /dev/null 2>&1; then
>\
>cat $(port_specific_symbol_files) >> $@.tmp; \
>  else \
>sed -n '1,/DO NOT DELETE/p' $@.tmp > tmp.top; \
>sed -n '/DO NOT DELETE/,$$p' $@.tmp > tmp.bottom; \
>cat tmp.top $(port_specific_symbol_files) tmp.bottom > $@.tmp; \
>rm tmp.top tmp.bottom; \
>  fi; \
>fi
>
>Note the double /dev/null on the grep command line.  The first one causes the
>grep to fail when the command is invoked on these systems.  That's old code,
>but it is now invoked for config/abi/pre/float128.ver on the mainline and 5
>branch and this breaks the build on these systems (4.9 builds fine).
>
>This first /dev/null doesn't serve any useful purpose and seems to be a typo,

Doesn't it mean that if $port_specific_symbol_files contains only
whitespace we don't hang waiting for input from stdin? The 'if' above
it will be true when "x$port_specific_symbol_files" = "x " or similar.

I don't see any way for that to happen in the FSF tree, so it should
be safe. I'm a bit concerned about making that change this late in
stage 4 though. There isn't much time to find out if it breaks an
obscure target.


As it is a make variable, can't make be used to test this?
So perhaps
chmod +w $@.tmp
ifneq ($(port_specific_symbol_files),)
  if grep '^# Appended to version file.' \
   $(port_specific_symbol_files) /dev/null > /dev/null 2>&1; then \
cat $(port_specific_symbol_files) >> $@.tmp; \
  else \
sed -n '1,/DO NOT DELETE/p' $@.tmp > tmp.top; \
sed -n '/DO NOT DELETE/,$$p' $@.tmp > tmp.bottom; \
cat tmp.top $(port_specific_symbol_files) tmp.bottom > $@.tmp; \
rm tmp.top tmp.bottom; \
  fi;
endif
?  Though, I think the initial and trailing whitespace is removed during
expansion (or already parsing of the vars), so even the
test "x$(port_specific_symbol_files)" != x
check should work right.


OK, I have no objection to the original patch then.



Re: [patch] Remove superfluous /dev/null on grep line

2016-04-06 Thread Jakub Jelinek
On Wed, Apr 06, 2016 at 10:12:18AM +0100, Jonathan Wakely wrote:
> >As it is a make variable, can't make be used to test this?
> >So perhaps
> > chmod +w $@.tmp
> >ifneq ($(port_specific_symbol_files),)
> >   if grep '^# Appended to version file.' \
> >$(port_specific_symbol_files) /dev/null > /dev/null 2>&1; then \
> > cat $(port_specific_symbol_files) >> $@.tmp; \
> >   else \
> > sed -n '1,/DO NOT DELETE/p' $@.tmp > tmp.top; \
> > sed -n '/DO NOT DELETE/,$$p' $@.tmp > tmp.bottom; \
> > cat tmp.top $(port_specific_symbol_files) tmp.bottom > $@.tmp; \
> > rm tmp.top tmp.bottom; \
> >   fi;
> >endif
> >?  Though, I think the initial and trailing whitespace is removed during
> >expansion (or already parsing of the vars), so even the
> >test "x$(port_specific_symbol_files)" != x
> >check should work right.
> 
> OK, I have no objection to the original patch then.

To correct myself, only leading whitespace is removed, trailing is not,
but when we do care about vars containing only whitespace, that means
removing everything.

Jakub


Re: [PATCH] PR70117, ppc long double isinf

2016-04-06 Thread Alan Modra
On Wed, Apr 06, 2016 at 10:46:48AM +0200, Richard Biener wrote:
> On Wed, Apr 6, 2016 at 10:31 AM, Alan Modra  wrote:
> > On Tue, Apr 05, 2016 at 11:29:30AM +0200, Richard Biener wrote:
> >> In general the patch looks like a good approach to me but can we
> >> hide that
> >>
> >> > +  const struct real_format *fmt = FLOAT_MODE_FORMAT (mode);
> >> > +  bool is_ibm_extended = fmt->pnan < fmt->p;
> >>
> >> in a function somewhere in real.[ch]?
> >
> > On looking in real.h, I see there is already a macro to do it.
> >
> > Here's the revised version that properly tests the long double
> > subnormal limit.  Bootstrapped and regression tested
> > powerpc64le-linux.
> 
> Can you add a testcase or two for the isnormal () case?

Sure.  I'll adapt the testcase I was using to verify the output,
attached in case you're interested.

> I wonder whether the isnormal tests are too excessive to put in
> inline code and thus libgcc code wouldn't be better to handle this...

Out-of-line would be better for -Os at least.

> At least the glibc implementation looks a lot simpler to me ...
> (if ./sysdeps/ieee754/ldbl-128ibm/s_fpclassifyl.c is the correct one).

It looks more or less the same to me, except done by bit twiddling on
integers.  :)

> Thus an alternative is to inline sth similar via the folding or via
> an optab and not folding (I'd prefer the latter).
> 
> That said, did you inspect the generated code for a isnormal (x)
> call for non-constant x?

Yes, I spent quite a bit of time fiddling trying to get optimal code.
I'm not claiming I succeeded..

>  What does XLC do here?

Not sure, sorry.  I don't have xlc handy.  Will try later.

-- 
Alan Modra
Australia Development Lab, IBM
int __attribute__ ((noclone, noinline))
isnormal (double x)
{
  return __builtin_isnormal (x);
}

int __attribute__ ((noclone, noinline))
isnormal_ld (long double x)
{
  return __builtin_isnormal (x);
}

double min_norm = 0x1p-1022;
double min_denorm = 0x1p-1074;
double ld_low = 0x1p-969;

int
main (void)
{
  static union { long double ld; unsigned long l[2]; } x;

  __builtin_printf ("%a %d\n", min_norm, isnormal (min_norm));
  __builtin_printf ("%a %d\n", min_norm * 0.5, isnormal (min_norm * 0.5));

  x.ld = ld_low;
  __builtin_printf ("%La (%016lx %016lx) %d\n", x.ld, x.l[0], x.l[1],
		isnormal_ld (x.ld));
  x.ld = -ld_low;
  __builtin_printf ("%La (%016lx %016lx) %d\n", x.ld, x.l[0], x.l[1],
		isnormal_ld (x.ld));
  x.ld = -min_norm * 0.5;
  x.ld += ld_low;
  __builtin_printf ("%La (%016lx %016lx) %d\n", x.ld, x.l[0], x.l[1],
		isnormal_ld (x.ld));
  x.ld = min_norm * 0.5;
  x.ld -= ld_low;
  __builtin_printf ("%La (%016lx %016lx) %d\n", x.ld, x.l[0], x.l[1],
		isnormal_ld (x.ld));
  x.ld = -min_norm;
  x.ld += ld_low;
  __builtin_printf ("%La (%016lx %016lx) %d\n", x.ld, x.l[0], x.l[1],
		isnormal_ld (x.ld));
  x.ld = min_norm;
  x.ld -= ld_low;
  __builtin_printf ("%La (%016lx %016lx) %d\n", x.ld, x.l[0], x.l[1],
		isnormal_ld (x.ld));
  x.ld = -min_denorm;
  x.ld += ld_low;
  __builtin_printf ("%La (%016lx %016lx) %d\n", x.ld, x.l[0], x.l[1],
		isnormal_ld (x.ld));
  x.ld = min_denorm;
  x.ld -= ld_low;
  __builtin_printf ("%La (%016lx %016lx) %d\n", x.ld, x.l[0], x.l[1],
		isnormal_ld (x.ld));
  x.ld = min_denorm;
  x.ld += ld_low;
  __builtin_printf ("%La (%016lx %016lx) %d\n", x.ld, x.l[0], x.l[1],
		isnormal_ld (x.ld));
  x.ld = -min_denorm;
  x.ld -= ld_low;
  __builtin_printf ("%La (%016lx %016lx) %d\n", x.ld, x.l[0], x.l[1],
		isnormal_ld (x.ld));
  return 0;
}


Re: [RFC] introduce --param max-lto-partition for having an upper bound on partition size

2016-04-06 Thread Richard Biener
On Wed, 6 Apr 2016, Richard Biener wrote:

> On Wed, 6 Apr 2016, Prathamesh Kulkarni wrote:
> 
> > On 6 April 2016 at 13:44, Richard Biener  wrote:
> > > On Wed, 6 Apr 2016, Prathamesh Kulkarni wrote:
> > >
> > >> On 5 April 2016 at 18:28, Richard Biener  wrote:
> > >> > On Tue, 5 Apr 2016, Prathamesh Kulkarni wrote:
> > >> >
> > >> >> On 5 April 2016 at 16:58, Richard Biener  wrote:
> > >> >> > On Tue, 5 Apr 2016, Prathamesh Kulkarni wrote:
> > >> >> >
> > >> >> >> On 4 April 2016 at 19:44, Jan Hubicka  wrote:
> > >> >> >> >
> > >> >> >> >> diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
> > >> >> >> >> index 9eb63c2..bc0c612 100644
> > >> >> >> >> --- a/gcc/lto/lto-partition.c
> > >> >> >> >> +++ b/gcc/lto/lto-partition.c
> > >> >> >> >> @@ -511,9 +511,20 @@ lto_balanced_map (int n_lto_partitions)
> > >> >> >> >>varpool_order.qsort (varpool_node_cmp);
> > >> >> >> >>
> > >> >> >> >>/* Compute partition size and create the first partition.  */
> > >> >> >> >> +  if (PARAM_VALUE (MIN_PARTITION_SIZE) > PARAM_VALUE 
> > >> >> >> >> (MAX_PARTITION_SIZE))
> > >> >> >> >> +fatal_error (input_location, "min partition size cannot be 
> > >> >> >> >> greater than max partition size");
> > >> >> >> >> +
> > >> >> >> >>partition_size = total_size / n_lto_partitions;
> > >> >> >> >>if (partition_size < PARAM_VALUE (MIN_PARTITION_SIZE))
> > >> >> >> >>  partition_size = PARAM_VALUE (MIN_PARTITION_SIZE);
> > >> >> >> >> +  else if (partition_size > PARAM_VALUE (MAX_PARTITION_SIZE))
> > >> >> >> >> +{
> > >> >> >> >> +  n_lto_partitions = total_size / PARAM_VALUE 
> > >> >> >> >> (MAX_PARTITION_SIZE);
> > >> >> >> >> +  if (total_size % PARAM_VALUE (MAX_PARTITION_SIZE))
> > >> >> >> >> + n_lto_partitions++;
> > >> >> >> >> +  partition_size = total_size / n_lto_partitions;
> > >> >> >> >> +}
> > >> >> >> >
> > >> >> >> > lto_balanced_map actually works in a way that looks for cheapest 
> > >> >> >> > cutpoint in range
> > >> >> >> > 3/4*parittion_size to 2*partition_size and picks the cheapest 
> > >> >> >> > range.
> > >> >> >> > Setting partition_size to this value will thus not cause 
> > >> >> >> > partitioner to produce smaller
> > >> >> >> > partitions only.  I suppose modify the conditional:
> > >> >> >> >
> > >> >> >> >   /* Partition is too large, unwind into step when best cost 
> > >> >> >> > was reached and
> > >> >> >> >  start new partition.  */
> > >> >> >> >   if (partition->insns > 2 * partition_size)
> > >> >> >> >
> > >> >> >> > and/or in the code above set the partition_size to half of 
> > >> >> >> > total_size/max_size.
> > >> >> >> >
> > >> >> >> > I know this is somewhat sloppy.  This was really just first cut 
> > >> >> >> > implementation
> > >> >> >> > many years ago. I expected to reimplement it marter soon, but 
> > >> >> >> > then there was
> > >> >> >> > never really a need for it (I am trying to avoid late IPA 
> > >> >> >> > optimizations so the
> > >> >> >> > partitioning decisions should mostly affect compile time 
> > >> >> >> > performance only).
> > >> >> >> > If ARM is more sensitive for partitining, perhaps it would make 
> > >> >> >> > sense to try to
> > >> >> >> > look for something smarter.
> > >> >> >> >
> > >> >> >> >> +
> > >> >> >> >>npartitions = 1;
> > >> >> >> >>partition = new_partition ("");
> > >> >> >> >>if (symtab->dump_file)
> > >> >> >> >> diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c
> > >> >> >> >> index 9dd513f..294b8a4 100644
> > >> >> >> >> --- a/gcc/lto/lto.c
> > >> >> >> >> +++ b/gcc/lto/lto.c
> > >> >> >> >> @@ -3112,6 +3112,12 @@ do_whole_program_analysis (void)
> > >> >> >> >>timevar_pop (TV_WHOPR_WPA);
> > >> >> >> >>
> > >> >> >> >>timevar_push (TV_WHOPR_PARTITIONING);
> > >> >> >> >> +
> > >> >> >> >> +  if (flag_lto_partition != LTO_PARTITION_BALANCED
> > >> >> >> >> +  && PARAM_VALUE (MAX_PARTITION_SIZE) != INT_MAX)
> > >> >> >> >> +fatal_error (input_location, "--param max-lto-partition 
> > >> >> >> >> should only"
> > >> >> >> >> +  " be used with balanced partitioning\n");
> > >> >> >> >> +
> > >> >> >> >
> > >> >> >> > I think we should wire in resonable MAX_PARTITION_SIZE default.  
> > >> >> >> > THe value you
> > >> >> >> > found experimentally may be a good start. For that reason we 
> > >> >> >> > can't really
> > >> >> >> > refuse a value when !LTO_PARTITION_BALANCED.  Just document it 
> > >> >> >> > as parameter for
> > >> >> >> > balanced partitioning only and add a parameter to 
> > >> >> >> > lto_balanced_map specifying whether
> > >> >> >> > this param should be honored (because the same path is used for 
> > >> >> >> > partitioning to one partition)
> > >> >> >> >
> > >> >> >> > Otherwise the patch looks good to me modulo missing 
> > >> >> >> > documentation.
> > >> >> >> Thanks for the review. I have updated the patch.
> > >> >> >> Does this version look OK ?
> > >> >> >> I had randomly chosen 1, not sure if th

[Patch AArch64 1/3] Enable CRC by default for armv8.1-a

2016-04-06 Thread James Greenhalgh

Hi,

This change reflects binutils support for CRC, where it is always enabled
for armv8.1-a.

OK?

Thanks,
James

---
2016-04-06  James Greenhalgh  

* config/aarch64/aarch64.h (AARCH64_FL_FOR_ARCH8_1): Also add
AARCH64_FL_CRC.

diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 7750d1c..15d7e40 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -145,7 +145,7 @@ extern unsigned aarch64_architecture_version;
 /* Architecture flags that effect instruction selection.  */
 #define AARCH64_FL_FOR_ARCH8   (AARCH64_FL_FPSIMD)
 #define AARCH64_FL_FOR_ARCH8_1			   \
-  (AARCH64_FL_FOR_ARCH8 | AARCH64_FL_LSE | AARCH64_FL_V8_1)
+  (AARCH64_FL_FOR_ARCH8 | AARCH64_FL_LSE | AARCH64_FL_CRC | AARCH64_FL_V8_1)
 
 /* Macros to test ISA flags.  */
 


[Patch AArch64 0/3] Fix PR70133

2016-04-06 Thread James Greenhalgh
Hi,

This patch set fixes PR70133, which is a bug in the way we handle extension
strings after using -march or -mcpu=native. In investigating this, I found
other bugs in the way we communicate architceture intention between the
compiler and the assembler.

This patch set cleans this up somewhat.

Tested on a Cortex-A57 based board, a Cortex-A57/Cortex-A53 big.LITTLE
based board, a Cortex-A72/Cortex-A53 big.LITTLE based board and an xgene1
based board. I don't have access to the board in the bug report, but I fed
representative data to the detection code to check that worked too.

The patch set goes as follows...

[Patch AArch64 1/3] Enable CRC by default for armv8.1-a

  The assmebler will enable CRC by default for -march=armv8.1-a, and we should
  follow that expectation in GCC.

[Patch AArch64 2/3] Rework the code to print extension strings (pr70133)

  There are a number of bugs that come from the way we enable and disable
  extension strings. Rework this code so we always put out a safe and minimal
  set of flags for a -march/-mcpu input.

[Patch AArch64 3/3] Fix up for pr70133

  Use the infratructure added in 2/3 to fix the PR.

OK for trunk?

Thanks,
James

---
[Patch AArch64 1/3] Enable CRC by default for armv8.1-a

2016-04-06  James Greenhalgh  

* config/aarch64/aarch64.h (AARCH64_FL_FOR_ARCH8_1): Also add
AARCH64_FL_CRC.


[Patch AArch64 2/3] Rework the code to print extension strings (pr70133)

gcc/

2016-04-06  James Greenhalgh  

PR target/70133
* config/aarch64/aarch64-common.c (aarch64_option_extension): Keep
track of a canonical flag name.
(all_extensions): Likewise.
(arch_to_arch_name): Also track extension flags enabled by the arch.
(all_architectures): Likewise.
(aarch64_parse_extension): Move to here.
(aarch64_get_extension_string_for_isa_flags): Take a new argument,
rework.
(aarch64_rewrite_selected_cpu): Update for above change.
* config/aarch64/aarch64-option-extensions.def: Rework the way flags
are handled, such that the single explicit value enabled by an
extension is kept seperate from the implicit values it also enables.
* config/aarch64/aarch64-protos.h (aarch64_parse_opt_result): Move
to here.
(aarch64_parse_extension): New.
* config/aarch64/aarch64.c (aarch64_parse_opt_result): Move from
here to config/aarch64/aarch64-protos.h.
(aarch64_parse_extension): Move from here to
common/config/aarch64/aarch64-common.c.
(aarch64_option_print): Update.
(aarch64_declare_function_name): Likewise.
(aarch64_start_file): Likewise.
* config/aarch64/driver-aarch64.c (arch_extension): Keep track of
the canonical flag for extensions.
* config.gcc (aarch64*-*-*): Extend regex for capturing extension
flags.

gcc/testsuite/

2016-04-06  James Greenhalgh  

PR target/70133
* gcc.target/aarch64/mgeneral-regs_4.c: Fix expected output.
* gcc.target/aarch64/target_attr_15.c: Likewise.

[Patch AArch64 3/3] Fix up for pr70133

2016-04-06  James Greenhalgh  

PR target/70133

* config/aarch64/driver-aarch64.c
(aarch64_get_extension_string_for_isa_flags): New.
(arch_extension): Rename to...
(aarch64_arch_extension): ...This.
(ext_to_feat_string): Rename to...
(aarch64_extensions): ...This.
(aarch64_core_data): Keep track of architecture extension flags.
(cpu_data): Rename to...
(aarch64_cpu_data): ...This.
(aarch64_arch_driver_info): Keep track of architecture extension
flags.
(get_arch_name_from_id): Rename to...
(get_arch_from_id): ...This, change return type.
(host_detect_local_cpu): Update and reformat for renames, handle
extensions through common infrastructure.



[Patch AArch64 2/3] Rework the code to print extension strings (pr70133)

2016-04-06 Thread James Greenhalgh

Hi,

This patch aims to ensure that when we say:

  -march=armv8-a+nosimd

We communicate that to the assembler in a way it understands.

On trunk, we'll put out a directive that says

  .march armv8-a+fp

Which has two issues; it is not enough to disable the simd extensions, and
it adds a +fp that is already implied by the .march.  Rather, we should
be emitting:

  .march armv8-a+nosimd

Fixing this is reasonably easy. The size of this patch comes from some
refactoring to move aarch64_parse_extension from config/aarch64/aarch64.c
to common/config/aarch64/aarch64-common.c . we also need to modify
config/aarch64/aarch64-option-extensions.def to start keeping track of
a canonical name for each flag. Doing this means updating config.gcc with
a new regex.

We also need to fixup two testcases that look for an incorrect .arch
directive in the assembler output.

We need this to rationalise the way we put out .arch strings in
preparation for the next patch in the series, and to fix the existing bugs
and deficiencies in our .arch handling.

As an exception +crc is only sometimes guaranteed to be enabled by default
for ARMv8.1-A, so we always have to put this out.

Bootstrapped on aarch64-none-linux-gnu and tested for the defaults, and
with an explicit -march=native passed to dejagnu.

OK?

Thanks,
James

---
gcc/

2016-04-06  James Greenhalgh  

PR target/70133
* config/aarch64/aarch64-common.c (aarch64_option_extension): Keep
track of a canonical flag name.
(all_extensions): Likewise.
(arch_to_arch_name): Also track extension flags enabled by the arch.
(all_architectures): Likewise.
(aarch64_parse_extension): Move to here.
(aarch64_get_extension_string_for_isa_flags): Take a new argument,
rework.
(aarch64_rewrite_selected_cpu): Update for above change.
* config/aarch64/aarch64-option-extensions.def: Rework the way flags
are handled, such that the single explicit value enabled by an
extension is kept seperate from the implicit values it also enables.
* config/aarch64/aarch64-protos.h (aarch64_parse_opt_result): Move
to here.
(aarch64_parse_extension): New.
* config/aarch64/aarch64.c (aarch64_parse_opt_result): Move from
here to config/aarch64/aarch64-protos.h.
(aarch64_parse_extension): Move from here to
common/config/aarch64/aarch64-common.c.
(aarch64_option_print): Update.
(aarch64_declare_function_name): Likewise.
(aarch64_start_file): Likewise.
* config/aarch64/driver-aarch64.c (arch_extension): Keep track of
the canonical flag for extensions.
* config.gcc (aarch64*-*-*): Extend regex for capturing extension
flags.

gcc/testsuite/

2016-04-06  James Greenhalgh  

PR target/70133
* gcc.target/aarch64/mgeneral-regs_4.c: Fix expected output.
* gcc.target/aarch64/target_attr_15.c: Likewise.

diff --git a/gcc/common/config/aarch64/aarch64-common.c b/gcc/common/config/aarch64/aarch64-common.c
index 4969f07..08e7959 100644
--- a/gcc/common/config/aarch64/aarch64-common.c
+++ b/gcc/common/config/aarch64/aarch64-common.c
@@ -112,6 +112,7 @@ struct gcc_targetm_common targetm_common = TARGETM_COMMON_INITIALIZER;
 struct aarch64_option_extension
 {
   const char *const name;
+  const unsigned long flag_canonical;
   const unsigned long flags_on;
   const unsigned long flags_off;
 };
@@ -119,11 +120,11 @@ struct aarch64_option_extension
 /* ISA extensions in AArch64.  */
 static const struct aarch64_option_extension all_extensions[] =
 {
-#define AARCH64_OPT_EXTENSION(NAME, FLAGS_ON, FLAGS_OFF, FEATURE_STRING) \
-  {NAME, FLAGS_ON, FLAGS_OFF},
+#define AARCH64_OPT_EXTENSION(NAME, FLAG_CANONICAL, FLAGS_ON, FLAGS_OFF, Z) \
+  {NAME, FLAG_CANONICAL, FLAGS_ON, FLAGS_OFF},
 #include "config/aarch64/aarch64-option-extensions.def"
 #undef AARCH64_OPT_EXTENSION
-  {NULL, 0, 0}
+  {NULL, 0, 0, 0}
 };
 
 struct processor_name_to_arch
@@ -137,6 +138,7 @@ struct arch_to_arch_name
 {
   const enum aarch64_arch arch;
   const std::string arch_name;
+  const unsigned long flags;
 };
 
 /* Map processor names to the architecture revision they implement and
@@ -155,26 +157,111 @@ static const struct processor_name_to_arch all_cores[] =
 static const struct arch_to_arch_name all_architectures[] =
 {
 #define AARCH64_ARCH(NAME, CORE, ARCH_IDENT, ARCH, FLAGS) \
-  {AARCH64_ARCH_##ARCH_IDENT, NAME},
+  {AARCH64_ARCH_##ARCH_IDENT, NAME, FLAGS},
 #include "config/aarch64/aarch64-arches.def"
 #undef AARCH64_ARCH
-  {aarch64_no_arch, ""}
+  {aarch64_no_arch, "", 0}
 };
 
-/* Return a string representation of ISA_FLAGS.  */
+/* Parse the architecture extension string STR and update ISA_FLAGS
+   with the architecture features turned on or off.  Return a
+   aarch64_parse_opt_result describing the result.  */
+
+enum aarch64_parse_opt_result
+aarch64_parse_extension (const char *str, unsigned long *isa_flags)
+{
+

[Patch AArch64 3/3] Fix up for pr70133

2016-04-06 Thread James Greenhalgh

Hi,

Having updated the way we parse and output extension strings, now we just
need to wire up the native detection to use these new features.

In doing some cleanup and rename I ended up fixing 8-spaces to tabs in
about half the file. I've done the rest while I'm here to save us from
some a mixed-style file.

Bootstrapped on aarch64-none-linux-gnu, then tested with defaults and
an explicit -march=native passed (on a system detected as
cortex-a57+crypto, and again on a system detected as
cortex-a72.cortex-a53+crypto). I also set up a dummy /proc/cpuinfo and
used that to manually check the input data in pr70133.

OK?

Thanks,
James

---
2016-04-06  James Greenhalgh  

PR target/70133

* config/aarch64/driver-aarch64.c
(aarch64_get_extension_string_for_isa_flags): New.
(arch_extension): Rename to...
(aarch64_arch_extension): ...This.
(ext_to_feat_string): Rename to...
(aarch64_extensions): ...This.
(aarch64_core_data): Keep track of architecture extension flags.
(cpu_data): Rename to...
(aarch64_cpu_data): ...This.
(aarch64_arch_driver_info): Keep track of architecture extension
flags.
(get_arch_name_from_id): Rename to...
(get_arch_from_id): ...This, change return type.
(host_detect_local_cpu): Update and reformat for renames, handle
extensions through common infrastructure.

diff --git a/gcc/config/aarch64/driver-aarch64.c b/gcc/config/aarch64/driver-aarch64.c
index 8925ec1..ce771ec 100644
--- a/gcc/config/aarch64/driver-aarch64.c
+++ b/gcc/config/aarch64/driver-aarch64.c
@@ -18,9 +18,16 @@
.  */
 
 #include "config.h"
+#define INCLUDE_STRING
 #include "system.h"
+#include "coretypes.h"
+#include "tm.h"
 
-struct arch_extension
+/* Defined in common/config/aarch64/aarch64-common.c.  */
+std::string aarch64_get_extension_string_for_isa_flags (unsigned long,
+			unsigned long);
+
+struct aarch64_arch_extension
 {
   const char *ext;
   unsigned int flag;
@@ -29,7 +36,7 @@ struct arch_extension
 
 #define AARCH64_OPT_EXTENSION(EXT_NAME, FLAG_CANONICAL, FLAGS_ON, FLAGS_OFF, FEATURE_STRING) \
   { EXT_NAME, FLAG_CANONICAL, FEATURE_STRING },
-static struct arch_extension ext_to_feat_string[] =
+static struct aarch64_arch_extension aarch64_extensions[] =
 {
 #include "aarch64-option-extensions.def"
 };
@@ -42,15 +49,16 @@ struct aarch64_core_data
   const char* arch;
   const char* implementer_id;
   const char* part_no;
+  const unsigned long flags;
 };
 
 #define AARCH64_CORE(CORE_NAME, CORE_IDENT, SCHED, ARCH, FLAGS, COSTS, IMP, PART) \
-  { CORE_NAME, #ARCH, IMP, PART },
+  { CORE_NAME, #ARCH, IMP, PART, FLAGS },
 
-static struct aarch64_core_data cpu_data [] =
+static struct aarch64_core_data aarch64_cpu_data[] =
 {
 #include "aarch64-cores.def"
-  { NULL, NULL, NULL, NULL }
+  { NULL, NULL, NULL, NULL, 0 }
 };
 
 #undef AARCH64_CORE
@@ -59,37 +67,37 @@ struct aarch64_arch_driver_info
 {
   const char* id;
   const char* name;
+  const unsigned long flags;
 };
 
 #define AARCH64_ARCH(NAME, CORE, ARCH_IDENT, ARCH_REV, FLAGS) \
-  { #ARCH_IDENT, NAME  },
+  { #ARCH_IDENT, NAME, FLAGS },
 
-static struct aarch64_arch_driver_info aarch64_arches [] =
+static struct aarch64_arch_driver_info aarch64_arches[] =
 {
 #include "aarch64-arches.def"
-  {NULL, NULL}
+  {NULL, NULL, 0}
 };
 
 #undef AARCH64_ARCH
 
-/* Return the full architecture name string corresponding to the
-   identifier ID.  */
+/* Return an aarch64_arch_driver_info for the architecture described
+   by ID, or NULL if ID describes something we don't know about.  */
 
-static const char*
-get_arch_name_from_id (const char* id)
+static struct aarch64_arch_driver_info*
+get_arch_from_id (const char* id)
 {
   unsigned int i = 0;
 
   for (i = 0; aarch64_arches[i].id != NULL; i++)
 {
   if (strcmp (id, aarch64_arches[i].id) == 0)
-return aarch64_arches[i].name;
+	return &aarch64_arches[i];
 }
 
   return NULL;
 }
 
-
 /* Check wether the string CORE contains the same CPU part numbers
as BL_STRING.  For example CORE="{0xd03, 0xd07}" and BL_STRING="0xd07.0xd03"
should return true.  */
@@ -98,7 +106,7 @@ static bool
 valid_bL_string_p (const char** core, const char* bL_string)
 {
   return strstr (bL_string, core[0]) != NULL
- && strstr (bL_string, core[1]) != NULL;
+&& strstr (bL_string, core[1]) != NULL;
 }
 
 /*  Return true iff ARR contains STR in one of its two elements.  */
@@ -142,7 +150,7 @@ host_detect_local_cpu (int argc, const char **argv)
 {
   const char *arch_id = NULL;
   const char *res = NULL;
-  static const int num_exts = ARRAY_SIZE (ext_to_feat_string);
+  static const int num_exts = ARRAY_SIZE (aarch64_extensions);
   char buf[128];
   FILE *f = NULL;
   bool arch = false;
@@ -156,6 +164,8 @@ host_detect_local_cpu (int argc, const char **argv)
   unsigned int n_imps = 0;
   bool processed_exts = false;
   const char *ext_string = ""

Re: [PATCH] PR70117, ppc long double isinf

2016-04-06 Thread Andreas Schwab
Alan Modra  writes:

> diff --git a/gcc/testsuite/gcc.target/powerpc/pr70117.c 
> b/gcc/testsuite/gcc.target/powerpc/pr70117.c
> new file mode 100644
> index 000..99e6f19
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr70117.c
> @@ -0,0 +1,22 @@
> +/* { dg-do run { target { { powerpc*-*-darwin* powerpc*-*-aix* rs6000-*-* } 
> || { powerpc*-*-linux* && lp64 } } } } */

Any reason why it is restricted to lp64?

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Unreviewed patch

2016-04-06 Thread Rainer Orth
The following patch has remainded unreviewed for a week:

[testsuite, sparcv9] Fix gcc.dg/ifcvt-4.c on 64-bit SPARC (PR 
rtl-optimization/68749)
https://gcc.gnu.org/ml/gcc-patches/2016-03/msg01631.html

Although it's testsuite-only, I'm quite reluctant to make potential
semantic changes to a testcase without approval from the subject-matter
expert.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: PATCH] Fix PR 31531: A microoptimization of isnegative of signed integer

2016-04-06 Thread Richard Biener
On Tue, Feb 16, 2016 at 5:50 AM, Hurugalawadi, Naveen
 wrote:
> Hi,
>
>>> I'm also failing to see why you can't enhance the existing
>
> Please find attached the patch that enhances the existing pattern.
> Please review the patch and let me know if any further modifications
> are required.

What's the motivation of splitting this into a equal type (borken for
the vector case)
and a non-equal type case?  Simply only allow nop-conversions here
(tree_nop_conversion_p)
and unconditionally emit

 (scmp (view_convert:newtype @0) (bit_not @1))

?  The conversion will be omitted if it turns out to be not necessary
and a view_convert
will be turned into a regular conversion for non-vector cases.

Richard.

> Thanks,
> Naveen


[PATCH, testsuite/ARM] Skip pr70496.c for cortex-m devices

2016-04-06 Thread Thomas Preudhomme
Hi,

Testcase in gcc.target/arm/pr70496.c uses an .arm directive so assumes the 
target has an ARM execution state. This patch adds a dg-skip-if directive to 
skip that test on Cortex-M targets since they don't have such an execution 
state.

ChangeLog entry is as follows:


*** gcc/testsuite/ChangeLog ***

2016-04-06  Thomas Preud'homme  

PR testsuite/70553
* gcc.target/arm/pr70496.c: Skip for ARM Cortex-M targets.


diff --git a/gcc/testsuite/gcc.target/arm/pr70496.c 
b/gcc/testsuite/gcc.target/arm/pr70496.c
index 
89957e2c7a75cb89153b3e3fc34d8051b6a997d1..548a8243059ddaec63ed897dc67f4751d806a065
 
100644
--- a/gcc/testsuite/gcc.target/arm/pr70496.c
+++ b/gcc/testsuite/gcc.target/arm/pr70496.c
@@ -1,6 +1,7 @@
 /* { dg-do assemble } */
 /* { dg-options "-mthumb -O2" } */
 /* { dg-require-effective-target arm_thumb2_ok } */
+/* { dg-skip-if "does not have ARM state" { arm_cortex_m } } */
 
 int i;
 void




Is this ok for trunk?

Best regards,

Thomas


Re: [PATCH, testsuite/ARM] Skip pr70496.c for cortex-m devices

2016-04-06 Thread Kyrill Tkachov

Hi Thomas,

On 06/04/16 12:03, Thomas Preudhomme wrote:

Hi,

Testcase in gcc.target/arm/pr70496.c uses an .arm directive so assumes the
target has an ARM execution state. This patch adds a dg-skip-if directive to
skip that test on Cortex-M targets since they don't have such an execution
state.

ChangeLog entry is as follows:


*** gcc/testsuite/ChangeLog ***

2016-04-06  Thomas Preud'homme  

 PR testsuite/70553
 * gcc.target/arm/pr70496.c: Skip for ARM Cortex-M targets.


diff --git a/gcc/testsuite/gcc.target/arm/pr70496.c
b/gcc/testsuite/gcc.target/arm/pr70496.c
index
89957e2c7a75cb89153b3e3fc34d8051b6a997d1..548a8243059ddaec63ed897dc67f4751d806a065
100644
--- a/gcc/testsuite/gcc.target/arm/pr70496.c
+++ b/gcc/testsuite/gcc.target/arm/pr70496.c
@@ -1,6 +1,7 @@
  /* { dg-do assemble } */
  /* { dg-options "-mthumb -O2" } */
  /* { dg-require-effective-target arm_thumb2_ok } */
+/* { dg-skip-if "does not have ARM state" { arm_cortex_m } } */
  


Would it be better to just require the arm_arm_ok effective target?
That should try to compile a test with -marm added to the command,
which should fail for Cortex-M targets.

Thanks,
Kyrill


  int i;
  void




Is this ok for trunk?

Best regards,

Thomas





Compile libcilkrts with -funwind-tables (PR target/60290)

2016-04-06 Thread Rainer Orth
I've finally gotten around to analyzing this testsuite failure on 32-bit
Solaris/x86:

FAIL: g++.dg/cilk-plus/CK/catch_exc.cc  -O1 -fcilkplus execution test
FAIL: g++.dg/cilk-plus/CK/catch_exc.cc  -O3 -fcilkplus execution test
FAIL: g++.dg/cilk-plus/CK/catch_exc.cc  -g -O2 -fcilkplus execution test
FAIL: g++.dg/cilk-plus/CK/catch_exc.cc  -g -fcilkplus execution test

The testcase aborts like this:

Thread 2 received signal SIGABRT, Aborted.
[Switching to Thread 1 (LWP 1)]
0xfe3ba3c5 in __lwp_sigqueue () from /lib/libc.so.1
(gdb) where
#0  0xfe3ba3c5 in __lwp_sigqueue () from /lib/libc.so.1
#1  0xfe3b2d4f in thr_kill () from /lib/libc.so.1
#2  0xfe2f64da in raise () from /lib/libc.so.1
#3  0xfe2c93ee in abort () from /lib/libc.so.1
#4  0xfe525b37 in _Unwind_Resume (exc=0x80a75a0)
at /vol/gcc/src/hg/trunk/local/libgcc/unwind.inc:234
#5  0xfe783b85 in __cilkrts_gcc_rethrow (sf=0xfeffdb00)
at /vol/gcc/src/hg/trunk/local/libcilkrts/runtime/except-gcc.cpp:589
#6  0xfe77f0ea in __cilkrts_rethrow (sf=0xfeffdb00)
at /vol/gcc/src/hg/trunk/local/libcilkrts/runtime/cilk-abi.c:548
#7  0x080513a3 in my_test ()
at 
/vol/gcc/src/hg/trunk/local/gcc/testsuite/g++.dg/cilk-plus/CK/catch_exc.cc:38
#8  0x080515bd in main ()
at 
/vol/gcc/src/hg/trunk/local/gcc/testsuite/g++.dg/cilk-plus/CK/catch_exc.cc:62

The gcc_assert in _Unwind_Resume triggers since
_Unwind_RaiseException_Phase2 returned _URC_FATAL_PHASE2_ERROR.  I found
that x86_fallback_frame_state had been invoked for this pc:

   0xfe77218e <__cilkrts_rethrow+30>:   add$0x10,%esp

and returned _URC_END_OF_STACK, which is totally unexpected since
_Unwind_Find_FDE should have found it.  __cilkrts_rethrow is defined in
libcilkrts/cilk-abi.o, but in the 32-bit case EH info is missing:

32-bit:

ro@fuego 339 > elfdump -u .libs/libcilkrts.so|grep rethrow
   0x18def  0x3558c  __cilkrts_gcc_rethrow
  [0x11fc]  initloc:   0x18def [ sdata4 pcrel ]  __cilkrts_gcc_rethrow

64-bit:

ro@fuego 341 > elfdump -u amd64/libcilkrts/.libs/libcilkrts.so|grep rethrow
   0x1c639  0x2010  __cilkrts_rethrow
   0x2388b  0x5510  __cilkrts_gcc_rethrow
   [0x488]  initloc:   0x1c639 [ sdata4 pcrel ]  __cilkrts_rethrow
  [0x3988]  initloc:   0x2388b [ sdata4 pcrel ]  __cilkrts_gcc_rethrow

I traced this to -funwind-tables bein set on 32-bit Linux/x86, while it
is unset on 32-bit Solaris/x86  due to

i386/i386.c (ix86_option_override_internal):

  if (opts->x_flag_asynchronous_unwind_tables == 2)
opts->x_flag_asynchronous_unwind_tables = !USE_IX86_FRAME_POINTER;

where i386/sol2.h has

#define USE_IX86_FRAME_POINTER 1

while the default is 0.

As expected, compiling libcilkrts with -funwind-tables (which is a no-op
on Linux/x86, Linux/x86_64, and Solaris/amd64) makes the failure go
away.

I'm uncertain if this is ok for mainline at this stage or has to wait
for gcc-7.  Once it goes into mainline, it's probably worth a backport
to all active release branches.

Thoughts?

Rainer


2016-04-04  Rainer Orth  

PR target/60290
* Makefile.am (GENERAL_FLAGS): Add -funwind-tables.
* Makefile.in: Regenerate.

# HG changeset patch
# Parent  5ffdfe8da23c38390b9520b92cb76f53dfc3da0b
Compile libcilkrts with -funwind-tables (PR target/60290)

diff --git a/libcilkrts/Makefile.am b/libcilkrts/Makefile.am
--- a/libcilkrts/Makefile.am
+++ b/libcilkrts/Makefile.am
@@ -43,6 +43,9 @@ GENERAL_FLAGS = -I$(top_srcdir)/include 
 # Enable Intel Cilk Plus extension
 GENERAL_FLAGS += -fcilkplus
 
+# Always generate unwind tables
+GENERAL_FLAGS += -funwind-tables
+
 AM_CFLAGS = $(XCFLAGS) $(GENERAL_FLAGS) -std=c99
 AM_CPPFLAGS = $(GENERAL_FLAGS)
 AM_LDFLAGS = $(XLDFLAGS)

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH, testsuite/ARM] Skip pr70496.c for cortex-m devices

2016-04-06 Thread Ramana Radhakrishnan
On Wed, Apr 6, 2016 at 12:03 PM, Thomas Preudhomme
 wrote:
> Hi,
>
> Testcase in gcc.target/arm/pr70496.c uses an .arm directive so assumes the
> target has an ARM execution state. This patch adds a dg-skip-if directive to
> skip that test on Cortex-M targets since they don't have such an execution
> state.
>
> ChangeLog entry is as follows:
>
>
> *** gcc/testsuite/ChangeLog ***
>
> 2016-04-06  Thomas Preud'homme  
>
> PR testsuite/70553
> * gcc.target/arm/pr70496.c: Skip for ARM Cortex-M targets.
>
>
> diff --git a/gcc/testsuite/gcc.target/arm/pr70496.c
> b/gcc/testsuite/gcc.target/arm/pr70496.c
> index
> 89957e2c7a75cb89153b3e3fc34d8051b6a997d1..548a8243059ddaec63ed897dc67f4751d806a065
> 100644
> --- a/gcc/testsuite/gcc.target/arm/pr70496.c
> +++ b/gcc/testsuite/gcc.target/arm/pr70496.c
> @@ -1,6 +1,7 @@
>  /* { dg-do assemble } */
>  /* { dg-options "-mthumb -O2" } */
>  /* { dg-require-effective-target arm_thumb2_ok } */
> +/* { dg-skip-if "does not have ARM state" { arm_cortex_m } } */
>
>  int i;
>  void
>


Ok.

Looks obvious and sorry about the inadvertent breakage.

Ramana
>
>
>
> Is this ok for trunk?
>
> Best regards,
>
> Thomas


Re: [PATCH] PR70117, ppc long double isinf

2016-04-06 Thread Alan Modra
On Wed, Apr 06, 2016 at 12:27:36PM +0200, Andreas Schwab wrote:
> Alan Modra  writes:
> 
> > diff --git a/gcc/testsuite/gcc.target/powerpc/pr70117.c 
> > b/gcc/testsuite/gcc.target/powerpc/pr70117.c
> > new file mode 100644
> > index 000..99e6f19
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/powerpc/pr70117.c
> > @@ -0,0 +1,22 @@
> > +/* { dg-do run { target { { powerpc*-*-darwin* powerpc*-*-aix* rs6000-*-* 
> > } || { powerpc*-*-linux* && lp64 } } } } */
> 
> Any reason why it is restricted to lp64?

No, that was me copying from rs6000-ldouble-1.c without thinking.
We've had double-double on powerpc-linux 32-bit for quite a while.

-- 
Alan Modra
Australia Development Lab, IBM


[committed] Avoid -Wuninitialized warnings in certain OpenMP cases (PR middle-end/70550)

2016-04-06 Thread Jakub Jelinek
Hi!

I've committed my counter-part of the recently posted OpenACC
-Wuninitialized patch.
omp-low already makes sure -Wuninitialized doesn't warn when used in shared
clause (explicit or implicit).  This patch adds to that avoidance of warning
for mapping of vars in target construct (if they aren't addressable, we
could warn), and for implicit firstprivate clauses on task/taskloop and
target constructs.
IMHO if somebody uses explicit firstprivate clause, then warning is
desirable, if the var is uninitialized, one should better use private
instead of firstprivate clause.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.

2016-04-06  Jakub Jelinek  

PR middle-end/70550
* tree.h (OMP_CLAUSE_FIRSTPRIVATE_IMPLICIT): Define.
* gimplify.c (gimplify_adjust_omp_clauses_1): Set it for implicit
firstprivate clauses.
* omp-low.c (lower_send_clauses): Set TREE_NO_WARNING for
OMP_CLAUSE_FIRSTPRIVATE_IMPLICIT !by_ref vars in task contexts.
(lower_omp_target): Set TREE_NO_WARNING for
non-addressable possibly uninitialized vars which are copied into
addressable temporaries or copied for GOMP_MAP_FIRSTPRIVATE_INT.

* c-c++-common/gomp/pr70550-1.c: New test.
* c-c++-common/gomp/pr70550-2.c: New test.

--- gcc/tree.h.jj   2016-04-05 18:56:59.908992839 +0200
+++ gcc/tree.h  2016-04-06 10:04:46.616725165 +0200
@@ -1430,6 +1430,10 @@ extern void protected_set_expr_location
 #define OMP_CLAUSE_PRIVATE_TASKLOOP_IV(NODE) \
   TREE_PROTECTED (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_PRIVATE))
 
+/* True on a FIRSTPRIVATE clause if it has been added implicitly.  */
+#define OMP_CLAUSE_FIRSTPRIVATE_IMPLICIT(NODE) \
+  (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_FIRSTPRIVATE)->base.public_flag)
+
 /* True on a LASTPRIVATE clause if a FIRSTPRIVATE clause for the same
decl is present in the chain.  */
 #define OMP_CLAUSE_LASTPRIVATE_FIRSTPRIVATE(NODE) \
--- gcc/gimplify.c.jj   2016-04-05 18:57:00.038991093 +0200
+++ gcc/gimplify.c  2016-04-06 10:04:46.619725124 +0200
@@ -7742,6 +7742,8 @@ gimplify_adjust_omp_clauses_1 (splay_tre
   && (flags & GOVD_WRITTEN) == 0
   && omp_shared_to_firstprivate_optimizable_decl_p (decl))
 OMP_CLAUSE_SHARED_READONLY (clause) = 1;
+  else if (code == OMP_CLAUSE_FIRSTPRIVATE && (flags & GOVD_EXPLICIT) == 0)
+OMP_CLAUSE_FIRSTPRIVATE_IMPLICIT (clause) = 1;
   else if (code == OMP_CLAUSE_MAP && (flags & GOVD_MAP_0LEN_ARRAY) != 0)
 {
   tree nc = build_omp_clause (input_location, OMP_CLAUSE_MAP);
--- gcc/omp-low.c.jj2016-04-05 18:57:00.092990368 +0200
+++ gcc/omp-low.c   2016-04-06 10:23:57.344978709 +0200
@@ -6107,8 +6107,15 @@ lower_send_clauses (tree clauses, gimple
 
   switch (OMP_CLAUSE_CODE (c))
{
-   case OMP_CLAUSE_PRIVATE:
case OMP_CLAUSE_FIRSTPRIVATE:
+ if (OMP_CLAUSE_FIRSTPRIVATE_IMPLICIT (c)
+ && !by_ref
+ && is_task_ctx (ctx))
+   TREE_NO_WARNING (var) = 1;
+ do_in = true;
+ break;
+
+   case OMP_CLAUSE_PRIVATE:
case OMP_CLAUSE_COPYIN:
case OMP_CLAUSE__LOOPTEMP_:
  do_in = true;
@@ -16083,7 +16090,16 @@ lower_omp_target (gimple_stmt_iterator *
|| map_kind == GOMP_MAP_POINTER
|| map_kind == GOMP_MAP_TO_PSET
|| map_kind == GOMP_MAP_FORCE_DEVICEPTR)
- gimplify_assign (avar, var, &ilist);
+ {
+   /* If we need to initialize a temporary
+  with VAR because it is not addressable, and
+  the variable hasn't been initialized yet, then
+  we'll get a warning for the store to avar.
+  Don't warn in that case, the mapping might
+  be implicit.  */
+   TREE_NO_WARNING (var) = 1;
+   gimplify_assign (avar, var, &ilist);
+ }
avar = build_fold_addr_expr (avar);
gimplify_assign (x, avar, &ilist);
if ((GOMP_MAP_COPY_FROM_P (map_kind)
@@ -16252,6 +16268,8 @@ lower_omp_target (gimple_stmt_iterator *
tree t = var;
if (is_reference (var))
  t = build_simple_mem_ref (var);
+   else if (OMP_CLAUSE_FIRSTPRIVATE_IMPLICIT (c))
+ TREE_NO_WARNING (var) = 1;
if (TREE_CODE (type) != POINTER_TYPE)
  t = fold_convert (pointer_sized_int_node, t);
t = fold_convert (TREE_TYPE (x), t);
@@ -16263,6 +16281,8 @@ lower_omp_target (gimple_stmt_iterator *
  {
tree avar = create_tmp_var (TREE_TYPE (var));
mark_addressable (avar);
+   if (OMP_CLAUSE_FIRSTPRIVATE_IMPLICIT (c))
+ TREE_NO_WARNING (var) = 1;

[committed] OpenMP declare simd ABI changes on x86_64/i686

2016-04-06 Thread Jakub Jelinek
Hi!

As we already have some declare simd ABI changes in GCC 6 (mangling of
the various linear clause kinds), and as glibc uses the AVX512F stuff in,
I've decided to commit following ABI changes right now rather than waiting
for GCC 7.
The changes are:
1) for AVX2, functions with {{un,}signed ,}char characteristic type without
   simdlen clause are now using simdlen of 32 rather than 16
2) there are new AVX512F entrypoints, with mangling character e, that
   pass/return values in __attribute__((vector_size (64))) vectors
   (both integer and floating point)
3) the fuzzy part is what to do with functions with explicit rather than
   implicit simdlen clause if it is > 16; we certainly need some upper
   bound, the spec doesn't say anything, and I don't have recent enough
   ICC to check what is used right now

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.
I'll try to coordinate with Intel about 3) as well as the default alignment
if aligned clause is used on declare simd without any explicit alignment.

2016-04-06  Jakub Jelinek  

* config/i386/i386.c (ix86_simd_clone_compute_vecsize_and_simdlen):
Add support for AVX512F clones, include them by default for
exported OpenMP declare simd functions.  For AVX2 allow simdlen 32
and use it if charasteric type is 8-bit, for AVX512F allow simdlen
up to 128.

* lib/target-supports.exp (check_effective_target_vect_simd_clones):
Check for avx512f effective targets instead of avx2.
* gcc.dg/gomp/declare-simd-1.c: Add scan-assembler-times directives
for AVX512F clones.
* gcc.dg/gomp/declare-simd-3.c: Likewise.
* g++.dg/gomp/declare-simd-1.C: Likewise.
* g++.dg/gomp/declare-simd-3.C: Likewise.
* g++.dg/gomp/declare-simd-4.C: Likewise.

--- gcc/config/i386/i386.c.jj   2016-04-01 17:21:31.0 +0200
+++ gcc/config/i386/i386.c  2016-04-06 12:05:30.381913719 +0200
@@ -53761,7 +53761,7 @@ ix86_simd_clone_compute_vecsize_and_simd
 
   if (clonei->simdlen
   && (clonei->simdlen < 2
- || clonei->simdlen > 16
+ || clonei->simdlen > 128
  || (clonei->simdlen & (clonei->simdlen - 1)) != 0))
 {
   warning_at (DECL_SOURCE_LOCATION (node->decl), 0,
@@ -53819,7 +53819,9 @@ ix86_simd_clone_compute_vecsize_and_simd
 {
   /* If the function isn't exported, we can pick up just one ISA
 for the clones.  */
-  if (TARGET_AVX2)
+  if (TARGET_AVX512F)
+   clonei->vecsize_mangle = 'e';
+  else if (TARGET_AVX2)
clonei->vecsize_mangle = 'd';
   else if (TARGET_AVX)
clonei->vecsize_mangle = 'c';
@@ -53829,8 +53831,8 @@ ix86_simd_clone_compute_vecsize_and_simd
 }
   else
 {
-  clonei->vecsize_mangle = "bcd"[num];
-  ret = 3;
+  clonei->vecsize_mangle = "bcde"[num];
+  ret = 4;
 }
   switch (clonei->vecsize_mangle)
 {
@@ -53846,6 +53848,10 @@ ix86_simd_clone_compute_vecsize_and_simd
   clonei->vecsize_int = 256;
   clonei->vecsize_float = 256;
   break;
+case 'e':
+  clonei->vecsize_int = 512;
+  clonei->vecsize_float = 512;
+  break;
 }
   if (clonei->simdlen == 0)
 {
@@ -53854,9 +53860,24 @@ ix86_simd_clone_compute_vecsize_and_simd
   else
clonei->simdlen = clonei->vecsize_float;
   clonei->simdlen /= GET_MODE_BITSIZE (TYPE_MODE (base_type));
-  if (clonei->simdlen > 16)
-   clonei->simdlen = 16;
 }
+  else if (clonei->simdlen > 16)
+switch (clonei->vecsize_int)
+  {
+  case 512:
+   /* For AVX512-F, support VLEN up to 128.  */
+   break;
+  case 256:
+   /* For AVX2, support VLEN up to 32.  */
+   if (clonei->simdlen <= 32)
+ break;
+   /* FALLTHRU */
+  default:
+   /* Otherwise, support VLEN up to 16.  */
+   warning_at (DECL_SOURCE_LOCATION (node->decl), 0,
+   "unsupported simdlen %d", clonei->simdlen);
+   return 0;
+  }
   return ret;
 }
 
@@ -53881,6 +53902,10 @@ ix86_simd_clone_adjust (struct cgraph_no
   if (!TARGET_AVX2)
str = "avx2";
   break;
+case 'e':
+  if (!TARGET_AVX512F)
+   str = "avx512f";
+  break;
 default:
   gcc_unreachable ();
 }
@@ -53920,6 +53945,10 @@ ix86_simd_clone_usable (struct cgraph_no
   if (!TARGET_AVX2)
return -1;
   return 0;
+case 'e':
+  if (!TARGET_AVX512F)
+   return -1;
+  return 0;
 default:
   gcc_unreachable ();
 }
--- gcc/testsuite/lib/target-supports.exp.jj2016-03-23 19:25:53.0 
+0100
+++ gcc/testsuite/lib/target-supports.exp   2016-04-06 11:19:06.398694815 
+0200
@@ -2603,7 +2603,7 @@ proc check_effective_target_vect_simd_cl
# avx2 clone.  Only the right clone for the specified arch will be
# chosen, but still we need to at least be able to assemble
# avx2.
-   if { [check_effective_target_avx2] } {
+ 

Re: Add some C++17 items to gcc-6/changes.html

2016-04-06 Thread Jason Merrill

On 04/04/2016 06:22 AM, Jonathan Wakely wrote:

I plan to commit this to wwwdocs CVS. Have I missed anything that
should be listed?


I'd mention that TM requires the -fgnu-tm flag.

Jason




Re: openacc reference reductions

2016-04-06 Thread Jakub Jelinek
On Tue, Apr 05, 2016 at 06:53:47PM -0700, Cesar Philippidis wrote:
> --- a/gcc/omp-low.c
> +++ b/gcc/omp-low.c
> @@ -309,6 +309,25 @@ is_oacc_kernels (omp_context *ctx)
> == GF_OMP_TARGET_KIND_OACC_KERNELS));
>  }
>  
> +/* Return true if CTX corresponds to an oacc parallel region and if
> +   VAR is used in a reduction.  */
> +
> +static bool
> +is_oacc_parallel_reduction (tree var, omp_context *ctx)
> +{
> +  if (!is_oacc_parallel (ctx))
> +return false;
> +
> +  tree clauses = gimple_omp_target_clauses (ctx->stmt);
> +
> +  for (tree c = clauses; c; c = OMP_CLAUSE_CHAIN (c))
> +if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_REDUCTION
> + && OMP_CLAUSE_DECL (c) == var)
> +  return true;
> +
> +  return false;
> +}
> +
>  /* If DECL is the artificial dummy VAR_DECL created for non-static
> data member privatization, return the underlying "this" parameter,
> otherwise return NULL.  */
> @@ -2122,7 +2141,8 @@ scan_sharing_clauses (tree clauses, omp_context *ctx,
> else
>   install_var_field (decl, true, 3, ctx,
>  base_pointers_restrict);
> -   if (is_gimple_omp_offloaded (ctx->stmt))
> +   if (is_gimple_omp_offloaded (ctx->stmt)
> +   && !is_oacc_parallel_reduction (decl, ctx))
>   install_var_local (decl, ctx);
>   }
>   }

The above is O(n^2) in number of clauses on the construct.
Perhaps better define some OMP_CLAUSE_MAP_IN_REDUCTION macro (e.g.
TREE_PRIVATE bit is unused on OMP_CLAUSE_MAP right now), make sure to set it
e.g. during gimplification where you can see all GOVD_* flags for a
particular decl), and then use this flag here?


Jakub


Re: [PATCH 2/2] Fix C++ side of PR c/70436 (missing -Wparentheses warnings)

2016-04-06 Thread Jason Merrill

OK.

Jason


Re: [PATCH, cpp] Fix pr61817 and 69391

2016-04-06 Thread Jakub Jelinek
On Tue, Apr 05, 2016 at 09:22:37AM -0700, Richard Henderson wrote:
> These two related PRs are all about remembering where a macro is expanded.
> Worse, we've got two competing goals -- the real location of the expansion,
> for __LINE__, and the virtual location of the expansion, for diagnostics.
> 
> There seems to be no way to unify the two competing goals.  If we simply
> "fix" the first, we break the second.  Therefore, I resort to passing down
> both locations.
> 
> Ok?
> 
> 
> r~


Missing PR numbers here.

>   * internal.h (_cpp_builtin_macro_text): Update decl.
>   * macro.c (_cpp_builtin_macro_text): Accept location for __LINE__.
>   (builtin_macro): Accept a second location for __LINE__.
>   (enter_macro_context): Compute both virtual and real expansion
>   locations for the macro.
> 
>   * gcc.dg/pr61817.c: New test.
>   * gcc.dg/pr69391-1.c: New test.
>   * gcc.dg/pr69391-2.c: New test.
> 
> 
> diff --git a/gcc/testsuite/gcc.dg/pr61817.c b/gcc/testsuite/gcc.dg/pr61817.c
> new file mode 100644
> index 000..4230485
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr61817.c
> @@ -0,0 +1,19 @@
> +/* { dg-do compile } */
> +/* { dg-options "-std=c11 -ftrack-macro-expansion=0" } */

Wouldn't it make sense to provide this test also as -1.c and -2.c, one
with -ftrack-macro-expansion=0 and one with -ftrack-macro-expansion=1?

Otherwise LGTM.

Jakub


Re: [C++ PATCH] PR 70501, ICE in verify ctor sanity

2016-04-06 Thread Jason Merrill

On 04/05/2016 05:21 PM, Nathan Sidwell wrote:

On 04/05/16 12:40, Jason Merrill wrote:


It's not clear to me that we really need a TARGET_EXPR for vector values.  Since
one element of a vector can't refer to another, we don't need the ctx->ctor
handling.  Perhaps we should handle vectors like we do PMF types in
cxx_eval_bare_aggregate?


That may be abstractly better, but we do currently wrap constructors in
target_exprs for vector compound_literals (which is what I was
following).  See the get_target_expr_sfinae  calls in
finish_compound_literal for instance.  That happens for  the '(v4si){(0,
0)}' subexpression of the testcase.


Sure, but that also seems unnecessary; vector rvalues don't have object 
identity the way class and array rvalues do.


Jason



[Patch] Avoid deadlock in guality tests.

2016-04-06 Thread Yvan Roux
Hi,

we are confronted to a deadlock situation when doing native validation
on armv8l target.

When gcc/testsuite/gcc.dg/guality/example.c is executed it spawns gdb,
and makes it attach to his parent, but during the test execution, gdb
receives a SIGSEGV, which is handled as a stop signal.  Then the test
timeouts and dejagnu tries to terminate it, but doesn't manage to do
it, this was raised and discussed on dejagnu list.

https://lists.gnu.org/archive/html/dejagnu/2016-03/msg00048.html

Dejagnu cleanup mechanism needs to be enhanced, but I think that it
would also be better if guality tests don't get stuck and/or can be
killed easily.  This patch changes GDB signals handling to nostop for
SIGSEGV, SIGINT, SIGTERM and SIGBUS.  I am not sure if we need to
increase the list of signals to all the stop ones (which are not used
by GDB) or to restrict it just to SIGSEGV.

Tested without regression on native x86_64, i386, aarch64 targets and
unleash native armv8l one.  Is it OK for trunk ? (I don't know the
rules for that kind of testsuite fix during stage 4).

Cheers,
Yvan

2016-04-06  Yvan Roux  
Pedro Alves  

* gcc.dg/guality/guality.h (main): Avoid GDB being blocked on signals.
diff --git a/gcc/testsuite/gcc.dg/guality/guality.h 
b/gcc/testsuite/gcc.dg/guality/guality.h
index 52fa706..d5867d8 100644
--- a/gcc/testsuite/gcc.dg/guality/guality.h
+++ b/gcc/testsuite/gcc.dg/guality/guality.h
@@ -252,6 +252,10 @@ main (int argc, char *argv[])
   if (!guality_gdb_input
  || fprintf (guality_gdb_input, "\
 set height 0\n\
+handle SIGINT pass nostop\n\
+handle SIGTERM pass nostop\n\
+handle SIGSEGV pass nostop\n\
+handle SIGBUS pass nostop\n\
 attach %i\n\
 set guality_attached = 1\n\
 b %i\n\


Re: [Patch] Avoid deadlock in guality tests.

2016-04-06 Thread Jakub Jelinek
On Wed, Apr 06, 2016 at 04:53:47PM +0200, Yvan Roux wrote:
> 2016-04-06  Yvan Roux  
> Pedro Alves  
> 
> * gcc.dg/guality/guality.h (main): Avoid GDB being blocked on signals.

Ok.

> diff --git a/gcc/testsuite/gcc.dg/guality/guality.h 
> b/gcc/testsuite/gcc.dg/guality/guality.h
> index 52fa706..d5867d8 100644
> --- a/gcc/testsuite/gcc.dg/guality/guality.h
> +++ b/gcc/testsuite/gcc.dg/guality/guality.h
> @@ -252,6 +252,10 @@ main (int argc, char *argv[])
>if (!guality_gdb_input
> || fprintf (guality_gdb_input, "\
>  set height 0\n\
> +handle SIGINT pass nostop\n\
> +handle SIGTERM pass nostop\n\
> +handle SIGSEGV pass nostop\n\
> +handle SIGBUS pass nostop\n\
>  attach %i\n\
>  set guality_attached = 1\n\
>  b %i\n\


Jakub


Re: [C++ PATCH] PR 70501, ICE in verify ctor sanity

2016-04-06 Thread Nathan Sidwell

On 04/06/16 07:49, Jason Merrill wrote:

On 04/05/2016 05:21 PM, Nathan Sidwell wrote:

On 04/05/16 12:40, Jason Merrill wrote:


It's not clear to me that we really need a TARGET_EXPR for vector values.  Since
one element of a vector can't refer to another, we don't need the ctx->ctor
handling.  Perhaps we should handle vectors like we do PMF types in
cxx_eval_bare_aggregate?


That may be abstractly better, but we do currently wrap constructors in
target_exprs for vector compound_literals (which is what I was
following).  See the get_target_expr_sfinae  calls in
finish_compound_literal for instance.  That happens for  the '(v4si){(0,
0)}' subexpression of the testcase.


Sure, but that also seems unnecessary; vector rvalues don't have object identity
the way class and array rvalues do.


I'll investigate further.  At least we have a fallback now.

nathan


Re: [patch] Remove superfluous /dev/null on grep line

2016-04-06 Thread Eric Botcazou
> OK, I have no objection to the original patch then.

Thanks, applied.  FWIW I verified that the library still builds after the 
change with an empty port_specific_symbol_files variable.

-- 
Eric Botcazou


Re: [Patch] Avoid deadlock in guality tests.

2016-04-06 Thread Pedro Alves
On 04/06/2016 03:53 PM, Yvan Roux wrote:
> Dejagnu cleanup mechanism needs to be enhanced, but I think that it
> would also be better if guality tests don't get stuck and/or can be
> killed easily.  This patch changes GDB signals handling to nostop for
> SIGSEGV, SIGINT, SIGTERM and SIGBUS.  I am not sure if we need to
> increase the list of signals to all the stop ones (which are not used
> by GDB) or to restrict it just to SIGSEGV.

I'd suggest:

 handle all pass nostop
 handle SIGINT pass nostop

That would make gdb pass _all_ signals except SIGTRAP

Thanks,
Pedro Alves



[PATCH, i386] Add some missing modes to mode attributes

2016-04-06 Thread Uros Bizjak
... so there is no unresolved attributes in insn-output.c.

2016-04-06  Uros Bizjak  

* config/i386/sse.md (shuffletype): Add V32HI and V4TI modes.
(ssescalarsize): Add V8SF, V4SF, V4DF and V2DF modes.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.
Committed to mainline SVN and gcc-5 branch.

Uros.
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 55f1ae7..5132955 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -493,8 +493,9 @@
   [(V16SF "f") (V16SI "i") (V8DF "f") (V8DI "i")
   (V8SF "f") (V8SI "i") (V4DF "f") (V4DI "i")
   (V4SF "f") (V4SI "i") (V2DF "f") (V2DI "i")
-  (V32QI "i") (V16HI "i") (V16QI "i") (V8HI "i")
-  (V64QI "i") (V1TI "i") (V2TI "i")])
+  (V32HI "i") (V16HI "i") (V8HI "i")
+  (V64QI "i") (V32QI "i") (V16QI "i")
+  (V4TI "i") (V2TI "i") (V1TI "i")])
 
 (define_mode_attr ssequartermode
   [(V16SF "V4SF") (V8DF "V2DF") (V16SI "V4SI") (V8DI "V2DI")])
@@ -733,7 +734,8 @@
(V64QI "8") (V32QI "8") (V16QI "8")
(V32HI "16") (V16HI "16") (V8HI "16")
(V16SI "32") (V8SI "32") (V4SI "32")
-   (V16SF "32") (V8DF "64")])
+   (V16SF "32") (V8SF "32") (V4SF "32")
+   (V8DF "64") (V4DF "64") (V2DF "64")])
 
 ;; SSE prefix for integer vector modes
 (define_mode_attr sseintprefix


Re: [Patch] Avoid deadlock in guality tests.

2016-04-06 Thread Yvan Roux
On 6 April 2016 at 17:09, Pedro Alves  wrote:
> On 04/06/2016 03:53 PM, Yvan Roux wrote:
>> Dejagnu cleanup mechanism needs to be enhanced, but I think that it
>> would also be better if guality tests don't get stuck and/or can be
>> killed easily.  This patch changes GDB signals handling to nostop for
>> SIGSEGV, SIGINT, SIGTERM and SIGBUS.  I am not sure if we need to
>> increase the list of signals to all the stop ones (which are not used
>> by GDB) or to restrict it just to SIGSEGV.
>
> I'd suggest:
>
>  handle all pass nostop
>  handle SIGINT pass nostop
>
> That would make gdb pass _all_ signals except SIGTRAP

I've committed it already :/

I can make the change, but isn't there cases where SIGILL is used for
breakpoints in GDB (I think I've seen that somewhere).


Yvan


Re: [Patch] Avoid deadlock in guality tests.

2016-04-06 Thread Pedro Alves
On 04/06/2016 04:13 PM, Yvan Roux wrote:
> On 6 April 2016 at 17:09, Pedro Alves  wrote:
>> On 04/06/2016 03:53 PM, Yvan Roux wrote:
>>> Dejagnu cleanup mechanism needs to be enhanced, but I think that it
>>> would also be better if guality tests don't get stuck and/or can be
>>> killed easily.  This patch changes GDB signals handling to nostop for
>>> SIGSEGV, SIGINT, SIGTERM and SIGBUS.  I am not sure if we need to
>>> increase the list of signals to all the stop ones (which are not used
>>> by GDB) or to restrict it just to SIGSEGV.
>>
>> I'd suggest:
>>
>>  handle all pass nostop
>>  handle SIGINT pass nostop
>>
>> That would make gdb pass _all_ signals except SIGTRAP
> 
> I've committed it already :/
> 
> I can make the change, but isn't there cases where SIGILL is used for
> breakpoints in GDB (I think I've seen that somewhere).

True, and SIGSEGV and SIGEMT too.  But GDB handles that transparently
and won't pass such a breakpoint signal to the program, even with
"handle pass".  Only "handle SIGTRAP pass" passes a
breakpoint/step/etc. trap to the program.

Thanks,
Pedro Alves



[testsuite] Skip gcc.c-torture/execute/20101011-1.c on Visium

2016-04-06 Thread Eric Botcazou
I don't understand how this one slipped through the cracks last year, but the 
test needs to be disabled on Visium as on a bunch of other architectures.

Tested on visium-elf, applied on the mainline and 5 branch.


2016-04-06  Eric Botcazou  

* gcc.c-torture/execute/20101011-1.c (__VISIUM__): Set DO_TEST to 0.

-- 
Eric Botcazoudiff --git a/gcc/testsuite/gcc.c-torture/execute/20101011-1.c b/gcc/testsuite/gcc.c-torture/execute/20101011-1.c
index e7157c5..744763f 100644
--- a/gcc/testsuite/gcc.c-torture/execute/20101011-1.c
+++ b/gcc/testsuite/gcc.c-torture/execute/20101011-1.c
@@ -30,6 +30,9 @@
 #elif defined (__TMS320C6X__)
   /* On TI C6X division by zero does not trap.  */
 # define DO_TEST 0
+#elif defined (__VISIUM__)
+  /* On Visium division by zero does not trap.  */
+# define DO_TEST 0
 #elif defined (__mips__) && !defined(__linux__)
   /* MIPS divisions do trap by default, but libgloss targets do not
  intercept the trap and raise a SIGFPE.  The same is probably


Re: [Patch] Avoid deadlock in guality tests.

2016-04-06 Thread Yvan Roux
On 6 April 2016 at 17:24, Pedro Alves  wrote:
> On 04/06/2016 04:13 PM, Yvan Roux wrote:
>> On 6 April 2016 at 17:09, Pedro Alves  wrote:
>>> On 04/06/2016 03:53 PM, Yvan Roux wrote:
 Dejagnu cleanup mechanism needs to be enhanced, but I think that it
 would also be better if guality tests don't get stuck and/or can be
 killed easily.  This patch changes GDB signals handling to nostop for
 SIGSEGV, SIGINT, SIGTERM and SIGBUS.  I am not sure if we need to
 increase the list of signals to all the stop ones (which are not used
 by GDB) or to restrict it just to SIGSEGV.
>>>
>>> I'd suggest:
>>>
>>>  handle all pass nostop
>>>  handle SIGINT pass nostop
>>>
>>> That would make gdb pass _all_ signals except SIGTRAP
>>
>> I've committed it already :/
>>
>> I can make the change, but isn't there cases where SIGILL is used for
>> breakpoints in GDB (I think I've seen that somewhere).
>
> True, and SIGSEGV and SIGEMT too.  But GDB handles that transparently
> and won't pass such a breakpoint signal to the program, even with
> "handle pass".  Only "handle SIGTRAP pass" passes a
> breakpoint/step/etc. trap to the program.

Ah ok, thanks for the explanations Pedro, I'll prepare a new patch and
validate it.

Cheers,
Yvan


[PATCH] PR47040 - Make error message for empty array constructor more helpful/correct

2016-04-06 Thread Dominique d'Humières
Is the following patch OK (regtested on x86_64-apple-darwin15)? Should it be 
back ported to the gcc-5 branch?

TIA

Dominique

Index: gcc/fortran/ChangeLog
===
--- gcc/fortran/ChangeLog   (revision 234788)
+++ gcc/fortran/ChangeLog   (working copy)
@@ -1,3 +1,10 @@
+2016-04-06  Tobias Burnus  
+   Dominique d'Humieres 
+
+   PR fortran/47040
+   * array.c (gfc_match_array_constructor): add "without type-spec"
+   to the error message.
+
 2016-04-04  Andre Vehreschild  
 
PR fortran/67538
Index: gcc/fortran/array.c
===
--- gcc/fortran/array.c (revision 234788)
+++ gcc/fortran/array.c (working copy)
@@ -1136,7 +1136,8 @@
goto done;
   else
{
- gfc_error ("Empty array constructor at %C is not allowed");
+ gfc_error ("Empty array constructor without type-spec at %C "
+"is not allowed");
  goto cleanup;
}
 }
Index: gcc/testsuite/ChangeLog
===
--- gcc/testsuite/ChangeLog (revision 234788)
+++ gcc/testsuite/ChangeLog (working copy)
@@ -1,3 +1,7 @@
+2016-04-06  Dominique d'Humieres  
+
+   * gfortran.dg/empty_constructor.f90: New test.
+
 2016-04-06  Eric Botcazou  
 
* gcc.c-torture/execute/20101011-1.c (__VISIUM__): Set DO_TEST to 0.
Index: gcc/testsuite/gfortran.dg/empty_constructor.f90
===
--- gcc/testsuite/gfortran.dg/empty_constructor.f90 (nonexistent)
+++ gcc/testsuite/gfortran.dg/empty_constructor.f90 (working copy)
@@ -0,0 +1,17 @@
+! { dg-do compile }
+! PR 47040
+! Contributed by Tobias Burnus 
+!
+program test_print 
+  implicit none
+  integer :: i 
+  call print( [ ] ) ! { dg-error "Empty array constructor without type-spec" }
+  call print( [integer :: ] )
+  call print( pack( [ 1 ], [ 2 ] == 3 ) )
+  call print( [ ( i, i = 1, 0 ) ] )
+contains 
+  subroutine print( array ) 
+integer, dimension(:) :: array 
+write(*,*) size(array) 
+  end subroutine print 
+end program test_print 



Re: [PATCH, i386] Add some missing modes to mode attributes

2016-04-06 Thread H.J. Lu
On Wed, Apr 6, 2016 at 8:11 AM, Uros Bizjak  wrote:
> ... so there is no unresolved attributes in insn-output.c.
>
> 2016-04-06  Uros Bizjak  
>
> * config/i386/sse.md (shuffletype): Add V32HI and V4TI modes.
> (ssescalarsize): Add V8SF, V4SF, V4DF and V2DF modes.
>
> Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.
> Committed to mainline SVN and gcc-5 branch.
>

Could you also add V4TI, V2TI and V1TI to ssescalarsize?

Thanks.

-- 
H.J.


C++ PATCH for implicit abi_tag on template member function

2016-04-06 Thread Jason Merrill
We were incorrectly omitting the ABI tag on the instantiation of this 
member function because we were setting the tags on the instantiation 
and looking for them on the temploid.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 32a5b18940f37dd97c2735f24a8353d1db457d86
Author: Jason Merrill 
Date:   Wed Apr 6 07:45:02 2016 -0400

	* class.c (check_abi_tags): Fix function template handling.

diff --git a/gcc/cp/class.c b/gcc/cp/class.c
index 937e41f..02a992f 100644
--- a/gcc/cp/class.c
+++ b/gcc/cp/class.c
@@ -1604,6 +1604,15 @@ check_abi_tags (tree t, tree subob)
 void
 check_abi_tags (tree decl)
 {
+  tree t;
+  if (abi_version_at_least (10)
+  && DECL_LANG_SPECIFIC (decl)
+  && DECL_USE_TEMPLATE (decl)
+  && (t = DECL_TEMPLATE_RESULT (DECL_TI_TEMPLATE (decl)),
+	  t != decl))
+/* Make sure that our template has the appropriate tags, since
+   write_unqualified_name looks for them there.  */
+check_abi_tags (t);
   if (VAR_P (decl))
 check_abi_tags (decl, TREE_TYPE (decl));
   else if (TREE_CODE (decl) == FUNCTION_DECL
diff --git a/gcc/testsuite/g++.dg/abi/abi-tag19.C b/gcc/testsuite/g++.dg/abi/abi-tag19.C
new file mode 100644
index 000..e21d7b1
--- /dev/null
+++ b/gcc/testsuite/g++.dg/abi/abi-tag19.C
@@ -0,0 +1,4 @@
+struct __attribute__((abi_tag("a"))) X { };
+template struct Y { X f() { return X(); } };
+template struct Y;
+// { dg-final { scan-assembler "_ZN1YIiE1fB1aEv" } }


[PATCH] Avoid needless unsharing during constexpr evaluation (PR c++/70452)

2016-04-06 Thread Patrick Palka
During constexpr evaluation we unconditionally call unshare_expr in a
bunch of places to ensure that CONSTRUCTORs (and their CONSTRUCTOR_ELTS)
don't get shared.  But as far as I can tell, we don't have any reason to
call unshare_expr on non-CONSTRUCTORs, and a CONSTRUCTOR will never be
an operand of a non-CONSTRUCTOR tree.  Assuming these two things are
true, then I think we can safely restrict the calls to unshare_expr to
only CONSTRUCTOR trees. Doing so saves 50MB of peak memory usage in the
test case in the PR (bringing memory usage down to 4.9 levels).

This patch takes this idea a bit further and implements a custom
unshare_constructor procedure that recursively unshares just
CONSTRUCTORs and their CONSTRUCTOR elements.  This is in contrast to
unshare_expr which unshares even non-CONSTRUCTOR elements of a
CONSTRUCTOR.  unshare_constructor also has an assert which verifies that
there really is no CONSTRUCTOR subexpression inside a non-CONSTRUCTOR
tree.  So far I haven't been able to get this assert to trigger which
makes me reasonably confident about this optimization.

Does this look OK to commit after bootstrap + regtesting?

gcc/cp/ChangeLog:

PR c++/70452
* constexpr.c (not_a_constructor): New function.
(unshare_constructor): New function.
(cxx_eval_call_expression): Use unshare_constructor instead of
unshare_expr.
(find_array_ctor_elt): Likewise.
(cxx_eval_vec_init_1): Likewise.
(cxx_eval_store_expression): Likewise.
(cxx_eval_constant_expression): Likewise.
---
 gcc/cp/constexpr.c | 55 ++
 1 file changed, 47 insertions(+), 8 deletions(-)

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index 1c2701b..5d33a11 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -1151,6 +1151,45 @@ adjust_temp_type (tree type, tree temp)
   return cp_fold_convert (type, temp);
 }
 
+/* Callback for walk_tree used by unshare_constructor.  */
+
+static tree
+not_a_constructor (tree *tp, int *walk_subtrees, void *)
+{
+  if (TYPE_P (*tp))
+*walk_subtrees = 0;
+  gcc_assert (TREE_CODE (*tp) != CONSTRUCTOR);
+  return NULL_TREE;
+}
+
+/* If T is a CONSTRUCTOR, return an unshared copy of T.  Otherwise
+   return T.  */
+
+static tree
+unshare_constructor (tree t)
+{
+  if (TREE_CODE (t) == CONSTRUCTOR)
+{
+  tree r;
+
+  r = copy_node (t);
+  CONSTRUCTOR_ELTS (r) = vec_safe_copy (CONSTRUCTOR_ELTS (t));
+
+  /* Unshare any of its elements that also happen to be CONSTRUCTORs.  */
+  for (unsigned idx = 0;
+  idx < vec_safe_length (CONSTRUCTOR_ELTS (r)); idx++)
+   CONSTRUCTOR_ELT (r, idx)->value
+ = unshare_constructor (CONSTRUCTOR_ELT (r, idx)->value);
+
+  return r;
+}
+
+  /* If T is not itself a CONSTRUCTOR then we don't expect it to contain
+ any CONSTRUCTOR subexpressions.  */
+  walk_tree_without_duplicates (&t, not_a_constructor, NULL);
+  return t;
+}
+
 /* Subroutine of cxx_eval_call_expression.
We are processing a call expression (either CALL_EXPR or
AGGR_INIT_EXPR) in the context of CTX.  Evaluate
@@ -1454,7 +1493,7 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree 
t,
  tree arg = TREE_VALUE (bound);
  gcc_assert (DECL_NAME (remapped) == DECL_NAME (oparm));
  /* Don't share a CONSTRUCTOR that might be changed.  */
- arg = unshare_expr (arg);
+ arg = unshare_constructor (arg);
  ctx->values->put (remapped, arg);
  bound = TREE_CHAIN (bound);
  remapped = DECL_CHAIN (remapped);
@@ -1534,7 +1573,7 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree 
t,
 }
 
   pop_cx_call_context ();
-  return unshare_expr (result);
+  return unshare_constructor (result);
 }
 
 /* FIXME speed this up, it's taking 16% of compile time on sieve testcase.  */
@@ -1880,7 +1919,7 @@ find_array_ctor_elt (tree ary, tree dindex, bool insert = 
false)
  /* Append the element we want to insert.  */
  ++middle;
  e.index = dindex;
- e.value = unshare_expr (elt.value);
+ e.value = unshare_constructor (elt.value);
  vec_safe_insert (CONSTRUCTOR_ELTS (ary), middle, e);
}
  else
@@ -1896,7 +1935,7 @@ find_array_ctor_elt (tree ary, tree dindex, bool insert = 
false)
e.index = hi;
  else
e.index = build2 (RANGE_EXPR, sizetype, new_lo, hi);
- e.value = unshare_expr (elt.value);
+ e.value = unshare_constructor (elt.value);
  vec_safe_insert (CONSTRUCTOR_ELTS (ary), middle+1, e);
}
}
@@ -2565,7 +2604,7 @@ cxx_eval_vec_init_1 (const constexpr_ctx *ctx, tree 
atype, tree init,
  for (i = 1; i < max; ++i)
{
  idx = build_int_cst (size_type_node, i);
-

patch for PR70398

2016-04-06 Thread Vladimir Makarov

  The following patch fixes

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70398

  The patch was successfully tested and bootstrapped on x86_64 and aarch64.

  Committed as rev. 234792



Index: ChangeLog
===
--- ChangeLog	(revision 234791)
+++ ChangeLog	(working copy)
@@ -1,3 +1,9 @@
+2016-04-06  Vladimir Makarov  
+
+	PR rtl-optimization/70398
+	* lra-constraints.c (process_address_1): Check zero scale and code
+	for reloading with zero scale.
+
 2016-04-06  Uros Bizjak  
 
 	* config/i386/sse.md (shuffletype): Add V32HI and V4TI modes.
Index: testsuite/ChangeLog
===
--- testsuite/ChangeLog	(revision 234791)
+++ testsuite/ChangeLog	(working copy)
@@ -1,3 +1,8 @@
+2016-04-06  Vladimir Makarov  
+
+	PR rtl-optimization/70398
+	* testsuite/gcc.target/aarch64/pr70398.c: New.
+
 2016-04-06  Eric Botcazou  
 
 	* gcc.c-torture/execute/20101011-1.c (__VISIUM__): Set DO_TEST to 0.
Index: lra-constraints.c
===
--- lra-constraints.c	(revision 234780)
+++ lra-constraints.c	(working copy)
@@ -2914,6 +2914,7 @@ process_address_1 (int nop, bool check_o
 {
   struct address_info ad;
   rtx new_reg;
+  HOST_WIDE_INT scale;
   rtx op = *curr_id->operand_loc[nop];
   const char *constraint = curr_static_id->operand[nop].constraint;
   enum constraint_num cn = lookup_constraint (constraint);
@@ -3161,14 +3162,14 @@ process_address_1 (int nop, bool check_o
   *ad.inner = simplify_gen_binary (PLUS, GET_MODE (new_reg),
    new_reg, *ad.index);
 }
-  else if (get_index_scale (&ad) == 1)
+  else if ((scale = get_index_scale (&ad)) == 1)
 {
   /* The last transformation to one reg will be made in
 	 curr_insn_transform function.  */
   end_sequence ();
   return false;
 }
-  else
+  else if (scale != 0)
 {
   /* base + scale * index => base + new_reg,
 	 case (1) above.
@@ -3180,6 +3181,17 @@ process_address_1 (int nop, bool check_o
   *ad.inner = simplify_gen_binary (PLUS, GET_MODE (new_reg),
    *ad.base_term, new_reg);
 }
+  else
+{
+  enum reg_class cl = base_reg_class (ad.mode, ad.as,
+	  SCRATCH, SCRATCH);
+  rtx addr = *ad.inner;
+  
+  new_reg = lra_create_new_reg (Pmode, NULL_RTX, cl, "addr");
+  /* addr => new_base.  */
+  lra_emit_move (new_reg, addr);
+  *ad.inner = new_reg;
+}
   *before = get_insns ();
   end_sequence ();
   return true;
Index: testsuite/gcc.target/aarch64/pr70398.c
===
--- testsuite/gcc.target/aarch64/pr70398.c	(revision 0)
+++ testsuite/gcc.target/aarch64/pr70398.c	(working copy)
@@ -0,0 +1,26 @@
+/* { dg-do run } */
+/* { dg-options "-O -fno-tree-loop-optimize -fno-tree-ter -static" } */
+unsigned int in[8 * 8] =
+  { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
+45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63 };
+
+unsigned char out[8 * 8];
+
+int
+main (void)
+{
+  int i;
+  for (i = 0; i < 8 * 4; i++)
+{
+  out[i * 2] = (unsigned char) in[i * 2] + 1;
+  out[i * 2 + 1] = (unsigned char) in[i * 2 + 1] + 2;
+}
+  __asm__("":::"memory");
+  for (i = 0; i < 8 * 4; i++)
+{
+  if (out[i * 2] != in[i * 2] + 1
+	  || out[i * 2 + 1] != in[i * 2 + 1] + 2)
+	__builtin_abort ();
+}
+  return 0;
+}


Re: [PATCH] Avoid needless unsharing during constexpr evaluation (PR c++/70452)

2016-04-06 Thread Richard Biener
On April 6, 2016 6:51:40 PM GMT+02:00, Patrick Palka  
wrote:
>During constexpr evaluation we unconditionally call unshare_expr in a
>bunch of places to ensure that CONSTRUCTORs (and their
>CONSTRUCTOR_ELTS)
>don't get shared.  But as far as I can tell, we don't have any reason
>to
>call unshare_expr on non-CONSTRUCTORs, and a CONSTRUCTOR will never be
>an operand of a non-CONSTRUCTOR tree.  Assuming these two things are
>true, then I think we can safely restrict the calls to unshare_expr to
>only CONSTRUCTOR trees. Doing so saves 50MB of peak memory usage in the
>test case in the PR (bringing memory usage down to 4.9 levels).
>
>This patch takes this idea a bit further and implements a custom
>unshare_constructor procedure that recursively unshares just
>CONSTRUCTORs and their CONSTRUCTOR elements.  This is in contrast to
>unshare_expr which unshares even non-CONSTRUCTOR elements of a
>CONSTRUCTOR.  unshare_constructor also has an assert which verifies
>that
>there really is no CONSTRUCTOR subexpression inside a non-CONSTRUCTOR
>tree.  So far I haven't been able to get this assert to trigger which
>makes me reasonably confident about this optimization.

At least the middle end uses CONSTRUCTOR to build vectors from components which 
can then of course be operands of expressions.

Richard.

>Does this look OK to commit after bootstrap + regtesting?
>
>gcc/cp/ChangeLog:
>
>   PR c++/70452
>   * constexpr.c (not_a_constructor): New function.
>   (unshare_constructor): New function.
>   (cxx_eval_call_expression): Use unshare_constructor instead of
>   unshare_expr.
>   (find_array_ctor_elt): Likewise.
>   (cxx_eval_vec_init_1): Likewise.
>   (cxx_eval_store_expression): Likewise.
>   (cxx_eval_constant_expression): Likewise.
>---
>gcc/cp/constexpr.c | 55
>++
> 1 file changed, 47 insertions(+), 8 deletions(-)
>
>diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
>index 1c2701b..5d33a11 100644
>--- a/gcc/cp/constexpr.c
>+++ b/gcc/cp/constexpr.c
>@@ -1151,6 +1151,45 @@ adjust_temp_type (tree type, tree temp)
>   return cp_fold_convert (type, temp);
> }
> 
>+/* Callback for walk_tree used by unshare_constructor.  */
>+
>+static tree
>+not_a_constructor (tree *tp, int *walk_subtrees, void *)
>+{
>+  if (TYPE_P (*tp))
>+*walk_subtrees = 0;
>+  gcc_assert (TREE_CODE (*tp) != CONSTRUCTOR);
>+  return NULL_TREE;
>+}
>+
>+/* If T is a CONSTRUCTOR, return an unshared copy of T.  Otherwise
>+   return T.  */
>+
>+static tree
>+unshare_constructor (tree t)
>+{
>+  if (TREE_CODE (t) == CONSTRUCTOR)
>+{
>+  tree r;
>+
>+  r = copy_node (t);
>+  CONSTRUCTOR_ELTS (r) = vec_safe_copy (CONSTRUCTOR_ELTS (t));
>+
>+  /* Unshare any of its elements that also happen to be
>CONSTRUCTORs.  */
>+  for (unsigned idx = 0;
>+ idx < vec_safe_length (CONSTRUCTOR_ELTS (r)); idx++)
>+  CONSTRUCTOR_ELT (r, idx)->value
>+= unshare_constructor (CONSTRUCTOR_ELT (r, idx)->value);
>+
>+  return r;
>+}
>+
>+  /* If T is not itself a CONSTRUCTOR then we don't expect it to
>contain
>+ any CONSTRUCTOR subexpressions.  */
>+  walk_tree_without_duplicates (&t, not_a_constructor, NULL);
>+  return t;
>+}
>+
> /* Subroutine of cxx_eval_call_expression.
>We are processing a call expression (either CALL_EXPR or
>AGGR_INIT_EXPR) in the context of CTX.  Evaluate
>@@ -1454,7 +1493,7 @@ cxx_eval_call_expression (const constexpr_ctx
>*ctx, tree t,
> tree arg = TREE_VALUE (bound);
> gcc_assert (DECL_NAME (remapped) == DECL_NAME (oparm));
> /* Don't share a CONSTRUCTOR that might be changed.  */
>-arg = unshare_expr (arg);
>+arg = unshare_constructor (arg);
> ctx->values->put (remapped, arg);
> bound = TREE_CHAIN (bound);
> remapped = DECL_CHAIN (remapped);
>@@ -1534,7 +1573,7 @@ cxx_eval_call_expression (const constexpr_ctx
>*ctx, tree t,
> }
> 
>   pop_cx_call_context ();
>-  return unshare_expr (result);
>+  return unshare_constructor (result);
> }
> 
>/* FIXME speed this up, it's taking 16% of compile time on sieve
>testcase.  */
>@@ -1880,7 +1919,7 @@ find_array_ctor_elt (tree ary, tree dindex, bool
>insert = false)
> /* Append the element we want to insert.  */
> ++middle;
> e.index = dindex;
>-e.value = unshare_expr (elt.value);
>+e.value = unshare_constructor (elt.value);
> vec_safe_insert (CONSTRUCTOR_ELTS (ary), middle, e);
>   }
> else
>@@ -1896,7 +1935,7 @@ find_array_ctor_elt (tree ary, tree dindex, bool
>insert = false)
>   e.index = hi;
> else
>   e.index = build2 (RANGE_EXPR, sizetype, new_lo, hi);
>-e.value = unshare_expr (elt.value);
>+e.value = unshare_constructor (elt.value);
> 

Re: [PATCH] Avoid needless unsharing during constexpr evaluation (PR c++/70452)

2016-04-06 Thread Patrick Palka
On Wed, Apr 6, 2016 at 1:17 PM, Richard Biener
 wrote:
> On April 6, 2016 6:51:40 PM GMT+02:00, Patrick Palka  
> wrote:
>>During constexpr evaluation we unconditionally call unshare_expr in a
>>bunch of places to ensure that CONSTRUCTORs (and their
>>CONSTRUCTOR_ELTS)
>>don't get shared.  But as far as I can tell, we don't have any reason
>>to
>>call unshare_expr on non-CONSTRUCTORs, and a CONSTRUCTOR will never be
>>an operand of a non-CONSTRUCTOR tree.  Assuming these two things are
>>true, then I think we can safely restrict the calls to unshare_expr to
>>only CONSTRUCTOR trees. Doing so saves 50MB of peak memory usage in the
>>test case in the PR (bringing memory usage down to 4.9 levels).
>>
>>This patch takes this idea a bit further and implements a custom
>>unshare_constructor procedure that recursively unshares just
>>CONSTRUCTORs and their CONSTRUCTOR elements.  This is in contrast to
>>unshare_expr which unshares even non-CONSTRUCTOR elements of a
>>CONSTRUCTOR.  unshare_constructor also has an assert which verifies
>>that
>>there really is no CONSTRUCTOR subexpression inside a non-CONSTRUCTOR
>>tree.  So far I haven't been able to get this assert to trigger which
>>makes me reasonably confident about this optimization.
>
> At least the middle end uses CONSTRUCTOR to build vectors from components 
> which can then of course be operands of expressions.

I see... I was assuming that the expressions passed to unshare_expr
would be more or less normalized but of course if the expression
involves a non-constant operand then not much normalization can be
done. So the patch ICEs on the following test case because we don't
normalize  = g (X { }) to  = 5 since g is not
constexpr.  So we end up calling unshare_constructor on this CALL_EXPR
whose argument is a CONSTRUCTOR.

struct X { };

constexpr int
foo (int (*f) (X))
{
  return f (X { });
}

int
g (X)
{
  return 5;
}

int a = foo (g);

 So the added assert and probably this patch is wrong.  Hmm...

>
> Richard.
>
>>Does this look OK to commit after bootstrap + regtesting?
>>
>>gcc/cp/ChangeLog:
>>
>>   PR c++/70452
>>   * constexpr.c (not_a_constructor): New function.
>>   (unshare_constructor): New function.
>>   (cxx_eval_call_expression): Use unshare_constructor instead of
>>   unshare_expr.
>>   (find_array_ctor_elt): Likewise.
>>   (cxx_eval_vec_init_1): Likewise.
>>   (cxx_eval_store_expression): Likewise.
>>   (cxx_eval_constant_expression): Likewise.
>>---
>>gcc/cp/constexpr.c | 55
>>++
>> 1 file changed, 47 insertions(+), 8 deletions(-)
>>
>>diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
>>index 1c2701b..5d33a11 100644
>>--- a/gcc/cp/constexpr.c
>>+++ b/gcc/cp/constexpr.c
>>@@ -1151,6 +1151,45 @@ adjust_temp_type (tree type, tree temp)
>>   return cp_fold_convert (type, temp);
>> }
>>
>>+/* Callback for walk_tree used by unshare_constructor.  */
>>+
>>+static tree
>>+not_a_constructor (tree *tp, int *walk_subtrees, void *)
>>+{
>>+  if (TYPE_P (*tp))
>>+*walk_subtrees = 0;
>>+  gcc_assert (TREE_CODE (*tp) != CONSTRUCTOR);
>>+  return NULL_TREE;
>>+}
>>+
>>+/* If T is a CONSTRUCTOR, return an unshared copy of T.  Otherwise
>>+   return T.  */
>>+
>>+static tree
>>+unshare_constructor (tree t)
>>+{
>>+  if (TREE_CODE (t) == CONSTRUCTOR)
>>+{
>>+  tree r;
>>+
>>+  r = copy_node (t);
>>+  CONSTRUCTOR_ELTS (r) = vec_safe_copy (CONSTRUCTOR_ELTS (t));
>>+
>>+  /* Unshare any of its elements that also happen to be
>>CONSTRUCTORs.  */
>>+  for (unsigned idx = 0;
>>+ idx < vec_safe_length (CONSTRUCTOR_ELTS (r)); idx++)
>>+  CONSTRUCTOR_ELT (r, idx)->value
>>+= unshare_constructor (CONSTRUCTOR_ELT (r, idx)->value);
>>+
>>+  return r;
>>+}
>>+
>>+  /* If T is not itself a CONSTRUCTOR then we don't expect it to
>>contain
>>+ any CONSTRUCTOR subexpressions.  */
>>+  walk_tree_without_duplicates (&t, not_a_constructor, NULL);
>>+  return t;
>>+}
>>+
>> /* Subroutine of cxx_eval_call_expression.
>>We are processing a call expression (either CALL_EXPR or
>>AGGR_INIT_EXPR) in the context of CTX.  Evaluate
>>@@ -1454,7 +1493,7 @@ cxx_eval_call_expression (const constexpr_ctx
>>*ctx, tree t,
>> tree arg = TREE_VALUE (bound);
>> gcc_assert (DECL_NAME (remapped) == DECL_NAME (oparm));
>> /* Don't share a CONSTRUCTOR that might be changed.  */
>>-arg = unshare_expr (arg);
>>+arg = unshare_constructor (arg);
>> ctx->values->put (remapped, arg);
>> bound = TREE_CHAIN (bound);
>> remapped = DECL_CHAIN (remapped);
>>@@ -1534,7 +1573,7 @@ cxx_eval_call_expression (const constexpr_ctx
>>*ctx, tree t,
>> }
>>
>>   pop_cx_call_context ();
>>-  return unshare_expr (result);
>>+  return unshare_constructor (result);
>> }
>>
>>/* FIXME speed this up, it's taking 16% of compile time on sieve
>>testcase.  */
>>@@ -1880,7 

Re: [PATCH: RL78] Optimize libgcc routines using clrw and clrb

2016-04-06 Thread DJ Delorie

Kaushik Phatak  writes:
> 2016-04-06  Kaushik Phatak 
>
>   * config/rl78/bit-count.S: Use clrw/clrb where possible.
>   * config/rl78/cmpsi2.S: Likewise.
>   * config/rl78/divmodhi.S Likewise.
>   * config/rl78/divmodsi.S Likewise.
>   * config/rl78/fpbit-sf.S Likewise.
>   * config/rl78/fpmath-sf.S Likewise.
>   * config/rl78/mulsi3.S Likewise.

This patch is fine, please apply once gcc is in stage 1 again.

Thanks!


Re: [PATCH, i386] Add some missing modes to mode attributes

2016-04-06 Thread Uros Bizjak
On Wed, Apr 6, 2016 at 5:55 PM, H.J. Lu  wrote:
> On Wed, Apr 6, 2016 at 8:11 AM, Uros Bizjak  wrote:
>> ... so there is no unresolved attributes in insn-output.c.
>>
>> 2016-04-06  Uros Bizjak  
>>
>> * config/i386/sse.md (shuffletype): Add V32HI and V4TI modes.
>> (ssescalarsize): Add V8SF, V4SF, V4DF and V2DF modes.
>>
>> Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.
>> Committed to mainline SVN and gcc-5 branch.
>>
>
> Could you also add V4TI, V2TI and V1TI to ssescalarsize?

They are not used currently, so there is no pressing need for these
modes. But they can be trivially added at any time.

BTW: It would be nice if the macroization machinery warned when there
are unresolved mode attributes in the source.

Uros.


Re: openacc reference reductions

2016-04-06 Thread Cesar Philippidis
On 04/06/2016 07:23 AM, Jakub Jelinek wrote:
> On Tue, Apr 05, 2016 at 06:53:47PM -0700, Cesar Philippidis wrote:
>> --- a/gcc/omp-low.c
>> +++ b/gcc/omp-low.c
>> @@ -309,6 +309,25 @@ is_oacc_kernels (omp_context *ctx)
>>== GF_OMP_TARGET_KIND_OACC_KERNELS));
>>  }
>>  
>> +/* Return true if CTX corresponds to an oacc parallel region and if
>> +   VAR is used in a reduction.  */
>> +
>> +static bool
>> +is_oacc_parallel_reduction (tree var, omp_context *ctx)
>> +{
>> +  if (!is_oacc_parallel (ctx))
>> +return false;
>> +
>> +  tree clauses = gimple_omp_target_clauses (ctx->stmt);
>> +
>> +  for (tree c = clauses; c; c = OMP_CLAUSE_CHAIN (c))
>> +if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_REDUCTION
>> +&& OMP_CLAUSE_DECL (c) == var)
>> +  return true;
>> +
>> +  return false;
>> +}
>> +
>>  /* If DECL is the artificial dummy VAR_DECL created for non-static
>> data member privatization, return the underlying "this" parameter,
>> otherwise return NULL.  */
>> @@ -2122,7 +2141,8 @@ scan_sharing_clauses (tree clauses, omp_context *ctx,
>>else
>>  install_var_field (decl, true, 3, ctx,
>> base_pointers_restrict);
>> -  if (is_gimple_omp_offloaded (ctx->stmt))
>> +  if (is_gimple_omp_offloaded (ctx->stmt)
>> +  && !is_oacc_parallel_reduction (decl, ctx))
>>  install_var_local (decl, ctx);
>>  }
>>  }
> 
> The above is O(n^2) in number of clauses on the construct.
> Perhaps better define some OMP_CLAUSE_MAP_IN_REDUCTION macro (e.g.
> TREE_PRIVATE bit is unused on OMP_CLAUSE_MAP right now), make sure to set it
> e.g. during gimplification where you can see all GOVD_* flags for a
> particular decl), and then use this flag here?

That's a good idea. I went ahead and combined this patch with the data
map reduction fix for PR70289 that I posted on Monday,
<https://gcc.gnu.org/ml/gcc-patches/2016-04/msg00202.html>, because I'm
already scanning for parallel reduction data clauses in there. As you
suggested, I introduced an OMP_CLAUSE_MAP_IN_REDUCTION macro to the data
clauses associated with acc parallel reductions.

Is this patch OK for trunk? It fixes PR70289, PR70348, PR70373, PR70533,
PR70535 and PR70537.

Cesar


pr70533-20160406.diff.gz
Description: application/gzip
2016-04-06  Cesar Philippidis  

	PR lto/70289
	gcc/
	* gimplify.c (gimplify_adjust_acc_parallel_reductions): New function.
	(gimplify_omp_workshare): Call it.  Add new data clauses for acc
	parallel reductions as needed.
	* omp-low.c (is_oacc_parallel_reduction): New function.
	(scan_sharing_clauses): Use it to prevent installing local variables
	for those used in acc parallel reductions.
	(lower_rec_input_clauses): Remove dead code.
	(lower_oacc_reductions): Add support for reference reductions.
	(lower_reduction_clauses): Remove dead code.
	(lower_omp_target): Don't remap variables appearing in acc parallel
	reductions.
	* gcc/tree.h (OMP_CLAUSE_MAP_IN_REDUCTION): New macro.

	gcc/testsuite/
	* c-c++-common/goacc/reduction-5.c: New test.
	* c-c++-common/goacc/reduction-promotions.c: New test.
	* gfortran.dg/goacc/reduction-3.f95: New test.
	* gfortran.dg/goacc/reduction-promotions.f90: New test.

	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/loop-reduction-gang-np-1.c: New
	test.
	* testsuite/libgomp.oacc-c-c++-common/loop-reduction-gw-np-1.c: New
	test.
	* testsuite/libgomp.oacc-c-c++-common/loop-reduction-gwv-np-1.c: New
	test.
	* testsuite/libgomp.oacc-c-c++-common/loop-reduction-gwv-np-2.c: New
	test.
	* testsuite/libgomp.oacc-c-c++-common/loop-reduction-gwv-np-3.c: New
	test.
	* testsuite/libgomp.oacc-c-c++-common/loop-reduction-gwv-np-4.c: New
	test.
	* testsuite/libgomp.oacc-c-c++-common/loop-reduction-vector-p-1.c: New
	test.
	* testsuite/libgomp.oacc-c-c++-common/loop-reduction-vector-p-2.c: New
	test.
	* testsuite/libgomp.oacc-c-c++-common/loop-reduction-worker-p-1.c: New
	test.
	* testsuite/libgomp.oacc-c-c++-common/loop-reduction-wv-p-1.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/loop-reduction-wv-p-2.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/loop-reduction-wv-p-3.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/par-loop-comb-reduction-1.c: New
	test.
	* testsuite/libgomp.oacc-c-c++-common/par-loop-comb-reduction-2.c: New
	test.
	* testsuite/libgomp.oacc-c-c++-common/par-loop-comb-reduction-3.c: New
	test.
	* testsuite/libgomp.oacc-c-c++-common/par-loop-comb-reduction-4.c: New
	test.
	* testsuite/libgomp.oacc-c-c++-common/par-reduction-1.c: Increase
	test coverage.
	* testsuite/libgomp.oacc-c-c++-common/par-reduction-2.c: Likewise.
	* t

Re: [PATCH] Avoid needless unsharing during constexpr evaluation (PR c++/70452)

2016-04-06 Thread Patrick Palka
On Wed, 6 Apr 2016, Patrick Palka wrote:

> On Wed, Apr 6, 2016 at 1:17 PM, Richard Biener
>  wrote:
> > On April 6, 2016 6:51:40 PM GMT+02:00, Patrick Palka  
> > wrote:
> >>During constexpr evaluation we unconditionally call unshare_expr in a
> >>bunch of places to ensure that CONSTRUCTORs (and their
> >>CONSTRUCTOR_ELTS)
> >>don't get shared.  But as far as I can tell, we don't have any reason
> >>to
> >>call unshare_expr on non-CONSTRUCTORs, and a CONSTRUCTOR will never be
> >>an operand of a non-CONSTRUCTOR tree.  Assuming these two things are
> >>true, then I think we can safely restrict the calls to unshare_expr to
> >>only CONSTRUCTOR trees. Doing so saves 50MB of peak memory usage in the
> >>test case in the PR (bringing memory usage down to 4.9 levels).
> >>
> >>This patch takes this idea a bit further and implements a custom
> >>unshare_constructor procedure that recursively unshares just
> >>CONSTRUCTORs and their CONSTRUCTOR elements.  This is in contrast to
> >>unshare_expr which unshares even non-CONSTRUCTOR elements of a
> >>CONSTRUCTOR.  unshare_constructor also has an assert which verifies
> >>that
> >>there really is no CONSTRUCTOR subexpression inside a non-CONSTRUCTOR
> >>tree.  So far I haven't been able to get this assert to trigger which
> >>makes me reasonably confident about this optimization.
> >
> > At least the middle end uses CONSTRUCTOR to build vectors from components 
> > which can then of course be operands of expressions.
> 
> I see... I was assuming that the expressions passed to unshare_expr
> would be more or less normalized but of course if the expression
> involves a non-constant operand then not much normalization can be
> done. So the patch ICEs on the following test case because we don't
> normalize  = g (X { }) to  = 5 since g is not
> constexpr.  So we end up calling unshare_constructor on this CALL_EXPR
> whose argument is a CONSTRUCTOR.
> 
> struct X { };
> 
> constexpr int
> foo (int (*f) (X))
> {
>   return f (X { });
> }
> 
> int
> g (X)
> {
>   return 5;
> }
> 
> int a = foo (g);
> 
>  So the added assert and probably this patch is wrong.  Hmm...

Here is a safer and simpler approach that just walks the expression
being unshared to try to find a CONSTRUCTOR node.  If it finds one, then
we unshare the whole expression.  Otherwise we return the original
expression.  It should be completely safe to avoid unsharing an
expression if it contains no CONSTRUCTOR nodes.

-- >8 --

gcc/cp/ChangeLog:

PR c++/70452
* constexpr.c (find_constructor): New function.
(unshare_constructor): New function.
(cxx_eval_call_expression): Use unshare_constructor instead of
unshare_expr.
(find_array_ctor_elt): Likewise.
(cxx_eval_vec_init_1): Likewise.
(cxx_eval_store_expression): Likewise.
(cxx_eval_constant_expression): Likewise.
---
 gcc/cp/constexpr.c | 40 
 1 file changed, 32 insertions(+), 8 deletions(-)

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index 1c2701b..5bccdec 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -1151,6 +1151,30 @@ adjust_temp_type (tree type, tree temp)
   return cp_fold_convert (type, temp);
 }
 
+/* Callback for walk_tree used by unshare_constructor.  */
+
+static tree
+find_constructor (tree *tp, int *walk_subtrees, void *)
+{
+  if (TYPE_P (*tp))
+*walk_subtrees = 0;
+  if (TREE_CODE (*tp) == CONSTRUCTOR)
+return *tp;
+  return NULL_TREE;
+}
+
+/* If T is a CONSTRUCTOR or an expression that has a CONSTRUCTOR node as a
+   subexpression, return an unshared copy of T.  Otherwise return T.  */
+
+static tree
+unshare_constructor (tree t)
+{
+  tree ctor = walk_tree (&t, find_constructor, NULL, NULL);
+  if (ctor != NULL_TREE)
+return unshare_expr (t);
+  return t;
+}
+
 /* Subroutine of cxx_eval_call_expression.
We are processing a call expression (either CALL_EXPR or
AGGR_INIT_EXPR) in the context of CTX.  Evaluate
@@ -1454,7 +1478,7 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree 
t,
  tree arg = TREE_VALUE (bound);
  gcc_assert (DECL_NAME (remapped) == DECL_NAME (oparm));
  /* Don't share a CONSTRUCTOR that might be changed.  */
- arg = unshare_expr (arg);
+ arg = unshare_constructor (arg);
  ctx->values->put (remapped, arg);
  bound = TREE_CHAIN (bound);
  remapped = DECL_CHAIN (remapped);
@@ -1534,7 +1558,7 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree 
t,
 }
 
   pop_cx_call_context ();
-  return unshare_expr (result);
+  return unshare_constructor (result);
 }
 
 /* FIXME speed this up, it's taking 16% of compile time on sieve testcase.  */
@@ -1880,7 +1904,7 @@ find_array_ctor_elt (tree ary, tree dindex, bool insert = 
false)
  /* Append the element we want to insert.  */
  ++middle;
  e.index = dindex;
- e

Re: [PATCH] PR47040 - Make error message for empty array constructor more helpful/correct

2016-04-06 Thread Steve Kargl
On Wed, Apr 06, 2016 at 05:44:55PM +0200, Dominique d'Humières wrote:
> Is the following patch OK (regtested on x86_64-apple-darwin15)? Should it be 
> back ported to the gcc-5 branch?

No and No.

-- 
Steve


Re: [PATCH] PR47040 - Make error message for empty array constructor more helpful/correct

2016-04-06 Thread Dominique d'Humières
Could you please elaborate.

Dominique

> Le 7 avr. 2016 à 07:48, Steve Kargl  a 
> écrit :
> 
> On Wed, Apr 06, 2016 at 05:44:55PM +0200, Dominique d'Humières wrote:
>> Is the following patch OK (regtested on x86_64-apple-darwin15)? Should it be 
>> back ported to the gcc-5 branch?
> 
> No and No.
> 
> -- 
> Steve